Reflexive Behaviour: How publication pressure affects research quality in Astronomy

Reflexive metrics is a branch of science studies which explores how the demand for accountability and performance measurement in science has shaped the research culture in recent decades. Hypercompetition and publication pressure are part of this neoliberal culture. How do scientists respond to these pressures? Studies on research integrity and organizational culture suggest that people who feel treated unfairly by their institution are more likely to engage in deviant behaviour, such as scientific misconduct. By building up on reflexive metrics, combined with studies on the influence of organisational culture on research integrity, this study reflects on the research behaviour of astronomers: 1) To what extent is research (mis-)behaviour reflexive, i.e. dependent on perceptions of publication pressure and distributive&organisational justice? 2) What impact does scientific misconduct have on research quality? In order to perform this reflection, we conducted a comprehensive survey of academic and non-academic astronomers worldwide and received 3,509 responses. We found that publication pressure explains 10% of the variance in occurrence of misconduct and between 7 and 13% of the variance of the perception of distributive&organisational justice as well as overcommitment to work. Our results on the perceived impact of scientific misconduct on research quality show that the epistemic harm of questionable research practices should not be underestimated. This suggests there is a need for a policy change. In particular, lesser attention to metrics (such as publication rate) in the allocation of grants, telescope time and institutional rewards would foster better scientific conduct and hence research quality.


Introduction
The growing body of research on the effect of evaluation procedures on scientific behaviour (e.g. Hesselmann, 2014;Stephan, 2012;Laudel & Gläser, 2014;Fochler & De Rijcke, 2017) points towards performance indicators (such as publication and citation rates) not only describing, but also prescribing behaviour (Desrosières, 1998;Porter, 1995). In other words, they have constitutive effects on the knowledge production process (Dahler-Larsen, 2014). This suggests that metrics intended to measure concepts like research quality, end up defining what research quality means, and thereby shaping what researchers strive for. As Dahler-Larsen states: "A claim to measure quality cannot be understood as referring to an already-existing reality, but as an attempt to define reality in a particular way" (Dahler-Larsen, 2019; p.11). Metrics are therefore not merely proxies for quality, but represent a definition of what quality is considered as.
Capturing a complex concept such as research quality quantitatively, strips it of its complexity. This makes it easier to understand, turning it into something objective and comparable (Desrosières, 1998;Dahler-Larsen, 2019), but also leads to a "validity problem" (Dahler-Larsen, 2014;p.971). The indicator's inability to account for the phenomenon's full complexity may therefore lead to an "evaluation gap" (Wouters, 2017). This may lead to unintended consequences when putting indicators into place, such as scientific misconduct resulting from coping with the divergence between what quantitative proxies measure and what researchers value themselves (Heuritsch, 2021).
Which metrics are considered important depends on the culture of science, which has evolved through neoliberal reforms and the rise of New Public Management in the last 30 years (Lorenz, 2012). Researchers have become increasingly dependent on external resources, such as funding and rewards. Competition for these resources and for positions has intensified (Anderson et al., 2007). Publish-or-Perish is an integral part of this culture, since a scientist's reputation (arguably the most important currency in academia), their funding opportunities and their career development hinge on their metrics, such as their publication rate (Tijdink et al., 2014a;Moosa, 2018). As a result, researchers have an interest in scoring well on performance indicators. Due to this "goal displacement", where doing well on quantitative metrics becomes an aim in itself (Fochler & De Rijcke, 2017;p.27) researchers may adopt various gaming strategies to attain the goal. (refer to e.g. Laudel & Gläser, 2014;Rushforth & De Rijcke, 2015). This suggests that an indicator ceases to be a good measurement when it becomes a target (Goodheart's Law).
Adopting gaming strategies to hit a target set by performance indicators is what Fochler & De Rijcke (2017) call playing the "indicator game". While some forms of gaming may seem innocent at first (e.g. going for an easy publication), they can result in behaviour which scientists themselves perceive as a threat to research integrity. This may range from questionable research practices (QRPs), such as insufficient supervision of (graduate) students or salami slicing to publish more papers on one's research, to outright scientific misconduct such as fabrication, falsification or plagiarism (FFP;OSTP, 2000). Martinson et al. (2005) have shown that the latter, more extreme kinds of misconduct, are less frequent than the "'minor offences', the many cases of 'sloppy science'" (Haven & van Woudenberg, in print) and "carelessness" (Martinson et al., 2006;p.2), represented by the QRPs. Since they are more numerous and more difficult to spot, Martinson et al. (2005;p.737) suggested that such "mundane 'regular'" forms of misbehaviour pose larger threats to research integrity than outright fraud. If playing the indicator game is at least partly causing research misconduct, it follows that using indictors in research evaluation has an impact on research integrity.
Studies on the constitutive effects and unintended consequences of indicator use on research (behaviour) are designated under the umbrella term "Reflexive Metrics" (Heuritsch, 2019;p.146). The relationship between research integrity and research culture & climate has also been studied in literature (e.g. Martinson et al., 2013;Wells et al., 2014;Martinson et al., 2016). Academic culture may comprise of networks of peers, departments, institutions, funding agencies, grant reviewers, journal editors and the peer-review system (Martinson et al., 2006). These authors carried out the first systematic, quantitative analysis of the relationship between organisational culture, perceptions of justice and scientist's behaviours. They found evidence for greater perception of injustice leading to misbehaviour, especially among researchers whose career is at stake (e.g early-career researchers). Other studies related individual perceptions of research climate, such as advisor-advisee relations or expectations, to misconduct (e.g. . Anderson et al. (2007) & Martinson et al. (2009) found evidence that the greater competition resulting from the neoliberalist culture in science resulted in gaming strategies to the detriment of research integrity.
Research integrity has been linked to research quality (e.g. Martinson et al., 2010), and the former is easier to measure than the latter (Wells et al., 2014). As both have a direct correlation with scientific misconduct, and a climate of research integrity fosters science quality, both terms are often equated in the literature on the effects of cultural aspects on research integrity. p.837) point out that improving research quality is in fact the "holy grail" of organizational initiatives targeted at an ethical organisational climate. Given that metrics have constitutive effects, simply "fixing" aspects of quality by an indicator (Dahler-Larsen, 2019; p.143) will likely not lead to this holy grail. Instead, quality indicators "are neoliberal instruments which colonize practices and undermine professional values" (Dahler-Larsen, 2019; p.14). Since directly measuring research quality therefore is unfeasible, studying cultural aspects about a research environment seems to be the way forward to foster scientific quality.
The effects of publication pressure on research quality have found particular attention in recent studies, since publication rate is one of the key metrics in academia (cf. Moosa, 2018;Stephan, 2012). Haven et al. (2019a) summarise some key studies' findings: while some extent of publication pressure may be a driver for productivity, too much of it may not only have negative effects on research integrity and quality, but also on individual researchers. Examples of these negative effects include secrecy (e.g. a lower willingness to share data; Zuiderwijk, 2019), less academic creativity, less reliable science, neglect of negative findings (Heuritsch, 2019(Heuritsch, & 2021 and a greater likelihood to engage in misbehaviour (QRPs & FFPs;Bedeian et al., 2010;Tijdink et al., 2014b;Bouter, 2015). The perceived competition resulting from the publish-or-perish imperative may also lead to emotional exhaustion on the individual level and feelings of unworthiness (Tijdink et al., 2013;Tijdink et al., 2016;Heuritsch, 2019). However, previous studies on the effects of publication pressure on research quality have two shortcomings: 1) Quantitative studies have thus far mainly included scientists from specific disciplines, such as biomedicine, management and population studies and of specific academic ranks (Miller et al., 2011;Van Dalen & Henkens, 2012;Tijdink et al., 2014b). 2) While previous literature acknowledges the link between research integrity and research quality, we are currently lacking quantitative studies that explore the impact of misbehaviour on research quality as opposed to the implied impact on research quality through the compromised research integrity. 3) Haven et al. (under review) is the only quantitative study of our knowledge that studies both, the impact of publication pressure and the impact of research climate on research integrity.
To address these shortcomings, this paper aims to study quantitatively the impact of research culture on research quality in a natural science field -Astronomy. The aspects of research culture under scrutiny are: perceived publication pressure, perceptions of distributive justice and procedural justice in peer review, grant application and telescope time application processes. These cultural aspects have been found relevant in Astronomy in a qualitative study (Heuritsch, 2021). The author studies the "organisational hinterland" (Dahler-Larsen, 2019) of Astronomy, to understand how quality inscriptions are produced, how they diverge from the astronomers' definition of quality and how this discrepancy affects research behaviour. In a nutshell, Heuritsch (2021) finds evidence for the structural conditions in the field, especially the over-emphasis on performance measured by publication rate, reception of external grants and telescope time, leading to gaming strategies to score well on those indicators mostly in the form of QRPs. These are found to be a response to the dissonance between cultural values (producing qualitative research that genuinely pushes knowledge forward) and the institutional objectives imposed to have career in academia (scoring well on indicators). In other words, there is a discrepancy between what indicators measure and the astronomers' definition of scientific quality -the so-called evaluation gap. Gaming strategies then give the appearance of compliance with cultural values, while using institutionalised means to achieve a good bibliometric record in innovative ways, such as salami slicing, cutting corners or going for easy publications (Haven et al., 2019b). The author finds evidence for a decrease in overall research quality as a consequence of prioritising quantity.
Based on Heuritsch (2019Heuritsch ( & 2021 we can use astronomers' own definition of research quality, as well as previous studies on the relationship between academic culture and research behaviour, to analyse the effect of perceived publication pressure and organisational justice on research behaviour and -quality in astronomy. Understanding which cultural aspects foster and which inhibit research quality in this field will bring us a step closer towards the holy grailthe knowledge of how to support the scientific enterprise. This is not only the first study to analyse the effects of cultural aspects (such as publication pressure) on research quality in Astronomy, but also the first to integrate Reflexive Metrics and studies on the relationship between research culture and -integrity. Moreover, it is the first to employ structural equation modelling to fully account for the structural relationships between the phenomena of interest. This Paper will be structured as follows. First, we will give a theoretical background on explanations of misconduct, which contains a review on research integrity studies taking organisational culture into account. Second, the method section describes the sample selection, the survey instruments, research question & hypotheses and technical aspects with regards to how we performed the statistical analyses. Third, the result section contains descriptive statistics, the results from our EFAs, CFAs and SEM with scientific misconduct as the dependent variable, as well as the perceived impact of scientific misconduct on research quality. The result section is followed by a discussion, strength & limitations and a conclusion section that also gives an outlook for future studies.

Theoretical Background: Explanations for Misconduct
To understand metrics' role in research misconduct (hereafter referred to as misconduct or misbehaviour), we must first reflect on the causes of misbehaviour. Haven & van Woudenberg (in print), based on Sovacool (2008), suggest that there are three narratives which can explain misconduct: (i) Failures on the individual actor's level ("Impure individuals") (ii) Failures on an institutional level (of a particular university/ institute) (iii) Failures on the structural system of science level Haven & van Woudenberg (in print) point out that these narratives are not mutually exclusive, and tests six theories taken from previous literature on misconduct to assess their value in explaining one or more of these narratives. Five of those six theories shall be considered for this study: (1) Bad Apple-, (2) General Strain-, (3) Organizational Culture-, (4) New Public Management and (5) Rational Choice Theory.
While a comprehensive review of these theories is outside the scope of this paper, more background information on this discourse can be found in Martinson et al. (2006) and Haven & van Woudenberg (in print). To set the theoretical background for this story, we shall give a brief overview and describe how (1) to (4) can be subsumed under (5) -Rational Choice Theory.
Bad apple theories provide perhaps the earliest explanations for misconduct (Hackett, 1994) and account for the first narrative (i), since misbehaviour is thought to be solely caused by an individual and their distorted psychology. However, these theories are regarded as too simplistic in a sociological context, as they do not account for any institutional. (ii) or structural. (iii) contexts .
General Strain Theory (GST; Agnew, 1992) is an "important strand of deviance theory, as it is the pressure to deviate from accepted norms as a response to perceived injustice" (Martinson et al., 2006;p.4). Based on Durkheim's concept of anomie, Merton (1938) suggests that deviant behaviour may be a coping response to structural strain, resulting from the inability to meet cultural ends with culturally legitimate means. The deviant behaviour (such as scientific misconduct) motivated by this kind of stress, is therefore nothing else but an "innovative" pathway to success. Agnew's GST enhances Merton's concept with the idea that deviant behaviour is not a necessary outcome of strain, but that coping strategies also depend on individual traits, such as self-esteem and intelligence, and contextual factors, such as a strong social support network and whether peers show legitimate or illegitimate coping behaviour. Since GST recognises individual-environment interactions and strain resulting from structural conditions, it accounts for all three narratives.
Organisational Culture Theories (OCTs), rooted in organisational psychology, recognise that the culture and structure of the organisation an individual works in affects their behaviour. In the context of academia, the organisational structure spans local. (e.g. departments or research institutes) and external settings (e.g. funding agencies, [inter-]national peer review systems, the overall academic employment market, etc; Martinson et al., 2006). Organisational culture encompasses all explicit and implicit norms and values within the organisation. A particular strand of OCT, Organisational Justice Theory (OJT), suggests that individuals who perceive they are being treated fairly by their organization, behave more fairly themselves (Martinson et al., 2006 1 ;Martinson et al., 2010). The fairer people feel their organization's processes are, the more likely they are to trust their workplace, to comply with decisions made and to not engage in questionable behaviour (ibid.). In other words, people who perceive the distribution of resources and decision making processes as fair are more likely to respond with normative as opposed to deviant behaviour, such as scientific misconduct (Martinson et al., 2006). One may distinguish between two types of organisational justice (OJ; ibid.): procedural and distributive justice. The former refers to a perception of fairness in decision-making and the latter to fairness in resource distribution processes. In academia, these processes may stem from the local and external settings of the organisational structure, such as peer-review of manuscripts, tenure, promotion and peer-review committees for research grant proposals (Martinson et al., 2010). Since OCTs recognise the fact that characteristics of the environment in which researchers work promote or inhibit scientific integrity, they are of type institutional. (ii) and structural. (iii) narratives.
New Public Management (NPM) is a form of public administration based on neoliberal policies and is characterised by a combination of free market ideology and intense managerial control practices (Lorenz, 2012). The author (p.601) poses the formula: "free market = competition = best value for money = optimum efficiency […]". Arguably, this formula is the rationale for competition in science, since NPM practices have reached academia since the 1980s (ibid.). The NPM paradigm also values efficiency as a key objective, as evident from the formula. The call for increased accountability in science (cf. Espeland & Vannebo, 2008) can be associated with this strive for efficiency. Accountability is in turn sought using quantitative performance indicators, such as publication rates and impact factors. Since resources are limited, there are less tenured positions in science than graduate students and given the extreme focus on efficiency and performance, NPM may result in "hypercompetition" in academia (Halffman & De Radder, 2015) at the cost of research integrity and the scientific enterprise (Anderson et al., 2007).
"Academic misconduct is considered to be the logical behavioral consequence of output-oriented management practices, based on performance incentives." (Overman et al., 2016;p.1140).
Therefore, the NPM theory of scientific misbehaviour suggests that if there is an over-emphasis on performance and competition, researchers will tend towards "self-protective and selfpromoting" behaviours such as mistrust in peers, an aversion towards sharing information & data , QRPs and FFPs (Anderson et al., 2007;p.459). This theory falls under the structural system of science narrative (iii). (2021), as a suitable framework to study the emergence and the impact of the evaluation gap in the academic field of Astronomy. Despite RCT's roots in economics, and contrary to often stated in literature (e.g. Haven et al., under review;Atkinson-Grosjean & Fairley, 2009), when applied thoroughly in sociology (according to Esser, 1999), RCT does not suggest that individuals act "rationally" in the classical sense (i.e. high investment of cognitive resources and disregarding emotions/ instincts) or the simplistic "Homo Economicus". Instead, RCT explains how it is often more rational not to invest cognitive resources and instead follow scripts, which are pre-defined instructions on how to act in situations based of cultural and/ or institutional norms (Esser, 1999). Thorough application of RCT follows the Coleman Boat (Coleman, 1990) by first analysing the logic of an actor's situation, which is comprised of one internal and three external constituents: (1) the internal component, made up of the actor's values, drivers, skills and personality; (2) the material opportunities at present; (3) explicit and implicit institutional norms and (4) the cultural reference frame, such as symbols and shared values. This situation is in a second step translated into bridge hypotheses which deliver the variables that are relevant for the individual's action that follows from the situation. An action theory (oftentimes the Expected Utility Theory; cf. Esser, 1999 andHeuritsch, 2021; Fig 3) is applied to explain how choices are made based on the derived variables of the situation. Third, by finding suitable transformation rules one can explain how individual actions aggregate to the sociological phenomenon in question. The result is a sociological explanation of how the interplay between structural conditions, institutional norms and the individual's personalities cause collective social phenomena. Applied to explaining scientific misbehaviour, RCT therefore pays tribute to all three narratives.

Rational Choice Theory (RCT), is proposed by Heuritsch
Building up on Heuritsch (2021), which followed the Coleman Boat in their analysis, we find that theories (1) to (4) may be subsumed under RCT. First, the internal constituent of the individual's situation accounts for the person's beliefs, values and drivers. If they do not correspond to the cultural values, such an individual may assume the role of a "bad apple" (1) in the respective system/ organisation. Second, if the cultural values cannot be lived up by institutionally legitimate means, the individual may perceive strain in the form of anomie (2; cf. Heuritsch, 2021). Third, the external constituents of the situation account for the organisational culture and environment (3), and therefore its subsequent relevance in the individual's action. For example, perceived injustices may contribute to strain (2; Martinson et al., 2010). Fourth, the prevailing NPM paradigm (5) influences the organisational culture (3), and therefore explains part of the culture's values and norms (Haven et al., under review). For example, NPM is likely to foster hypercompetition, which may impose strain (2), which in turn may lead to deviant behaviour, depending on individual's dispositions (1) that regulate the response to strain. This is how theories (1) to (4) deliver partial explanations for misbehaviour, which subsumed under RCT (5) can achieve a wider explanatory value.

Sample Selection and procedure
Astronomy is a highly international and globalised field (Roy & Mountain, 2006;Heuritsch, 2021). This includes a high proportion of international collaborative research (Chang & Huang, 2015), not least because of the sharing of observatories located in specific parts of the world (e.g. ALMA in Chile). Astronomers have a strong common culture through their publication systems (three main journals and conference proceedings), societies and professional associations (Roy & Mountain, 2006;Heuritsch, 2021). Heidler (2011), estimates there to be about 15,000-20,000 active astronomers worldwide. They may work in academic research or in other research facilities, such as space agencies or non-public institutes. Given the close similarity in culture of academic and non-academic astronomers, this survey targeted both groups.
The ideal aim was to run a census. However, there is no official, complete list of all astronomers worldwide. Therefore, we used a three-stage cluster sampling technique to build our sampling frame, encompassing as many astronomers as possible. In the first stage, we constructed a list of astronomy institutions worldwide, including universities, non-academic research organisations, observatories, societies and associations. In the second stage, based on this list, we reached out to 176 universities, 56 non-academic research facilities & observatories and 17 societies & associations. We did so by emailing a respective contact person (e.g. a secretary/ department head), which included an invitation to the survey, asking them to forward the email to all department members (including PhD students). We estimate that 1,200 academic astronomers were reached in this way. In the third stage, we contacted the heads of the 9 divisions 2 of the International Astronomical Union -the largest association of academic and non-academic astronomers -asking them to forward the invitation to their respective division members. Three of them followed our request and one posted the invitation in their newsletter. In the fourth and final stage, we used an automated script to send out email invitations to the five remaining divisions, whose their heads were non-responsive, using publicly available email adresses. That way all IAU members 3 (approximately 12000 astronomers) were reached through at least one channel. Note, that some astronomers may have received the invitation more than once, since astronomers may be part of more than one division and may also have been reached through multiple of the approaches described above. Given that the IAU only has around 7% of "junior members", we asked survey recipients to forward the invitation to early career researchers. We estimate that around 13000-15000 astronomers were reached in total, 3509 of which at least party completed the survey, amounting to a response rate of roughly 25%. 2011 astronomers completed the survey in full.

Instruments
We used the online tool LimeSurvey to create and host the survey. As outlined above, the survey is embedded in the conceptual framework of rational choice theory (RCT), subsuming organisational culture theory (OCT) and its subtype organisational justice theory (OJT). Therefore, our online survey contains a number of instruments to measure our independent and dependent variables: While there is no general definition of research misconduct, since it is highly contextual (Hesselmann, 2014), Martinson et al. (2005Martinson et al. ( , 2006Martinson et al. ( , 2009Martinson et al. ( , 2010 and Bouter et al. (2016) designed questionnaires asking about the occurrence of, in total, 90 different misbehaviour and questionable research practices. On the basis of Heuritsch (2021) we chose for our study the 18 items that were most relevant for the context of astronomy (see S1-Appendix: S1-Table1a-b). In our survey, we ask about the perceived frequency of observed (mis-)behaviour.

• Research Quality
We operationalised research quality in astronomy on basis of the findings of Heuritsch (2019Heuritsch ( & 2021. The author found three quality criteria: (1) good research needs to push knowledge forward, which includes studying a diversity of topics and making incremental contributions; (2) The research needs to be based on clear, verifiable and sound methodology that is (3) reported in an understandable and transparent way. This includes the sharing of data and reduction code. In the line of Bouter et al. (2016) who surveyed the frequency of misbehaviour and its impact, for each (mis-) behaviour item we asked about the frequency (as mentioned above), the impact on the validity of the findings, the impact on the communication value of the resulting paper and for two items we additionally asked for the impact on the research diversity (see S1-Table1a-b). In analogy to Bouter et al. (2016;p.2), "the total harm [on quality] caused by a specific research misbehaviour depends on the frequency of its occurrence and the impact [on quality] when it occurs".

Independent Variables
• Perceived Publication Pressure To measure perceived publication pressure, which has been linked with (perceived) misbehaviour (Haven et al., under review), we adapted the Publication Pressure Questionnaire (PPQ; as validated by Tijdink et al., 2014a) to the context of research in astronomy. The initial PPQ consists of 18 items and we added four more. The added questions dealt with the influence of perceived publication pressure on the publication of data, reduction algorithms and replicability -all three of which have been found important for research quality in Astronomy (Heuritsch 2019(Heuritsch & 2021. The 22 adapted PPQ items can be found in S1-Appendix: S1-Table2. items and added one more item, resulting into three effort and eight reward items (see S1-Appendix: S1-Table3). With regards to perceived procedural justice, in this study we consider the following processes: a) Resource allocation, b) Peer review, c) Grant application and d) Telescope time application. We adapted the instruments Martinson et al. (2006) used to the context of astronomy and the specific processes (see S1-Table4a-d). In particular, we added questions about how much the success of that process depends on luck on the one hand and improper preferential treatment on the other hand. The addition of these two items followed from findings by Heuritsch (2021) and suggestions from initial tests with astronomers.

Control Variables
In addition to our main independent variables that may predict misbehaviour, we assume that role-associated and individual aspects factor into these relationships. On basis of Heuritsch (2021), we expect that scientists who haven't yet established their reputation, such as early career researchers, will be more likely to perceive publication pressure and organisational injustice, as there are insufficient tenured positions. As opposed to Martinson et al. (2006), who suggests that role-associated aspects ("social identity") mediate the relationship between the independent and the dependent variables, we therefore propose that they predict the independent variables. Our control variables include: gender, academic position, whether one is primarily employed at an academic (as opposed to non-academic) institution, whether one is employed at an institution in the global North/ South 4 , and number of published first or coauthor papers in the last 5 years. The reference categories are: gender: female/ non-binary; academic position: full professor; institute location: global South; Number of papers published: 1-5.

Research Question & Hypotheses
In light of the theoretical background as outlined above, we work from the assumption that higher perceived procedural and distributive injustice research and higher perceived publication pressure, will increase scientists' chances to observe research misbehaviour. We expect earlycareer researchers, with a less secure position, to be more likely to perceive both injustice and publication pressure.
Our research question is: To what extent can role-associated factors, cultural aspects and publication pressure explain the variance in perceived research misbehaviour and what effect does misbehaviour have on the research quality in astronomy?
Building on the qualitative study of the evaluation gap and its potential consequences on research quality in astronomy by Heuritsch (2021) and previous studies on the relationship between research culture and integrity (e.g. Martinson et al., 2006), this study tests the following hypotheses by means of a quantitative survey: 1. (H1): The greater the perceived distributive injustice in Astronomy, the greater the likelihood of a scientist observing misbehavior. 2. (H2): The greater the perceived organizational injustice in Astronomy, the greater the likelihood of a scientist observing misbehavior. 3. (H3): The greater the perceived publication pressure in Astronomy, the greater the likelihood of a scientist observing misbehavior. 4. (H4): Those for whom injustice and publication demands pose a more serious threat to their academic career (e.g., early-career & female researchers in a male-dominated field) will perceive the organizational culture to be more unjust, the publication pressure to be higher, and subsequently more occurring misbehavior. 5. (H5): Scientific misbehavior has a negative effect on the research quality in astronomy. 6. (H6): The greater the perceived publication pressure in Astronomy, the greater the likelihood of a scientist perceiving a greater distributive and organizational injustice.
Based on these hypotheses, we specified our structural equation model. Given that publications are the main output of research and one of the key indicators used to evaluate the performance of astronomers (Heuritsch, 2021), we hypothesise that the perception of distributive and organizational justice depends on the perceived publication pressure. Overcommitment to work may depend on the perceived publication pressure as well as perceived distributional and organisational justice. We test the influence of all our control variables (academic position, gender, number of published papers, being academic/ non-academic, employment in global north/ south) on all our independent variables and the dependent variable, scientific misconduct. Figure 1 depicts the level of the latent constructs of this model, excluding the measured indicators, for simpler readability.

Statistical Analyses
The analysis of this survey was performed in SPSS and R. Data preparation, including recoding and calculation of mean scores was performed in SPSS. We decided to exclude the 23 Bachelor and Master students from our sample, since we received too few responses from this category to conduct a proper analysis. All instruments measuring the independent variables (see S1-Appendix), are scored on a scale from 1 (strongly disagree) to 5 (strongly agree) and are treated as continuous variables in our structural equation model (SEM). The steps to arrive at our final model started with testing the independent variable constructs PPQ and ERI (including overcommitment) by performing an exploratory factor analysis (EFA) for ordinal data (CATPCA in SPSS) and we derived Cronbach Alphas as scale reliabilities. For the EFA we used Promax Kaiser-normalisation for rotating the factors. Next, all independent variable constructs were tested by means of confirmatory factor analysis (CFA) using Lavaan version 0.5-23 (Rosseel, 2012) in R version 3.3.1. For each construct, we used the residual correlation matrices to determine significant correlations of the indicators and included them into the respective models. After checking for construct validity, we further used Lavaan to perform structural equation modelling, which is the purpose it was designed for. Lavaan uses maximum likelihood estimation for regression analysis and listwise exclusion for missing data. The results section will present the results of our EFAs, CFAs and the complete SEM.

Descriptive Statistics
In Table 1 we first present the descriptive statistics of the control variables. Females make up around 26% of the sample (N=1827). For further analysis we combined the female and nonbinary categories. Out of 2188 astronomers who shared their academic position in the survey, there are about 15% PhD candidates, 23% postdocs, 8.5% assistant professors, 25.96% full professors and 12.25% unranked astronomers. Out of the 2478 astronomers who declared whether their primary employment is at an academic university setting, 84.18% are employed in such setting and 15.82% are employed at other institutions which do research, such as national research institutes, observatories/telescopes or space agencies. Out of 1624 astronomers who answered in which country their primary employment is located, 84.79% astronomers work for an institution in the global North and 15.21% in the global South. 2610 astronomers answered how many papers they published as first or co-authors in the last 5 years. The 7.8% who have not published any yet are excluded from our regression analysis, since they did not receive the item battery regarding organisational justice in terms of peer review. The largest publication category is 1-5 papers published in the last 5 years with 31.72% of respondents, followed by the 11-20 publications category with 16.21% of respondents. 3.53% have been first or co-author for more than 100 papers in that time frame. Out of 2647 astronomers who answered the question of whether they applied for telescope time in the past 5 years, 58.9% replied yes and the rest replied no. The same amount of people filled in the question about whether they applied for a grant application in the past 5 years, and here 62.3% answered yes. Those who answered no for any of the two questions did not receive the item batteries regarding procedural justice with respect to telescope time or grant application processes, respectively. For each independent and dependent variable construct, we calculated the mean scores, which are presented in Table 2. The mean of the perceived publication pressure lies slightly above the mean of the scale. The effort versus reward ratio is 1.15, which means that the perceived effort put into work is higher than the perceived reward received for work. Astronomers also feel a slight overcommitment to work (M=3.39). The four forms of organisational justice are generally above the mean of the scale, which indicates that astronomers tend to feel more justice than injustice when it comes to resource allocation, peer review, grant application and telescope time application. The mean of the perceived frequency of scientific misconduct lies just below the mean of the scale (M=2.99). The mean impact of the 18 different misbehaviours (listed in Table 2) on the validity of the findings at hand (quality criterion 1) and on the resulting paper's ability to convey the research appropriately (quality criterion 2) are around 3.3. The mean value for the impact of misbehaviour on research diversity is higher (M=3.74). However, one needs to consider that this question was only asked for the two types of misbehaviour (Item 8 and Item 9) for which it was expected that they have an impact on quality criterion 3. That choice was made in order not to burden the participants with a question that did not fit with the rest of the misbehaviour items.

Exploratory Factor Analyses
In order to build our SEM, we first performed an EFA for the independent variables PPQ and ERI, the latter of which includes the overcommitment items. The Pearson correlation matrix for the PPQ items show considerably low correlations for 10 items (<|0.3|). When testing the Cronbach alphas (see S2-Appendix: S2-Table1a) for the whole construct we found that removing those items would increase the reliability of the PPQ construct. We subsequently decided to remove those items, resulting in 12 remaining items. The Cronbach alpha for the remaining PPQ construct is 0.871 (see S2-Table1b), which is considered as good internal consistency. We henceforth used this cleaned PPQ for any further analysis. The CATPCA resulted in 3 factors, which we classified as F1) Extent & Consequences on one's own conduct of research, F2) Impact on Relationship with colleagues and F3) Suspected consequences on science (see S2-Table1c).
The EFA for the ERI construct resulted in 5 factors (see S2-Appendix: S2-Table2a): One factor representing the perceived effort put into the work (F3), one factor representing the perceived overcommitment to work (F5) and three factors representing the perception of being rewarded for one's work (F1: Job situation, F2: Salary, F4: Receiving praise/ respect). The Cronbach alphas are 0.691 for the effort construct, 0.780 for the overcommitment construct and 0.805 for the combined reward construct (S2-Table2b-d), which is considered acceptable for the former two and good for the last one.

Confirmatory Factor Analyses
We subsequently ran CFAs for all independent variables, The results are presented in S2-Appendix: S2- Table3. This table includes the model fit indices CFI and TLI, where >0.9 indicates a good fit for both and RMSEA, where <0.05 denotes a good fit. In addition, the fourth column includes the Chi-square values for the difference between the models with and without accounting for significant covariation between indicators measuring the respective independent variable. All independent variable constructs show a good fit according to CFI and TLI. As for RMSEA the fit is good for ERI and acceptable for the other constructs. For each independent variable, the model is unsurprisingly better when significant covariations between indicators are taken into account.

Structural Equation Model
This section presents the statistically significant main effects from the regression analysis of the whole structural equation model (Figure 1; N=520 after listwise exclusion). The SEM fit is acceptable with a CFI of 0.801, a TLI of 0.790 and a RMSEA of 0.043 90% CI(0.042, 0.044).
For the sake of readability, we split the output by the independent and dependent variables, resulting in five different tables (Table 3a to Table 3e 5 ). Table 3a presents the main effects of the control variables regressed onto perceived publication pressure. Being male as opposed to female/ non-binary decreases the chance to perceive publication pressure by 0.172 points. Astronomers occupying a position other than associate professor tend to feel more publication pressure than a full professor. Whether one works at an academic institution or not does not have a significant effect on publication pressure. However, working at an institution located in the Global North are less likely to feel publication pressure by 0.33 points. As for number of published papers, the effects for most categories are not significant, apart from having published 21-30, 61-70 or more than 100 papers, which all decrease the likelihood of perceiving publication pressure as compared to having published between one and five papers.  As for the distributive justice factors reward and effort (Table 3b), astronomers who perceive publication pressure feel less rewarded (by 0.381 points) for the work they do, while at the same time they feel that they put more effort (by 0.502 points) into their work than astronomers who feel less publication pressure. Males tend to feel like they need to put less effort into their work than females or non-binaries (by 0.256 points). Postdocs tend to feel less rewarded for their work by 0.315 points, while at the same time also putting less effort than full professors by 0.263 points. Associate professors also feel that they are less rewarded as compared to full professors by 0.318 points, but show no significant effect on the effort factor. Neither being an academic astronomer, nor being employed in the global North makes a difference in the reward and effort factors as compared to the opposite. Astronomers who have published 11-20, 31-40 or 61-70 papers feel more rewarded for their job than those who have published one to five papers. Those who have published more than 100 papers are more likely to feel that they put a lot of effort into their work than the reference category (1-5 papers).  The perception of all four kinds of organisational justice measured depends on the perceived publication pressure ( Table 3c). The feeling of being treated fairly decreases with increasing publication pressure for all four latent variables (with parameter estimates between 0.273 and 0.475). PhDs and Postdocs feel treated more fairly in terms of resource allocation and peer review than full professors. PhDs also feel more justice when it comes to grant application processes. Being academic or not does not have a significant effect on any of the four organisational justice perceptions. Being employed in the global North decreases the likelihood of perceiving fairness in terms of peer review and grant application as compared to those employed in the global South. Astronomers who have published 11-20 or more than 100 papers feel treated more fairly in terms of resource allocation than those who have only published 1-5 papers. Those who have published 11-20, 31-40 or more than 100 papers feel more organisational justice in terms of peer review and grant application than the reference category (1-5 papers). Additionally, having published more than 90 papers increases the likelihood of feeling fairness in peer review processes. As for organisational justice in terms of telescope time applications, only having 11-20 as compared to 1-5 publications increases the perception of fairness significantly.   Finally, we turn to the results of our dependent variable: perception of how often misconduct occurs in astronomy ( Table 3e). The parameter estimate of the main effect of publication pressure on perception of misconduct frequency is 0.375, which means that increasing perceived publication pressure leads to increased perception of misconduct. Perceived fairness of telescope time application processes also has a significant effect on the perception of misconduct; a decreasing feeling of fairness increases the perception of misconduct by 0.113 points. Being employed in the global North also increases the perception of misconduct by 0.212 points. In addition to the main effects on perceived frequency of misconduct, our model calculates the mediated effects. Let us first attend the results of the effects of the control variables as mediated by publication pressure. Being male as compared to female/ non-binary reduces the perception of misconduct by 0.064 points. Any other position than associate professor increases the chance to perceive misconduct as compared to full professor. Employment at an institution in the global North decreases the perception of misconduct by 0.124 points, mediated through the perception publication pressure. Having published 21-30 or more than 100 papers also decreases the likelihood of perceiving misconduct.

Perceived impact on research quality
Lastly, we analysed the perceived impact of the 18 types of misbehaviour on the three aspects of research quality: the validity of the findings at hand (quality criterion 1; QC1), the resulting paper's ability to convey the research appropriately (quality criterion 2; QC2) and the impact on research diversity (quality criterion 3; QC3). Following Bouter et al. (2016) and Haven et al. (under review), we calculated the perceived impact as the product score of the means of perceived frequency and impact on the respective quality criterion (means are listed in S3-Appendix: S3-Table1). The higher the resulting number, the higher the harm on research. S3-Table2 lists the results of perceived impact scores in descending order. In addition, Figure 2 visualises the perceived impact on research quality in four quadrants, indicating high frequency and high impact (Q1), low frequency and low impact (Q3), low frequency and high impact (Q2) and high frequency and low impact (Q4). Item 8 ("Propose study questions solely because they are considered a 'hot' topic") and Item 9 ("Not considering a study question because it isn't considered a 'hot' topic, even though it could be important for astronomy") show the highest perceived impact on research quality (14.33 and 12.89, respectively). This impact is related to QC3 ("impact on research diversity") and can be found in Q1. While the high impact of these types of misbehaviour on QC3 (M=3.7, M=3.79 for Item 8 & 9, respectively) may well be expected, the comparatively high frequencies (M=3.88, M=3.4 for Item 8 & 9, respectively) are an interesting result. Item 18 ("Biased interpretation of data that distorts results") follows, with perceived impact amounting to 12.47 for QC1 and 12.17 for QC2. As we can see from Figure  2, these high values for perceived impact stem mostly from the comparatively high impacts on the two quality criteria (M=4.04 & M=3.95), rather than a high frequency (M=3.08), which is just above the mean (2.99). The occurrence of Item 13 ("Data fabrication and/ or falsification") also has a high impact on both quality criteria (M=4.17 for QC1, M=4.01 for QC2). Due to the comparative low frequency of Item 13 (M=1.95), the perceived impact of this type of misbehaviour corresponds to rank 30 (QC1) and 31 (QC2) out of 38. Item 13 can be found in Q2. Item 10 ("Giving authorship credit to someone who has not contributed substantively to a manuscript") which is ranked lower than Item 13 (32 for QC2 and 31 for QC1) can be interpreted as the opposite of Item 13 and hence be found in Q4; Item 10 has a high occurrence (M=3.73), but low impact on quality criteria (M=1.97 for QC1 and M=2.06 for QC2). By contrast, "Denying authorship credit to someone who has contributed substantively to a manuscript" (Item 11) is located in Q3, since it doesn't occur very often (M=2.05) and when it occurs it also has a comparatively low impact on QC1 (M=2.53) and QC2 (M=2.67). The perceived impact of this type of misbehaviour is also the lowest ranked.

Discussion
Building on previous quantitative research on scientific misconduct Martinson et al., 2005Martinson et al., , 2006Martinson et al., , 2009Martinson et al., , 2010Martinson et al., & 2016Haven, 2021) and qualitative research on deviant behaviour in astronomy (Heuritsch, 2019;Heuritsch, 2021), we built a structural equation model relating role-associated factors, such as academic position and location of employment with environmental factors, such as perceived publication pressure and distributive & organisational justice and our dependent variables; scientific misconduct and research quality. We found that the location of the institution where an astronomer is employed in terms of global North versus global South makes up about 5% of the variance of observed misconduct. Perceived organisational justice in terms of telescope time application processes explains 3% and perceived publication pressure explains nearly 10% of the variance of observed misconduct.
In addition of publication pressure having a direct effect on scientific misconduct (cf. Haven et al., under review), we worked from the assumption that perceived publication pressure influences the perception of distributive and organisational justice, which our results confirm. An astronomer who perceives publication pressure is more likely to perceive less reward from their work, less organisational justice (in terms of resource allocation, peer review, grant application and telescope time application) and the need to put more effort into work. Hence, publication pressure is indeed a key factor to determine how research culture and integrity in astronomy is perceived.
Publication pressure, in turn, is more likely to be perceived by astronomers with academic ranks below a full professor (cf. Miller et al., 2011;Tijdink et al., 2014b). Interestingly, astronomers employed at institutions in the global North feel less publication pressure, despite observing more misconduct. Hence, there is some difference between institutions in the global North as compared to the global South, which makes astronomers perceive more publication pressure (which in turn increases the likelihood of observing misconduct), yet at the same time suppresses the perception of scientific misconduct. There is a tendency to perceive less publication pressure, when one has published more than 1-5 first or co-authored papers in the last 5 years. However, since there is only a statistical significance for 3 out of 11 categories, this requires further investigation. Males do not only perceive less publication pressure than females/non-binaries, but also feel like they need to put less effort into their work, which is consistent with previous research showing that working conditions are harder for females than males 6 .
Early career researchers perceive more organisational justice in terms of resource allocation, peer review and grant allocation than full professors. This may be because early career researchers may still have a positive opinion of organisational processes, whereas more experience may lead to more occasions of unfairness (cf. Heuritsch, 2021).
The frequency of observed misbehaviour shows that the severe types of misbehaviour (FFP type) such as data fabrication & falsification (Item 13), concealing results (Item 17) and forms of plagiarism (Items 15 & 16) occur less often than the QRPs, as expected (Martinson et al., 2005;Haven, 2021). Among the most frequently occurring QRPs are making the topic selection dependent on whether a topic is or isn't "hot" (Items 8 & 9), questionable authorship practices (Item 10) and insufficient supervision (Item 4). Our findings agree with  who found that data fabrication & falsification (Item 13) is believed to be the biggest threat to the validity of the findings (QC1) at hand and the communication value of the resulting paper (QC2). In comparison, plagiarism (an FFP type misbehaviour) is ranked very low on the impact on QC1 and QC2 scores. This makes sense, since as Haven et al. (under review) point out, "plagiarism fails to connect the knowledge to its proper origin, but it need not distort scientific knowledge per se", whereas falsification & fabrication do. Our perceived impact ranking (the product of the mean scores of perceived frequency and impact on research quality) also agrees with the findings of Bouter et al. (2016). It suggests that is not outright fraud which dominates the negative impact on knowledge creation, but rather behaviour which cuts corners to publishable output (Item 8,9 & 5). We conclude that many a little makes a mickle; the epistemic harm in research in astronomy done by QRPs seems to be greater than that done by FFPs, which would agree with finding by Haven (2021) and Bouter et al. (2016).

Strengths/ Limitations
Surveys and data analysis come with strengths and limitations. Let us first turn to the strengths of this study. First, we sampled astronomers from all over the world. This paper gives a snapshot of the international cultural climate in astronomy and its impact on scientific misconduct and quality, while previous studies on this topic mainly focused on universities in the US or the Netherlands (e.g. Wells et al., 2014;Haven et al., 2019a). The second strong point of this survey is that the types of misconduct we chose are not only based on previous literature on this topic (e.g. Martinson et al., 2005Martinson et al., , 2006Martinson et al., , 2009Martinson et al., , 2010Martinson et al., & 2016Haven, 2021), but also on qualitative research conducted on deviant behaviour in the field of astronomy (Heuritsch, 2021). As Hesselmann (2014; p.61f.) points out, "the meaning of misbehaviour is permanently shifting" and therefore, "measuring scientific misconduct quantitatively should not be first on our research agenda." We believe that the qualitative study by Heuritsch (2021) gave us solid ground to tailor the quantitative studies performed on misbehaviour to the field of astronomy. Third, our analysis is the first among the literature relating publication pressure, distributive & organisational justice with scientific misconduct, which uses structural equation modelling, allowing for an estimation of the model in its whole complexity. Fourth, it is also the first study in this set of literature, which operationalises research quality. We therefore measure the impact of scientific misconduct on scientific quality instead of implying that relationship through the concept of research integrity.
Our study also comes with several limitations. First, while our response rate was acceptable (25%), our competition rate lies at around 14%, which may be considered as relatively low, but can still be compared to that of similar web-based surveys (e.g. Haven et al., 2019a). The reason for this drop may be that the survey was considered long, as evident from some feedback from the respondents; it took 30-60 minutes to complete. Second, due to the length of the survey, we may need to consider response bias towards those who feel publication pressure and feel treated unfairly and may hence be more enthusiastic in voicing their opinion about this topic. On the one hand, this may overestimate the effects of publication pressure and organisational injustice on misconduct. On the other hand, those who left the field of Astronomy as a result of publication pressure and injustice are of course not sampled, and hence publication pressure and organisational injustice may be underestimated through survivor bias (Kurtz & Henneken, 2017). Third, respondents criticised that there was no "NA" option for the experience of misbehaviour items. While respondents didn't have to choose an answer to move forward in the survey, this may have resulted in an underestimation of occurrence of misconduct, as astronomers who may not have much experience in the field, clicked the lowest or the middle answer category, while having preferred an NA option. Because self-reports may result in underreporting misbehaviour (Hesselmann, 2014) we chose to ask about the general experience with the types of misbehaviour. Therefore, we expect that the underreporting of one's own misconduct would mitigate the overreporting of misconduct by others (Haven et al., under review). Fourth, many filter questions, such as if one has already applied for grants or telescope time, resulted in a comparative small sample (N=520) for the SEM analysis, because of listwise exclusion. Fifth, at the time we designed the survey we had no knowledge about the revised PPQ 2019b), which we would have used instead of the PPQ and may have resulted into better construct validity, without having to adapt the construct for our own further analysis. Sixth, while it was our theoretical aim to conduct a census of all worldwide astronomers, this was practically impossible due to time, budget and resource constraints. Our three-stage sampling design aimed at completeness in getting a hold on astronomical institutions worldwide and reaching as many astronomers as possible. However, we cannot expect that our list is indeed complete, nor that our contacts have reached all astronomers from the respective institutes, nor that our sample is random. Therefore, representativeness may be limited. To improve this, further testing for item batteries may also inform weighting to adjust the sample proportions to the population proportion.

Conclusion, Implications and Outlook for Further Research
The aim of this research was to study the impact of perceptions of publication pressure and distributive & organisational justice on the observation of occurrence of scientific misconduct and the impact that certain types of misconduct have on research quality in astronomy. While we did not find statistically significant effects of perceived distributive & organisational justice of the four processes -resource allocation, peer review, grant application and telescope time application -on research misbehaviour, we strongly emphasise that publication pressure is part of research culture. As outlined by Heuritsch (2021), institutional norms define what is seen as a good researcher, and publication rate is one of the key indicators to measure the performance of an astronomer. Arguably, playing the indicator game is an innovative path to success, so we worked from the assumption that research (mis-)behaviour is reflexive, insofar that it depends on how one's performance is evaluated. We found that publication pressure explains 10% of the variance of occurrence of misconduct and between 7 and 13% of the variance of the perception of distributive & organisational justice as well as overcommitment to work. We subsequently analysed the impact of the individual types of misbehaviour on three aspects of research quality. We agree with findings of previous studies (e.g. Haven, 2021;Bouter et al., 2016) that QRPs should not be underestimated for their epistemic harm.
We conclude that there is a need for a policy change. In the distribution of institutional rewards, grants and telescope time, less attention to metrics (such as publication rate) would foster better scientific conduct and hence research quality. Publication pressure could also be reduced by reconsidering what is considered publishable. Since, for example, negative results cannot easily be published (Heuritsch, 2021), a lot of scientific work may not be recognised as valuable research. Future studies could be devoted to exploring potentially more innovative ways of creating quality output which may count towards one's performance. This requires reflecting and working on the structural conditions that comprise the norms and culture of research in astronomy. After all, they comprise the external constituents of the situation of an actor, and are therefore of high relevance in the individual's actions. Future quantitative studies may complete the rational choice picture, by paying tribute to the internal component of the astronomers' research situation. For example, one could study their motivation to do research and to publish, and could relate this to the importance of publications in the field. S1-Appendix: Survey Questions S1-Table1a: First item-battery for the DVs Scientific Misbehaviour and Research Quality. Instruction: "Here we present 9 different forms of research behaviour. For each item, please answer the same 3 questions for the general situation in research in Astronomy as you experience it. Item 8 & 9 contain 4 questions. We are interested in your personal views and opinions. These may be based on direct experience, stories from colleagues and/or knowledge of the literature on research behaviour. Please remember that answering honestly about your personal experiences is vital for this study. Your answers are completely anonymous." Questions for each item: 1 -How frequently does this form of research behaviour happen in Astronomy? 2 -If it occurs, how impactful is it on the validity of the findings of the study at hand? 3 -If it occurs, how impactful is it on the resulting paper's ability to convey the research appropriately? 4 -If it occurs, how impactful is it on ensuring that a diverse set of research questions are studied in Astronomy?
For each question the response scale ranged from 1=Very Low to 5=Very High.

Items:
Items were adapted to the context of astronomy based on Martinson et al. (2005Martinson et al. ( , 2006Martinson et al. ( , 2009Martinson et al. ( , 2010 and Bouter et al. (2016). ° denotes questions added by the authors. Not considering a study question because it isn't considered a 'hot' topic, even though it could be important for astronomy°

S1-Table1b: Second item-battery for the DVs Scientific Misbehaviour and Research Quality.
Instruction: "Here we present 9 different forms of research misbehaviour. For each item, please answer the same 3 questions for the general situation in research in Astronomy as you experience it. We are interested in your personal views and opinions. These may be based on direct experience, stories from colleagues and/or knowledge of the literature on research misbehaviour. Please remember that answering honestly about your personal experiences is vital for this study. Your answers are completely anonymous." Questions for each item: 1 -How frequently does this form of research behaviour happen in Astronomy? 2 -If it occurs, how impactful is it on the validity of the findings of the study at hand? 3 -If it occurs, how impactful is it on the resulting paper's ability to convey the research appropriately?
For each question the response scale ranged from 1=Very Low to 5=Very High.

S1-Table2: Item-battery for the IV Perceived Publication Pressure.
Instruction: "Please indicate to what extent you agree/ disagree with the following statements:" Items: Items were adapted to the context of astronomy based on Tijdink et al. (2014a). ° denotes questions added by the authors.
For each question the response scale ranged from 1=Strongly Disagree to 5=Strongly Agree and included the option "NA". * denotes reverse coded items.
Publication of scientific articles is the most important aspect of my work How often do you feel pressure to publish?° [This questions response scale was "Never", "Very rarely", "Rarely, "Regularly, "Often" and "Very often"; whereby "Never" and "Very rarely" were recoded into one category for the analysis] S1-Table3: Item-battery for the IV Perceived Distributive Justice.

Instruction:
"To what extent do you agree/ disagree with the following statements?" Items: Items were adapted to the context of astronomy based on Siegrist et al. (2014). ° denotes questions added by the authors.
For each question the response scale ranged from 1=Strongly Disagree to 5=Strongly Agree. * denotes reverse coded items. Items were adapted to the context of astronomy based on Martinson et al. (2006). ° denotes questions added by the authors.

Effort
For each question the response scale ranged from 1=Strongly Disagree to 5=Strongly Agree. * denotes reverse coded items. You had an influence in these decisions You (would) have been able to appeal these decisions S1-Table4b: Item-battery for the IV Perceived Organisational Justice in terms of Peer Review.

Instruction:
"The following items refer to the peer review of your most recent manuscript submitted for publication. When you think of the review, consider the overall quality and the review process. To what extent do you agree that:" Items: Items were adapted to the context of astronomy based on Martinson et al. (2006). ° denotes questions added by the authors. For each question the response scale ranged from 1=Strongly Disagree to 5=Strongly Agree. * denotes reverse coded items.
The reviews were appropriate relative to … ... the effort you put into the manuscript ... the quality of the manuscript I experience the reviews as fair In my opinion, the acceptance or refusal of a manuscript is mostly based on improper preferential treatment°* In my opinion, the acceptance or refusal of a manuscript is mostly based on luck°* The review process was typical of reviews you have received in the past 3 years S1-Table4c: Item-battery for the IV Perceived Organisational Justice in terms of Grant Application.

Instruction:
"The following items refer to the review of your most recent extramural grant application. When you think of the review, consider the overall quality of the review and the review process. To what extent do you agree that:" Items: Items were adapted to the context of astronomy based on Martinson et al. (2006). ° denotes questions added by the authors. For each question the response scale ranged from 1=Strongly Disagree to 5=Strongly Agree. * denotes reverse coded items. The review process was typical of reviews you have received in the past 3 years S1-Table4d: Item-battery for the IV Perceived Organisational Justice in terms of Telescope Time Application.

Instruction:
"The following items refer to the review of your latest application for telescope time. When you think of the review, consider the overall quality of the review and the review process. To what extent do you agree that:" Items: Items were adapted to the context of astronomy based on Martinson et al. (2006). ° denotes questions added by the authors. For each question the response scale ranged from 1=Strongly Disagree to 5=Strongly Agree. * denotes reverse coded items. The review process was typical of reviews you have received in the past 3 years S1-Table5: Item-battery for the IV Perceived Overcommitment.

Instruction:
"To what extent do you agree/ disagree with the following statements?" Items: Items were adapted to the context of astronomy based on Siegrist et al. (2014). For each question the response scale ranged from 1=Strongly Disagree to 5=Strongly Agree. * denotes reverse coded items. I get easily overwhelmed by time pressures at work As soon as I get up in the morning I start thinking about work After I finish my work day, I can easily relax and 'switch off' work* People close to me say I sacrifice too much for my job Work rarely lets me go, it is still on my mind when I go to bed If I postpone something at work that I was supposed to do today I'll have trouble sleeping at night

S2-Table1a: PPQ Cronbach Alphas before removal of items
This table displays the SPSS output of the Cronbach alphas of the PPQ consisting of 22 items (see S1-Appendix: S1-Table2). * denotes items that we subsequently removed.