Open Science Knowledge Production: Addressing Epistemological Challenges and Ethical Implications

: Open Science (OS) is envisioned to have a wide range of beneﬁts including being more transparent, shared, accessible, and collaboratively developed than traditional science. Despite great enthusiasm, there are also several challenges with OS. In order to ensure that OS obtains its beneﬁts, these challenges need to be addressed. Accordingly, the objective of this study is to provide an overview of one type of challenge, i.e., epistemological challenges with OS knowledge production, and their ethical implications. Based on a literature review, it (a) reveals factors undermining the envisioned beneﬁts of OS, (b) identiﬁes negative effects on knowledge production, and (c) exposes epistemological challenges with the various phases of the OS process. The main epistemic challenges are related to governance, framing, looping effects, proper data procurement, validation, replication, bias, and polarization. The ethical implications are injustice, reduced beneﬁt (efﬁciency), increased harm (as a consequence of poor-quality science), deception and manipulation (reduced autonomy), and lack of trustworthiness. Accordingly, to obtain the envisioned beneﬁts of OS, we need to address these epistemological challenges and their ethical implications.


Introduction
According to UNESCO's Recommendation on Open Science, open science (OS) is defined as "an inclusive construct that combines various movements and practices aiming to make multilingual scientific knowledge openly available, accessible and reusable for everyone, to increase scientific collaborations and sharing of information for the benefits of science and society, and to open the processes of scientific knowledge creation, evaluation and communication to societal actors beyond the traditional scientific community" [1]. Correspondingly, OS has four key values: Quality and integrity; collective benefit; equity and fairness; and diversity and inclusiveness [2]. Moreover, it has four key characteristics: OS is transparent, shared, accessible, and collaboratively developed. Furthermore, OS has a wide range of practices, such as "open code, open data, open access, data-intense, alternative reputation systems, open notebooks, open lab books, science blogs, collaborative bibliographies, citizen science, open peer review, or pre-registration" [3].
While there obviously are many advantages to collaboratively developed, transparent, shared, and accessible knowledge, there are also several epistemological challenges with OS knowledge production, which have ethical implications. These deserve careful scrutiny and attention as they may undermine the goals and shift the risk-potential-benefit balance for OS. Accordingly, there has been a demand for ethics of open science [4]. Therefore, the objective of this study is to provide an overview of the epistemological challenges with OS and their ethical implications.
Before embarking on the task, I will clarify some key concepts. As this study focuses on the challenges of OS, I will also initially acknowledge the envisioned benefits of OS. This is partly because the identified challenges may undermine these benefits, and because addressing the challenges is crucial to ensure the envisioned benefits occur.

Acknowledging the Envisioned Benefits of OS
Very broadly, OS is envisioned to increase scientific efficiency as more people can access research data, analyses, and results. Accordingly, OS contributes to reaching the productivity goal of science [7]. OS is thereby expected to democratize knowledge (the democratization goal) [7]. Relatedly, it is predicted to promote open culture and solidarity through fostering an attitude of sharing [8], and thereby equitize and remedy global historical injustices in scientific knowledge production [9,10]. By this, OS is envisioned to contribute to reaching the global cognitive justice goal of science. By facilitating the active participation of previously excluded groups and promoting the co-production of knowledge, OS aims to address sustainable development goals [11]. In particular, OS has been envisioned to produce locally relevant knowledge, particularly for developing countries, and to be a driver of innovation [12,13] and socioeconomic growth [10,11,14]. Hence, OS may redirect and redesign the goals of science.
Moreover, by introducing new impact metrics, OS is predicted to undermine the pressure to publish, and thus potentially improve the quality of knowledge production [15], e.g., by reducing downsides and artifacts of publication pressure, such as salami and imalas publication [16]. OS is also envisioned to increase the efficiency of knowledge production and dissemination: "For science to effectively function, and for society to reap the full benefits from scientific endeavours, it is crucial that science data be made open" [17].
Hence, OS has been understood as "improving quality, integrity, citability, reward systems" as well as "quality assurance, enhancement and management" [11]. The general idea is that openness and transparency will foster integrity and quality, e.g., as it makes data available for verification and reproducibility checks. Moreover, OS can improve the quality of peer review through openness about reviewers and by multiplying the number of comments and perspectives addressing publications or posts [18,19]. In particular, it is argued that "[o]pen science broadens the research communication spectrum via the use of preprints and data repository servers, so journals are being moved from their traditional role to become part of an interconnected complex system of information sources and communication vehicles." [18]. This can, in turn, have epistemic benefits, such as data reuse, increased reproducibility, and new and better research results.
Hence, there are many envisioned advantages for knowledge production with OS. Table 1 provides an overview of the most frequently mentioned assets. The point here has not been to provide an in-depth analysis of these advantages, but only to present an overview of the elements to be discussed below.

Methods
To provide an overview of the epistemological challenges and related ethical implications, a basic literature search was performed in Google Scholar with the search string "("open science" AND epistem* AND ethic* AND (problem* OR challeng* OR implication OR issue))." The aim of the search was not to obtain an exhaustive overview of the literature but to identify the main challenges and issues. As such, it follows a method for identifying ethical issues in Health Technology Assessment (HTA) [20][21][22]. Hence, only articles discussing epistemological challenges and ethical implications were included. Articles only mentioning that such challenges and implications exist without any substantial analysis were excluded, as were articles mentioning epistemological challenges and ethical implications already identified but not adding new content.
The search performed on February 25, 2022, identified 315 references, which were screened for abstract and title. In total, 73 articles were screened in full text and 35 articles were included. Additionally, 12 articles were identified through snowballing.
The content of the included articles was analyzed with simple directed content analysis [23] directed at addressing two basic questions:

1.
What are the epistemological challenges with OS knowledge production identified in the literature? 2.
What are the ethical implications of these challenges?
To identify sources of distortion of knowledge production and their ethical implications in more detail, I will investigate the different parts/phases of OS with respect to whether the distortion is detectable, intentional, manageable (actionability), and on which level (individual, corporate) it appears for the various parts of OS.
The analysis will follow a simple utilitarian structure, focusing first on (lack of) benefits and then on harms. Thereafter epistemological challenges and ethical implications in the various phases of OS processes will be analyzed.

Epistemological Challenges with OS
The identified challenges can be discussed under three aspects of the impact of OS: (1) Envisioned benefits that do not occur, (2) negative effects on knowledge production, and (3) significant problems specific to open science that occur in discrete phases of the research process: Data procurement (collecting, producing, clarifying, rinsing data), analyzing data, applying algorithms or models, model parameter management (estimating, deliberating on, selecting), producing raw results, interpreting results, synthesizing results, presenting results, envisioning implications, and documenting (articles, reports, synopsis, reviews).

Concerns with Benefits That Do Not Occur
With reference to the envisioned benefits mentioned in the introduction (see Section 1.2 and Table 1), it has been pointed out that the OS still has to document its outcomes on the effectiveness and validity of research [24]. Moreover, OS may not be able to remedy the historical injustices in knowledge production. Previously excluded groups may not be able to participate in the co-production of knowledge, and locally relevant knowledge may not be produced, neither locally nor elsewhere [25]. Moreover, OS may not be the expected driver for innovation and socioeconomic growth. The reason for this may be that the OS infrastructure embeds and promotes power imbalances as its complex infrastructure poses material and symbolic barriers to participation and representation, favoring "Anglospeaking male researchers in North America and Europe" [11]. Open-access publishing is a good example of this. In short, OS may buttress existing epistemic structures and dominances [8] as well as undermine "epistemic justice or cognitive justice" [9].
Yet another reason may be that OS reinforces existing epistemic organization and governance [26], e.g., by being shaped, governed, and reproduced by the institutions, countries, and agencies that govern existing knowledge production [11]. For example, the rhetoric of openness in OS can be used for promoting specific economic competition agendas or for a particular technological and cultural change in scientific knowledge production [11]. Another related problem is epistemic framing [9] "groupthink," i.e., when formal or informal leaders in collaborative OS research dominate the various phases of knowledge production [27]. A related framing effect may also result from OS infrastructure shaping its content. It has been argued that infrastructures need to be put in place in order to produce OS, which will shape what knowledge is considered legible, legitimate, worthy of visibility, and eligible for quality control [11,28]. While all infrastructures (OS or not) may influence their content, OS may have discrete framing effects that need special attention.
Similar concerns have been flagged with respect to research collaboration as expressed in the following quotation: "The interaction between actors with different interests, worldviews, and epistemic structures implies conflicts and negotiations of distinct-often divergent-agendas, expressing asymmetric relations of power. There are significant asymmetries between those who can make use of open knowledge and collaborative practices in their interests, and those who contribute to the common knowledge but do not benefit from it. Therefore, although collaboration is a crucial part of knowledge production, it begs the question: Collaborate for what purpose, with whom, and under what terms?" [9]. Hence, collaboration may reinforce existing framings or power structures. Only when OS bypasses or disrupts suppressive structures can it obtain its collaborative and democratizing goals.
Moreover, OS may not undermine or reduce the pressure to publish, as envisioned. As long as research outcomes are measured and evaluated quantitatively, new types of OS bibliometric measures may emerge and hamper the intended quality improvements.
However, qualitative measures, such as narrative CVs, may reduce the problem [29][30][31]. However, whether it will avoid the problem is yet to be seen.
Correspondingly, OS may not increase the quality of research or the efficiency of knowledge production, as expected. One reason for this can be due to the overrepresentation of knowledge produced by already-dominant groups [11,32]. Another is "missing data" and "data shadows" [33] undermining the benefit of the data analysis. Scientists may selectively share data, biasing the access to data, and "perpetuating the system" [34]. This comes together with what has been called "platform capitalism in science" [26] where the economy of platform owners directs the access and application of data, e.g., by draining value from scientific knowledge production without contributing significant value to the scientific community. Hence, the deliberations on what data become open (and how they can be used) may undermine the laudable goals of OS. Close analyses of what types of data are provided in OS (compared to what is relevant for addressing pressing research questions) may be of great help. The same goes for measures to provide data filtering or selection in OS.
Others have discussed how knowledge-related paradigms, values, and norms are embedded in policies to exert power [11]. The policies behind OS are not neutral, but evoke key actors' profound values, beliefs, and paradigmatic assumptions to promote their interests and institutional demands [11,35].
While OS is envisioned to spur new science assessments and metrics [36], there are reasons to consider looping effects, i.e., the interaction between a classification and the persons (or kinds) that are classified [37]. Scientists may adapt to new metrics in manners that distort fair accreditation and undermine scientific research quality. This is, of course, a general problem in science metrics that does not vanish with OS. However, as it may be continued and enhanced in OS, it is crucial that we pay careful attention to it. As pointed out above, alternative metrics may reduce adaptation, but they may not eliminate the looping effect. New and genuine assessment and accreditation problems may occur.
While increased access to information, knowledge, and evidence are obvious advantages, epistemic objections have been made. The first and easily accessible research results may be of low quality and even fake, as quality-assured and validated evidence takes much more time, resulting in OS bias or an OS divide. Additionally, the process of making data openly available (e.g., by anonymization) may significantly reduce the value and usefulness of the data [9]. However, it is an empirical question of how big these problems will be in practice and whether data with reduced applicability (e.g., due to anonymization) will be better than no data at all.
While alternative peer-review processes, such as (crowdsourced) post-publication peer review and accelerated virtual witnessing, have been thought to improve the validity of scientific knowledge, problems such as rushed circulation of groundless accusations of scientific misconduct, lack of transparency in anonymous online commenting, and lack of inclusiveness have been revealed [38]. The task is to ascertain the good side of the review process (e.g., improving the quality of published scientific research) while avoiding "review trolling".
Moreover, it has been argued that OS will cause the public to misunderstand scientific information [39] and that "open science can make citizens generate a false belief" [40]. It can be argued that this is a general problem and that we need to improve science communication [39] and not OS. As pointed out by Cribb and Sari, "Science is by its nature complicated, making it all the more important that good science writing should be simple, clean and clear" [27]. However, openness and accessibility may enhance the bias towards the simple compared to the complicated.
Yet another related challenge is that open disclosure of knowledge can be used for bad purposes (or dual-use), such as knowledge about the H5N1 virus [27,39]. Accordingly, "[t]he more scientific information is known, the more opportunities exist for the information to be abused" [40]. There are valid arguments that some types of data and information are kept secret [41][42][43]. Accordingly, OS may not become completely epistemi-cally open, democratic, and just, having ethical implications for those who produce and use the knowledge.
As the envisioned epistemic advantages have ethical benefits, any shortfalls of such improvements have corresponding ethical implications. Table 2 provides a summary of concerns with benefits that do not occur and what we can do to address them. What then can we do to address these epistemological challenges and their ethical implications? As suggested in Table 2, we may need to change the epistemic governance structure, as the traditional structures related to access and use of data are bypassed. Moreover, we need to actively adapt the incentive systems and funding requirements, to avoid biased knowledge production and assure equity and justice, also from a global perspective. Accordingly, we need to make adaptive adjustments to the impact metrics and peer review system, as openness does not assure content quality and traditional quality control mechanisms are sidestepped. Connecting money to the metrics is but one way of acknowledging the importance of recognition and reward, but it may reinforce socioeconomic structures. Improving infrastructure and facilitating the contributions and access of the under-privileged is crucial, as is countering framing effects and compensation mechanisms to address skewed infrastructure effects. How to address the epistemological challenges and ethical implications in practice of course needs to be tailored to the specific OS context. This is well beyond the objective and scope of this article and is the task of many subsequent and specific studies. The point here is to provide an overview, which hopefully can be helpful for such studies. Nonetheless, the specific solutions may do well in following such general suggestions [1].

Concerns with Negative Effects
In addition to problems with obtaining envisioned benefits, OS may also have some negative effects on knowledge production. Such epistemological challenges need to be addressed and mitigated to obtain maximal outcomes of OS.
While OS has been envisioned to reduce problems with reproducibility, as data are open and can more easily be verified [44], it has been argued that it is important to differentiate between reproducibility (consistent results from specific data), robustness, and replicability (consistent results across different studies) and that "open data initiatives can promote reproducibility and robustness but do little to promote replicability." [45] Moreover, the replicability problem is even more pronounced in OS artificial intelligence (AI) and machine learning (ML) [46] due to the lack of transparency and "black box problems." If it is (empirically) correct that replicability problems may increase while reproducibility will increase, we should pay more attention to and incentivize replication. The most obvious incentive is inherent in OS: Access to more data.
These epistemological challenges relate to the ethical issues of trustworthiness and trust [47]. Increasing efficiency in knowledge production is of little value if the research results cannot be verified or trusted. Whether and how such negative effects occur (for various kinds and phases of OS) is of course an empirical question. Nonetheless, being aware of these effects can help in avoiding them and ensuring the positive impact of OS. Moreover, adhering to strict validation procedures and fostering research integrity may be important.
Another identified challenge is that easy access to data may incentivize poor-quality research. See also Section 3.1. It is documented that researchers who are strongly incentivized to publish are drawn to study designs with low statistical power [48]. Moreover, easy access can spur unqualified use as, e.g., contextual constraints and quality caveats, may be ignored. Again, it is an empirical question whether easy access spurs low-quality research, but it calls attention to the danger that easy access fosters easy designs, which fosters low-quality evidence. New measures for data quality assurance may be needed as part of OS data use.
Moreover, OS may shift the perspective of research ethics and integrity: "From the point of view of open science, the ethical dimension takes on new formats and reaches different levels and ranges. It concerns the ethical commitment to making the research work and its results immediately available for use and remix by others, whereas codes of integrity and ethics in research adopted by scientific and teaching institutions have mostly focused on the combat of plagiarism." [49] While this appears to be a gross oversimplification, the point is that research ethics institutions and research integrity infrastructure may not be appropriately devised to address the ethical (and epistemological) issues in OS. Hence, we should pay attention to how the systems for research ethics and integrity are prepared to regulate OS. In particular, we must make sure that we have scientific norms against the misuse of OS and that measures against free riders are sufficiently developed.
Another challenge is handling intellectual interests and intellectual property rights as OS threatens to undermine "cognitive capitalism," which is explained to be a "parasitical and rentier exploitation of collective production, offering the conditions for its reproduction as in free platforms of access to digital networks. At the same time, it spoils this very dynamics of value creation with the toughening of mechanisms for protecting intellectual property" [49]. Accordingly, obtaining epistemic efficiency and promoting pluralism appears to be a substantial challenge: "This should be the greatest ethical challenge of open science: the dialogue with the other, the building of bridges and mutual fertilisation in the diversity of knowledge" [49]. In practice, avoiding appropriation and unwarranted profiting from OS is identified as an important task. Accordingly, we may need to adjust assessment and accreditation systems.
Yet another challenge is that OS knowledge production depends on specific types of institutions and organizations [50]. Some have described the demand for a "new institutionality," i.e., that OS requires or enforces new institutional formats as well as new normative and legal frameworks for the production, circulation, appropriation, evaluation, and use of scientific knowledge [49]. Correspondingly, OS may foster new epistemic cultures, with emerging local knowledge-producing practices [51]. However, it is pointed out that without institutions verifying the quality and validity of its knowledge, OS may come to promote epistemic pluralism [52] and polarized research [53]. While the validation problem is not unique to OS, it may become more pressing, enhanced, or find new expressions in OS, and thus is important for maintaining the trustworthiness of science. Hence, OS may contribute substantially to scientific development by refining existing or elaborating on new validation methods.
While OS can be placed in the tradition of the community of inquiry in pragmatic philosophy (with reference to Pierce and Dewey), it faces topical challenges of sloppy science, conspiracy theories, and post-truth phenomena [54]. Accordingly, it is argued that resilience and best practices are needed, as well as post-normal literacy [54]. While quality and validity problems with temporal knowledge production appear to be a general problem (independent of OS), it may be that it can be enhanced or directed by OS. In any case, this indicates that we need to focus on the validity and quality of knowledge production and the institutions that contribute to this-and a special focus on the aspects influenced by OS.
OS can also change the contract and interplay between science and society, e.g., by power structure and distortions, but also by involvement and co-evolution [55]. It can imply a "constitutional recalibration" in terms of "the recalibration of old agreements concerning the autonomy of science and its relations with its societal environment, in particular politics, law, and the economy" [56]. Moreover, it is argued that OS promotes a shift in knowledge production from industrial "productionist metaphysics" to a "post-industrial mode of consumption as use, reuse, and modification" [57]. Within science itself, OS can alter the social process, such as the peer review and publishing process [58], as well as the gift relationship and reciprocity within the scientific society [59].
Moreover, problems may occur in collaboration between researchers with different attitudes to openness and OS [60]. There are many conceptions of openness in OS, which may cause practical, regulatory, and epistemological challenges, e.g., between ideological, legal, technical, and operational openness. Accordingly, it is argued that it is important to ascertain coherence between the different conceptions of openness [61]. Good communication and clear premises for collaboration may contribute to this.
Additionally, it has been pointed out that "[d]espite ethical codes urging researchers to focus on participant needs and potential harms, the OS movement focuses on the benefits of transparency and efficiency for researchers, rarely mentioning associated participant risk" [62] In particular, it is pointed out that there is a lack of sensitivity to the "potential for harm for marginalized participants, communities, and researchers" and that OS conveys and reinforces "cultural values of its creators (competition, capitalism)" [62]. The important point is to take into account unforeseen and reinforcing side effects of well-intended measures.
While it is envisioned that OS can contribute to open contact and fruitful influence of different theoretical perspectives, OS may also enhance groupthink, polarization, and theoretical "myopia and sclerosis." For example, OS is predicted to promote "open theorizing," i.e., "when loosely coordinated researchers realize they can draw on one another's empirical, methodological, or theoretical material to develop theoretical contributions", and in technical terms, that "enactment of the social epistemological principles of free criticism and diversity fosters the concentration, extension, reinvigoration, and procreation of theoretical vocabularies, which may promote theoretical deepening, expansion, rejuvenation, and generativity, respectively" [63]. At the same time, it is pointed out that open theorizing "may escalate the field's already existing tendencies toward theoretical myopia, dilution, shallowness, and faddishness" due to theoretical narrowmindedness, theoretical branching out, superficial analogies, and "theoretical cul-de-sacs" [63]. Hence, whether OS will contribute to open theory building is an open question that deserves attention (both epistemically and ethically).
Moreover, OS may enhance (or reduce) bias in research, as bias is difficult to detect and address. While the openness and transparency of OS can reveal (and hopefully reduce) biases in research, it may also create or enhance them, e.g., in parts of OS that are hidden (such as data procurement, see below). Guidelines, such as the Transparency and Openness Promotion (TOP) Guidelines for the reporting of scientific research, may be used, adapted, or developed to detect and avoid bias. On a general level, it has been argued that traditional single-level epistemic approaches are inadequate. Virtue theorists focus on individuals, paternalists on environments, and collectivists on groups. Accordingly, it is argued that we need to apply an interactionist approach to integrating individuals, environments, and groups to reduce the distortion of bias in OS [64].
On a more practical level, it has been pointed out that informed consent can be difficult to respect in an OS setting [49,62] as it can be challenging to have the knowledge and information about future data use. The same goes for the return of research results. Hence, OS may also challenge traditional issues in research ethics. Again, this may not be specific to OS, but OS may enhance or transform the challenge, which needs to be addressed to avoid the negative effects of OS.
Hence, OS not only faces ethical implications from not obtaining its epistemic benefits, but raises a series of ethical issues due to epistemological challenges. Table 3 summarizes the potential negative effects.

Epistemological Challenges with OS Processes
While the identified literature discusses many crucial challenges of OS and its ethical implications, the scope of this article does not allow a detailed analysis of each of them. The reader will find the details in the provided references. Below, I will address epistemological challenges related to various phases of OS processes in more detail as this has not been found in the literature. The phases have been identified in the literature and mentioned in the introduction: Data procurement (collecting, producing, clarifying, rinsing data), analyzing data, applying algorithms or models, model parameters management (estimating, deliberating on, selecting), producing raw results, interpreting results, synthesizing results, presenting results, envisioning implications, and documenting (articles, reports, synopsis, reviews).
First, there exist risks of scientific misconduct in various phases of OS knowledge production. While open data facilitates the verification of analyses, algorithms, and models, the entrance and quality assurance of the data themselves may not be as easily checked and quality assured. The reasons for this may be multiple, e.g., that the data user is remote from the data source and has limited information about the context of its retrieval. It may be easy to fabricate or falsify data. Hence, while OS may reduce scientific misconduct in several parts of OS knowledge production, it may not reduce the chance of fabrication, falsification, and plagiary in data provision and procurement [65]. While this problem certainly is not unique to OS, the point is that we should pay special attention to data procurement and provision in OS.
When it comes to interpreting, synthesizing, and presenting results as well as discussing implications and documenting results, the risk of fabrication, falsification, or plagiary is not different than in traditional or "closed" science. One reason for this, it is argued, is that traditional ethical codes of conduct are not well suited for OA [66]. However, when data-processing steps, syntheses, and interpretations are open and accessible, an external review can reveal, and in the long run, come to prevent, scientific misconduct.
Regarding other types of manipulation of data, such as data manipulation or fudging, OS does not seem to increase the chances of manipulation in the various phases of OS except for data procurement [65]. Again, the reason appears to be the lack of control in the initial process. Therefore, special attention should be paid to this part of the OS process.
When it comes to verifiability, the various phases of OS knowledge production differ substantially. While data analyses, algorithm application, and producing raw results can be easy to test and verify, data procurement can be difficult to verify. Correspondingly, it can be difficult to detect errors in this phase. On the other hand, it can be easy to detect errors (or fraud) in data analysis and produce raw results [38].
When it comes to intentionality, errors or flaws are more likely to be intentional in data procurement, interpreting and presenting results, and the documentation of scientific research than in other phases.
As shown in the debate on the results from the Open Science Collaboration's Reproducibility Project [67], errors or flaws in OS knowledge production can occur both on the individual and corporate level, depending on the type of research. Obviously, individual errors are more likely to occur in phases of research that are individually performed, such as data analysis and synthesizing results.
In sum, OS makes it easier to verify and detect errors and flaws in data analyses, the application of models and algorithms, and raw results (output from data analyses). Data procurement remains to be a source of errors and flaws compared to traditional science. Accordingly, OS may improve research integrity, but also leave ample room for breaches of norms and regulations in research ethics. The point is not that OS will enhance such breaches, but that we have to pay attention to the potential in order to avoid the harms and safeguard and enhance the benefits of OS and the trust in science.

Discussion
There are of course also epistemic issues related to other aspects and parts of knowledge production than those discussed here, e.g., in the selection of research perspectives, research question generation, methods selection, etc. These are general issues related to all types of scientific knowledge production, and not specific to OS. Moreover, they may be implicit and not available for scrutiny through OS. They are thus beyond the scope of this article.
Many problems with and objections to OS have not been addressed here, such as that OS violates or undermines the concept of copyright [40]. This article has focused on the epistemological challenges, and not problems with OS in general.
While I have focused on the epistemological challenges and ethical implications of OS and briefly discussed the ethical issues with various parts of OS, each of the issues deserves more discussion in detail than can be provided in an overarching review. There is also a wide range of ethical issues related to OS in general that are not directly related to epistemic challenges, for example, that OS in qualitative research may violate privacy rights [68]. The objective here has been to provide an overview of the epistemological challenges and their ethical implications so that the specific issues can be addressed in a more detailed manner in further research.
Relatedly, it has been noted that OS can increase epistemic injustice [69]. This may of course have both epistemological and ethical implications. It is interesting to note that the epistemic component of epistemic injustice attracts less attention than its ethical aspects. For example, epistemic injustice is considered to be induced by the characteristics of the persons and not of the knowledge. Closer studies of epistemic injustice in OS are most welcome, e.g., how OS is involved in framing and directing knowledge production (hermeneutical epistemic injustice). However, this is beyond the scope of this article.
Related problems, such as using other "open" data sources to carry out research, e.g., from social media, [70] are also beyond the scope of this article. The same goes for the subversion of data for many reasons, such as hiding poor research integrity or avoiding competing research groups.
It may well be argued that open data standards or guidelines can address some of the challenges discussed above [71]. However, as the analysis shows, the challenges may remain and even be increased, e.g., with respect to data procurement.
This article has not addressed the various aspects of OA specifically, such as Open Access, Open Policy, Open Data, Open Reproducible Research, or engaged with the taxonomy of OS in detail [72]. Moreover, much more can be said about the ideological and philosophical basis of OS, e.g., its connection to and extension of Merton's principle of "communism" to the digital information age. While highly interesting and relevant for epistemology and ethics, this is beyond the scope of the present article.
Epistemological challenges with OS have been broadly defined here in terms of challenges to knowledge production in OS. This very broad definition has been applied to be inclusive, as the article is meant to provide an overview, hopefully inspiring more in-depth analyses. Moreover, epistemological challenges have been specifically connected to the various phases of OS knowledge production. Certainly, each of the challenges with each phase deserves closer scrutiny. Again, the intention here has been to provide an overview that can guide more in-depth analyses. Correspondingly, the various challenges can be analyzed in terms of specific theoretical frameworks, e.g., epistemic cultures [51] or STS. Hopefully, this article can be useful to identify the issues to be addressed from specific perspectives.
The results are presented in three main headings (Concerns with benefits that do not occur, Concerns with negative effects, and Epistemological challenges with OS processes). Certainly, the epistemological challenges and ethical implications could have been organized in many other ways. As explained in the methods section, the two first follow a basic utilitarian approach. This seems relevant as so much focus has been on the benefits of OS. Moreover, following the epistemological challenges along the OS process of knowledge production appears justified as well. Future and more specific and systematic reviews may apply other perspectives, categories, and headings.
Moreover, the identified challenges in the three main areas are not mutually exclusive. An undermined or non-occurring benefit may also have harmful (side-) effects, or the cause of the benefit shortfall may also cause negative effects. A more detailed analysis (for each type of OS) is needed to fully address these issues. The aim here has been to provide an overview of challenges and implications and not an exhaustive and exclusive taxonomy. Notably, there is no consistency in the literature. What has been pointed out as epistemological challenges (or ethical implications) for some types of OS may not be relevant for other types of OS. Moreover, what some consider to be a challenge for a specific type (or part) of OS, others do not. Mapping the differences and inconsistencies with respect to the various challenges and implications is beyond the scope of this article.
The search strategy of the literature search is not very elaborate, nor is the database very specific. More targeted searches may have included additional references. For example, including synonyms for OS such as "open research" and "open scholarship" (Search string = (("open science" OR "open research" OR "open scholarship") AND epistem* AND ethic* AND (problem* OR challeng* OR implication OR issue)) identifies 514 references (199 more references at revision). However, the content of the additional references does not appear to identify issues that have not been covered. Moreover, as stated, the aim was not exhaustiveness and completeness of references, but rather to provide an overview of the challenges and implications, and a more elaborate search may not extend the content of the results. While not an exhaustive literature review, this review may appear to be an exemplary literature review as far as it "presents only key references to reacquaint the reader with representative work that relate to the research study" [73]. However, it is in line with an ethics review (in Health Technology Assessment), which aims to provide the reader with an overview of the main issues to consider [20]. Moreover, many of the identified articles dealt with the same challenges, indicating that important issues were covered. A more systematic review with detailed information on the type of OS, stakeholders, epistemic cultures, and the number of references for various challenges is certainly welcome. However, this is beyond the scope of this study, which aims to provide an overview of the epistemological challenges and the ethical implications of OS knowledge production in general, and which hopefully can be a helpful starting point for a systematic review. As OS is a very broad field with a wide range of merits and challenges, it has been necessary to restrict the scope of this study. The objective has been delimited to provide an overview of the epistemological challenges with OS knowledge production, and their ethical implications in terms of a) factors undermining the envisioned benefits, b) direct negative epistemic effects, and c) epistemological challenges with the various phases of the OS process. The objective appears relevant and warranted as OS is elaborate enough to critically assess its benefits and challenges in order to adjust and improve OS in the future. Although the objective is not to discuss the details of each of the challenges and implications, the important contribution of this article is the identification, gathering, and synthesizing of a wide range of information, which otherwise is very labor-intensive to obtain, as well as providing an overview and inspiration for further and more in-depth studies.
While many epistemological challenges and ethical implications have been identified, we have very limited evidence on how they play out in the practice of OS. This is, of course, because it is still early days in OS knowledge production. This means that we should pay attention to the challenges to obtain the envisioned benefits of OS (as pointed out), but also that we need empirical research on whether these challenges occur and how they play out. Hence, we need more research on the epistemological and ethical aspects of OS.

Summary and Conclusions
While there are important envisioned epistemic advantages with OS, this article has identified and addressed some of the challenges and their ethical implications. First, it has identified and discussed concerns with envisioned benefits that do not or may not occur. In particular, the intended increased access to the process of knowledge production, the results, and the producers of knowledge may be hampered by embedded and ingrained norms and values or by existing epistemic governance being reinforced. Public misunderstanding of research and misuse of research results may also backfire and distort knowledge production. Intended broader research participation may be undermined by domination by privileged actors. The same goes for the predicted improvements in collaboration. When it comes to alternative research evaluation and incentive structures, these may be haunted by looping effects and pecuniary domination. The foreseen improvement in the peer review system may also be undermined by looping effects. Finally, the improved technological infrastructure may be overridden by framing effects favoring the privileged.
To address these epistemological challenges, we may need to change the epistemic governance structure, actively adapt the incentive systems and funding requirements, and make adaptive adjustments to the assessment and impact metrics and peer review system. Connecting money to the metrics is but one way of doing so, acknowledging the importance of reward. Improving the infrastructure, counter-frame, and finding compensation measures may be ways to address skewed infrastructure effects.
When it comes to negative epistemic effects of OS, a range of challenges appears, such as undermining quality (lack of replicability, bias, polarization), lack of norms, institutions, and regulations of scientific activity, which altogether can undermine trustworthiness and trust in scientific research. To address such issues, we need to find appropriate ways to verify OS elements and results and develop new institutions inciting new norms and regulations.
When it comes to the specific phases of OS, the greatest challenges are related to data procurement. The reason for this is that data procurement is the most difficult process to verify, and thus also to detect flaws in. Errors in data procurement, applying algorithms or models, model parameter management, and interpreting, presenting, and documenting results are also more likely to be intentional than the other phases of OS. Accordingly, problems with data procurement are hard to address. However, doing so is of utmost importance as data procurement influences all the other phases of OS. Hence, data procurement is epistemically the most vulnerable phase of OS knowledge production, and it influences all other phases and thus threatens to distort OS's impact and value.
Hence, there are many challenges with OS knowledge production: Benefits that may not occur, negative epistemic effects, and the various phases of the OS process. The related main ethical implications are injustice, deception and manipulation (reduced autonomy), reduced benefits (efficiency), increased harm (as a consequence of poor-quality science), and a lack of trustworthiness. Accordingly, to obtain the envisioned benefits of OS we need to address these epistemological challenges and their ethical implications.
Funding: This research has received funding from the European Union's Horizon 2020 research and innovation programme under grant number GA 101006430. The APC was funded by the University of Oslo as part of the project.

Data Availability Statement:
All data is available in this publication.