Open Research Data and Open Peer Review: Perceptions of a Medical and Health Sciences Community in Greece

Recently significant initiatives have been launched for the dissemination of Open Access as part of the Open Science movement. Nevertheless, two other major pillars of Open Science such as Open Research Data (ORD) and Open Peer Review (OPR) are still in an early stage of development among the communities of researchers and stakeholders. The present study sought to unveil the perceptions of a medical and health sciences community about these issues. Through the investigation of researchers‘ attitudes, valuable conclusions can be drawn, especially in the field of medicine and health sciences, where an explosive growth of scientific publishing exists. A quantitative survey was conducted based on a structured questionnaire, with 179 valid responses. The participants in the survey agreed with the Open Peer Review principles. However, they ignored basic terms like FAIR (Findable, Accessible, Interoperable, and Reusable) and appeared incentivized to permit the exploitation of their data. Regarding Open Peer Review (OPR), participants expressed their agreement, implying their support for a trustworthy evaluation system. Conclusively, researchers need to receive proper training for both Open Research Data principles and Open Peer Review processes which combined with a reformed evaluation system will enable them to take full advantage of the opportunities that arise from the new scholarly publishing and communication landscape.


Introduction
It is an unquestionable fact that new knowledge is created through global interdisciplinary collaborations. Therefore, European Commission has made "Open Science" a high priority goal, along with "Open Innovation" and "Open to the World" initiatives, to keep European Union states competitive in global level [1].
Open Science gives the opportunity through the remarkable technological advantages to archive, curate and disseminate interdisciplinary research results across the globe in terms of greater efficiency, productivity and transparency. It has to be clarified that Open Science is not a dogma but a "movement which aims to make scientific research, data and dissemination accessible to all levels of an inquiring society" [2].
The "Berlin Declaration on Open Access to Knowledge in the Sciences and Humanities" [3] is regarded as the main reference for the access and sharing of scientific results in the digital age. Previous definitions of Open Access were related to free access to peer-reviewed literature (Budapest Open Access Initiative, 2002 [4] & Bethesda Statement on Open Access Publishing, 2003 [5]), whereas the "Berlin Declaration" considers not only articles but also "raw data and metadata, source materials, digital representations of pictorial and graphical materials and scholarly multimedia material" to be openly accessible and reusable. Although significant initiatives have been launched and substantial efforts have been made for the implementation and dissemination of the Open Access, the other two main components of Open Science, namely Open Research Data (ORD) and Open Peer Review (OPR) are still in an early stage of development among the communities of researchers, funders and information scientists [6].
The Open Research Data are classified in various categories, such as observational, experimental, computer simulation data etc. Also, they can be found in many forms such as documents, laboratory notebooks, questionnaires, images, audio and video clips, samples, specimens, data files etc. [7]. In the current data-intensive landscape, sharing research data initiative is a matter of considerable significance. Sharing scientific data will speed up and improve the transparency of the research process and assessment accordingly, promoting at the same time the public access to the results [9], [10]. It is worth mentioning that funders such as the National Health Institutes (NIH) have officially supported data sharing as the premium medium to translate research results into knowledge, products and procedures, for the benefit of human health since 2003 [8].
In this context, FAIR data principles were published a few years ago to describe how research data should be shared. FAIR stands for Findable, Accessible, Interoperable and Reusable and these are the features that research data should possess when shared, in terms of the "as open as possible, as closed as necessary" philosophy [9], [11]. The success of FAIR is substantially dependent on cultural changes which will eliminate negative behaviours and on significant changes to researcher's existing system of incentives and rewards [12]. It is essential to mention that a report was released recently by the FAIR in Practice Task Force of the European Open Science Cloud FAIR Working Group [13] as a follow up on the 2018 report [11] in the context of the COVID-19 pandemic. According to this report, FAIR practices help significantly to expedite the needed research processes to fight this new disease. However, more acceleration would be possible if FAIR principles have already implemented more broadly [13].
The second major pillar of Open Science is the Open Peer Review which has neither a standardized definition nor an agreed schema of its features and implementations. It often represents an umbrella term for several overlapping ways or attributes that peer review models can adapt in line with the aims of Open Science (e.g. open identities, open reports, open participation, open interaction) [14]. An Open Peer Review process is a valuable addition to the scientific publishing landscape because transparency is as important as openness [15]. It is not just a matter of making reviews publicly available, but bring the choices of editors, reviewers, editorial decisions, and author's responses also into the public space allowing a creative interaction [15].
In this context, "mega journals" like PLoS were among the early adopters of the Open Access movement by giving emphasis on the rapid publication of "good enough" research, post-publication evaluation through article-level statistics and by implementing OPR techniques (e.g. open reports) [16]. Additionally, to decouple peer review process from journals and its functions, inbuilt systems for the post-publication peer review were created (e.g. F1000 Research) [17].
Apart from publishers' initiatives, there is also substantial research activity on supporting OPR application. For example, Artificial Intelligence could be a valuable tool to the formation of a new and more efficient open peer review process along with further diversity in the reviewer pool [20]. Moreover, there are suggestions for peer review process to adopt the characteristics of open code social platforms (e.g. Wikipedia, Amazon, Reddit) for achieving democratization. At the same time, the implementation of "blockchain" technology is discussed for maximizing reviewers' rewards [17]. Additionally, an interactive open access peer review model can be easily integrated into both new and existing scientific journals. Repositories such as arXiv.org have already adopted an interactive discussion platform for efficient review and open discussion [21], [22]. Another example of implementing OPR attributes is the Transparent Peer Review model, applied by Wiley in more than 60 journals with the use of ScholarOne and Publons platforms. Remarkably, more than 87% authors of these journals have chosen this approach for their articles [23].
Attempting to promote further the concepts of openness for both data and review process, it is necessary to identify first the scientific communities'perceptions about these topics. Particularly, librarians and information scientists could develop more effective strategies for promoting ORD and OPR in interdisciplinary communities only if they are aware of the scientists' attitudes.
In this context, the present research aims to investigate the viewpoints of a representative medical and health sciences community towards Open Peer Review, unveil the degree of their knowledge about Open Research Data and of their agreement on sharing their content.

Literature review
Even though data sharing offers great potentials for scientific progress by reproducing study results and by allowing the reuse of data, scientists hesitate to make their files available in public. The study of Zuiderwijk et al. [24] provides a systematical review of 32 open data studies on individual researchers' drivers and inhibitors for sharing and using open research data. The upcoming paragraphs attempt to identify the most common inhibitors that result in low rates of data sharing among researchers, focusing on the medical community.
Medical research is reported to have a low data sharing culture, probably related to the fact that it works with individual-related data [25]. The study of Tenopir et al. [26] revealed that there is a strong influence of the researcher's age on the willingness to share data. In particular, junior scientists are less likely to make their data available to others compared with the senior ones (over 50 years old), who are willing to share their data. This behavior can be explained by high competition, especially among non-tenured researchers. This study also points out the different disciplinary practices. Specifically, in medicine and health sciences, most researchers are reticent regarding the accessibility of their own data. It is also found that academics are strongly against the use of their work for commercial gain without their prior knowledge or permission, even when they receive credit as the original authors [27]. Specifically, Savage and Vicker reported that only one author among the authors of ten articles published in the PLoS Medicine or PLoS Clinical Trials journals allowed access to the research data [28]. On the contrary, the majority of atmospheric science researchers were willing to share their data, [26].
Additionally, researchers express the urge to have control, regarding access and use of the data that accompany publications. They also report a lack of sufficient formal recognition in comparison to the journal articles and lack of professional reward for sharing [30], [25].
It is also reported that researchers often do not share their data, because they worry about people misusing and misinterpreting them [31], [32], [33], [34]. Moreover, sharing data can be time-consuming (e.g. preparing a data set) and researchers are often unaware of the existence of repositories that offer related services. Researchers have not trained adequately for managing and sharing data [35]. The extra working load along with teaching and administrative obligations, the disciplinary norm, the unclear career benefits and the risks that may occur could significantly have a negative impact on researchers' attitudes towards data sharing [36], [37]. Only scientists who exclusively conduct research are more likely to share their data because of the time privilege [24] [25].
Nevertheless, most of the medical scientists consider as a fair exchange for others to use their data under certain conditions: a) if there is opportunity to collaborate on the project, b) if results based on the data could only be disseminated with the data provider's approval, c) if at least part of the costs of data acquisition, retrieval or provision must be recovered or d) if legal permission for data use is obtained [26].
Due to the nature of medical data (e.g. huge volume and breadth, privacy issues) and the need for an efficient process, it is vital to extent FAIR to FAIR-Health principles, as proposed by Holub et al. [38] and introduce a positive framework of incentives aiming the health communities. Moreover, health data (e.g. the use of human material) demand for special considerations like privacy protecting principles and protection regulations [38]. Even when complying with medical data protection policies (e.g. Health Insurance Portability and Accountability Act (HIPAA) [39], there is always the risk of data re-identification [36], [40], making researchers more reluctant to share their data openly.
As far as the peer review process is concerned, it was introduced by the Royal Society of Edinburgh's in 1731 [41] and journals started using it formally, after 1960 [20]. It has played a decisive role and has been so well established that it is challenging to conceive any future change [42]. The two traditional forms of peer review (single and double blind) and lately, the open peer review model are the principal evolved forms [17], which all aim to achieve the maximum transparency. Focusing more on the health and medical disciplines, the PEER D4 study reports that researchers from these fields have the highest number of articles compared with other researchers and always choose peer review journals [43]. Moreover, it is certified that scientists in medicine and health science decide to publish their papers in journals that implement a formal and trustworthy peer review process, regardless the content access model (e.g. OA and non-OA) [44]. One of the aims of the present paper is to identify if they also consider OPR a trustworthy peer review process.
From a large number of studies on single and double-blind peer review models, it is inescapable to conclude that they are rarely impartial or evidence-based and with high levels of complexity [45], [46], [17]. Reported biases that concern the gender, the nationality, the affiliation, the language etc. [47], [45] confirm that there is no longer the question of whether the traditional peer review is impartial, but what the probable causes and solutions are [17]. The OPR could contribute positively to improve transparency by highlighting such incidents of inappropriate reviewers' behavior and achieving mitigating and tackling at the same time. Transparency is an ultimate goal for an evaluation framework which can function as a mechanism of accountability, that is almost absent at the traditional models [17]. Moreover, OPR can reinforce innovative efforts and minimize the negative aspects of the peer review.
Apart from the bias incidents, traditional peer review models are time-consuming and demand great effort from the researchers [18], [19]. The single or double-blind peer review models, even though are preferred more, are difficult in practice and have little impact on the quality or speed of the review or on acceptance rates [17], [48]. Nowadays, the increasing number of produced papers by researchers resulted in higher rejection rates and multiple reviews of the same content by different journals ("review process" overload), which may be one of the reasons for the evaluation reports reduced quality [49].
Regarding the above discussion, the crucial question that arises is if Open Peer Review could actually be the alternative path that will help research community to overcome or minimize the negative aspects of the existing assessment framework. For example, Sueur et al. [50] suggest that a more open evaluation model could improve the educational value of peer review, increase the constructive criticism that encourages researchers, and reduce pride and prejudice in editorial processes. Besides, studies already showed that articles with open peer review reports could be expected to have significantly greater citation counts compared to articles with closed peer review history [51]. Nevertheless, there is no longitudinal evidence that new evaluation models such as OPR are superior to the traditional ones at either a population or system-wide level [17]. OPR cannot be applicable under all circumstances since there is no standard procedure for publishing review reports [52]. Additionally, it has to be clarified that although OPR provides incentives for a more transparent and collaborative evaluation framework, it is not able yet to completely prevent undesirable behavior, misconduct incidents or eliminate automatically all identity-related biases [53], [54].
The present paper aims to contribute to the ongoing discussion for ORD and OPR by providing the viewpoints of the medical and health sciences community.

Materials and Methods
For investigating the attitude of medical and health sciences community on issues about open research data and open peer review, a quantitative survey was conducted based on a structured questionnaire, which is divided into three parts, namely FAIR principles awareness and ORD -OPR related questions (See Appendix A).
The 215 actual participants (out of 415 total population) in the questionnaire were academic doctors (42), doctors in the National Health System (107), nursing (24) and paramedical staff (10), postgraduate medical students and health researchers (32-others), affiliated with the General Hospital of Athens "Hippocratio". They all agreed to participate in a series of surveys, which included topics related to scholarly publishing as the ones presented in this paper. The questionnaires were sent electronically through the Lime Survey platform. As soon as the completed questionnaires were gathered in January 2020, the data processing began. Apart from the professional category, other demographic characteristics were the years of professional experience and gender.
The types of questions were based on the psychometric Likert scale, which is often used in similar surveys, for the evaluation of population attitude or opinion. Participants could choose from 4 scales instead of 5, in an attempt to achieve more concrete results by preventing them from taking the easy way of neutrality. Similar surveys [55], [56] inspired the content of the questionnaire.

Results
In the first part of the survey (Part 1) the participants were asked if they are familiar with the FAIR principles. It is noted that before completing the survey, participants were presented with the definitions of the key terms contained in the questions (e.g. OA, FAIR, OPR, ORD etc.) As it is depicted in the following table (Table 1), most of the population (81,9 %) was unaware of FAIR principles, before the survey. In Part 2 of the survey medical and health scientists were asked to define the level of their agreement (1-Disagree, 2 -Somewhat disagree, 3 -Somewhat agree, 4 -Agree) regarding specific OPR issues through six questions. Specifically, as it is presented below in Figure 1/All responses, 86.97% of the participants agree or somewhat agree with the statement that publication should include related research data files with open or closed access (Figure 1/All responses -Q1). They also believe (92.09%agree or somewhat agree) that Open Research Data will significantly contribute to research promotion and science progress (Figure 1/All responses -Q2). They recognize that FAIR principles are difficult to be implemented because of the lack of proper training and support (78.14% -agree or somewhat agree - Figure 1/All responses -Q3). Only 45.58% of the participants agree or somewhat agree with the exploitation of their own data from others, in a non-commercial way, with the proper credits attribution (Figure 1/All responses -Q4). Moreover, participants don't agree with the exploitation of their own data, by third parties, for commercial purposes (only 20% agree or somewhat agree - Figure  1/All responses -Q5). Finally, most of the participants' viewpoints (72.1% -agree or somewhat agree - Figure 1/All responses -Q6) align with the argument that FAIR principles are more difficult to apply in disciplines such as medical sciences (e.g. due to personal information contained to medical research data).
Comparing the overall results of Part2 (    Finally, in Figure 3, responses are depicted according to the years of experience of the sample population. It is observed that younger scientists with 1-10 years of experience disagree or somewhat disagree more (66% - Figure 3   In Part 3 medical and health scientists were asked to define their position (1-Disagree, 2 -Somewhat disagree, 3 -Somewhat agree, 4 -Agree) regarding OPR potentials through five questions. In particular, participants seem to agree or somewhat agree (79.07% Figure 4/All responses -Q1) with the potential to submit a scientific article to a journal which follows OPR system, whereas the reviewer's identities would be revealed to authors and the opposite and also be included in the final publication. Moreover, the participants adopt the same positive position (89.77% - Figure 4/All responses -Q2) towards the potential to publish their work in a journal which aligns with OPR and the whole reviewer's report with the author's response would be included in the final publication. The sample population also agreed or somewhat agreed (78.6% - Figure Figure 6) no remarkable differences were observed. It seems that most of the population sample agrees with the OPR process and its potentials. Nevertheless, it has to be pointed out that nursing staff would consider moderate less preferable (only 50% has a positive attitude compared to the 68.84% of the overall population) to publish articles in journals such PLoS where the speed of publishing is a high priority in combination with the technical correctness of the papers regardless of their potential novelty.  The above results lead to the conclusion that most of the participants expressed a positive attitude towards ORD but a low level of personal knowledge about them. Regarding OPR model and its potentials, most of the population sample reacted again positively. However, the evaluation procedures and mechanisms are not clear yet or have not been described in a uniform or standardized way. A full analysis and discussion of the results will follow at the upcoming sections.

Discussion
Going deeper to the interpretation of the results, the majority of the population sample, even though they agreed with the beneficial contribution of data sharing to research promotion and science progress, strongly expressed a clear opposition against the exploitation of their data by third parties, commercially and non-commercially, even when credit attribution applies [27]. The unwillingness to share their own data, among other reasons presented in the discussion, could also be justified by participants' positive response to the statement that in medicine and health sciences, is hard to implement ORD principles (e.g. FAIR), due to individual-related data [25]. Additionally, younger participants, especially female with 1-10 years of experience compared to their older colleagues, are less likely to share their research data as it is also confirmed in [26] study. The results of the present study also point out the lack of proper training and support regarding the implementation of FAIR principles in data sharing process which is also reported in other studies [37], [36] revealing the absence of the necessary skills on how to share data along with the limited awareness of the available repositories or data preparation standards.
Regarding OPR and its potentials, the majority of the participants expressed their willingness to publish their papers' peer review reports and be reviewers in journals which follow OPR system, aligning with the content and results of other studies [57], [58]. The outcomes of the present research also reinforce the fact that OPR adoption is most prevalent in medical disciplines and agree with the steady growth in OPR adoption, which has been observed since 2001 [58]. Even though the present study captured a positive trend of the medicine and health sciences community towards OPR and its potentials, it is quite questionable if the participants actually have realized its full potentials, which is also confirmed in Patel et al. [59] study. Additionally, the majority of the respondents appear to consider peer review mainly as a process of validation, as reported in other studies [59], [60] since they highly rated publication choices like PLoS and F1000Research, where the first priority is the technical soundness and the reliability of the research. This result could also be related to the fact that medical scientists care mostly about the trustworthy peer review process, regardless of the content access model (e.g. OA and non-OA) [44].

Restrictions -Future suggestions
It is important to clarify that the present survey focuses on medicine and health sciences discipline because of its explosive pace of publishing and a great urge for scientific communication. Additionally, although the target group of the survey is limited to the Greek medical community, its methodology could apply to other scientific fields. Moreover, the results of the survey are considered a valuable attribution towards the development of a sufficient strategy for the dissemination of Open Science principles among the members of the medical community. The examination of the similarities and differences of researchers' viewpoints towards ORD and OPR, from diverse scientific fields, is our target for any future activities.

Conclusion
As already stated in the results section, it becomes apparent that the participants of the medicine and health care community are positive towards Open Research Data initiative regarding the beneficial contribution to the conduct of research and the advancement of science in general. On the contrary, they are not familiar with basic terms of ORD, like FAIR principles and they opposed to any exploitation of their research data for commercial and non-commercial use. This somewhat contradicting perception is mostly related to the competition for career progression or/and the inside knowledge for inconsistencies at the research methodology [61] [25]. In addition, especially for health sciences, it may also be related to the personal (sensitive) character of the medical research data, which demands compliance with medical data protection policies, and it is reflected as well as in participants' responses in the present survey. These outcomes also imply lack of knowledge about the benefits of data sharing process and the re-usability positive potentials, indicating at the same time an apparent absence of proper training about subject open repositories, research data management, data anonymization techniques etc.
As far as the OPR is concerned, the participants seem to agree with the new potentials such as open reports and post peer review process, but it isn't very certain if they have fully understood in-depth the functions of the OPR model. Nevertheless, researchers conceive peer review as a highly significant research validation mechanism through which they will obtain reputation and promotion [62], [63] and seek transparency in the research publishing process. Therefore, as it is shown in survey's results, they are willing to support every effort that targets a highly trustworthy peer review system.
One of the most important conclusions is that researchers urge to receive proper training for both ORD principles and OPR processes. Librarians, information scientists, publishers and other stakeholders should undertake initiatives such as data services, workflows and consultations that are tailored to the needs and the particular requirements of each scientific community. The proper education about the key components of Open Science movement, in conjunction with the tenure-centred evaluation system reformation, will enable researchers to take full advantage of the opportunities that arise from the new scholar publishing and communication landscape.
Conflicts of Interest: Declare conflicts of interest or state "The authors declare no conflict of interest." Authors must identify and declare any personal circumstances or interest that may be perceived as inappropriately influencing the representation or interpretation of reported research results. Any role of the funders in the design of the study; in the collection, analyses or interpretation of data; in the writing of the manuscript, or in the decision to publish the results must be declared in this section. If there is no role, please state "The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results".

Abbreviations
The following abbreviations are used in this manuscript: Q2 -Open Access to Research Data related with published articles will significantly contribute to research promotion and science progress.
Q3 -Researchers are unable to follow / apply FAIR principles due to lack of proper or appropriate training and support.
Q4 -Would you share your research data for exploitation from third parties allowing them to remix, adapt, and build upon your work non-commercially, as long as they credit you; Q5 -Would you share your research data for exploitation from third parties allowing them to remix, adapt, and build upon your work even for commercial purposes, as long as they credit you; Q6 -Do you consider more difficult to implemented FAIR principles in disciplines such as medical sciences (e.g. due to personal information contained to medical research data); Answer options: 1 -Disagree, 2 -Somewhat disagree, 3 -Somewhat agree, 4 -Agree Part 3 -Define your position regarding OPR potentials. Q1 -Would you submit a scientific article to a journal following OPR where the reviewer's identities would be revealed to authors and vice versa and included in the final publication; Q2 -Would you submit a scientific article at a journal following OPR where the reviewer's reports and the author's responses would be included in the final publication; Q3 -Would you accept to review an article for a journal following OPR knowing that your identity and report would be included in the final publication; Q4 -Would you submit a scientific article at a journal following PLoS review process, meaning fast publication based on the technical soundness of research without any judgement on its novelty; Q5 -Would you submit a scientific article at a journal following F1000 Research review process, meaning fast post-publication peer review after a basic formal check done by selected experts and readers.