FAIR Metadata Standards for Low Carbon Energy Research—A Review of Practices and How to Advance

: The principles of Findability, Accessibility, Interoperability, and Reusability (FAIR) have been put forward to guide optimal sharing of data. The potential for industrial and social innovation is vast. Domain-speciﬁc metadata standards are crucial in this context, but are widely missing in the


Introduction
Metadata is needed for concise knowledge communication of the energy transition to its stakeholders. The concept of metadata lends itself to the idea of maintaining a laboratory notebook: data would represent the outcome of a specific experiment, a physical notebook that contains how and what data was collected from a given project, and a database is the library where all notebooks from various projects are stored. All this data is connected to some context, which is captured by metadata here represented by a card index. Metadata enables navigation within and across databases. In other words, data is inevitably linked to its metadata, and without metadata, data loses usefulness [1]. Hence, a credible and traceable documentation of knowledge about the energy system is not possible without metadata.
Metadata for low carbon energy research is essential as it provides the context for data used to decide on energy transition pathways. Metadata needs to enable reproduction and validation from the research community's perspective, and metadata needs to be comprehensive (trackable) for society. This is an integral part of science. Metadata supports transparency of energy transition processes, the validity of the basis for decision-making, and collaboration across disciplines and societal groups. Metadata, therefore, also serves the purpose of communicating data between its users. As the research and policy questions connected to the low carbon energy transition are interdisciplinary and complex, it is crucial to understand the energy system as a whole and in its parts (refer to Georgescu-Roegen [2] for further reading on the Principle of the Whole). This implies that data describing the historic, present, and future course has to be in a suitable format that allows scaling, analysis, and comparing heterogeneous information with different granularity levels, may differ across user groups, which is why it is essential to identify them beforehand so that the richness and detail of metadata can be defined.
Wilkinson et al. [3] stress that 'Metadata needs to meet domain-specific standards'. However, current metadata practices in the energy domain fall short of these aspirations. There are no data standards to link to units, technologies, methodologies, and infrastructures. This report is a collaborative work and examines this gap in more detail in the following sections. It reviews, explores, and tests metadata practices and related governance principles for the low-carbon energy domain. Building on a series of community workshops and discussions with dozens of researchers from June 2020 to December 2020, we elicit lessons learned and provide recommendations for advancing FAIR (meta)data standards for low carbon energy research. (We use the phrase '(meta)data' as a shorthand for 'metadata and data'.) Thereby FAIR stands for Findability, Accessibility, Interoperability, and Reusability of data [3]. Metadata plays a vital role in all the FAIR guiding principles. Out of the 15 criteria, 13 are explicitly referring to metadata.
While metadata needs to be accessible by anybody, research data in energy science covers a wide range in openness, ranging from open data, to shared data, and closed data. While open data is understood as data 'that can be freely used, re-used and shared by anyone for any purpose' [4], shared data may be widely accessible, but have some conditions about its use. Closed data is limited to a well defined, usually small group of users. Given the commercial interest of energy stakeholders, closed data is not uncommon in the energy domain. Moreover, the General Data Protection Regulation (EU GDPR) may be required to limit access to the data. While there are good reasons for the protection of commercial interests and private data, the European Union encourages its member states to foster that data should be 'open by design and open by default' [4]. In any case, the FAIR principles comprise the entire range of openness by understanding accessibility as 'accessibility under well defined conditions' and linking re-usability to 'clear and accessible data usage licenses' [3]. Moreover note that metadata may even just play an important internal role for a research team by enabling a proper documentation of the data context.
The report is structured as follows. Section 2 outlines the approach used to engage the low carbon energy research community and other experts. We believe the method to be useful for replication. Section 3 is presenting insights from workshop discussions, structuring them along with reflections on the needs of user groups (Section 3.1), methods on how to advance metadata standards from the perspective of humans and machines, and technical points of departure in the energy domain (Section 3.2). These sections review and evaluate current metadata practices and standards in the field. The report concludes with lessons learned and recommendations on advancing metadata standards for low carbon energy research (Section 4). That section also expands on the role of openness for data.

Method to Enable Community-Wide Reflection on Metadata
This report results from discussions within the low carbon energy research community organized through a series of online workshops between June and December 2020. The consortium of the EERAdata project has developed the method for community engagement [5]. This project aims to create and establish a FAIR and open data ecosystem for low carbon energy research, for which the (small) project consortium is 'going where the data are'. EERAdata strives to transfer best practices in FAIRification developed in other communities to the low carbon energy research domain. To this end, a series of six workshops are conducted, following the concept of defining, implementing, and sustaining FAIR practices. The second workshop brought together experts for FAIR and open data with energy domain specialists with the aim of creating a shared understanding of FAIR and open energy (meta)data. This report presents a white paper on the matter. The recommendations will be taken up in the new transversal Joint Program 'Digitalization for Energy' of the European Energy Research Alliance (EERA) [6]. While all workshops were designed to gather and reflect inputs from data scientists and data stewards (from the field and beyond), the December workshops were dedicated to compiling content and preparing a first version of the current paper as a community effort. Figure 1 presents an overview of the different seminars held. Additional details are also provided in the Supplementary Materials. A community-wide initiative was chosen as a suitable litmus paper for various reasons. Firstly, the advancement of metadata practices that examine existing gaps that hinder low carbon energy research affects the global community. Existing gaps and needs cannot be looked at myopically as a researcher's problem but rather as a global problem. They will continue to negatively affect the FAIR and open data ecosystem in low-carbon energy if urgent steps are not taken to change the current narrative and practices. Secondly, constructive inputs from a wider community open up transparent discussions about metadata and tap into a diversified knowledge base. All broad community stakeholders should form resilient solutions for the current issues that hinder metadata standards. Thirdly, expert practitioners' input enabled non-specialists to understand recent research on specific topics better and generate further interest that may influence the longed-for change.
All workshops were delivered in an online format. Participants could contribute to the working draft of this paper via online collaboration tools that were always available and accessible. Participants were invited to join writing teams and discussion groups to share comments, provide useful research from credible sources, and add their contributions from expert analysis sessions featured throughout the workshops.

Metadata user stories
To discuss the alignment of metadata design and work ows for data access and processing. To elicit user pro les and test mock-ups.
Making domain-relevant machine-actionable metadata at scale Expert inputs and panel discussion to plan domainspeci c metadata-for-machine workshops.

Users of Low Carbon Energy (meta)Data and their Needs
As already discussed, metadata provides data to communicate the nature of the underlying data between data users, including data creators and data re-users. The extent of context needed naturally depends on the needs of the data users. Thus, a definition of the required richness and depth of metadata as well as the conditions for accessing (meta)data is inseparable from a stakeholder and needs analysis. The group of energy metadata users is diverse, ranging from different academic disciplines to the general public. Besides, some stakeholders have been identified from the general point of view of scholarly communication [7], where energy research is one topic from many. In detail, we have identified the following user groups, their use of energy data, and the need for specific metadata to accommodate these needs:

1.
Researchers -(a) Energy domain experts create data from scratch, re-use existing data to curate, aggregate, analyze, and publish data. They come from various disciplines. As re-users of data, they need support from metadata in searching and finding data. They require context information with high granularity and rich provenance to assess data quality and relevance, precise information on usage rights, and intellectual property rights (IPR) requirements. (b) Interdisciplinary scientists inform themselves on (other) expert knowledge, re-use data, aggregate and analyze data, and publish data. They require context information for searching and finding data. They need information on the context of data and provenance (mostly on aggregated levels) and precise information on usage rights and IPR. Shared and open data fall within the interest of this group.

2.
Science funders of energy R&D activities inform themselves of the results of funded research and projects. They need to monitor, adjust, and plan funding policies and principles to better direct R&D investments and to ensure the impact of policy measures [8][9][10]. Evaluation agencies are a key mediator in this regard as they compile existing (meta)data and develop evaluation tools. Another important objective of funders is to unlock the potential of added value by exploring knowledge generated by research, exploiting and expanding its use [11]. Therefore, funding agencies also require the Data Management Plan (DMP) to track data's provenance and future maintenance. Typically, this application is at a high level of aggregation, and information should be easy to understand and disseminate. Last but not least, a pivotal interest is that the funding agency be acknowledged in the metadata.

3.
Planners and decision-makers (incl. energy market regulators, security coordinators, policy-makers) inform themselves on expert knowledge. They may reuse data, analyze some data, and publish aggregated data and decisions. This involves a middle to high level of aggregation, information on the context of data, and provenance. An important aspect is the metadata information on aggregated data. While this group has an interest in open data policies, they are also obliged to protect private data pertaining to citizens.

4.
Energy and other industries (incl. technical and operational planners and decisionmakers, energy market operators). They inform themselves on expert knowledge and re-use and analyze some data on all aggregation levels. They require information on the context and provenance of data. Important is metadata information on aggregated data. To assess the commercialization potential, legally secure information on usage rights and IPR are needed.

5.
General public. The group informs itself to adjust behavior and practices (e.g., energy consumption behavior, voting in elections, engagement as prosumers, activists or citizen-scientists). They may re-use data to tell others and pose questions leading to research activities. A high level of aggregation is needed to make data easy to understand and navigate. 6.
Data scientists (incl. data engineers, software and algorithm developers). This group codes, tests, and validates software with existing data. Specifically, data engineers (who may be domain specific or agnostic) are key builders of (meta)data pipelines and back-end solutions. Data scientists may re-use (meta)data to design scientific workflows considering interoperability with other tools and demonstrate applications. They need concise metadata to integrate data sources within the software tool, machine-actionable open file formats, agreed standards, terminologies, and interoperability protocols. Being in legally-compliant control of private data is of particular interest for this user group. 7.
Publishers, librarians, and data curators publish, store, and archive research data. They may re-use data to link them to metrics such as access statistics and to crossreference. They need metadata regarding ownership/authorship of results or the size of the data and information such as keywords to provide searchability in their publications and archives.
Note that in practice, a data stakeholder may take two or more roles at the same time. Some institutions may also provide several functions simultaneously by, e.g., bringing together basic researchers, interdisciplinary scientists, public relation specialists, and data custodians.
The data activities and metadata needs identified above can be categorized into broader topics to illustrate the user groups' differences. Typical activities by data stakeholders in the life-cycle of data are: (1) creating and collecting data, (2) actively re-using data (e.g., creating new data or products), (3) passively re-using data (e.g., by distributing the information consumed), (4) accessing information, (5) publishing data, and (6) saving and archiving data (data curation). Note that the order of activities might differ between the user groups depending on the actual workflow. The needs can be organized roughly under the topics: (1) to identify data context and data organization (including standards applied), (2) to retrieve (meta)data, (3) to manage and process information, and (4) to inform on ownership, funding, IPR, commercial protection, and data protection regulations (such as European GDPR). Note that these categories are not entirely exclusive and may support each other in providing the required context. They are similar to some of the purposes of metadata found in Haynes [12].
Applying the categories above, Figure 2 visualizes the extent of activities and needs for each user group. We distinguish between a low engagement (L), an intermediate level (M), and a pivotal movement (H). As discussed above, the needs of each user group usually come with different levels of granularity. For simplicity, we distinguish only fine (F-symbolized by a circle) and coarse (C-symbolized by a pentagon) levels; when a user group requires specific types of metadata at the different levels, we show it as both (B-symbolized by a diamond). The choices for the values reflect the results of workshop discussions and impressions from the literature review. This figure confirms the variation in typical activities with data, e.g., scientists mainly create research data. In contrast, all other user groups mainly re-use data in analysis, aggregation, and publication. It underlines the role of standards as a basis for mutual understanding between the user groups, accommodating their interests and backgrounds.

Gaps for Users of Energy (meta)Data
The section provides a review of gaps between metadata needs and existing (meta)data practices. We consider the user groups identified in the previous section.
Most of the literature is narrated from the perspective of the researcher (domain experts, interdisciplinary, and other scientists). Limitations and gaps related to data and metadata are significant impediments for research activities [13,14]. The primary issues raised pertain to findability and accessibility issues of the FAIR principles [15][16][17]. For an enriched research cycle, the metadata and data should be descriptive, complete, easy to find, accessible, and machine parsable. Data that lacks descriptive and full metadata reduces the resource's reusability and diminishes its value. Further limitations derive from a lack of completeness (e.g., missing years from time series, incomplete sampling, etc.), lack of a user-friendly database, and (meta)data formats not being machine-actionable. The data itself should fit with the domain norms regarding semantics and packaging. In addition to this, metadata should, where possible, be provided in a Linked Data or RDF format, such as JSON-LD.
The factors mentioned above decrease the ability to locate data while at the same time disrupting the pace of research, slowing adoption and use by other users [18,19]. Metadata and data should be registered or indexed in searchable sources that academics and other data stakeholders can easily access. Although a significant amount of data is continuously generated, only a small proportion is necessarily available in academic databases that are easy to find and access [20]. In this sense, the lack of an appropriate format to store the data is a significant barrier since a considerable amount of information is not deposited in sustainable and widely-known repositories [15]. Moreover, the available data is not always linked to metadata that conforms to any metadata schema.
From a theoretical perspective, particularly in qualitative research, data is prone to inaccuracies, omissions, and inconsistencies [21]. This weakens the reuseability of the data. Interdisciplinary research may be affected by a broader set of data and metadata-related challenges. For example, the interoperability criteria of the FAIR guiding principles has crucial importance and its lack means that essential information is merely absent from specific datasets, making academic research needlessly time-consuming to process multiple datasets, costly, labor-intensive, and error-prone when processing multiple datasets [22]. A difficulty for processing data in different academic fields is linked to the following deficiencies: (1) lack of controlled vocabularies and terminologies, (2) lack of agreements on controlled vocabularies and terminologies, (3) lack of agreement on formats in which controlled vocabularies and terminologies are expressed, (4) lack of linking between existing controlled vocabularies, (5) lack of agreement on formats for metadata, and (6) lack of agreement on metadata schemas.
Specific gaps faced by funders of energy R&D activities (incl. evaluation agencies) are connected with their need to monitor and evaluate the efficiency of the provided funds. Therefore, access to (meta)data, the richness of documentation of activities and results, as well as the interoperability are current key issues. For example, the literature points to the significance of research output in showing which funds are appropriately utilized to enhance socioeconomic progress and to assess the impact of policy support measures [8,9]. While some evaluation tools are available [23], their proper application suffers from a lack of coverage, appropriate levels of detail, as well as missing interlinkages between the deliverables of research projects.
Data and metadata are significant ingredients and drivers of the policy process. Evidence from the literature highlights that data sharing is essential for policymakers to carry out necessary activities spanning planning, monitoring, decision-making, and implementation with accurate scientific data [16]. In this context, planners and policymakers-as well as analysts preparing policy-relevant reports-benefit from shared and/or open data platforms. Simultaneously, it is imperative to transparency that policymakers should share data openly to achieve successful cooperation between academic and policy-making levels. Transparency is essential to gain social trust and engage the public in the policy-making process [19,24]. However, as the literature review shows, governance problems or insufficient skills can pose a problem for the data's openness [25]. Moreover, one main category of policymakers' issues in managing data has been identified as legal problems [24]. Ethical and security concerns and additional workload are essential reasons for the difficulty in achieving openness-even in metadata-at the policy-making level [26]. As policymakers' decisions affect the public and industry significantly, they should engage private sector representatives and citizens in the decision-making processes, and the data generated should be accessible at every level [21]. Another related topic that has received interest from policy data stakeholders is the accessibility and openness of government-run sites being deemed essential to increase communication between municipalities and citizens [17].
The energy and other industries is another main user group of low carbon energy (meta)data, especially against the background of the increasing digitization of the energy sector. High quality metadata is a prerequisite to new business concepts such as smart demand response, the operation of small-scale distributed renewables, smart charging of electric vehicles, and the development and management of energy storage solutions. The literature stresses the general lack of common standards for sharing (meta)data in the industry [18,20,22]. Without such standards, data is lost and interoperability is limited if not disabled, causing forgone opportunities for performance optimization [27], demand forecasting [28], the monitoring of energy use [29], and the creation of energy management frameworks [27]. The matter is even more pressing when aiming at a global adoption of circular economy concepts, in which the flows of resources, waste, and-consequently -data produced by different participants of an industrial symbioses need to be integrated. The coordination of such a deeply linked industrial infrastructure requires collaboration based on FAIR (meta)data. Finally, unresolved issues around the sharing of (meta)data and intellectual property rights need to be settled.
Traditionally, the general public requires (meta)data that can be trusted to judge the accountability and transparency of governmental measures [30][31][32]. Cantadora et al. [33] emphasize the general public's data needs as a prerequisite for citizen participation in policymaking or community initiatives. The authors discuss the OECD's Open Government Data project that relies on making the governments' data available to the general public for transparency and accountability.
Of increasing concern is the protection of privacy rights in the advent of large-scale metering and the collection of consumer information. Currently, (meta)data are not designed to prevent the inference of sensitive personal information nor do they support the regulation of discrimination practises based on customer segmentation [34]. The authors also point out that the need for fine-grained data that allows the tracking of choices of individual household members is at odds with the requirement for data aggregation due to data protection. Moreover, with the application of aggregation procedures, the task of following the flow of data becomes demanding.
With the opening of energy markets for new actors, the general public's role broadens from a passive participant (who receives information and takes IT-informed consumption choices) to active engagement in the energy transition. Gendron and Killian [35] refer to the concepts of a 'data citizen' and 'information democratization', whereby they argue that citizens lack knowledge on concepts like data agility and data literacy. For example, Anhalt-Depies et al. [36] discuss knowledge gaps around privacy rights when participating in citizen-science activities. The empowerment of citizens regarding FAIR (meta)data knowledge and competences is also important for steering interest in the general public to take up new professions, such as energy data stewards and data analysts. Lhoste [37] investigates possibilities of engaging civil society in R&D with the help of specifically tailored research systems. Dunnett et al. [38] is a best practice on harmonizing data for solar and wind farm locations by sourcing information from the collaborative OpenStreetMap project.
The specific gaps data scientists are confronted with are presented in regular surveys conducted by Appen, an artificial intelligence (AI) platform. Their annual Data Scientist Report presents the results of interviews with more than 100 participants on data management practices. The larger share of participants represents technology companies with sizes in a spectrum of 100 to more than 10,000 employees. The results from 2017 [39] show that data scientists spend most of their time (51%) on data curation, i.e., the collecting, labeling, cleaning, and organizing of data. At the same time, only 19% is devoted to data analysis and modeling. This imbalance could be rectified with the help of FAIR (meta)data. 51% of the respondents characterize the data they work on as unstructured data, pointing to the need of accommodating heterogeneous data. Regarding the type of data, 91% of the data scientists use is text data, 33% image data, 11% audio data, and 20% video data showing that (meta)data have to comply with various multimedia standards. While several approaches are applied occasionally, agreements on selected standards are missing. A subsequent Appen White Paper [40] highlights several trends. The first one is the increasing number of AI projects and the growing need for high quality data in parallel. Also, the growing significance of data privacy and related ethical issues is even more paramount because data-driven AI is now increasingly applied in decision-making. The 2018 survey also reports on the perception of algorithmic bias. Indeed, 9% of data scientists believe that there is no algorithmic bias, whereas 75% think there is less algorithmic bias than human (data scientist) bias. This result may support the machine actioning of the data while there is at the same time a danger of amplifying bias coming from training sets that are not representative or of poor quality [41]. Rich metadata is the way forward to reveal and track potential data quality issues.
A detailed compilation of gaps from the point of view of publishers, librarians, and data curators can be found in Gregg et al. [7]. Summarizing their findings, publishers are faced with the transition from traditional cataloging and metadata compilation to more automated forms. Also, proprietary concerns may lead to the in-house development of metadata concepts. Librarians are engaged in creating metadata, consuming metadata, and preserving metadata at the same time. As consumers, they rely on researchers and publishers' high-quality metadata and fail if gaps are encountered from that side. Bascones and Staniforth [42] even state that "due to poor quality metadata supply combined with complex and continuous processing, librarians rarely know the status of their library holdings and collections of e-resources," affecting user experience, user satisfaction, and cost calculations. Of course, poor quality metadata should be corrected rather than preserved, but procedures, agreements, and guidelines are lacking. Finally, data curators are faced with the wish to assign persistent identifiers to highly granular data against the background of limited financial and administrative resources. Indeed, the lack of acknowledgment for data curation and the allocation of corresponding resources is a gap that can be observed for all energy data stakeholders.

Human Perspective on Metadata Navigation
When knowledge was codified in traditional encyclopedias, users oriented themselves within that given framework of knowledge organization. However, with the growing amount of experience and increased fragmentation, new ways of organizing and codifying knowledge are needed to enable users to search, understand, and publish data. To tap that potential, metadata must be appropriate and well-structured, according to how specific users view the domain's knowledge. That may differ across groups of users and within a group, e.g., acknowledging different schools of thought or disciplinary differences.
The user groups of low carbon energy data discussed in the preceding section present different views on the energy domain, leading to how knowledge is encoded in general and specifically through metadata. Ontologies and taxonomies are an essential tool to tell what the data is about. They organize knowledge with the help of concepts and relationships between them, allowing researchers to navigate the information and to capture the context of data. Ontologies are understood here following Studer et al. [43] as a "formal, explicit specification(s) of a shared conceptualization." Note that ontologies are shared between users in the communication of data. Having many stakeholders with various needs and interests may lead to fragmentation into several niche ontologies and the need for standardization and integration to come to a mutual understanding. Taxonomies are a specific form of ontology where all concepts are arranged hierarchically as sub-and super-concepts. Ontologies allow for richer knowledge representation, including different types of generic classes (e.g., materials, buildings, renewable energy generators), which are linked by multiple relations between them (e.g., steel is used to build wind turbines) [44,45]. Ontologies support the communication between humans, between machines, as well as between humans and machines. However, while humans typically have a predefined set of concepts, e.g., from being educated in energy science, devices need to learn about concepts in the form of agreed-upon, persistent labels.
In the energy domain, no unified approach to manage the domain knowledge exists. Instead, several ontological schemes have been developed, leading to numerous 'ontology islands' that only partially match each other. Consequently, metadata from different ontologies are primarily unrelated, potentially inconsistent, and not harmonized, preventing the navigation and reuse of heterogeneous data. In view of these difficulties, the idea of multiple application-specific metadata schemes is gaining ground. Table 1 presents a list of ontologies that have been developed within the energy domain. To explore ontologies' role in constructing hierarchical metadata and to brainstorm on solutions for overcoming the problems of ontology islands, a practical exercise was undertaken in the workshop session 'EU policy and energy research taxonomies' (see Supplementary Materials). Participants explored existing ontology islands in the energy domain to discover knowledge organization conflicts between given ontologies and their understandings. Second, participants were asked to identify options for linking ontology islands, attempting to unify the subdomains into a consistent whole. The practical exercise focused on the European Union's ontology frameworks and researcher-based ontologies for representing knowledge in the energy system with different granularity levels. One important lesson learned from the workshop discussions is that matching different ontologies is laborious. Therefore, preference should be given to well-established concepts over less established ones, e.g., backed-up by a larger institution or community-agreed standards.
Participants in the workshop agreed overwhelmingly that it is relatively straightforward for specialists in a sub-domain to develop standard terminology and its codification by an ontology with high granularity. The difficulty lies in defining sub-domain boundaries. The problem is to keep a specific level of detail for terms and concepts (tailored to user needs) while matching them at the same time with those that exist outside. The problem is a lack of agreements on terminology, concepts, and even philosophical approaches. According to Collier and Mahon [68] and Sartori [69], empirical and theoretical concepts should be delineated to avoid (or at least be aware of) conceptual stretching that is to "broaden the meaning-and thereby the range of application-of the conceptualizations at hand ... i.e., to vague, amorphous conceptualizations [with the net result that . . . ] our gains in extensional coverage tend to be matched by losses in connotative precision. It appears that we can cover more only by saying less, and by saying less in a far less precise manner" [69]. An example from the workshop is the term 'asset', which is understood and operationalized differently across disciplines.
Ontology islands need to be overcome to enable integrated knowledge frameworks and the coexistence of different ideas seems contradictory at first sight. However, ontologies are not limited to 1:1 relations because they also allow 'is similar to' or 'is connected to' connections. The problem also concerns the impossibility of replicating the complexity of reality-the number of links of one subdomain to others is simply too large. Even aligning a few sub-domains is beyond manual editing. However, algorithms can navigate vast amounts of relations as long as they are semantically linked.
First attempts have been implemented to generate and align ontologies based on, e.g., automated natural-language processing [70]. Therefore, an automatically generated ontology was among those studied in the workshop, namely EuroSciVoc. This vocabulary organizes scientific topics that were extracted from the CORDIS database on EU projects. Analyzing energy-related entries, the classification fails in reproducing sufficiently the understanding of experts and links between subjects. For example, one would expect 'energy crops' as an entry within agricultural sciences or 'energy history' being part of humanities/history.
As knowledge about any subject continually changes with time, the knowledge representation should capture these changes. An example is the rising importance of new energy technologies. The dynamic update does not only concern the content of some metadata but requires review, creation, and deletion of metadata terms over time. Therefore, implementing metadata with the help of ontologies means that specific concepts and relations have to be regularly reconciled with reality for timeliness and relevance. However, how best to enable dynamic and automated updates of ideas and connections is still a work in progress and was seen as a significant issue by the workshop participants. Approaches from the 'Continuous capture of metadata' project [71] may establish a support infrastructure. AI should be backing the process since the amount of information involved is impossible to screen by humans.
During the discussions, it became evident that selective knowledge representation leads to a bias. As with data itself, the choice of relevant metadata may be in danger of prejudice, particularly selection bias [72]. To some extent, this is natural since populating an ontology is usually not done as a randomized process but involves human choice based on prescribed knowledge [73]. The sequential character of updating the ontology may lead to self-amplified effects resulting in substantial deviations from reality. Transferring this to automated procedures, e.g., machine learning, may even increase this danger if the training data is not carefully screened for bias [41]. To avoid selection bias, a systematic review of the ontologies by experts and all relevant stakeholders should be undertaken. To this end, open metadata is essential. However, as the evaluation of metadata practices reveals, it is not uncommon that also metadata is closed behind paywalls (Section 3.2.3). Also, aligning newly created ontologies to existing ones may help to minimize bias. In this regard, the initiative for an Open Energy Ontology (OEO) is an important step forward. It supports co-editing, and hence, a pluralistic view on critical terms and concepts, minimizing possible bias. Its design process is documented on GitHub, and agreements are reached by consensus at open developer meetings [55].

Machine Perspective on Metadata Navigation
Recently, more and more routine tasks are automated due to the advances in software engineering. This also applies to many of the activities considered in the previous sections. Data search and literature review now rely extensively on algorithms. Humans are assisted if not replaced by algorithms with their own needs for clarity of communication regarding the exchange and re-use of (meta)data. Typically, strict enforcement of standards, terminologies, and protocols is needed to implement persistent and unique identifiers. Only then will it be possible to interlink, analyze, and aggregate data at the massive scale required in tomorrow's machine-based energy infrastructure.
Up to this point, we did not elaborate much on the importance of enabling the navigation of data and metadata from the machines' perspective. The rationale for allowing machine-actionability is that the amount and complexity of energy data have surpassed the threshold of manageability by humans. Machine support is needed for finding, organizing, and analyzing data. Indeed, a rapidly increasing number of datasets in the energy domain qualify as 'big data' in terms of volume, velocity, variety, and veracity. The trend will intensify with the continued increase of digitization of the energy sector and beyond.
An illustrative example is the amount of data generated from household-level electrical load measurements, which easily exceeds one billion data readings when monitoring the inhabitants of a medium-sized European city over a period of two years [74].
An illustration of the new way of integrating (meta)data is the following. Energy grids can be connected to devices, such as IoT sensors, that collect significant amounts of metadata and place them in real-time storage [75]. This is valuable for grid operators to better manage operations and energy suppliers to assist them in smartly adjusting energy supply with demand. Metadata through machine-to-machine data exchange and advanced IoT sensors can help make more accurate weather and load predictions, which is key to ensuring effective renewable energy integration. The combination of electrical and IoT infrastructures metadata with AI is the fundamental aspect of cloud-based energy management systems [76]. The participation of end-users is critical for the energy cloud as it connects and coordinates excess electricity with different end-users. This AI coordination can modify peak storage capacity based on energy generation and consumption [28]. While the grid of end-users is connected to a broader public distribution network [77], the exchange of information between various automation devices is realized by interconnecting communication networks [78].
When done correctly, machine-actionability enhances the communication of metadata and supports the efficient processing and integration of data from multiple sources. The FAIR principles have already been designed with machine-actionability in mind. Indeed, the FAIR principles are developed with idea of serving machines first, and then humans with (meta)data. The systematic assigning of persistent identifiers to metadata and the linking of metadata is thereby central. For this purpose, well-established semantic web technologies exist (e.g., RDF-Resource Description Framework, OWL-Web Ontology Language, and LOD-Linked Open Data). However, the use of these technologies is demanding for researchers outside of the IT community, exceeding the researcher's capacities (in terms of time and skills). Consequently, activities for machine-actionability need funding, training, guidance by specialists, and supporting tools. In the end, implementing machine-actionability has to be enabled by involving the different (human) user groups of the domain-specific data. The process of implementing machine-actionability is still an ongoing discussion with limited experience gathered so far in the energy domain. However, templates, software tools, and organizational support exist, ready to be customized for and with the energy research community. For example, linked data and RDF are well known and established, but simple tools are missing, so that anyone can leverage on these technologies. A panel discussion as part of the International FAIR Convergence Symposium 2020 was organized to start the community's conversation. Led by the initiative GO-Fair [79], so-called Metadata-for-Machines (M4M) workshops for the energy community were discussed, whereby input was provided from experiences gained in other organizations. M4M workshops intend to kick-start FAIRfication efforts in a community, generating minimally viable metadata components that are modular, reusable, and extended later on. The process starts with the community experts that develop a set of informal terminologies. Then domain experts, with the help of data stewards, go through a formalization process by using metadata templates to describe both the structure and the semantics of metadata. Moreover, care is taken that the results are compliant with FAIR principles. Online tools such as the CEDAR workbench [80] are available to make the metadata machine-actionable (i.e., the tools convert the templates into semantic components), while sheet2rdf [81] and excel2rdf [82] are available for simple creation of machine-actionable controlled vocabularies. In one of the first M4M workshops related to the energy domain [83], the wind energy community created and employed machine-actionable metadata templates for datasets [84]. A light weight yet comprehensive metadata specification, such as the one proposed by the Frictionlessdata Framework, has also been adopted by energy-related projects [85].
The process of documenting the implementation of FAIR principles in a community is known as the 'FAIR Implementation Profile (FIP)' [86]. In summary, the FIP is a collection of FAIR implementation choices made by a specific community for the FAIR principles. Once realized, FIPs provide guidance and orientation on FAIR infrastructures for different data communities. Overall, the above process of implementing machine-actionable metadata is echoed by the EERAdata project (data stakeholder involvement-linking to IT experts -developing and agreeing on common standards-FAIRification through support tools). The panel discussion also revealed essential insights for the planning of energy-specific M4M workshops. Most importantly, the community's broad involvement is crucial to ensure that the design of metadata templates reflects energy science's interdisciplinary character.
The metadata repositories resulting from FIPs are known as the network of FAIR Data Points. They provide software that allows data owners to expose their data's metadata to the public for seamless access by humans and machines alike. When created by trusted communities, other communities' reuse and uptake can accelerate the overall implementation and convergence of the FAIRification of (meta)data.

Advancing Metadata Practices-Technical Aspects
We expand on the practical approach towards defining metadata for low carbon energy research by focusing on the functions that metadata should serve different groups of users as discussed in Section 3.1. Typically, a range of standards is implemented to provide these functionalities. The various standards to deliver the desired functionality are denoted as a metadata concept-we call implementing such a vision a metadata practice. To understand the extent of performance, we analyze existing metadata practices in the low carbon energy research community, applying a set of the following evaluation criteria: • Richness relates to the extent and completeness of metadata, supporting the usefulness of metadata. For example, only 12 identifiers for administrative data is used in the repository 'figshare', whereas 100 identifiers are requested for the 'dataverse'. • Consensus concerns the level of agreement in the community and through the institutional backup. For example, a consortium working on the concept has a more extensive backup than an individual, supporting metadata's credibility. Alternatively, a community may have formally approved a practice, or the practice may even have been adopted by standard-setting institutions such as CODATA. The continuous curation further increases consensus. • Accessibility and transparency concern the ease of using metadata for different stakeholders. This criterion supports usefulness and findability, answering the following questions: Does metadata support the empowerment of many users, e.g., by providing information on varying levels of granularity? How good is the quality of data documentation? Are open standards supported? Specifically for machines: Are persistent and unique identifiers used? Are they retrievable by standard search engines? • Linked metadata supports compatibility and interoperability between data. This criterion concerns how well metadata is interlinked, structured, and grounded, allowing navigation at different levels. • Functional implementation should be up to standard, supporting authorization and authentication. Recommended are the ubiquitous use of persistent identifiers, a high degree of modularity, and the inclusion of licensing information. JSON-LD and XML-RDF are recommended metadata formats.
The above criteria to evaluate metadata practices were derived from reviews and discussions in the workshops (Section 2). It is important to note that equal emphasis has been put on assessing human-readability and machine-actionability of metadata. The requirements were tested against the extent to which they align with standards proposed in the literature [12,[87][88][89][90][91][92][93][94][95]. The Supplementary Materials offers additional details on the development of the evaluation criteria as well as the results from the assessment of metadata practices within the community (Supplementary Materials, Table S1).
The workshop discussions led to identifying several metadata practices and platform developments in the energy domain, most of them being in-house developments. Twelve of the more mature metadata practices were chosen for assessment. These are (1) metadata strings of the Open Energy Platform, (2) the Standard International Energy Product Classi-fication, (3) the Energy Identification Code scheme (ENTSO-E), (4) the CityGML Energy Application Domain Extension, (5) the Common Information Model by the International Electrotechnical Commission, (6-11) metadata models of the projects Hotmaps, EMODnet, ShareWind, PV-GIS, NOMAD, SEMANCO, and (12) the Materials Experiment and Analysis Database. The reader should note that the assessment of the metadata practices is not meant to scrutinize these pioneering efforts. Instead, the evaluation is used to analyze the current state of the art and to identify common prevailing issues.
We find a large diversity in compliance by applying the evaluation criteria for metadata practice as presented in the list above. Many of the rules show a high granularity in the field of application (standards on richness). Still, they do not link systematically to areas outside of their scope (criteria on linked metadata). This limits the possibility of traversing the metadata for broader reuse. Besides, the existence of these metadata standards is usually unknown outside of the particular sub-domain. We also found that some of the metadata practices are already supported by standard-setting organizations. An example is the Common Information Model by the International Electrotechnical Commission. However, this example shows at the same time that the metadata information is only available under commercial licensing and represents a major impediment, preventing the possibility of critically scrutinizing and co-editing the standard (accessibility and transparency criteria). On the opposite end of the spectrum lies the Open Energy Platform, encouraging broad community involvement. The initiative is currently the broadest and most comprehensive approach towards metadata in the energy domain. The biggest challenge for all metadata practices in the energy domain is, however, machine-actionability. Many of the current metadata practices are limited to documentation for humans in the form of, e.g., widelyused spread sheets and PDF files. A proper functional implementation does not require the use of semantic web technologies but would nonetheless benefit from these technologies. For further details, see also Table S1 in the Supplementary Materials.

Lessons Learned
A common ground for metadata is essential for communication and validation of results in the energy research domain. The importance of standards will increase with the sustained digitization of energy sector, creating the demand for both human-readability and machine-actionability of data. The stakeholders of energy (meta)data form a diverse group who uses data differently and have specific needs with varying expectations on the coverage, granularity, and accessibility of (meta)data. Also, effective communication of data within and between user groups requires domain-specific metadata, emphasizing the need for agreed-upon standards and user group specific implementations. Multiple conceptual approaches and metadata practices exist. Many of them are, however, islanded solutions and connecting them is a big challenge. Moreover, the domain-specific translation of the FAIR data guiding principles is a new research area and few mature best practices exist. Science has reached a threshold where data curation is becoming so resource-intensive that an individual researcher becomes overburdened with this task, rendering the organization of a community-wide process to FAIRify domain (meta)data nothing but wishful thinking. The workshop discussions and literature review revealed several specific challenges, which we highlight below. The last section expands the conclusions of this community white paper with the following recommendations.

•
Reaching agreements on common standards requires a synthesis of established terminologies, vocabularies, and data formats to create rich and linked, but still comprehensible, metadata that caters to the needs of all stakeholder groups. Metadata concepts to track the domain-and user-specific workflows are still under development. Practices on licensing and access control are uncommon. While the functional implementation should adhere to emerging solutions found in the ICT community (e.g., the widespread use of persistent identifiers, semantic web technology), they need to tie with established practices in the energy domain and match existing IT competencies.
• Limited functionality of supporting tools and infrastructure. If an energy research team aims at tackling the problem of interoperability and machine-actionability of their (meta)data, very limited or unspecific top-down support is available. For example, one of the main projects of the European Cloud-EUDAT Collaborative Data Infrastructure-offers support for integrated data services. However, none of the seven core communities covers low carbon energy research, and none of the use cases of the European Open Science Cloud includes best practice examples from the field. (Compare with the information about the core communities at https://eudat.eu/ news/eudat-keeps-engaging-communities-all-you-need-to-know-on-the-eudat-datapilots, (accessed on 1 September 2021)). At the same time guidance is needed to address the lack of resources and skills. In consequence, the reluctance to commit to engage with new data curation technologies appears well founded. • Durable solutions for the energy domain and beyond. Since the context of data is given by the current domain knowledge, metadata is inevitably dynamic in nature. Therefore, metadata needs to be updated to align with the current state of knowledge. Moreover, the relevance and appropriateness of data descriptions has to be regularly verified. No practices are established yet to enable dynamic and machine-supported updating. • Data curation literacy across user groups. The lack of FAIR data curation skills is an issue for all stakeholders of (low carbon) energy data. Indeed, the level of competence ranges from IT-enthusiasts to technology skeptics. Also, little experience exists to map specialist-terminology with non-specialist terminology to facilitate the broad reuse of (meta)data. • Organizing a community approach. For advancing (meta)data practices, a framework able to address and involve all stakeholders is needed. A suitable framework should have a high visibility in the community to ensure legitimacy and to promote the uptake of standards. The challenge is that the implementation of such a framework is beyond the possibilities of a few funded projects, because it requires a long-term and consistent commitment. • A challenge to fund data curation activities. Data curation as well as skill building is resource-intensive for individual researchers and institutions alike, but available funding is clearly not matching the needs. • Lack of acknowledgement for data curators. Data curation is not acknowledged in the same way as data processing and analysis. Despite its relevance, this task is largely seen as inferior. Even more, contributions by data curators are generally less visible (including suitable journals to publish on the matter).

Recommendations
Advancing FAIR (meta)data practices for low carbon energy research data supports the energy transition and enables the sector to tap the potential of the knowledge economy. However, the explorations in this community report reveal a number of challenges. We address these with the following recommendations:

1.
Augment and synthesize existing metadata practices in the energy domain. The starting point to reach agreements on common standards is the identification, extension, and integration of existing approaches. This includes the search for best practices in other domains. In this way, domain-specific attempts can be matched with emerging solutions. At the same time, researchers are picked up where they are, and their contributions utilized. With this white paper, we invite interested researchers and existing initiatives in the field to organize a joint conference with the Open Energy Modelling Community, the Wind Energy Community, and the EERAdata project, among others.

2.
Establish platforms that gather and document domain-specific metadata practices.
The recommendation is to build a knowledge base with supporting tools, guidelines, best practices, FAIR implementation choices, and links to (meta)data repositories. The hub should be anchored in a number of widely frequented energy community websites. The platform should also offer contacts to domain experts and other FAIR data curators (e.g., linking energy-related working groups of top-down initiatives). This promotes the use of general metadata standards such as Dublin Core and also supports a energy-specific implementation of FAIR (meta)data principles that aligns with the general trends. The FAIR Implementation Profiles [86] can serve as an inspiration.

3.
Build a FAIR energy data network and foster long-term commitment. A starting point is to utilize existing international and EU-level networks in the field. The idea is to link up with periodic conferences and workshops, which are mainstream meeting places for the energy community already. Established international umbrella organizations that target the wide range of energy data stakeholders from researchers to policy-makers, industries, funders, publishers, and so forth, appear as a good choice. One candidate is the European Energy Research Alliance (EERA), working at the interface of European policy-making, national governments, the research community and industry. Members of EERA engage in a broad range of research and cross-cutting activities facilitated in 18 Joint Programs. In particular, the recently established transversal program 'Digitalization of Energy' may serve as a suitable means to kick-off community activities.

4.
Develop a multi-year work program actively engaging low carbon energy data stakeholders to (1) estimate the resources needed to implement and sustain FAIR data activities from the perspective of user groups, (2) coordinate the development of community practices and consensus on standards, (3) organize workshops to advance unresolved issues (incl. the open/closed data spectrum, incentive problem of data curation, business opportunities, data protection and GDPR, data licensing, and so forth), and (4) promote the building of FAIR data competences and skills.

5.
Advocate for broadening funding programs and encourage industry funding. The lack of funding for FAIR data activities is a major barrier to advance community practices, build the necessary skills, and increase awareness for the need for data curation. It is important to actively approach funders to motivate the funding of on-the-job training and the development of new university education on energy data stewardship for instance. The industry should also be targeted, and public-private partnerships seem a way forward in this respect.
We conclude this community report on the advancement of (meta)data practices for low carbon energy research with a quote by Scott [96] that "Metadata is a love note to the future". In the same spirit, we hope to inspire the readers. It is the hour for pioneers to discover largely uncharted territory.

Conflicts of Interest:
The authors declare no conflict of interest.