Transitions in Journalism—Toward a Semantic-Oriented Technological Framework

The technologies behind today’s web services, tools, and applications are evolving continually. As a result, the workflows and methods of different business sectors are undergoing constant change. The news industry and journalism are heavily affected by these changes. New technological means for practicing journalism and producing news items are being incorporated in media workflows, challenging well established journalistic norms and practices. The perpetual technological evolution of the web creates a wide range of opportunities. For this reason, both technology companies and media organizations have begun to experiment with semantic web technologies. Our focus in this paper was to discover and define the ways that semantic technologies can contribute to the technological upgrade of everyday journalism. From this perspective, we introduced the term ‘semantic journalism’ and attempted to investigate the transition of journalism to a semantic-oriented technological framework.


Introduction
The introduction of the Internet and the exploitation of its services resulted in a variety of consequences for the way the news industry operates (Deuze 2003) and the way journalism is produced (Van der Haak et al. 2012). The emergence of new tools and practices, along with the digitalization of the work process, provided new ways to produce information and a redefinition of professional journalism (Deuze and Witschge 2018). Currently, the audience has direct access to multiple digital sources, ranging from mainstream media to individual blogs and digital social platforms, from which they can create and distribute a wide range of information (Veglis and Maniou 2018). Journalists are expected to operate multiple services and platforms in order to be constantly informed of what is happening and deliver the news (Spyridou et al. 2013). In addition, the rapidly growing tendency of media organizations to use advanced web technologies in order to manage the everyday workload of news production and publication has attracted the interest of many academic researchers (Maass and Kowatsch 2012; Osman et al. 2014;Caswell 2019). These researchers conceptualized the transformation of journalism and explained the drastic changes under the rubric of convergence.
In Newman's recent survey (Newman 2019), Digital News Report: Journalism, Media and Technology Trends and Predictions 2019, the intentions of 200 publishers from 29 countries to invest in harnessing the potential of artificial intelligence (AI) and machine learning (ML) were clearly captured. Other studies in the field presented evidence that journalists understand the merits and the necessity to enhance their practices by exploiting new technological capabilities and foster innovation in the newsroom (Paulussen 2016). The emergence of more and more technologically advanced types of journalism leaves no space for doubts regarding the technological optimization of today's journalism.
Using these findings as a starting point, this article suggests that there is a burgeoning need to further investigate how advanced web technologies and services like the semantic web (SW) influence journalism and media studies. Many scholars have attempted to present semantic-oriented solutions to intertemporal problems experienced by various companies, including media organisations, for example information retrieval data integration, prominence of news material, discovery, collection and verification of content, data management and many more (Janev and Vranes 2009). Although this information gives us important guidelines to understand the SW and its implications in journalism as well as insights for better use, this information has not been categorized, described and placed in a context that showcases journalism in the SW environment. For this reason, this study's principal purpose is to introduce the new-born practice of semantic journalism (SJ), which derives from the development of the semantic web (SW) and the embedment of semantic technologies in everyday journalism practices. Secondarily, within our intentions is to resolve all possible doubts regarding SJ by presenting all the relevant documentation and arguing that is already happening and evolving behind the scenes, fragmented and marginalized in an uncharted field. Overall, through the examination of various semantic-based solutions for journalism-related tasks, this research effort attempts to describe how the profession of journalism could be envisioned and realized in a technologically enhanced web environment, such as the SW.
The rest of the article is organized as follows. Section 2 provides a detailed literature review and is divided into four subsections. The first subsection deals with the problematic aspects of the current web and proposes the SW's philosophy as an alternative solution, followed by a brief presentation of the old and new web standards on which semantic solutions base their applications. The third subsection outlines the SW's technological framework in which journalism is contextualized. The fourth subsection lists the semantically oriented technological standards developed specifically for the news industry, featuring examples of real use cases. Similarly structured, Section 3 continues with the definition of the term semantic journalism (SJ), followed by its characteristics. Consequently, the classification between certain semantic journalism production processes and publication practices is documented and analyzed. After, Section 4 discusses the major limitations of this study, succeeded by our concluding remarks and future research extensions.

The Emergence of the Semantic Web
The web as an information space has been through various phases of development since it was established for public use. This progress was accomplished through the years, by incorporating modern innovative web technologies that redefined several aspects. This led to a mixture of spectacular successes and failures (Choudhury 2014). Compared to other media-like TV, radio, and newspapers-the simplicity, availability, reachability, and reduced exploitation costs of the web rendered it as the most popular and convenient platform for information publishing and dissemination (Panagiotidis et al. 2019). However, most of this information is published as unstructured text that is made available to a general audience only by means of web pages (Frasincar et al. 2011). In other words, although today's web services facilitate and underpin actions, such as interaction, participation, collaboration, content creation, and information sharing, it seems that the implementation means of these actions are flawed in terms of usability (Saridou et al. 2018). Over the years, the web's infrastructure proved insufficient and unable to manage the ever-growing unstructured amount of inherent information.
Journal. Media 2020, 1, 1 3 of 17 The key problem of today's web is located in the following paradox: while numerous technological developments enabled and promoted the production and circulation of web content, the developments required to organize, manage, and manipulate this content have been left behind. Unfortunately, the web grew primarily in terms of volume and not in structure. The media industry, and especially journalism, welcomed the abundance of information on the web because they understood the dynamics and the potential benefits of its exploitation. However, the latter is not fully feasible today because journalists must navigate in a network of unstructured interconnected forms of information (e.g., documents, images, and text) (Kuck 2004), which are designed only for human consumption (Shadbolt et al. 2006;Antoniou et al. 2012). On the one hand, this status quo strengthened journalist performances regarding information availability, but on the other hand, undermined it in terms of functionality and efficiency. In this technological era, where computers cannot read, understand, or manipulate information as humans do, certain journalistic tasks (e.g., discovery, collection, verification, analysis of information, etc.) are difficult to complete in a meaningful and effective way. At this point, SW technologies step in and offer suitable solutions.
According to Berners-Lee et al. (2001), "the semantic web is an extension of the current web, in which information is given well-defined meaning, better enabling computers and people to work in cooperation". The need for this extension appeared in the foreground by the founders of the web themselves as a solution to the problems caused by its anarchic structure. The SW's added value is the structure it proposes. By semantically structuring web pages or documents, web content gains meaning (Poullet et al. 1997), and the web as a whole becomes more accessible to computers (Antoniou et al. 2012). As Choudhury (2014) argues, the "SW is driving the evolution of the current web by enabling machines to understand and respond to complex human requests". Such an understanding requires that the relevant information sources be semantically structured. The aim of the SW architecture (Horrocks et al. 2005) is to structure the web content in such a way to enable users to effectively discover, integrate, exchange, and interlink information (Ankolekar et al. 2007) and improve data management. In the SW framework, every piece of information will transmit its meaning, allowing computer programs to interpret information like humans and intelligently generate and distribute useful content tailored to human needs (Aghaei et al. 2012).

Semantic Web Components
In order for the semantic solutions to be applied, certain old and new technological standards, for which the World Wide Web Consortium (W3C) is responsible, are used. They are presented briefly below, to introduce the reader into the SW era. It is not within the intentions of this paper to delve deeper into every standard. Specifically:

•
The Universal Code (Unicode) is the global character encoding system with more than 137,929 characters (Unicode Consortium 2019) and it is used to represent and manipulate text in many languages.

•
The Uniform Resource Identifier (URI) is a globally scoped character string that is used to identify digital resources or concepts in the web (Berners-Lee et al. 2005).

•
The Extensive Markup Language (XML) is a markup language that models every piece of information. It is used to define a set of rules for encoding documents in a format that is both human and machine-readable (Bray et al. 2008). Additionally, XML schema is the vocabulary that is used to describe and verify elements inside an XML document and show the interrelationship between them and their complementary attributes (W3C World Wide Web Consortium).

•
The Resource Description Framework (RDF) is a set of specifications used to create statements for conceptual descriptions of the information implemented in web resources. Moreover, an RDF schema is a data-modelling vocabulary for RDF data and these are used to describe groups of resources and the relationships between them through a hierarchical system of classes and properties (Decker et al. 2000).

•
The Web Ontology Language (OWL) is a family of representation languages that provides terminological knowledge, bringing reasoning power to the SW. It provides infrastructure to RDF statements by describing concepts, characteristics of properties, constraints, and categories of things and relationships (W3C World Wide Web Consortium). An ontology is an explicit and formal specification of a conceptualization and is used for integrating heterogeneous databases, enabling interoperability among disparate systems and specifying interfaces to independent, knowledge-based services (Gruber 1995).

•
Lastly, SPARQLE Protocol and RDF Query Language (SPARQL) is a semantic query language for databases and is used to retrieve data that follows the RDF specifications (W3C SPARQL Working Group 2013).

Contextualize Journalism in the Semantic Web
In today's exploding web landscape, where vast amounts of information are published from different sources across the world, journalists and media professionals often find it difficult to take advantage of all the available data. For this reason, and in order to deal with the everyday workload, professionals use more and more web services, tools, and applications (Maass and Kowatsch 2012). Unfortunately, no matter how many technological aids they use, these are often not sufficient to fully exploit the content of the web. The evolution of the SW will introduce a web of data (Polleres and Huynh 2009) in which the relations of every data item will be fully clarified and structured. As a result, humans and computers will be able to coexist and collaborate in an environment where the interconnection of concepts, rather than simply documents, will be feasible. In this way, journalists will acquire the power to recover targeted information and draw useful conclusions regarding topics that were previously difficult or impossible to research.
The SW promises an upgraded web environment with potential journalistic capabilities, powered by advanced technological features (Antoniou et al. 2012). The major contribution of these features is to define new methods of web information usage by computers, not only for display purposes but also for interoperability between systems (Cardoso et al. 2007). In the context of journalism, this type of technological evolution will unlock various prospects of data exploitation and information handling (Panagiotidis et al. 2020). As Saridou et al. (2018) argued, "the technologies responsible for upgrading the usability of the current web and its content are at the same time responsible for upgrading journalism and its research products". In theory, the applicability of SW's philosophy on contemporary investigative practices leads journalism to a higher functional level.
In practice, the use of semantic techniques for exploiting structured data (Gray et al. 2012) enables the development of new tools and methods not only for information processing (Heravi et al. 2012) but also for discovery, collection, analysis, integration, and reuse. This is why leading media organizations, in cooperation with technological giants and startups, are already beginning to embrace and incorporate SW features into their workflows. British Broadcasting Corporation (BBC) constitutes one of the most notable examples of semantic technologies adoption. The process has been revolutionary and has been in progress since 2004. In brief, it started with a general philosophy of assigning one web address (or Uniform Resource Locator-URL) for every BBC programme (i.e., TV, Radio, etc.) and afterwards converting these identifiers to URIs. Then, it rolled out incrementally with the consumption of Open Data, the adoption of Linked Data principles and the exploitation of RDF and ontologies in the organisation's websites. It ended up with the 2010 World Cup Website, the 2012 Olympic website and the BBC Programmes project, providing, to a so far traditional journalism outlet, significant strategic advantage in the market (Feigenbaum 2012). Although it may not be clearly perceived, low-level semantic technologies are also trending in a few other media pioneers like the New York Times (NYT), Thomson Reuters, Norwegian National Broadcaster (NRK), Agence France Press (AFP) and Associated Press (AP). These are some of the leading media organizations that are working towards the utilization of semantic services, technologies and tools in various ways; ranging from speeding up research to accumulating and cross referencing data (Engels and Tønnesen 2007;Underwood 2019).

of 17
Scenery has a wide variety of characteristics. Developers are using semantic technologies to augment the ways in which they create, link, and reuse content on social media sites (Bojars et al. 2008). Services, like Spectee and Trint, which provide the means to extract information from user-generated content (UGC) by semantically analyzing the aural, visual, and textual components, have become increasingly popular. Likewise, services, like Factmata and Fabula AI, and tools, like Twitteraudit, can be seen as detection mechanisms that give journalists the ability to better understand multimedia content and verify the authenticity of information (Saridou et al. 2018). Lastly, there are services, like Google Street View (Anguelov et al. 2010), that correlate content with geolocation data. Konstantinou et al. (2010) stressed that the great potential of the SW and its technologies become more evident through a number of semantically rich online applications. An indicative set of case studies presented by the W3C Semantic web Education and Outreach (SWEO) Interest Group (www.w3.org/2001/sw/sweo/public/UseCases/), demonstrated the potential carried by semantic technologies in several areas of interest, including the arts, automotive, broadcasting, cultural heritage, education, eGovernment, energy, eTourism, financial, geographic information systems, health care, the IT industry, and many more. Likewise there are semantically oriented technological standards developed specifically for the news industry. The International Press Telecommunications Council (IPCT; https://iptc.org/standards/) is responsible for them and their purpose is to improve the management and exchange of information between content providers, intermediaries, and consumers. Many other news industry giants are currently members of IPCT, like the Deutsche Presse-Agentur (dpa), BBC, Getty Images, Press Association (PA), and Reuters. Some of the most widely spread standards are the following:

Semantic Web Standards for the Media Industry
• News Industry Text Format (NITF) is a solution for sharing news. Through the use of XML, the standard defines the content and structure of news articles in order to make them more searchable and useful. NITF is the means for identifying and describing numerous news characteristics, such as the who, what, when, where and why of an instance. Currently, many news agencies/companies publish NITF documents as Table 1 showcases.

•
Photo metadata is a global standard that is used to provide general information regarding a picture (i.e., the format, copyrights, location, etc.), to be readable by both machines and humans. • Video Metadata Hub is a newly introduced standard, and it serves as a recommendation, comprising widely used properties of a video (i.e., the name, data type, etc.), for shared video management. • Media Topics, as recently updated, is a vocabulary with over 1200 terms used for the classification of media items based on text categorization. • NewsML-G2 is the most common standard for managing and exchanging individual and packages of news items in various formats (i.e., text, images, graphic, video, and audio) between various news outlets (i.e., newspapers, TV stations, and news aggregators). It provides rich metadata in XML format in order to facilitate the convergence with the SW, in terms of rich functionality, ease of use, compactness, and compatibility. Furthermore, the EventsML-G2 and SportsML standards are encountered under NewsML-G2, and these are used for the interchange of events and sports data (i.e., facts about an event, statistics, etc.), respectively. Some of the media organizations that are currently using NewsML-G2, EventsML-G2, and SportsML standards are listed in Table 2.

•
RightsML standard is realized through the Rights Expression Languages. These are machine-readable languages determining the various restrictions/permissions for the use of a digital media item (i.e., an image) when it is made available within the media industry (i.e., photo agencies).

•
Ninjs standardizes the representation of news in the JavaScript Object Notation (JSON) format. JSON is a data format with a wide range of applications. By expressing news items in JSON format, Journal. Media 2020, 1, 1 6 of 17 the information becomes easily interpreted and interchanged between Application Programming Interfaces (APIs), mobile apps, and others. • rNews is a standard for embedding machine-readable metadata in online news. As news publishers are using their web pages to disseminate information, by using semantic markup to annotate news-specific metadata they render their web page documents and the content of their databases more recoverable by machines and more visible to the search, social, and aggregation ecosystem. The New York Times (https://www.nytimes.com/) and the Austria Presse Agentur (APA) (https://www.ots.at/) are currently using this standard.  The term semantic journalism (SJ) derives from the convergence of the words semantic web and journalism. The key points of SJ theory are the following: It represents a new trend in journalism. It aims at the modernization of existing workflows by adopting the fundamental principles of the SW. It embodies all those use cases of applications with semantic characteristics that can serve journalism. It fulfills the technological enrichment of research methods, such as searching, collecting, editing, categorizing, and verifying information. It leverages specific aspects of semantic technology through a journalistic perspective. It marks the use of new tools that improve the management, analysis, and dissemination of web content for greater efficiency and impact of journalistic outputs. Lastly, it maps a new field of application for semantic technologies.
In detail, SJ is a technologically advanced type of journalism that integrates characteristics from the SW in order to provide journalists with modern alternatives. The use of semantically structured data and tools with semantic features are a few of the first indications that depict the transition from journalism to SJ. This new breed of journalism concerns all journalists who understand the power of the Internet and its services. In recent years, the development of the SW, as an extended version of today's most popular service of the Internet-the World Wide Web-has attracted significant attention due to its aggregation features (Heravi et al. 2012). Some of the media organisations mentioned before have already devoted serious amounts of resources to explore and take advantage of semantic technologies. This is because the operational and managerial difficulties of today's web leave them with no choice but to search for more efficient methods of data exploitation. The prevailing circumstances The term semantic journalism (SJ) derives from the convergence of the words semantic web and journalism. The key points of SJ theory are the following: It represents a new trend in journalism. It aims at the modernization of existing workflows by adopting the fundamental principles of the SW. It embodies all those use cases of applications with semantic characteristics that can serve journalism. It fulfills the technological enrichment of research methods, such as searching, collecting, editing, categorizing, and verifying information. It leverages specific aspects of semantic technology through a journalistic perspective. It marks the use of new tools that improve the management, analysis, and dissemination of web content for greater efficiency and impact of journalistic outputs. Lastly, it maps a new field of application for semantic technologies.
In detail, SJ is a technologically advanced type of journalism that integrates characteristics from the SW in order to provide journalists with modern alternatives. The use of semantically structured data and tools with semantic features are a few of the first indications that depict the transition from journalism to SJ. This new breed of journalism concerns all journalists who understand the power of the Internet and its services. In recent years, the development of the SW, as an extended version of today's most popular service of the Internet-the World Wide Web-has attracted significant attention due to its aggregation features (Heravi et al. 2012). Some of the media organisations mentioned before have already devoted serious amounts of resources to explore and take advantage of semantic technologies. This is because the operational and managerial difficulties of today's web leave them with no choice but to search for more efficient methods of data exploitation. The prevailing circumstances are the ones that cause the development and application of semantic technologies in the media. These, in turn, generate fruitful conversations regarding the future of journalism.

Characteristics of Semantic Journalism
The SW package of theory and practice gives everyday journalistic activities a technological boost. Actions that are able to accommodate semantic features lead to the realization of journalism in the SW context. Through this process, the extension of journalism and the construction of SJ are achieved. The concept of SJ is controversial, however, as although it is easily conceived, it is undocumented. Over the years, initiatives, like CNN's iReport (Hellmueller and Li 2015) and BBC's News Lab BBC Programmes (Dowman et al. 2005) discussed the different ways journalists can benefit from the use of semantic technologies.
One of the recent efforts to investigate and identify this field within the specific context of user-generated content (UGC) and social media, is social semantic journalism (SSJ) (Heravi et al. 2012). The SSJ and its framework deals with the vast amount of UGC across social media platforms and the limited amount of time that journalists have to spare in order to extract potential news stories from these mostly unstructured, unfiltered, and unverified data. For this reason, it proposes a semantic-based solution that can formalise and link unstructured UGC to other semantically-enriched data sets for integration, verification, and fact-checking purposes (Heravi and Mcginnis 2015). The researchers behind the SSJ effort focused on assisting journalists with content creation by providing them with contextual information, first for aggregating and verifying user-generated news and second, for making judgments related to a social media item or a media item from an unknown source.
Another initiative of high significance that embodies semantic features is structured journalism or database journalism (Caswell 2019). This is an emerging form of journalism that publishes news content as entries in a database, enabling users to explore the content in ways that reveal trends and patterns and create new stories and visualizations. The database itself is the story, which enables researchers to explore and sort the content guided by their own personal curiosities, mix and match bits and pieces of information and see tallies of published work (Gourarie 2015).
However, the introduction of SJ serves a wider purpose. SJ is an attempt to address pre-publication and post-publication issues and catalogue every aspect of journalism practice with semantic orientation. This is the reason we believe that SJ is an umbrella term that incorporates not only news production processes, but also publication practices that exploit semantic technologies.

Semantic Journalism Production Processes
The classification regarding the semantic-oriented news production processes, as depicted in Figure 2, could have been made in various ways as most of them are not completely isolated and independent, but instead have overlaps and are related to each other. As the analysis below will show, although some production processes are self-sufficient and can stand on their own, they can also be identified in one or more publication practices. We choose to proceed with this segmentation in order to investigate them from a specific point of view, that of SJ.
Semantic annotation is defined as the action of describing (part of) an electronic resource (e.g., text, image, video, etc.) using metadata whose meaning is formally specified in an ontology (Fernández 2010). This is the process that creates semantic labels of documents for the SW aiming to establish relations between pieces of annotated data. The semantic annotation plays an important role in a variety of semantic applications, such as the generation of linked data, extraction of information, advanced searching (based on concepts), and reasoning regarding web sources (Pech et al. 2017). Currently, there are a few semantic annotation use cases that are potentially useful for journalists, such as the Concretely Annotated New York Times schema, which provides layers of annotation for over 1.8 million of its published articles (Ferraro Francis et al. 2014). Additionally, real time semantic annotation use cases that are potentially useful for journalists are the Slovenian Press Agency paradigm (Event Registry 2017) and the SciHi blog (http://scihi.org/) powered by the company YOVISTO. Semantic annotation is defined as the action of describing (part of) an electronic resource (e.g., text, image, video, etc.) using metadata whose meaning is formally specified in an ontology (Fernández 2010). This is the process that creates semantic labels of documents for the SW aiming to establish relations between pieces of annotated data. The semantic annotation plays an important role in a variety of semantic applications, such as the generation of linked data, extraction of information, advanced searching (based on concepts), and reasoning regarding web sources (Pech et al. 2017). Currently, there are a few semantic annotation use cases that are potentially useful for journalists, such as the Concretely Annotated New York Times schema, which provides layers of annotation for over 1.8 million of its published articles (Ferraro Francis et al. 2014). Additionally, real time semantic annotation use cases that are potentially useful for journalists are the Slovenian Press Agency paradigm (Event Registry 2017) and the SciHi blog (http://scihi.org/) powered by the company YOVISTO.
The rest fall into the blogosphere's realm. Through the use of semantically enriched blog metadata, the following processes take place (Cayzer and Shabajee 2003;Cayzer 2004aCayzer , 2004b. Namely: The Semantic View enables context-sensitive, schema-driven views of the blog content. For example, bibliographic items within a blog are viewed as small explanatory cards arranged in tables, grouped in clusters, and based on a scoring system.
The Semantic Navigation facilitates new navigation modalities for finding items of interest by using links such as "agrees with", "part of", and "find similar entries".
The Semantic Query empowers richer query and discovery mechanisms over and above free text search. Semantic metadata can help build rich queries for accessing a community's collective knowledge using links, such as "Find all blog entries about a paper written by this author" and "Find all blog items about my friends").

Semantic Journalism Publication Practices
Despite their contribution so far, the last three processes also play a significant role as parts of the semantic blogging practice. This is one of the four semantic-oriented news production practices demonstrated in Figure 3. Semantic blogging is defined by the use of rich metadata, as emphasized The rest fall into the blogosphere's realm. Through the use of semantically enriched blog metadata, the following processes take place (Cayzer and Shabajee 2003;Cayzer 2004aCayzer , 2004b. Namely: The Semantic View enables context-sensitive, schema-driven views of the blog content. For example, bibliographic items within a blog are viewed as small explanatory cards arranged in tables, grouped in clusters, and based on a scoring system.
The Semantic Navigation facilitates new navigation modalities for finding items of interest by using links such as "agrees with", "part of", and "find similar entries".
The Semantic Query empowers richer query and discovery mechanisms over and above free text search. Semantic metadata can help build rich queries for accessing a community's collective knowledge using links, such as "Find all blog entries about a paper written by this author" and "Find all blog items about my friends").

Semantic Journalism Publication Practices
Despite their contribution so far, the last three processes also play a significant role as parts of the semantic blogging practice. This is one of the four semantic-oriented news production practices demonstrated in Figure 3. Semantic blogging is defined by the use of rich metadata, as emphasized above, in order to transform blogs from simple online diaries to full participants of an information-sharing ecosystem. This vision proposes the materialization of informal decentralized knowledge management. The semantic blogging demonstrator is a simple prototype application that was produced around bibliography management and includes the abovementioned semantic capabilities (Reynolds et al. 2004). Additionally, the BBC Programmes is considered a successful paradigm of semantic blogging technology embodiment (Raimond et al. 2010). above, in order to transform blogs from simple online diaries to full participants of an informationsharing ecosystem. This vision proposes the materialization of informal decentralized knowledge management. The semantic blogging demonstrator is a simple prototype application that was produced around bibliography management and includes the abovementioned semantic capabilities (Reynolds et al. 2004). Additionally, the BBC Programmes is considered a successful paradigm of semantic blogging technology embodiment (Raimond et al. 2010). Semantic publishing refers to the process of publishing content on the web accompanied by metadata (Shadbolt et al. 2006) and well-defined, machine-readable information, which enables computers to understand its structure and even its meaning (Shotton et al. 2009). As Peroni (2017) points out, "Semantic publishing involves the use of web and SW technologies and standards for the semantic enhancement, for example of a published journal article, so as to improve its discoverability, interactivity, openness and (re-)usability for both humans and machines". Complementary bibliography states that the term is used to describe a set of technologies that set up the conditions for the semantic enrichment of content, aiming to augment its meaning, facilitate its linking to semantically related articles, provide access to data within the article in actionable form, or permit the integration of data between papers (Kuhn and Dumontier 2017). Representative examples of semantic publishing web sites and solutions, such as the ones of BBC and Ontotext, displayed in Table 3, are beginning to come to the foreground (Georgiev et al. 2013;Gonzalez 2014). Semantic publishing refers to the process of publishing content on the web accompanied by metadata (Shadbolt et al. 2006) and well-defined, machine-readable information, which enables computers to understand its structure and even its meaning (Shotton et al. 2009). As Peroni (2017) points out, "Semantic publishing involves the use of web and SW technologies and standards for the semantic enhancement, for example of a published journal article, so as to improve its discoverability, interactivity, openness and (re-)usability for both humans and machines". Complementary bibliography states that the term is used to describe a set of technologies that set up the conditions for the semantic enrichment of content, aiming to augment its meaning, facilitate its linking to semantically related articles, provide access to data within the article in actionable form, or permit the integration of data between papers (Kuhn and Dumontier 2017). Representative examples of semantic publishing web sites and solutions, such as the ones of BBC and Ontotext, displayed in Table 3, are beginning to come to the foreground (Georgiev et al. 2013;Gonzalez 2014). Semantic authoring of content is to combine semantically structured pieces of information based on an ontology (Hasida 2007). In detail, as Khalili and Auer (2013) indicated, "semantic authoring is the tool-supported manual composition process of creating semantic documents, resulting either from the use of semantic knowledge representation formalisms (i.e., RDF, OWL, etc.) or from the use of non-semantic representation forms (i.e., text or hypertext), which are enriched with semantic representations during the authoring process". The digital agency 3pc GmbH provides a prime environment of semantic authoring that can semantically process collections of information in order to enable the efficient authoring of professional, visually appealing, and engaging content products . Over the years, multiple ontologies, as presented in Table 4, have been developed and used in the news domain for authoring purposes ). Semantic storytelling can be conceptualised as the automatic (or semi-automatic) generation of different storylines based on the information extracted, classified, and annotated within extensive textual datasets (Rishes et al. 2013). Semantic storytelling can also be defined as the identification of interesting story paths (Schneider et al. 2017a) or the recommendation of interesting nuggets of information based on a certain set of content using a concrete narrative style or voice . Semantic storytelling is a storytelling approach that bundles a flexible set of semantic services, like the semantic analysis of document collections, in order to point out stimulating correlations between the different entities mentioned in the collections (Schneider et al. 2017b;Rehm et al. 2017). Today, there are different semantic storytelling systems, as shown in Table 5. Within their basic functions are, first, to take large amount of documents and extract the entities and relations between them as well as the temporal information and relationships and, second, to automatically produce a hypertext view of the document cluster in order to support journalists to quickly and efficiently familiarise themselves with the content of the processed documents (Rehm et al. 2019). Zweites Deutsches Fernsehen (ZDF) -German public-service television broadcaster Condat AG https://en.condat.de/ 7 rbb Fernsehen (RBB) -German free-to-air television channel

Research Challenges and Elements of Discussion
Each of the aforementioned semantic-oriented news production and publication activities contributes to the construction of a sophisticated journalism reality. In this reality, such activities, along with their fundamental operational principles, become integral aspects of journalism. This is where the difference between SJ and today's journalism is located. SJ differs in the culture of web usage. The web as a service, along with its technologies and content, will not have a secondary role in practising journalism, as is happening today in several cases, but likely a primary role instead. As a result, in the future, more sophisticated web content management technologies will be used, providing more complete journalistic products in terms of knowledge acquisition, as well as automated investigative methods powered by new tools. Against this background, a few key limitations appear. The authors of this research focused on three interrelated topics that are open to debate. These are the evolutionary course of the web, the adoption of semantic technologies, and the establishment of SJ.
Specifically, the first and foremost issue that must be discussed concerns the integration of the SW and the usage options. As already explained, the SW effort provides smart solutions and alternatives to the flaws of the current web (Domingue et al. 2011). However, since its first conception in the 1960s, and after the endorsement of the term by Tim Berners Lee and his team in 2001 (Berners-Lee et al. 2001), the SW vision remains incomplete. Admittedly, the fact that it has been under development for a long time (Bengtson 2015) raises many questions regarding the feasibility of fulfillment (Marshall and Shipman 2003), censorship and privacy, the scalability and visualization of content, the availability of ontologies, and the multilinguality and stability of SW languages (Benjamins et al. 2002). The Incubator Group for Uncertainty Reasoning of the World Wide Web (URW3-XG) placed all the challenges that SW has to face under the single heading of 'uncertainty' (Kenneth et al. 2008). The question is, thus, when will the SW develop and become mainstream?
Following this skeptical point of view, another issue arises. Many journalists, media scholars, and professionals question the design, development, and implementation prospects of semantic technologies and debate whether and how they will affect the current status quo in their respected fields (Pellegrini 2012;Halevy and McGregor 2012;Dörr 2016). As presented earlier, there are many recorded cases of semantic technology embodiment. Despite their number, researchers observed that such technologies have not been widely adopted by media organizations (Pomp et al. 2018).
From this perspective, the third issue comes in the foreground and poses a question regarding the evolution and establishment of SJ. The above-mentioned journalistic trends rely on contemporary technologies (software and hardware), matured enough to be used for both live and on-demand news reporting. On the contrary, the build-up of SJ depends on developing techniques and technologies, such as the semantic ones. Their immaturity raises doubts regarding the foundations of SJ and its formulation in today's media landscape. In addition, its dual purpose concerning the enhancement of current reporting methods and the revitalisation of journalism's credibility is undermined by several corporate-related factors, such as problems in incorporating new techniques (Paulussen et al. 2011), disregard for multiple initiatives in the field (i.e., apps and web platforms), investment stagnation in conducting educational workshops and utilizing specialized tools, and, most importantly, the low levels of information literacy (Association of College & Research Libraries, Education & Behavioral Sciences Section Communication Studies Committee A. C. R. L) among employees.

Concluding Remarks and Future Research
The researchers of this paper believe that prospective study of the SW philosophy can bring to the foreground adoptable, contemporary approaches regarding journalism. We attempted to introduce an enhanced version of today's journalism in which every activity throughout the reporting process acquires a semantic orientation. Following the rapid proliferation of interest around semantic technologies, it became clear that they have much to offer in response to media needs (Evain 2014). For them to be partially incorporated in journalistic practices and contribute to the rise of a new tendency, the so called 'semantic journalism', was inevitable.
As analysed above, the development and consolidation of the SW and its technologies is the key precondition for the evolution of SJ. Without this, the forms of journalistic practice that use semantic technologies will remain in the background and remain criticized as a science fiction scenario. The apparent superiority of semantic technologies does not seem sufficient to overcome the current web technologies, which remain a cherished mainstay of journalism. Fortunately, there are a few indications in the opposite direction. The adoption of SW philosophy by some media pioneers can motivate other organizations and companies to act similarly and, together, launch the extension of the web. As a result of this extension, several long-established journalistic norms will be re-evaluated (i.e., the use of semantically structured information as raw material for reporting purposes), workflows and practices will be challenged as dysfunctional (i.e., information retrieval, interconnectivity between data sources, conclusion extraction, and the automation of processes) and research methodologies will be upgraded (i.e., the formulation and integration of semantic technologies for the exploitation of