Special Issue "Archiving Community Memories"

Quicklinks

A special issue of Future Internet (ISSN 1999-5903).

Deadline for manuscript submissions: closed (15 April 2014)

Special Issue Editors

Guest Editor
Dr. Thomas Risse

L3S Research Center, Leibniz University Hannover, Appelstrasse 9a, 30167 Hannover, Germany
Website | E-Mail
Fax: +49 (0) 511 762 17779
Interests: semantic evolution; service computing; data management in distributed systems; federated search; self-organizing systems
Guest Editor
Dr. Wim Peters

Department of Computer Science, University of Sheffield, Regent Court, Sheffield S1 4DP, UK
Website | E-Mail
Fax: +44 (0) 114 2221810
Interests: information extraction; knowledge management; web archiving; language technology; semantic resource creation and analysis

Special Issue Information

Dear Colleagues,

Given the ever increasing importance of the World Wide Web as a source of information, adequate Web archiving and preservation has become a cultural necessity in preserving knowledge. This is especially the case for non-traditional digital publications, e.g., blogs, micro-blogs, social networks. Given the deluge of digital information created and the rapidness of changes on the Web, a first necessary step is to be able to respond quickly by the timely creation of archives, with minimum overhead enabling more costly preservation actions further down the line to avoid an irreparable loss of knowledge.

In addition to the “common” challenges of digital preservation, web preservation has to deal with the sheer size and ever-increasing growth and change rate of Web data. Hence, selection of content sources becomes a crucial and challenging task for archival organizations. Instead of following a “collect-all” strategy, archival organizations are trying to build community memories that reflect the diversity of information people are interested in.

Beside the creation of Web archives, their usage in applications plays an increasingly important role. Allowing the easy access to information based on different facets and across time is just one aspect. The possibility to look into the past, to understand how things are evolving opens the space for new application scenarios and analysis approaches.

This special issue of Future Internet journal contains selected, extended papers presented at the 1st International Workshop on Archiving Community Memories (ARCOMEM 2013, http://www.arcomem.eu/ipres-2013) in conjunction with the 10th International Conference on Preservation of Digital Objects to be held 2-6 September 2013, Lisbon, Portugal. However, the special issue is not limited to workshop but open to any submission related to the topic.

Dr. Thomas Risse
Dr. Wim Peters
Guest Editor

Submission

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website. Once you are registered, click here to go to the submission form. Manuscripts can be submitted until the deadline. Papers will be published continuously (as soon as accepted) and will be listed together on the special issue website. Research articles, review articles as well as communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are refereed through a peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Future Internet is an international peer-reviewed Open Access quarterly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 500 CHF (Swiss Francs). English correction and/or formatting fees of 250 CHF (Swiss Francs) will be charged in certain cases for those articles accepted for publication that require extensive additional formatting and/or English corrections.


Keywords

  • Web and Social Web Harvesting
  • Focused & Topical Crawling
  • Deep Web Capture
  • Social Web Analysis
  • Information Extraction
  • Video and Image Analysis
  • Appraisal and selection of content
  • Applications & Use Cases
  • Semantic Web Technologies
  • Temporal Analytics

Published Papers (6 papers)

View options order results:
result details:
Displaying articles 1-6
Export citation of selected articles as:

Research

Open AccessArticle The ARCOMEM Architecture for Social- and Semantic-Driven Web Archiving
Future Internet 2014, 6(4), 688-716; doi:10.3390/fi6040688
Received: 15 April 2014 / Revised: 25 September 2014 / Accepted: 8 October 2014 / Published: 4 November 2014
PDF Full-text (723 KB) | HTML Full-text | XML Full-text
Abstract
The constantly growing amount ofWeb content and the success of the SocialWeb lead to increasing needs for Web archiving. These needs go beyond the pure preservationo of Web pages. Web archives are turning into “community memories” that aim at building a better understanding
[...] Read more.
The constantly growing amount ofWeb content and the success of the SocialWeb lead to increasing needs for Web archiving. These needs go beyond the pure preservationo of Web pages. Web archives are turning into “community memories” that aim at building a better understanding of the public view on, e.g., celebrities, court decisions and other events. Due to the size of the Web, the traditional “collect-all” strategy is in many cases not the best method to build Web archives. In this paper, we present the ARCOMEM (From Future Internet 2014, 6 689 Collect-All Archives to Community Memories) architecture and implementation that uses semantic information, such as entities, topics and events, complemented with information from the Social Web to guide a novel Web crawler. The resulting archives are automatically enriched with semantic meta-information to ease the access and allow retrieval based on conditions that involve high-level concepts. Full article
(This article belongs to the Special Issue Archiving Community Memories)
Open AccessArticle ARCOMEM Crawling Architecture
Future Internet 2014, 6(3), 518-541; doi:10.3390/fi6030518
Received: 15 April 2014 / Revised: 11 July 2014 / Accepted: 14 July 2014 / Published: 19 August 2014
Cited by 2 | PDF Full-text (840 KB) | HTML Full-text | XML Full-text
Abstract
The World Wide Web is the largest information repository available today. However, this information is very volatile and Web archiving is essential to preserve it for the future. Existing approaches to Web archiving are based on simple definitions of the scope of Web
[...] Read more.
The World Wide Web is the largest information repository available today. However, this information is very volatile and Web archiving is essential to preserve it for the future. Existing approaches to Web archiving are based on simple definitions of the scope of Web pages to crawl and are limited to basic interactions with Web servers. The aim of the ARCOMEM project is to overcome these limitations and to provide flexible, adaptive and intelligent content acquisition, relying on social media to create topical Web archives. In this article, we focus on ARCOMEM’s crawling architecture. We introduce the overall architecture and we describe its modules, such as the online analysis module, which computes a priority for the Web pages to be crawled, and the Application-Aware Helper which takes into account the type of Web sites and applications to extract structure from crawled content. We also describe a large-scale distributed crawler that has been developed, as well as the modifications we have implemented to adapt Heritrix, an open source crawler, to the needs of the project. Our experimental results from real crawls show that ARCOMEM’s crawling architecture is effective in acquiring focused information about a topic and leveraging the information from social media. Full article
(This article belongs to the Special Issue Archiving Community Memories)
Open AccessArticle Should I Care about Your Opinion? Detection of Opinion Interestingness and Dynamics in Social Media
Future Internet 2014, 6(3), 457-481; doi:10.3390/fi6030457
Received: 18 April 2014 / Revised: 19 June 2014 / Accepted: 11 July 2014 / Published: 13 August 2014
Cited by 1 | PDF Full-text (1787 KB) | HTML Full-text | XML Full-text
Abstract
In this paper, we describe a set of reusable text processing components for extracting opinionated information from social media, rating it for interestingness, and for detecting opinion events. We have developed applications in GATE to extract named entities, terms and events and to
[...] Read more.
In this paper, we describe a set of reusable text processing components for extracting opinionated information from social media, rating it for interestingness, and for detecting opinion events. We have developed applications in GATE to extract named entities, terms and events and to detect opinions about them, which are then used as the starting point for opinion event detection. The opinions are then aggregated over larger sections of text, to give some overall sentiment about topics and documents, and also some degree of information about interestingness based on opinion diversity. We go beyond traditional opinion mining techniques in a number of ways: by focusing on specific opinion-target extraction related to key terms and events, by examining and dealing with a number of specific linguistic phenomena, by analysing and visualising opinion dynamics over time, and by aggregating the opinions in different ways for a more flexible view of the information contained in the documents. Full article
(This article belongs to the Special Issue Archiving Community Memories)
Open AccessArticle Analysing and Enriching Focused Semantic Web Archives for Parliament Applications
Future Internet 2014, 6(3), 433-456; doi:10.3390/fi6030433
Received: 16 April 2014 / Revised: 19 June 2014 / Accepted: 11 July 2014 / Published: 30 July 2014
Cited by 1 | PDF Full-text (966 KB) | HTML Full-text | XML Full-text
Abstract
The web and the social web play an increasingly important role as an information source for Members of Parliament and their assistants, journalists, political analysts and researchers. It provides important and crucial background information, like reactions to political events and comments made by
[...] Read more.
The web and the social web play an increasingly important role as an information source for Members of Parliament and their assistants, journalists, political analysts and researchers. It provides important and crucial background information, like reactions to political events and comments made by the general public. The case study presented in this paper is driven by two European parliaments (the Greek and the Austrian parliament) and targets an effective exploration of political web archives. In this paper, we describe semantic technologies deployed to ease the exploration of the archived web and social web content and present evaluation results. Full article
(This article belongs to the Special Issue Archiving Community Memories)
Open AccessArticle The Use of Personal Value Estimations to Select Images for Preservation in Public Library Digital Community Collections
Future Internet 2014, 6(2), 359-377; doi:10.3390/fi6020359
Received: 7 February 2014 / Revised: 23 April 2014 / Accepted: 7 May 2014 / Published: 27 May 2014
Cited by 1 | PDF Full-text (1126 KB) | HTML Full-text | XML Full-text
Abstract
A considerable amount of information, particularly in image form, is shared on the web through social networking sites. If any of this content is worthy of preservation, who decides what is to be preserved and based on what criteria. This paper explores the
[...] Read more.
A considerable amount of information, particularly in image form, is shared on the web through social networking sites. If any of this content is worthy of preservation, who decides what is to be preserved and based on what criteria. This paper explores the potential for public libraries to assume this role of community digital repositories through the creation of digital collections. Thirty public library users and thirty librarians were solicited from the Indianapolis metropolitan area to evaluate five images selected from Flickr in terms of their value to public library digital collections and their worthiness of long-term preservation. Using a seven-point Likert scale, participants assigned a value to each image in terms of its importance to self, family and society. Participants were then asked to explain the reasoning behind their valuations. Public library users and librarians had similar value estimations of the images in the study. This is perhaps the most significant finding of the study, given the importance of collaboration and forming partnerships for building and sustaining community collections and archives. Full article
(This article belongs to the Special Issue Archiving Community Memories)
Open AccessArticle Exploiting Multimedia in Creating and Analysing Multimedia Web Archives
Future Internet 2014, 6(2), 242-260; doi:10.3390/fi6020242
Received: 25 February 2014 / Revised: 31 March 2014 / Accepted: 16 April 2014 / Published: 24 April 2014
PDF Full-text (3652 KB) | HTML Full-text | XML Full-text
Abstract
The data contained on the web and the social web are inherently multimedia and consist of a mixture of textual, visual and audio modalities. Community memories embodied on the web and social web contain a rich mixture of data from these modalities. In
[...] Read more.
The data contained on the web and the social web are inherently multimedia and consist of a mixture of textual, visual and audio modalities. Community memories embodied on the web and social web contain a rich mixture of data from these modalities. In many ways, the web is the greatest resource ever created by human-kind. However, due to the dynamic and distributed nature of the web, its content changes, appears and disappears on a daily basis. Web archiving provides a way of capturing snapshots of (parts of) the web for preservation and future analysis. This paper provides an overview of techniques we have developed within the context of the EU funded ARCOMEM (ARchiving COmmunity MEMories) project to allow multimedia web content to be leveraged during the archival process and for post-archival analysis. Through a set of use cases, we explore several practical applications of multimedia analytics within the realm of web archiving, web archive analysis and multimedia data on the web in general. Full article
(This article belongs to the Special Issue Archiving Community Memories)

Journal Contact

MDPI AG
Future Internet Editorial Office
St. Alban-Anlage 66, 4052 Basel, Switzerland
futureinternet@mdpi.com
Tel. +41 61 683 77 34
Fax: +41 61 302 89 18
Editorial Board
Contact Details Submit to Future Internet
Back to Top