Next Article in Journal
Using Multilevel Analysis to Examine the Relationship between Upper Secondary Students Internet Safety Awareness, Social Background and Academic Aspirations
Previous Article in Journal
The Gender Digital Divide in Developing Countries
Previous Article in Special Issue
ARCOMEM Crawling Architecture
Article Menu

Export Article

Open AccessArticle
Future Internet 2014, 6(4), 688-716;

The ARCOMEM Architecture for Social- and Semantic-Driven Web Archiving

L3S Research Center, Leibniz Universität Hannover, Hannover 30167, Germany
NLP Group, Department of Computer Science, University of Sheffield, S1 4DP Sheffield, UK
ATHENA - Research and Innovation Center in Information, Communication and Knowledge Technologies, 15125 Maroussi, Athens, Greece
CNRS LTCIT, Institut Mines-Télécom, Télécom ParisTech, 75634 Paris Cedex 13, France
Internet Memory Foundation, 45 ter rue de la Révolution, 93100 Montreuil, France
Yahoo Research, 08018 Barcelona, Spain
Athens Technology Center (ATC), 15233 Halandri Athens, Greece
Author to whom correspondence should be addressed.
Received: 15 April 2014 / Revised: 25 September 2014 / Accepted: 8 October 2014 / Published: 4 November 2014
(This article belongs to the Special Issue Archiving Community Memories)
Full-Text   |   PDF [723 KB, uploaded 4 November 2014]   |  


The constantly growing amount ofWeb content and the success of the SocialWeb lead to increasing needs for Web archiving. These needs go beyond the pure preservationo of Web pages. Web archives are turning into “community memories” that aim at building a better understanding of the public view on, e.g., celebrities, court decisions and other events. Due to the size of the Web, the traditional “collect-all” strategy is in many cases not the best method to build Web archives. In this paper, we present the ARCOMEM (From Future Internet 2014, 6 689 Collect-All Archives to Community Memories) architecture and implementation that uses semantic information, such as entities, topics and events, complemented with information from the Social Web to guide a novel Web crawler. The resulting archives are automatically enriched with semantic meta-information to ease the access and allow retrieval based on conditions that involve high-level concepts. View Full-Text
Keywords: web archiving; web crawler; architecture; text analysis; social Web web archiving; web crawler; architecture; text analysis; social Web

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Share & Cite This Article

MDPI and ACS Style

Risse, T.; Demidova, E.; Dietze, S.; Peters, W.; Papailiou, N.; Doka, K.; Stavrakas, Y.; Plachouras, V.; Senellart, P.; Carpentier, F.; Mantrach, A.; Cautis, B.; Siehndel, P.; Spiliotopoulos, D. The ARCOMEM Architecture for Social- and Semantic-Driven Web Archiving. Future Internet 2014, 6, 688-716.

Show more citation formats Show less citations formats

Related Articles

Article Metrics

Article Access Statistics



[Return to top]
Future Internet EISSN 1999-5903 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top