Next Article in Journal
Using Multilevel Analysis to Examine the Relationship between Upper Secondary Students Internet Safety Awareness, Social Background and Academic Aspirations
Previous Article in Journal
The Gender Digital Divide in Developing Countries
Previous Article in Special Issue
ARCOMEM Crawling Architecture
Article Menu

Export Article

Open AccessArticle
Future Internet 2014, 6(4), 688-716; doi:10.3390/fi6040688

The ARCOMEM Architecture for Social- and Semantic-Driven Web Archiving

1
L3S Research Center, Leibniz Universität Hannover, Hannover 30167, Germany
2
NLP Group, Department of Computer Science, University of Sheffield, S1 4DP Sheffield, UK
3
ATHENA - Research and Innovation Center in Information, Communication and Knowledge Technologies, 15125 Maroussi, Athens, Greece
4
CNRS LTCIT, Institut Mines-Télécom, Télécom ParisTech, 75634 Paris Cedex 13, France
5
Internet Memory Foundation, 45 ter rue de la Révolution, 93100 Montreuil, France
6
Yahoo Research, 08018 Barcelona, Spain
7
Athens Technology Center (ATC), 15233 Halandri Athens, Greece
*
Author to whom correspondence should be addressed.
Received: 15 April 2014 / Revised: 25 September 2014 / Accepted: 8 October 2014 / Published: 4 November 2014
(This article belongs to the Special Issue Archiving Community Memories)
View Full-Text   |   Download PDF [723 KB, uploaded 4 November 2014]   |  

Abstract

The constantly growing amount ofWeb content and the success of the SocialWeb lead to increasing needs for Web archiving. These needs go beyond the pure preservationo of Web pages. Web archives are turning into “community memories” that aim at building a better understanding of the public view on, e.g., celebrities, court decisions and other events. Due to the size of the Web, the traditional “collect-all” strategy is in many cases not the best method to build Web archives. In this paper, we present the ARCOMEM (From Future Internet 2014, 6 689 Collect-All Archives to Community Memories) architecture and implementation that uses semantic information, such as entities, topics and events, complemented with information from the Social Web to guide a novel Web crawler. The resulting archives are automatically enriched with semantic meta-information to ease the access and allow retrieval based on conditions that involve high-level concepts. View Full-Text
Keywords: web archiving; web crawler; architecture; text analysis; social Web web archiving; web crawler; architecture; text analysis; social Web
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Risse, T.; Demidova, E.; Dietze, S.; Peters, W.; Papailiou, N.; Doka, K.; Stavrakas, Y.; Plachouras, V.; Senellart, P.; Carpentier, F.; Mantrach, A.; Cautis, B.; Siehndel, P.; Spiliotopoulos, D. The ARCOMEM Architecture for Social- and Semantic-Driven Web Archiving. Future Internet 2014, 6, 688-716.

Show more citation formats Show less citations formats

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Future Internet EISSN 1999-5903 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top