Istituto di Scienza e Tecnologie dell'Informazione     
Baeza-Yates R., Castillo C., Junqueira F., Plachouras V., Silvestri F. Challenges on distributed Web retrieval. In: IEEE 23rd International Conference on Data Engineering. ICDE 2007 (Istanbul, Turkey, 15-20 April 2007). Proceedings, pp. 6 - 20. IEEE, 2007.
In the ocean ofWeb data, Web search engines are the primary way to access content. As the data is on the order of petabytes, current search engines are very large centralized systems based on replicated clusters. Web data, however, is always evolving. The number of Web sites continues to grow rapidly and there are currently more than 20 billion indexed pages. In the near future, centralized systems are likely to become ineffective against such a load, thus suggesting the need of fully distributed search engines. Such engines need to achieve the following goals: high quality answers, fast response time, high query throughput, and scalability. In this paper we survey and organize recent research results, outlining the main challenges of designing a distributed Web retrieval system.
URL: http://ieeexplore.ieee.org/iel5/4221634/4221635/04221649.pdf?isnumber=4221635&prod=CNF&arnumber=4221649&arSt=6&ared=20&arAuthor=Baeza-Yates%2C+R.%3B+Castillo%2C+C.%3B+Junqueira%2C+F.%3B+Plachouras%2C+V.%3B+Silvestri%2C+F.
DOI: 10.1109/ICDE.2007.367846
Subject Web Search Engine Engineering
Distributed Search Engines
Challenges in Web Search Engines
H.3 Information Storage and Retrieval
H.3.5 Online Information Services. Web-based services
H.3.5 Online Information Services. Commercial services

Icona documento 1) Download Document PDF

Icona documento Open access Icona documento Restricted Icona documento Private


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional