PUMA
Istituto di Informatica e Telematica     
Geraci F., Cavaleri G. A Clustering-based Approach for the Identification of Parked Domains. Technical report /cnr.iit/2014-TR-001, 2014.
 
 
Abstract
(English)
Parked domains (PDs) are domains whose owners are not interested in using them as gates for their activities but they are kept reserved to be sold in the secondary market of web domains. To transform the costs of the annual registration fees in an opportu- nity of revenues, parked domains most often host a large amount of ads in the hope that someone who lands on the site by chance clicks on some ads. Since parking has become a widespread activity, a large number of specialized companies have come out and made parking a straightforward task that simply requires to set the domain's name servers ap- propriately. Although parking is a legal activity, it introduces a big burden for crawling systems and web mining tools. In fact, without ltering parked domains, crawlers could spend a non-negligible part of their time downloading fat web sites whose content can neg- atively a ect the performances of analysis algorithms. In this paper, we face the problem of compiling the list of the name servers used for domain parking so that they can be discarded before the rst connection just after the rst DNS query.
Subject Parked domains
Web spam
Information systems - World Wide Web - Online advertising


Icona documento 1) Download Document PDF


Icona documento Open access Icona documento Restricted Icona documento Private

 


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional