Istituto di Scienza e Tecnologie dell'Informazione     
Murdock V., Donato D., Gionis A., Silvestri F. System and method for identifying spam hosts using stacked graphical learning. Patent n. AG06F1516FI. Registered in New York on 2007.
Systems and methods for identifying spam hosts are disclosed in which hosts known to the system and initially classified as spam or non-spam by a baseline classifier. Then for each node u in the host graph a new feature is computed. This feature is an aggregate function of the initial classifications produced by the baseline classifier for the neighbors of the node u. The set of neighbors can be defined in many different ways: in-link neighbors, out-link neighbors, bi-directional neighbors, k-hops neighbors, etc. The new feature computed above then is added to the existing set of features, and the baseline classifier is trained again, producing new predictions for each node. The results may then be used in many different ways including to filter search results based on host classifications so that spam hosts are not displayed or displayed last in a results set.
URL: http://www.faqs.org/patents/app/20090089373
Subject Web spam detection
H.4.m Information Systems Applications. Miscellaneous

Icona documento 1) Download Document PDF

Icona documento Open access Icona documento Restricted Icona documento Private


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional