Istituto di Scienza e Tecnologie dell'Informazione     
CassarÓ P., Colucci M., Gotta A., Tonellotto N. Joint modeling of arrival process and length distribution of queries in Web search engines. Technical report, 2016.
This paper proposes a novel fitting procedure via non-parametric kernel- based models of the probability mass function of a discrete arrival process, derived from real traffic traces of queries to a Web search engine. Most of the adopted estimation techniques for probability mass functions are based on parameter estimations for a given family of probability distri- bution functions. Conversely, the proposed procedure, jointly with a kernel-based model of the probability distribution function, doesn't need any assumptions about membership to a families of distributions, or about parameters. The fitting procedure based on the Generalized Cross-Entropy resolves a Quadratic Programming Problem. Furthermore, the estimated probability mass function can be expressed in a closed form, as a weighted sum of kernel functions. We also examine the performance of the proposed procedure via numer- ical experiments and present an example of traffic analysis with real data traffic. Results show that our estimation of the probability mass function, closely matches the empirical probability mass function. Precisely, through the procedure, both temporal and statistical characteristics, such as auto- correlation, long-range dependence, and skewness, can be well approximated.
Subject Web Search Engine
Batch Arrival Process
Kernel-Based Probability Distribution Models
Generalized Cross Entropy

Icona documento 1) Download Document PDF

Icona documento Open access Icona documento Restricted Icona documento Private


Per ulteriori informazioni, contattare: Librarian http://puma.isti.cnr.it

Valid HTML 4.0 Transitional