Resumo (PT):
Abstract (EN):
We propose a data mining model that captures the user navigation behaviour patterns. The user navigation sessions are modelled as a hypertext probabilistic grammar whose higher probability strings correspond to the user's preferred trails. An algorithm to efficiently mine such trails is given. We make use of the N gram model which assumes that the last N pages browsed affect the probability of the next page to be visited. The model is based on the theory of probabilistic grammars providing it with a sound theoretical foundation for future enhancements. Moreover, we propose the use of entropy as an estimator of the grammar's statistical properties. Extensive experiments were conducted and the results show that the algorithm runs in linear time, the grammar's entropy is a good estimator of the number of mined trails and the real data rules confirm the effectiveness of the model. © Springer-Verlag Berlin Heidelberg 2000.
Idioma:
Inglês
Tipo (Avaliação Docente):
Científica
Nº de páginas:
20