We present Filtron, a prototype anti-spam filter that integrates the main empirical con-clusions of our comprehensive analysis on using machine learning to construct effective personalizedanti-spam filters. Filtron is based on the experimental results over several design parameters on fourpublicly available benchmark corpora. After describing Filtron's architecture, we assess its behaviorin real use over a period of seven months. The results are deemed satisfactory, though they can beimproved with more elaborate preprocessing and regular re-training.
The World Wide Web has become a major source of information that can be turned into valuable knowledge for individuals and organisations. In the work presented here, we are concerned with the extraction of meta-knowledge from the Web. In particular, knowledge about Web usage which is invaluable to the construction of Web sites that meet their purposes and prevent disorientation. Towards this goal, we propose the organisation of the users of a Web site into groups with common navigational behaviour (user communities). We view the task of building user communities as a data mining task, searching for interesting patterns within a database. The database that we use in our experiments consists of access logs collected from the Web site of the Advanced Course on Artificial Intelligence 1999. The unsupervised machine learning algorithm COBWEB is used to organise the users of the site, who follow similar paths, into a small set of communities. Particular attention is paid to the interpretation of the communities that are generated through this process. For this purpose, we use a simple metric to identify the representative navigational behaviour for each community. This information can then be used by the administrators of the site to re-organise it in a way that is tailored to the needs of each community. The proposed Web usage analysis is much more insightful than the common approach of examining simple usage statistics of the Web site.
Comprehensive Meta Analysis keygen
2ff7e9595c
Comments