Workshop on "Statistical Approaches to Web Mining" SAWM'04
20 September 2004, Pisa, Italy
in conjunction with ECML/PKDD 2004: The 15th
European Conference on Machine Learning (ECML) and
The 8th European Conference
on Principles and Practice of Knowledge Discovery
in Databases (PKDD), 20-24 September, 2004, Pisa,
- Marco Gori Dipartimento di Ingegneria dell'Informazione, Siena, Italy
- Michelangelo Ceci (co-chair) Department of Informatics, University of Bari, Bari, Italy
- Mirco Nanni (co-chair) KDDLab, ISTI-CNR, Pisa, Italy
The explosive growth and popularity of the World-Wide Web has resulted
in a huge number of information sources on the Internet and the promise of unprecedented information-gathering
capabilities, which can be focused to:
- Extraction of knowledge from the Web: the Web is a huge collection of documents and
sophisticated knowledge extraction methods are required to effectively access the information
they contain. Such methods include machine learning and data mining techniques for information
categorization, extraction, and search, as well as for adapting to the interests of the users.
- Extraction of knowledge from the user's behaviour: the Web is a venue for doing business
electronically, as well as for the interaction, information acquisition and service exploitation
used by public authorities, non-governmental organizations, communities of interest and private
Persons. When observed as a venue for the achievement of business goals, the Web presence should
be aligned with the objectives of its owner and the requirements of its users. This raises the
demand for understanding Web usage, combining it with other sources of knowledge inside an
organization, and deriving lines of action.
Unfortunately, the morass of sources presents a formidable hurdle to effectively extract information
from them. In recent years a growing number of machine learning and data mining methods have been applied
to this problem. In many cases, the theoretical glue binding them together is manifestly that of statistics.
Some examples of well-known statistical learning methods applied to web mining problems are Support Vector
Machines (SVM), Bayesian classifiers, neural networks, as well as unsupervised learning methods (clustering,
principal component analysis, and so on). Statistical data mining methods are also used for data pre-processing,
transformation and result visualization.
The purpose of this workshop is to bring together researchers with background in machine learning,
data mining, statistics and pattern recognition who are interested in facing different problem of
information-gathering on the Web. The workshop is the third follow up event within the Web Mining
Forum supported by the Network of Excellence on Knowledge Discovery (KDNet, IST project No. 2001-33086).
The workshop will maintain a balance between theoretical issues and
descriptions of case studies to promote synergy between theory and
practice. It aims to be a highly communicative meeting place for
researchers working on similar topics, but coming from different
communities. In order to achieve these goals, the workshop will
consist of one or two invited talks, followed by short presentations
and longer discussions.
Each author will be encouraged to read another accepted paper and
to comment on it after the original talk has been given. Authors
should make certain that the techniques they describe deal with the
issues that are associated with the workshop.
All ECML/PKDD'04 SAWM workshop participants must also register for
the main ECML/PKDD conference. Workshop attendance will be limited to
The SAWM'04 workshop will be held in conjunction with the following tutorial:
The following invited talk will be presented within the workshop:
- "Computational and Statistical methods for web usage
mining: a critical comparison", by Paolo Giudici.
June 21, 2004
Notification of acceptance:
July 12, 2004
Workshop paper camera-ready deadline:
July 19, 2004
Workshop proceedings (camera- and web-ready):
July 26, 2004
Workshop: September 20, 2004