Instant and Interactive
The goal of the Instant and Interactive Data Mining workshop (IID) is to address the development of data mining techniques that allow users to interactively explore their data, receiving near-instant updates to every requested refinement. While Instant mining and Stream mining start from different perspectives and operate under different constraints, there is a significant overlap in techniques and developments in either setting can have a significant impact on the other. Therefore, this workshop aims to bring together researchers interested in instant and adaptive data mining methods, whether for use in interactive systems or in the processing of large streams of evolving data.
Workshop Program
IID is a full-day workshop on September 24th, organized in conjunction with ECML PKDD 2012. The workshop will be located in room 3.30 of the Wills Memorial building, Park street, Bristol.
The program for IID is:
9:00 | Workshop Opening |
9:05 | Keynote Presentation
(abstract)
'Real-World Interactive Machine Learning of Customer Support Logs at Hewlett-Packard' by George Forman |
10:05 |
'A Case of Visual and Interactive Data Analysis: Geospatial Redescription Mining'
by Esther Galbrun & Pauli Miettinen |
10:30 | Coffee Break |
11:00 |
'Online Estimation of Discrete Densities using Classifier Chains' by Michael Geilke & Eibe Frank & Stefan Kramer |
11:20 |
'iST-MRF: Interactive Spatio-Temporal Probabilistic Models for Sensor Networks' by Nico Piatkowski |
11:40 |
'From Block-based Ensembles to Online Learners In Changing Data Streams: If- and How-To' by Dariusz Brzezinski & Jerzy Stefanowski |
12:00 | Lunch (on your own) |
13:30 | Keynote Presentation 'Real Data Mining for Real Users: Instant, Interactive – A Dream?' by Michael Berthold |
14:30 |
'Towards Exploratory Search of Scientific Information' by Dorota Glowacka & Ksenia Konyushkova & Tuukka Ruotsalo & Samuel Kaski |
14:55 |
'Towards Real-Time Machine Learning' by Andreas Hapfelmeier & Christian Mertes & Jana Schmidt & Stefan Kramer |
15:20 |
'Instant Selection of High Contrast Projections in Multi-dimensional Data Streams' by Andrei Vanea & Emmanuel Müller & Fabian Keller & Klemens Böhm |
15:45 | Discussion & Closing |
16:00 | Coffee Break |
16:30 | Conference Opening |
Invited Speakers
We are proud to have
- Michael Berthold (Konstanz University) and
- George Forman (HP Labs)
Michael Berthold will give a keynote with the title: 'Real Data Mining for Real Users: Instant, Interactive - A Dream?'. He holds the Nycomed-Chair for Bioinformatics and Information Mining at Konstanz University in Germany, where his research focuses on using data mining methods for the interactive analysis of large information repositories in the Life Sciences. Most of the research results are made available to the public via the open source data mining platform KNIME.
Important Dates
Submission Deadline | 29th of June 2012, 23:59 PST |
---|---|
Notification to Authors | 20th of July 2012, 23:59 PST |
Camera-ready Deadline | 3rd of August 2012, 23:59 PST |
Workshop day | 24th of September 2012 |
Organizers
- Jilles Vreeken (Universiteit Antwerpen)
- Nikolaj Tatti (Universiteit Antwerpen)
- Bart Goethals (Universiteit Antwerpen)
- Anton Dries (Katholieke Universiteit Leuven)
- Matthijs van Leeuwen (Katholieke Universiteit Leuven)
- Siegfried Nijssen (Katholieke Universiteit Leuven)
iid2012 (at) easychair.org
Program Committee
- Bettina Berendt, KU Leuven
- Michael Berthold, University of Konstanz
- Albert Bifet, University of Waikato
- Mario Boley, University of Bonn and Fraunhofer IAIS
- Polo Chau, Georgia Tech
- Tijl De Bie, University of Bristol
- Jaakko Hollmén, Aalto University
- Florian Mansmann, University of Konstanz
- Naren Ramakrishnan, Virginia Tech
- Thomas Seidl, RWTH Aachen University
- Geoff Webb, Monash University
- Indrė Žliobaitė, Bournemouth University
What's IID?
Today, we lack the technology to perform `free-style' exploratory analysis on large amounts of data, allowing users to make discoveries by following their intuition. Standard data mining aims at finding highly interesting results, but this typically results in techniques that are computationally extremely demanding and therefore time consuming. Consequently, these techniques are hardly useful for the interactive exploration of large databases. To tackle this problem, we propose instant, interactive and adaptive data mining as a new data mining paradigm.
By instant, we mean that good results should be presented to the user within a few seconds. Short waiting times are essential to keep the user's attention. By interactive, we mean that the user should be able to give feedback on-the-fly, allowing to user to influence the analysis. Identifying intermediate results as (un)interesting allows the algorithm to focus on specific parts of the database and certain types of results. By adaptive, we mean that the system should learn from previous interactions with the user. These elements are clearly intertwined: without near-instant results, there can be no true interactivity, and without techniques that can take feedback into account, it is unrealistic to expect good results fast. Being instant, interactive and adaptive are therefore key requirements for next generation data mining. This will require a shift of focus compared to contemporary data mining work, in that instant algorithms will no longer be complete or optimal, while interactive algorithms will provide users an easy means to influence their calculations.
IID and Stream Mining
Several key challenges of instant and adaptive data mining are shared with the field of stream mining, which starts from the premise that the flow of data is continuous and never-ending. The key limitation in stream mining is the lack of arbitrary access to the data; data is provided in a given order and at a given rate, and only limited amounts of data can be buffered for processing at a later time. This brings along the need for on-line and any-time algorithms, that is, algorithms that are capable of processing the data as it becomes available and that can quickly provide partial results based on the data seen so far, without accessing the complete data. These algorithms should also be capable of adapting to changing circumstances such as gradual drifts of distributions or sudden shifts of the concepts underlying the data. Hence, adaptability and instantaneousness are key for developing successful stream mining algorithms as well.
Even though instant mining and stream mining start from different perspectives and operate under different constraints, we strongly believe there is a significant overlap in techniques and that developments in either setting can have a significant impact on the other. Therefore, the goal of this workshop is to bring together researchers with a shared interest in instant and adaptive data mining methods, whether for use in interactive systems or in the processing of large streams of evolving data.