Instant and Interactive

The goal of the Instant and Interactive Data Mining workshop (IID) is to address the development of data mining techniques that allow users to interactively explore their data, receiving near-instant updates to every requested refinement. While Instant mining and Stream mining start from different perspectives and operate under different constraints, there is a significant overlap in techniques and developments in either setting can have a significant impact on the other. Therefore, this workshop aims to bring together researchers interested in instant and adaptive data mining methods, whether for use in interactive systems or in the processing of large streams of evolving data.

Workshop Program

IID is a full-day workshop on September 24th, organized in conjunction with ECML PKDD 2012. The workshop will be located in room 3.30 of the Wills Memorial building, Park street, Bristol.

The program for IID is:


9:00Workshop Opening
9:05Keynote Presentation (abstract)
'Real-World Interactive Machine Learning of Customer Support Logs at Hewlett-Packard'
by George Forman
10:05 'A Case of Visual and Interactive Data Analysis: Geospatial Redescription Mining'
by Esther Galbrun & Pauli Miettinen

10:30Coffee Break

11:00 'Online Estimation of Discrete Densities using Classifier Chains'
by Michael Geilke & Eibe Frank & Stefan Kramer
11:20 'iST-MRF: Interactive Spatio-Temporal Probabilistic Models for Sensor Networks'
by Nico Piatkowski
11:40 'From Block-based Ensembles to Online Learners In Changing Data Streams: If- and How-To'
by Dariusz Brzezinski & Jerzy Stefanowski

12:00Lunch (on your own)

13:30Keynote Presentation
'Real Data Mining for Real Users: Instant, Interactive – A Dream?'
by Michael Berthold
14:30 'Towards Exploratory Search of Scientific Information'
by Dorota Glowacka & Ksenia Konyushkova & Tuukka Ruotsalo & Samuel Kaski
14:55 'Towards Real-Time Machine Learning'
by Andreas Hapfelmeier & Christian Mertes & Jana Schmidt & Stefan Kramer
15:20 'Instant Selection of High Contrast Projections in Multi-dimensional Data Streams'
by Andrei Vanea & Emmanuel Müller & Fabian Keller & Klemens Böhm
15:45Discussion & Closing

16:00Coffee Break
16:30Conference Opening

Invited Speakers

We are proud to have

as the keynote speakers at our workshop.

Michael Berthold will give a keynote with the title: 'Real Data Mining for Real Users: Instant, Interactive - A Dream?'. He holds the Nycomed-Chair for Bioinformatics and Information Mining at Konstanz University in Germany, where his research focuses on using data mining methods for the interactive analysis of large information repositories in the Life Sciences. Most of the research results are made available to the public via the open source data mining platform KNIME.

George Forman will present: 'Real-World Interactive Machine Learning of Customer Support Logs at Hewlett-Packard'. He is a senior research scientist at Hewlett-Packard Labs. His research interests stem from practical issues that arise in the application of machine learning to industrial problems, e.g. feature selection, robustness, small training sets, and novel problem formulations, such as interactive machine learning. With over 40 publications and 48 patents, he frequently serves as a journal reviewer and on program committees of conferences such as KDD and ECML PKDD. He received his PhD in Computer Science & Engineering from the University of Washington, Seattle, in 1996.

Important Dates

Submission Deadline29th of June 2012, 23:59 PST
Notification to Authors20th of July 2012, 23:59 PST
Camera-ready Deadline3rd of August 2012, 23:59 PST
Workshop day24th of September 2012

Organizers

You can contact us at:
iid2012 (at) easychair.org

Program Committee

  • Bettina Berendt, KU Leuven
  • Michael Berthold, University of Konstanz
  • Albert Bifet, University of Waikato
  • Mario Boley, University of Bonn and Fraunhofer IAIS
  • Polo Chau, Georgia Tech
  • Tijl De Bie, University of Bristol
  • Jaakko Hollmén, Aalto University
  • Florian Mansmann, University of Konstanz
  • Naren Ramakrishnan, Virginia Tech
  • Thomas Seidl, RWTH Aachen University
  • Geoff Webb, Monash University
  • Indrė Žliobaitė, Bournemouth University

What's IID?

Today, we lack the technology to perform `free-style' exploratory analysis on large amounts of data, allowing users to make discoveries by following their intuition. Standard data mining aims at finding highly interesting results, but this typically results in techniques that are computationally extremely demanding and therefore time consuming. Consequently, these techniques are hardly useful for the interactive exploration of large databases. To tackle this problem, we propose instant, interactive and adaptive data mining as a new data mining paradigm.

By instant, we mean that good results should be presented to the user within a few seconds. Short waiting times are essential to keep the user's attention. By interactive, we mean that the user should be able to give feedback on-the-fly, allowing to user to influence the analysis. Identifying intermediate results as (un)interesting allows the algorithm to focus on specific parts of the database and certain types of results. By adaptive, we mean that the system should learn from previous interactions with the user. These elements are clearly intertwined: without near-instant results, there can be no true interactivity, and without techniques that can take feedback into account, it is unrealistic to expect good results fast. Being instant, interactive and adaptive are therefore key requirements for next generation data mining. This will require a shift of focus compared to contemporary data mining work, in that instant algorithms will no longer be complete or optimal, while interactive algorithms will provide users an easy means to influence their calculations.

IID and Stream Mining

Several key challenges of instant and adaptive data mining are shared with the field of stream mining, which starts from the premise that the flow of data is continuous and never-ending. The key limitation in stream mining is the lack of arbitrary access to the data; data is provided in a given order and at a given rate, and only limited amounts of data can be buffered for processing at a later time. This brings along the need for on-line and any-time algorithms, that is, algorithms that are capable of processing the data as it becomes available and that can quickly provide partial results based on the data seen so far, without accessing the complete data. These algorithms should also be capable of adapting to changing circumstances such as gradual drifts of distributions or sudden shifts of the concepts underlying the data. Hence, adaptability and instantaneousness are key for developing successful stream mining algorithms as well.

Even though instant mining and stream mining start from different perspectives and operate under different constraints, we strongly believe there is a significant overlap in techniques and that developments in either setting can have a significant impact on the other. Therefore, the goal of this workshop is to bring together researchers with a shared interest in instant and adaptive data mining methods, whether for use in interactive systems or in the processing of large streams of evolving data.