Home | iotstreaming2017

IoT

Large Scale
Learning from Data Streams

ECML-PKDD 2017

18 September 2017

The Conference

IoT

Large Scale
Learning from Data Streams

Workshop+Tutorial

ECML-PKDD 2017

18 September 2017

The Workshop

Workshop + Tutorial

18 September 2017

09:00 am

The volume of data is rapidly increasing due to the development of the technology of information and communication. This data comes mostly in the form of streams. Learning from this ever-growing amount of data requires flexible learning models that self-adapt over time. In addition, these models must take into account many constraints: (pseudo) real-time processing, high-velocity, and dynamic multi-form change such as concept drift and novelty. This workshop welcomes novel research about learning from data streams in evolving environments. It will provide the researchers and participants with a forum for exchanging ideas, presenting recent advances and discussing challenges related to data streams processing. It solicits original work, already completed or in progress. Position papers are also considered. This workshop is combined with a tutorial treating the same topic and will be presented in the same day.

Anchor 1

The Workshop

2nd ECML/PKDD 2017 Workshop on
Large-scale Learning from Data Streams in Evolving Environments

Workshop + Tutorial

18 September 2017

09:00 am

Motivation and focus

The volume of data is rapidly increasing due to the development of the technology of information and communication. This data comes mostly in the form of streams. Learning from this ever-growing amount of data requires flexible learning models that self-adapt over time. In addition, these models must take into account many constraints: (pseudo) real-time processing, high-velocity, and dynamic multi-form change such as concept drift and novelty. Consequently, learning from streams of evolving and unbounded data requires developing new algorithms and methods able to learn under the following constraints: -) random access to observations is not feasible or it has high costs, -) memory is small with respect to the size of data, -) data distribution or phenomena generating the data may evolve over time, which is known as concept drift and -) the number of classes may evolve overtime. Therefore, efficient data streams processing requires particular drivers and learning techniques:

Incremental learning in order to integrate the information carried by each new arriving data;
Decremental learning in order to forget or unlearn the data samples which are no more useful;
Novelty detection in order to learn new concepts.

It is worthwhile to emphasize that streams are very often generated by distributed sources, especially with the advent of Internet of Things and therefore processing them centrally may not be efficient especially if the infrastructure is large and complex. Scalable and decentralized learning algorithms are potentially more suitable and efficient.

Aim and scope

This workshop welcomes novel research about learning from data streams in evolving environments. It will provide the researchers and participants with a forum for exchanging ideas, presenting recent advances and discussing challenges related to data streams processing. It solicits original work, already completed or in progress. Position papers are also considered. The scope of the workshop covers the following, but not limited to:

Online and incremental learning
Online classification, clustering and regression
Online dimension reduction
Data drift and shift handling
Online active and semi-supervised learning
Online transfer learning
Adaptive data pre-processing and knowledge discovery
Applications in
- Monitoring
- Quality control
- Fault detection, isolation and prognosis,
- Internet analytics
- Decision Support Systems,
- etc.

Submission and Review process

Regular and short papers presenting work completed or in progress are invited. Regular papers should not exceed 12 pages, while short papers are maximum 6 pages. Papers must be written in English and are to be submitted in PDF format online via the Easychair submission interface:

https://easychair.org/conferences/?conf=iotstreaming2017

Each submission will be evaluated on the basis of relevance, significance of contribution, quality of presentation and technical quality by at least two members of the program committee. All accepted papers will be included in the workshop proceedings and will be publically available on the conference web site. At least one author of each accepted paper is required to attend the workshop to present.

Important dates

Paper submission deadline: ***Monday, July 17, 2017 EXTENDED DEADLINE***
Paper acceptance notification: Monday, July 30, 2017
Paper camera-ready submission: Monday, August 7, 2017

Program Committee members (to be confirmed)

Carlos Ferreira, LIAAD INESC Porto LA, ISEP, Portugal
Edwin Lughofer, Johannes Kepler University of Linz, Austria
Sylvie Charbonnier, Université Joseph Fourier-Grenoble, France
Bruno Sielly Jales Costa, IFRN, Natal, Brazil
Fernando Gomide, University of Campinas, Brazil
José A. Iglesias, Universidad Carlos III de Madrid, Spain
Anthony Fleury, Mines-Douai, Institut Mines-Télécom, France
Teng Teck Hou, Nanyang Technological University, Singapore
Plamen Angelov, Lancaster University, UK
Igor Skrjanc, University of Ljubljana, Slovenia
Indre Zliobaite, Aalto University, Austria
Elaine Faria, Univ. Uberlandia, Brazil
Mykola Pechenizkiy, TU Eindonvhen, Netherlands
Raquel Sebastião, Univ. Aveiro, Portugal

Workshop Organizers

Moamar Sayed-Mouchaweh

Computer Science and Automatic Control Labs, High Engineering School of Mines, Douai, Francemoamar.sayed-mouchaweh@mines-douai.fr

Albert Bifet
Telecom-ParisTech; Paris, France
albert.bifet@telecom-paristech.fr

Hamid Bouchachia
Department of Computing & Informatics, University of Bournemouth, Bournemouth, UK
abouchachia@bournemouth.ac.uk

João Gama
Laboratory of Artificial Intelligence and Decision Support, University of Porto, Porto, Portugal
jgama@fep.up.pt

Rita Ribeiro
Laboratory of Artificial Intelligence and Decision Support, University of Porto, Porto, Portugal
rpribeiro@dcc.fc.up.pt

About

The Tutorial

Tutorial: IoT Big Data Stream Mining

The challenge of deriving insights from the Internet of Things (IoT) has been recognized as one of the most exciting and key opportunities for both academia and industry. Advanced analysis of big data streams from sensors and devices is bound to become a key area of data mining research as the number of applications requiring such processing increases. Dealing with the evolution over time of such data streams, i.e., with concepts that drift or change completely, is one of the core issues in IoT stream mining. This tutorial is a gentle introduction to mining IoT big data streams. The first part introduces data stream learners for classification, regression, clustering, and frequent pattern mining. The second part deals with scalability issues inherent in IoT applications, and discusses how to mine data streams on distributed engines such as Spark, Flink, Storm, and Samza

Content:

1. IoT Fundamentals and Stream Mining Algorithms

– IoT Stream mining setting

– Concept drift

– Classification and Regression

– Clustering

– Frequent Pattern mining

– Concept Evolution

– Limited Labeled Learning

2. IoT Distributed Big Data Stream Mining

– Distributed Stream Processing Engines

– Classification

– Regression

– Open Source Tools

– Applications

Presenters:

Gianmarco De Francisci Morales
Albert Bifet
Latifur Khan
Moamar Sayed-Mouchaweh
Joao Gama
Wei Fan

Speakers

Past Events

Anchor 1

Program

9:00 - 10:40 Tutorial: 1. IoT Fundamentals and Stream Mining Algorithms

10:40 - 11:00 Morning coffee break

11:00 - 12:40 Tutorial: 2. IoT Distributed Big Data Stream Mining and Applications

12:40 - 14:00 Lunch break

2:00 - 3:40 SESSION 1 Chair: Albert Bifet

2:00 – 2:45 Invited talk: Geoff Webb. Learning from non-stationary distributions

3:00 - 3:15 A Sliding Window Filter for Time Series Streams

Gordon Lesti and Stephan Spiegel

3:20 - 3:35 Evolutive deep models for online learning on data streams with no storage

Andrey Besedin, Pierre Blanchart, Michel Crucianu and Marin Ferecatu

3:40 - 4:00 Coffee Break

4:00 - 5:00 SESSION 2 Chair: Moamar Sayed-Mouchaweh

4:00 - 4:15 Hybrid Self Adaptive Learning Scheme for Simple and Multiple Drift-like Fault Diagnosis in Wind Turbine Pitch Sensors

Houari Toubakh and Moamar Sayed-Mouchaweh

4:20 - 4:35 Comparison between Co-training and Self-training for single-target regression in data streams using AMRules

Ricardo Sousa and Joao Gama

4:40 - 4:55 Self-Adaptive Ensemble Classifier for Handling Complex Concept Drift

Imen Khamassi and Moamar Sayed-Mouchaweh

5:00 – 5:15 Summary Extraction on Data Streams in Embedded Systems

Sebastian Buschjäger and Katharina Morik

Slides

Keynote Talk

Geoff Webb

Geoff Webb is Director of the Monash Monash Centre for Data Science. He is a technical advisor to data science startup BigML. He has been Editor in Chief of the premier data mining journal, Data Mining and Knowledge Discovery (2005 to 2014) and Program Committee Chair of the two top data mining conferences, ACM SIGKDD (2015) and IEEE ICDM (2010), as well as General Chair of ICDM (2012). His primary research areas are machine learning, data mining, user modelling and computational structural biology. Many of his learning algorithms are included in the widely-used BigML, R and Weka machine learning workbenches. He is an IEEE Fellow and received the inaugural Eureka Prize for Excellence in Data Science in 2017, the 2013 IEEE ICDM Service Award, a 2014 Australian Research Council Discovery Outstanding Researcher Award, the 2016 Australian Computer Society ICT Researcher of the Year Award and the 2016 Australasian Artificial Intelligence Distinguished Research Contributions Award.

Learning from non-stationary distributions

The world is dynamic – in a constant state of flux – but most learned models are static. Models learned from historical data are likely to decline in accuracy over time. This talk presents formal tools for analyzing non-stationary distributions and some insights that they provide. Shortcomings of standard approaches to learning from non-stationary distributions are discussed together with strategies for developing more effective techniques.

Please reload

Slides

Subcribes

Keynote

Program

Slides

IoT W+T

IoT

Large Scale Learning from Data Streams

ECML-PKDD 2017

18 September 2017

IoT

Large Scale Learning from Data Streams

​

Workshop+Tutorial

ECML-PKDD 2017

18 September 2017

The Workshop

Workshop + Tutorial

18 September 2017

09:00 am

The Workshop

2nd ECML/PKDD 2017 Workshop on Large-scale Learning from Data Streams in Evolving Environments

Workshop + Tutorial

18 September 2017

09:00 am

Motivation and focus

Aim and scope

Submission and Review process

Important dates

Program Committee members (to be confirmed)

​

The Tutorial

Tutorial: IoT Big Data Stream Mining

Content:

​

Presenters:

​

Program

Keynote Talk

Learning from non-stationary distributions

​

Large Scale
Learning from Data Streams

Large Scale
Learning from Data Streams

2nd ECML/PKDD 2017 Workshop on
Large-scale Learning from Data Streams in Evolving Environments