top of page

IoT

Large Scale
Learning from Data Streams

ECML-PKDD 2017
18 September 2017
The Conference

IoT

Large Scale
Learning from Data Streams

​

Workshop+Tutorial

ECML-PKDD 2018
10September 2018

The Workshop

Workshop + Tutorial
18 September 2017
09:00 am

The volume of data is rapidly increasing due to the development of the technology of information and communication. This data comes mostly in the form of streams. Learning from this ever-growing amount of data requires flexible learning models that self-adapt over time. In addition, these models must take into account many constraints: (pseudo) real-time processing, high-velocity, and dynamic multi-form change such as concept drift and novelty. This workshop welcomes novel research about learning from data streams in evolving environments. It will provide the researchers and participants with a forum for exchanging ideas, presenting recent advances and discussing challenges related to data streams processing. It solicits original work, already completed or in progress. Position papers are also considered. This workshop is combined with a tutorial treating the same topic and will be presented in the same day.

Anchor 1

The Workshop

3rd ECML/PKDD 2018 Workshop on
IoT Large Scale Machine Learning from Data Streams

Workshop + Tutorial
10 September 2018
09:00 am

The volume of data is rapidly increasing due to the development of the technology of information and communication. This data comes mostly in the form of streams. Learning from this ever-growing amount of data requires flexible learning models that self-adapt over time. In addition, these models must take into account many constraints: (pseudo) real-time processing, high-velocity, and dynamic multi-form change such as concept drift and novelty. This workshop welcomes novel research about learning from data streams in evolving environments. It will provide the researchers and participants with a forum for exchanging ideas, presenting recent advances and discussing challenges related to data streams processing. It solicits original work, already completed or in progress. Position papers are also considered. This workshop is combined with a tutorial treating the same topic and will be presented in the same day.

​

Motivation and focus


The volume of data is rapidly increasing due to the development of the technology of information and communication. This data comes mostly in the form of streams. Learning from this ever-growing amount of data requires flexible learning models that self-adapt over time. In addition, these models must take into account many constraints: (pseudo) real-time processing, high-velocity, and dynamic multi-form change such as concept drift and novelty. Consequently, learning from streams of evolving and unbounded data requires developing new algorithms and methods able to learn under the following constraints: -) random access to observations is not feasible or it has high costs, -) memory is small with respect to the size of data, -) data distribution or phenomena generating the data may evolve over time, which is known as concept drift and -) the number of classes may evolve overtime. Therefore, efficient data streams processing requires particular drivers and learning techniques:

  • Incremental learning in order to integrate the information carried by each new arriving data;

  • Decremental learning in order to forget or unlearn the data samples which are no more useful;

  • Novelty detection in order to learn new concepts.

It is worthwhile to emphasize that streams are very often generated by distributed sources, especially with the advent of Internet of Things and therefore processing them centrally may not be efficient especially if the infrastructure is large and complex. Scalable and decentralized learning algorithms are potentially more suitable and efficient.


Aim and scope

​

This workshop welcomes novel research about learning from data streams in evolving environments. It will provide the researchers and participants with a forum for exchanging ideas, presenting recent advances and discussing challenges related to data streams processing. It solicits original work, already completed or in progress. Position papers are also considered. The scope of the workshop covers the following, but not limited to:

  •  Online and incremental learning

  •  Online classification, clustering and regression

  •  Online dimension reduction

  •  Data drift and shift handling

  •  Online active and semi-supervised learning

  •  Online transfer learning

  •  Adaptive data pre-processing and knowledge discovery

  •  Applications in

    •  Monitoring

    •  Quality control

    •  Fault detection, isolation and prognosis,

    •  Internet analytics

    •  Decision Support Systems,

    •  etc.

 

Submission and Review process


Regular and short papers presenting work completed or in progress are invited. Regular papers should not exceed 12 pages, while short papers are maximum 6 pages. Papers must be written in English and are to be submitted in PDF format online via the Easychair submission interface:

​

https://easychair.org/conferences/?conf=iotstreaming2018

 

Each submission will be evaluated on the basis of relevance, significance of contribution, quality of presentation and technical quality by at least two members of the program committee. All accepted papers will be included in the workshop proceedings and will be publically available on the conference web site. At least one author of each accepted paper is required to attend the workshop to present.


Important dates


Paper submission deadline: Monday, July 16th, 2018
Paper acceptance notification: Friday, July 27th, 2018
Paper camera-ready submission: Monday, August 6th, 2018


Program Committee members (to be confirmed)
​

 

  • Carlos Ferreira, LIAAD INESC Porto LA, ISEP, Portugal

  • Edwin Lughofer, Johannes Kepler University of Linz, Austria

  • Sylvie Charbonnier, Université Joseph Fourier-Grenoble, France

  • Bruno Sielly Jales Costa, IFRN, Natal, Brazil

  • Fernando Gomide, University of Campinas, Brazil

  • José A. Iglesias, Universidad Carlos III de Madrid, Spain

  • Anthony Fleury, Mines-Douai, Institut Mines-Télécom, France

  • Teng Teck Hou, Nanyang Technological University, Singapore

  • Plamen Angelov, Lancaster University, UK

  • Igor Skrjanc, University of Ljubljana, Slovenia

  • Indre Zliobaite, Aalto University, Austria

  • Elaine Faria, Univ. Uberlandia, Brazil

  • Mykola Pechenizkiy, TU Eindonvhen, Netherlands

  • Raquel Sebastião, Univ. Aveiro, Portugal


Workshop Organizers

 

Moamar Sayed-Mouchaweh

Computer Science and Automatic Control Labs, High Engineering School of Mines, Douai, Francemoamar.sayed-mouchaweh@mines-douai.fr


Albert Bifet
Telecom-ParisTech; Paris, France
albert.bifet@telecom-paristech.fr


Hamid Bouchachia
Department of Computing & Informatics, University of Bournemouth, Bournemouth, UK
abouchachia@bournemouth.ac.uk


João Gama
Laboratory of Artificial Intelligence and Decision Support, University of Porto, Porto, Portugal
jgama@fep.up.pt


Rita Ribeiro
Laboratory of Artificial Intelligence and Decision Support, University of Porto, Porto, Portugal
rpribeiro@dcc.fc.up.pt

About

The Tutorial

Tutorial: IoT Data Stream Mining in Practice

The challenge of deriving insights from the Internet of Things (IoT) has
been recognized as one of the most exciting and key opportunities for both academia
and industry. The advent of IoT applications is here: industry 4.0, connected indus-
try, industry automation, smart cities, smart grids, energy efficiency, etc. All this IoT
applications require advanced analysis of big data streams from sensors and small
devices, while addressing security and privacy concerns. This tutorial is a gentle
introduction to mining IoT big data streams. The first part introduces data stream
learners for several learning tasks including distributed algorithms. The second and third part
present some applications for predictive maintenance, prediction for renewable ener-
gies, and social network analysis for telecommunications data streams.  The last part presents how to use Apache Spark Streaming for applying scalable machine learning on Big Data streams.

​

Content:
​

1.IoT Fundamentals and IoT Stream Mining Algorithms
– Predictive Learning
– Descriptive Learning
– Frequent Pattern mining
2. Case Study: Predictive Maintenance
– Problem Definition
– Change, Anomaly and Novelty Detection
– Failure Prediction and Detection
3. Case Study: Social Network Analysis

– Challenges in mining networked data,

– Online sampling

– Evolving centralities and communities

– Tracking the dynamics of evolving communities

4. Big Data Stream Mining using Spark Streaming
– Fundamental concepts
– Examples
– API

​

Presenters:
​
  • Joao Gama

  • Rita Ribeiro

  • Moamar Sayed-Mouchaweh

  • Heitor Murilo Gomes

  • Latifur Khan

  • Albert Bifet

Speakers
Past Events
Anchor 1
Anchor 1
Anchor 1

The Tutorial

Tutorial: IoT Data Stream Mining in Practice

The challenge of deriving insights from the Internet of Things (IoT) has
been recognized as one of the most exciting and key opportunities for both academia
and industry. The advent of IoT applications is here: industry 4.0, connected indus-
try, industry automation, smart cities, smart grids, energy efficiency, etc. All this IoT
applications require advanced analysis of big data streams from sensors and small
devices, while addressing security and privacy concerns. This tutorial is a gentle
introduction to mining IoT big data streams. The first part introduces data stream
learners for several learning tasks including distributed algorithms. The second and third part
present some applications for predictive maintenance, prediction for renewable ener-
gies, and social network analysis for telecommunications data streams.  The last part presents how to use Apache Spark Streaming for applying scalable machine learning on Big Data streams.

​

Content:
​

1.IoT Fundamentals and IoT Stream Mining Algorithms
– Predictive Learning
– Descriptive Learning
– Frequent Pattern mining
2. Case Study: Predictive Maintenance
– Problem Definition
– Change, Anomaly and Novelty Detection
– Failure Prediction and Detection
3. Case Study: Social Network Analysis

– Challenges in mining networked data,

– Online sampling

– Evolving centralities and communities

– Tracking the dynamics of evolving communities

4. Big Data Stream Mining using Spark Streaming
– Fundamental concepts
– Examples
– API

​

Presenters:
​
  • Joao Gama

  • Rita Ribeiro

  • Moamar Sayed-Mouchaweh

  • Heitor Murilo Gomes

  • Latifur Khan

  • Albert Bifet

Program

 

9:00 - 10:40 Tutorial I: IoT Stream Mining Algorithms and Predictive Maintenance

10:40 - 11:00 Morning coffee break

11:00 - 12:40 Tutorial II: Social Networks and Big Data

​

12:40 - 14:00 Lunch break

 

2:00 - 3:40  SESSION 1

 

2:00 – 2:40 Invited talk: Bernhard Pfahringer:Partial solutions to some current challenges in stream mining

​

2:40- 3:00 Query Log Analysis: Detecting Anomalies in DNS Traffic at a TLD Resolver.
Pieter Robberechts, Maarten Bosteels, Jesse Davis and Wannes Meert. 

​

3:00 - 3:20 Multimodal Tweet Sentiment Classification Algorithm Based on Attention Mechanism.
Peiyu Zou and Shuangtao Yang. 

​

3:20 - 3:40  Deep Online Storage-free Learning on Unordered Image Streams.

Andrey Besedin, Pierre Blanchart, Michel Crucianu and Marin Ferecatu. 

​

 3:40 - 4:00 Coffee Break

​

 4:00 - 5:00 SESSION 2 

​

4:00 - 4:20  Self Hyper-parameter Tuning for Stream Recommendation Algorithms.

 Bruno Veloso, Joao Gama and Benedita Malheiro.

​

 4:20 - 4:40 Active Learning by Clustering for Drifted Data Stream Classification.
Jakub Zgraja, Joao Gama and Michal Wozniak. 

​

 4:40 - 5:00 Fault Prognostics for the Predictive Maintenance of Wind Turbines: State of the Art.
Koceila Abid, Moamar Sayed-Mouchaweh and Laurence Cornez. 

​

Keynote Talk

Bernhard Pfahringer

Bernhard Pfahringer received his PhD degree from the University of
Technology in Vienna, Austria, in 1995. He is a Professor with the
Department of Computer Science at the University of Waikato in New
Zealand. His interests span a range of data mining and machine
learning sub-fields, with a focus on streaming, randomization, and
complex data.

​

​

Partial solutions to some current challenges in stream mining

​

Stream mining is concerned with online learning from non-stationary
data sources. I will argue that many, if not all, big data mining
endeavours are instances of stream mining. This presentation will
highlight issues in stream mining, including proper evaluation,
temporal dependencies, label acquisition, and preprocessing, and will
present some preliminary solutions for these challenges.

Please reload

Subcribes
Keynote
Program
bottom of page