• German
  • >
German >

Main Navigation

Collaborative Research Center SFB 876 - Providing Information by Resource-Constrained Data Analysis

The collaborative research center SFB876 brings together data mining and embedded systems. On the one hand, embedded systems can be further improved using machine learning. On the other hand, data mining algorithms can be realized in hardware, e.g. FPGAs, or run on GPGPUs. The restrictions of ubiquitous systems in computing power, memory, and energy demand new algorithms for known learning tasks. These resource bounded learning algorithms may also be applied on extremely large data bases on servers.

DockHa - Personal Hadoop cluster on Docker Swarm in minutes

Analysing Big Data typically involves developing for or comparing to Hadoop. For researching new algorithms, a personal Hadoop cluster, running independently of other software or other Hadoop clusters, should provide a sealed environment for testing and benchmarking. Easy setup, resizing and stopping enables rapid prototyping on a containerized playground.

DockHa is a project developed at the Artificial Intelligence Group, TU Dortmund University, that aims to simplify and automate the setup of independent Hadoop clusters in the SFB 876 Docker Swarm cluster. The Hadoop properties and setup parameters can be modified to suit the application. More information can be found in the software section (DockHa) and the Bitbucket repository (DockHa-Repository).

more ...

Survey on the Internet of Things published: Opportunities and Challenges for Distributed Data Analysis

As part of the work for project B3 the survey on Opportunities and Challenges for Distributed Data Analysis has now been published by Marco Stolpe at ACM SIGKDD.

This survey motivates how the real-time analysis of data, embedded into the Internet of Things (IoT), enables entirely new kinds of sustainable applications in sectors such as manufacturing, transportation and distribution, energy and utilities, the public sector as well as in healthcare. It presents and discusses the challenges of real-time constraints for state-of-the-art analysis methods. Current research strongly focuses on cloud-based big data analysis. Our survey provides a more balanced view, taking also into account highly communication-constrained scenarios which require research on decentralized analysis algorithms. These must analyse data directly on sensors and small devices. Discussed is the vertical partitioning of data common for the IoT, which is particularly challenging, since information about observations is assessed at different networked nodes. The paper includes a comprehensive bibliography that should provide readers with a good starting point for their own work.

more ...

Show news archive
Newsletter RSS Twitter