The Summer School will be offered as a hybrid event. Due to the ongoing COVID-19 pandemic it is not guaranteed that every international participant/lecturer can visit Dortmund. The event will thus be a mixture of local and (possibly some) remote lectures.
All lectures will also be streamed via Zoom and Youtube to the remote audience of participants that could not travel to Germany.
Lectures will be available on-demand on YouTube during the week of the Summer School.
Each lecture will be accompanied by a dedicated Q&A session.
The schedule for lectures and Q&A session will be announced on this webpage soon.
The following lectures will be part of the Summer School.
Note that this is a preliminary list.
More lectures are yet to be added and the list will be updated regularly.
The warm welcome to the summer school comes with an introduction of the collaborative research center SFB 876, which organises REAML 2022. What are the hot topics of resource-aware machine learning? Why and how should we save energy and communication when learning or applying the learned models? We conclude with practical hints.
Data science and machine learning is taking the world by storm. Almost all theory and methods, however, are inherently flawed in such a basic way that it prevents them from being used in practice. Unlike what most papers assume, in many applications (e.g., autonomous driving, industrial machines, or healthcare) it is impossible or hugely impractical to gather all data into one place. This is not only due to privacy concerns, but the sheer size of data makes centralizing and processing it infeasible. Federated learning offers a solution: models are trained only locally and combined to create a well-performing joint model - without sharing data. Like many data science techniques, applying them in practice requires a high level of trust. However, giving a guarantee on the model quality, training and resource efficiency, bounding the communication, and ensuring data privacy is a huge undertaking. In this talk I will present efficient, theoretically sound, and practically useful methods for efficient federated machine learning, as well as identify important and exciting open problems.
The lecture starts with information theoretic considerations, which show why inverse problem are hard when a measurement is distorted by finite-resolution effects. In most cases this implies that the problem can only be solved by biasing the result, for example by regularisation methods. Different ways are discussed to implement this approach and to control the resulting bias, with a special focus on the proper interpretation of the results.
In the field of data mining, there is one big open problem: the optimization subject to binary constraints. Binary constraints make data mining results interpretable and definite. Is this picture showing a cat? Should this movie be recommended to this user? Should the next chess move be this one? Binary results give a definite answer to questions like these. There are a lot of methods which are able to solve binary constrained problems, however, they mostly work under one condition: exclusivity. That is, if a picture shows a cat, then it can not show a dog, if a movie is recommended to one user, then it can't be recommended to the other, and there should only be one next chess move which is the optimal one. Depending on the application, this assumption is is more or less justified. The field of clustering is an area in which the optimization subject to binary constraints is explicitly studied. In this talk, we will discuss the broad spectrum of tasks where a matrix factorization approximation error in Frobenius norm is minimized subject to binary constraints. We will unveil under which circumstances this optimization task defines the clustering objectives of k-means, spectral clustering and subspace clustering, but we will also make connections to methods like deep learning. We will also see how bridging those disciplines under the umbrella of matrix factorization establishes novel research ideas and insights, providing inspiration to tackle pending research questions of adversarial learning, computing meaningful embeddings and to learn sensible similarity metrics.
Data analyses usually entail the application of many scripts, notebooks, and command line tools to transform, filter, aggregate or plot data and results. With ever increasing amounts of data being collected in science, reproducible and scalable automatic workflow management becomes increasingly important. Snakemake is a workflow management system, consisting of a text-based workflow specification language and a scalable execution environment, that allows the parallelized execution of workflows on workstations, compute servers and clusters without modification of the workflow definition. Snakemake thereby puts a particular focus on transparency and human readability, as well as adaptability and modularization of data analyses.
With over 380,000 downloads and on average more than 7 new citations per week in 2021 (>1300 in total), Snakemake is one of the most widely used systems for reproducible data analysis.
This tutorial will introduce the Snakemake workflow definition language and describe how to use the execution environment. Further, it will be shown how Snakemake helps to create reproducible and transparent analyses that can be adapted to new data with little effort.
The generalization mystery in deep learning is the following: Why do over-parameterized neural networks trained with gradient descent (GD) generalize well on real datasets even though they are capable of fitting random datasets of comparable size? Furthermore, from among all solutions that fit the training data, how does GD find one that generalizes well (when such a well-generalizing solution exists)?
We argue that the answer to both questions lies in the interaction of the gradients of different examples during training. Intuitively, if the per-example gradients are well-aligned, that is, if they are coherent, then one may expect GD to be (algorithmically) stable, and hence generalize well. We formalize this argument with an easy to compute and interpretable metric for coherence, and show that the metric takes on very different values on real and random datasets for several common vision networks. The theory also explains a number of other phenomena in deep learning, such as why some examples are reliably learned earlier than others, why early stopping works, and why it is possible to learn from noisy labels. Moreover, since the theory provides a causal explanation of how GD finds a well-generalizing solution when one exists, it motivates a class of simple modifications to GD that attenuate memorization and improve generalization.
Generalization in deep learning is an extremely broad phenomenon, and therefore, it requires an equally general explanation. We conclude with a survey of alternative lines of attack on this problem, and argue that the proposed approach is the most viable one on this basis.
Hardware implementation is critical to reducing execution time and energy consumption for the training and deployment of deep learning models. The use of field-programmable gate arrays (FPGAs) is a promising approach to achieve a good trade-off between the design cycle and performance for deep learning systems. This lecture on FPGA-based deep learning consists of two parts. The first part gives an overview of state-of-the-art FPGA design for training and inference of deep learning models. Specifically, this part covers potential benefits, application scenarios, main challenges, design optimisation techniques for FPGA-based deep learning with examples. The second part discusses a basic FPGA design for feed-forward networks (FFNs). The design accelerates the back-propagation process for FFN training and can be extended to support more complicated network architectures.
Although cosmic rays were already discovered more than 100 years ago, their exact origin, as well as the physical mechanisms involved in their acceleration remain largely unknown. In order to resolve this mystery large scale facilities have been set up around the globe, which target different messenger particles. Popular examples For many of these facilities the use of machine learning algorithms has become a standard analysis technique. Algorithms and their application differ between individual analyses, but especially Boosting, Random Forests and Deep Neural Networks are not only populare but also highly successful choices. This lecture will provide an overview over the challenges associate with the detection and analysis of different messenger particles and how these challenges can be addressed via the use of machine learning and deep learning algorithms.
The COVID-19 pandemic has shown the importance of medical testing for an early detection of regional disease hot spots and for monitoring the course of the pandemic. In particular, the coupling of medical biosensors with concepts of machine learning has the potential to meet the requirements for efficient and robust detection of current and future pathogens. The lecture illustrates this with the example of the plasmon-assisted microscopy sensor that can make nanometer-sized particles (e.g., viruses) visible. The principle of operation of the sensor and the concept for the detection of nanometer-sized particles is explained. The challenge is that the analysis is carried out on the basis of data-intensive and very noisy or artefact afflicted image sequences and that the processing of the image sequences should be done in (soft) real-time while minimising resource consumption, e.g., of energy and memory. The lecture is thus at the same time an introduction to the hackathon.
Fitting the context of the COVID-19 pandemic, the summer school is accompanied by a challenge on the detection of nanoparticles such as viruses. Using a plasmon-assisted microscopy sensor that can make nanometer-sized particles visible, we provide real-world images containing virus-like signals. The participants are challenged to test their knowledge of Machine Learning and cyber-physical systems in this real-world scenario. In this hackathon, they aim to achieve the most reliable and rapid detection possible with limited resources. They will receive training datasets with particles of defined sizes for training and validating their approaches. All submitted approaches are evaluated against a previously unknown dataset and ranked using a metric that considers both the predictive quality and resource efficiency of the model.