Active Simulation Data Mining
Learning from Simulations
A simulation models how the state of a system evolves over time, given the simulation parameters . Multiple steps of a simulation constitute a black-box data generator which produces data from its input parameters.
Active sampling tries to acquire an optimal set of labeled data to be used for training a supervised model. If we consider the observation or the label to be a part of the simulation input , this task respectively corresponds to active learning [1] or to active class selection [2].
Active Class Selection
The goal of active class selection (ACS) [2] is to optimize the class proportions in newly acquired data; a classifier trained from that data should exhibit maximum performance during its deployment. Having a simulation in which the label is part of the simulation parameters , we are facing exactly the ACS problem: in which class proportions should we simulate?
You can find all of our online conference talks on YouTube:
Use Case—Cherenkov Astronomy
Cherenkov astronomy reasons about the characteristics of cosmic objects by studying their gamma radiation. Since no labeled data is available from the actual detectors, it is necessary to simulate the training data. In each simulation run, we can arbitrarily choose the type of the particle to be simulated—which is the label in the prediction task at hand.
Publications
-
M. Bunse and K. Morik: Active Class Selection with Uncertain Deployment Class Proportions In Interactive Adaptive Learning Workshop at ECML-PKDD, 2021 (to appear).
-
M. Bunse and K. Morik: Certification of Model Robustness in Active Class Selection In Europ. Conf. on Mach. Learn. and Knowledge Discovery in Databases, 2021 (to appear).
- M. Bunse, D. Weichert, A. Kister, and K. Morik: Optimal Probabilistic Classification in Active Class Selection. In Int. Conf. on Data Mining, 2020.
-
M. Bunse, A. Saadallah, and K. Morik: Towards Active Simulation Data Mining. In Int. Tutorial and Workshop on Interactive Adaptive Learning at ECML-PKDD, pages 104-107, 2019.
- M. Bunse and K. Morik: What Can We Expect from Active Class Selection? In Lernen, Wissen, Daten, Analysen (LWDA), 2019, pages 79-83.
Supplementary Material
- Experiments on ACS at ECML-PKDD 2021: https://github.com/mirkobunse/AcsCertificates.jl
- Experiments on ACS at ICDM 2020: https://github.com/mirkobunse/acs-icdm20
- Interactive Adaptive Learning at ECML-PKDD 2019: https://p.ies.uni-kassel.de/ial2019/index.html
Bibliography
- B. Settles. Active Learning. Synthesis Lectures on Artificial Intelligence and Machine Learning. Morgan & Claypool Publishers, 2012.
- R. Lomasky, C. E. Brodley, M. Aernecke, D. Walt, and Mark A. Friedl. Active class selection. In Proc. of the ECML, pages 640–647, 2007.
- C. Bockermann, K. Brügge, J. Buss, A. Egorov, K. Morik, W. Rhode, and T. Ruhe. Online analysis of high-volume data streams in astroparticle physics. In Proc. of the ECML-PKDD, pages 100–115, 2015.
Share your ideas with us!
We are always looking for comments, criticism, and for collaborators. Can we count you in? :)
mirko.bunse [ät] cs.tu-dortmund.de
You may also like our work on deconvolution.