Bibtype |
Inproceedings |
Bibkey |
Buschjaeger/etal/2021e |
Author |
Buschjäger, Sebastian and Honysz, Philipp-Jan and Pfahler, Lukas and Morik, Katharina |
Title |
Very Fast Streaming Submodular Function Maximization |
Booktitle |
Machine Learning and Knowledge Discovery in Databases. Research Track: European Conference, ECML PKDD 2021, Bilbao, Spain, September 13-17, 2021, Proceedings, Part III |
Pages |
151-166 |
Address |
Berlin, Heidelberg |
Publisher |
Springer-Verlag |
Abstract |
Data summarization has become a valuable tool in understanding even terabytes of data. Due to their compelling theoretical properties, submodular functions have been the focus of summarization algorithms. Submodular function maximization is a well-studied problem with a variety of algorithms available. These algorithms usually offer worst-case guarantees to the expense of higher computation and memory requirements. However, many practical applications do not fall under this mathematical worst-case but are usually much more well-behaved. We propose a new submodular function maximization algorithm called ThreeSieves that ignores the worst-case and thus uses fewer resources. Our algorithm selects the most informative items from a data-stream on the fly and maintains a provable performance in most cases on a fixed memory budget. In an extensive evaluation, we compare our method against 6 state-of-the-art algorithms on 8 different datasets including data with and without concept drift. We show that our algorithm outperforms the current state-of-the-art in the majority of cases and, at the same time, uses fewer resources.
|
Year |
2021 |
Projekt |
SFB876-A1 |
Doi |
https://doi.org/10.1007/978-3-030-86523-8_10 |
Isbn |
978-3-030-86522-1 |