• German
German

# A2  Algorithmic aspects of learning methods in embedded systems

Prof. Dr. Sohler, Christian
Prof. Dr. Teubner, Jens

The main objective of this project is to develop design paradigms for learning algorithms in embedded systems. To this end, we will consider computational models that reflect important aspects of embedded systems, develop and analyse algorithms in these models and empirically evaluate their performance to make conclusions about their efficiency as well as the prediction power of the considered computational models. Then we would like to use the insights gained this way to derive general algorithmic design paradigms for embedded systems.

In this third phase of the project we will continue our study of the aspects energy consumption, communication and computation architecture, data streams and distributed systems. We know from the first two phases of the project as well as the findings of other projects (for example, A1, A6, B3, B4 and C4) that small summaries of the data, such as sketches or coresets, are a suitable design paradigm to obtain data-streaming and distributed algorithms. In the current phase we will continue in this direction and try to unite our findings in different computational models to design a single approach that fits different aspects of embedded systems. For this purpose, we want to continue to deepen our understanding of the individual approaches. Our focus will lie on energy consumption, which we will study both theoretically and empirically.

## Project management:

Prof. Dr. Sohler, Christian
Prof. Dr. Teubner, Jens

## Project members:

MSc Funke, Henning

## Alumni project management:

Prof. Dr. Vahrenhold, Jan

## Alumni:

Dr. Breß, Sebastian
Ehsanfar, Ebrahim
Krivošija, Amer
Lüdemann, Dierk
Dr. Scheffer, Christian
Dr. Schwiegelshohn, Chris
Dr. Skopalik, Alexander
Temme, Sylvie
Thom, Andreas

## Publications:

 Bury/etal/2018a Bury, Marc and Schwiegelshohn, Chris and Sorella, Mara. Sketch 'Em All: Approximate Similarity Search for Dynamic Data Streams. In Yoelle Maarek and Yan Liu (editors), Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, (WSDM) 2018, Los Angeles, CA, USA, Febuary 6-8, 2018, ACM, 2018. Driemel/Krivosija/2018a Driemel, Anne and Krivosija, Amer. Probabilistic embeddings of the Fr\'echet distance. In Epstein, Leah and Erlebach, Thomas (editors), Proceedings of the 16th International Workshop on Approximation and Online Algorithms (WAOA), Vol. 11312, pages 218--237, Springer, 2018. Funke/etal/2017a Henning Funke and Sebastian Bre\ss and Stefan Noll and Volker Markl and Jens Teubner. Pipelined Query Processing on Coprocessor Environments. In Proceedings of the 2018 ACM SIGMOD Conference on Management of Data, 2018. Munteanu/etal/2018a Alexander Munteanu and Chris Schwiegelshohn and Christian Sohler and David P. Woodruff. On Coresets for Logistic Regression. In Advances in Neural Information Processing Systems (NIPS), to appear, 2018. Munteanu/Schwiegelshohn/2018a Munteanu, Alexander and Schwiegelshohn, Chris. Coresets - Methods and History: A Theoreticians Design Pattern for Approximation and Streaming Algorithms. In KI - Künstliche Intelligenz, Vol. 32, No. 1, pages 37-53, 2018. Noll/etal/2017b Stefan Noll and Jens Teubner and Norman May and Alexander Böhm. Accelerating Concurrent Workloads with CPU Cache Partitioning. In Proc. of the 34th IEEE Int'l Conference on Data Engineering (ICDE), Paris, France, 2018. Braverman/etal/2017a Braverman, Vladimir and Frahling, Gereon and Lang, Harry and Sohler, Christian and Yang, Lin F.. Clustering High Dimensional Dynamic Data Streams. In Doina Precup and Yee Whye Teh (editors), Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, Australia, August 6-11, 2017. Bury/Schwiegelshohn/2017a Bury, Marc and Schwiegelshohn, Chris. On Finding the Jaccard Center. In Indyk, Piotr and Kuhn, Fabian and Muscholl, Anca (editors), 44th International Colloquium, ICALP 2017, Warsaw, Poland, July 10-14. Proceedings, 2017. CohenAddad/Schwiegelshohn/2017a Cohen-Addad, Vincent and Schwiegelshohn, Chris. On the Local Structure of Stable Clustering Instances. In Chris Umans (editors), Proceedings of the 58th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2017), Berkeley, CA, October 15-17, 2017. Gmyr/etal/2017a Gmyr, Robert and Hinnenthal, Kristian and Scheideler, Christian and Sohler, Christian. Distributed Monitoring of Network Properties: The Power of Hybrid Networks. In Indyk, Piotr and Kuhn, Fabian and Muscholl, Anca (editors), 44th International Colloquium, ICALP 2017, Warsaw, Poland, July 10-14. Proceedings, 2017. Noll/2017a Stefan Noll. Energy Efficiency in Main-Memory Databases. In Datenbanksysteme für Business, Technologie und Web (BTW), Studierendenprogramm, pages 335--344, 2017. Noll/etal/2017a Stefan Noll and Henning Funke and Jens Teubner. Energy Efficiency in Main-Memory Databases. In Datenbank-Spektrum, 2017. CohenAddad/etal/2016a Cohen-Addad, Vincent and Schwiegelshohn, Chris and Sohler, Christian. Diameter and k-Center in Sliding Windows. In Mitzenmacher, Michael and Rabani, Yuval and Sangiorgi, Davide (editors), 43nd International Colloquium, ICALP 2016, Rome, Italy, July 12-15. Proceedings, 2016. Driemel/2016a Driemel, Anne and Krivosija, Amer and Sohler, Christian. Clustering time series under the Fr\'echet distance. In Krauthgamer, Robert (editors), Proceedings of the 27th Symposium on Discrete Algorithms (SODA), pages 766-785, SIAM, 2016. Schwiegelshohn/Schwiegelshohn/2016a Schwiegelshohn, Chris and Schwiegelshohn, Uwe. The Power of Migration for Online Slack Scheduling. In Piotr Sankowski and Christos Zaroliagis (editors), 24th European Symposium of Algorithms, ESA 2016, Aarhus, Denmark, 22 - 26 August, Proceedings, 2016. Bury/Schwiegelshohn/2015a Bury, Marc and Schwiegelshohn, Chris. Sublinear Estimation of Weighted Matchings in Dynamic Data Streams. In Nikhil Bansal and Irene Finocchi (editors), Algorithms - ESA 2015 - 23th Annual European Symposium, Patras, Greece, September 14-16, 2015. Proceedings, Springer, 2015. Schwiegelshohn/Sohler/2014a Chris Schwiegelshohn and Christian Sohler. Logistic Regression for Datastreams. No. 1, TU Dortmund, 2014. Siedhoff/etal/2014b Siedhoff, Dominic and Fichtenberger, Hendrik and Libuschewski, Pascal and Weichert, Frank and Sohler, Christian and Müller, Heinrich. Signal/Background Classification of Time Series for Biological Virus Detection. In Xiaoyi Jiang, Joachim Hornegger, Reinhard Koch (editors), Pattern Recognition - 36th German Conference, GCPR 2014, Münster, Germany, September 2-5, 2014. Proceedings, Springer, 2014. Czumaj/etal/2013a Artur Czumaj and Christiane Lammersen and Morteza Monemizadeh and Christian Sohler. $(1+\varepsilon)$-Approximation for Facility Location in Data Streams. In Sanjeev Khanna (editors), Proceedings of the 24th Symposium on Discrete Algorithms (SODA), pages 1710-1728, SIAM, 2013. Feldman/etal/2013a Dan Feldman and Melanie Schmidt and Christian Sohler. Turning Big Data into Tiny Data: Constant-size Coresets for k-means, PCA and Projective Clustering. In Sanjeev Khanna (editors), Proceedings of the 24th Symposium on Discrete Algorithms (SODA), pages 1434-1453, SIAM, 2013. Fichtenberger/etal/2013a Fichtenberger, Hendrik and Gillé, Marc and Schmidt, Melanie and Schwiegelshohn, Chris and Sohler, Christian. BICO: Birch meets Coresets for k-means. In Hans L. Bodlaender and Giuseppe F. Italiano (editors), Proceedings of the 21st European Symposium on Algorithms (ESA), Springer, 2013. Caragiannis/etal/2012a Ioannis Caragiannis and Angelo Fanelli and Nick Gravin and Alexander Skopalik. Approximate pure Nash equilibria in weighted congestion games: existence, efficient computation, and structure. In Branislav Rovan and Vladimiro Sassone and Peter Widmayer (editors), Proceedings of the 13th ACM Conference on Electronic Commerce, pages 284--301, New York, NY, USA, ACM, 2012. Fanelli/etal/2012a Angelo Fanelli and Luca Moscardelli and Alexander Skopalik. On the Impact of Fair Best Response Dynamics. In Boi Faltings and Kevin Leyton-Brown and Panos Ipeirotis (editors), Proceedings of the 37th International Symposium on Mathematical Foundations of Computer Science, Springer, 2012. Gieseke/etal/2012a Fabian Gieseke and Gabriel Moruz and Jan Vahrenhold. Resilient K-d Trees: K-Means in Space Revisited. In Frontiers of Computer Science, Vol. 6, No. 2, pages 166-178, 2012. Gudmundsson/etal/2012a Joachim Gudmundsson and Andreas Thom and Jan Vahrenhold. Of Motifs and Goals: Mining Trajectory Data. In Egemen Tanin and Peer Kröger and Peter Widmayer (editors), Proceedings of the 20th International Conference on Advances in Geographic Information Systems (SIGSPATIAL), pages 129-138, ACM, 2012. Hoefer/Skopalik/2012a Hoefer, Martin and Skopalik, Alexander. Social Context in Potential Games. In Internet and Network Economics - 8th International Workshop, WINE 2012, Liverpool, UK, December 10-12, 2012. Proceedings, pages 364-377, 2012. Scheffer/Vahrenhold/2012a Christian Scheffer and Jan Vahrenhold. Approximating Geodesic Distances on 2-Manifolds in $R^3$. In Computational Geometry: Theory & Applications, 2012. Scheffer/Vahrenhold/2012b Christian Scheffer and Jan Vahrenhold. Simplified Medial-Axis Approximation with Guarantees. In Walter Didimo and Guiseppe Liotta (editors), Proceedings of the 28th European Workshop on Computational Geometry, pages 161--164, 2012. Temme/Vahrenhold/2012a Sylvie Temme and Jan Vahrenhold. Revisiting the Construction of SSPDs in the Presence of Memory Hierarchies. In Walter Didimo and Guiseppe Liotta (editors), Proceedings of the 28th European Workshop on Computational Geometry, pages 57--60, 2012. Vahrenhold/2012a Jan Vahrenhold. On the Space-Efficiency of the Ultimate Planar Convex Hull Algorithm''. In Greg Aloupis and David Bremner (editors), Proceedings of the 24th Canadian Conference on Computational Geometry (CCCG), pages 139--144, 2012. Scheffer/Vahrenhold/2011a Christian Scheffer and Jan Vahrenhold. Approximating Geodesic Distances on 2-Manifolds in $R^3$. In Greg Aloupis and David Bremner (editors), Proceedings of the 23rd Canadian Conference on Computational Geometry (CCCG 2011), pages 325--330, 2011. Scheffer/Vahrenhold/2011b Christian Scheffer and Jan Vahrenhold. Learning a 2-Manifold with a Boundary in $R^3$. In Michael Hoffmann (editors), Proceedings of the 27th European Workshop on Computational Geometry, pages 213--216, 2011.

## Disserations:

 Schwiegelshohn/2017a Schwiegelshohn, Chris. On Algorithms for Large-Scale Graph and Clustering Problems. Faculty of Computer Science, TU Dortmund University, 2017.

## Final Theses:

 Gahn/2016a Benedikt Gahn. Energieeffiziente Implementierungen von Algorithmen zur Bestimmung von minimalen Spannbäumen. Faculty of Computer Science, TU Dortmund University, 2016. Koehler/2016a Köhler, Oliver. Energieeffiziente Algorithmen zur Lösung des Frequent Item Problems''. TU Dortmund University, 2016. Schigulski/2016a Daniel Schigulski. Energieeffiziente Datenanalyse. TU Dortmund University, 2016. Bohr/2015a Christian Bohr. Sublineare Approximation von Eigenschaften geometrischer Punktmengen mittels Bereichsanfragen. Faculty of Computer Science, TU Dortmund University, 2015. Goertz/2015a Michael Dominik Görtz. Energy Efficient Database Join Algorithms. TU Dortmund University, 2015. Mezlaf/2015a David Mezlaf. Online Minimum Makespan Scheduling With Reordering On Uniformly Related Machines. Faculty of Computer Science, TU Dortmund University, 2015. Schwitalla/2015a Martin Schwitalla. Optimierung von Genomsequenzanalysen mit Hauptspeicherdatenbanksystemen. TU Dortmund, 2015. Boettinger/2014a Torsten Böttinger. Hauptkomponentenanalyse für k-means. Faculty of Computer Science, TU Dortmund University, 2014. Krause/2014a Philipp Krause. Energieeffiziente Software. TU Dortmund University, 2014. Rensen/2014a Fabian Rensen. k-Means basierte Klassifikation auf teilweise unklassifizierten Daten. Faculty of Computer Science, TU Dortmund University, 2014. Heinlein/2013a Tobias Heinlein. Algorithms for the j-subspace problem using MapReduce. Faculty of Computer Science, TU Dortmund University, 2013. Brueggen/2012a Georg von der Brüggen. Geometrische Datenanalyse in großen astronomischen Datenbanken unter Berücksichtigung von Ungenauigkeiten. Faculty of Computer Science, TU Dortmund University, 2012. Bulinski/2012a Bulinski, Michael. Parallelisierung von Clusteralgorithmen. Faculty of Computer Science, TU Dortmund University, 2012. Temme/2011a Sylvie Temme. Untersuchungen zur Ressourceneffizienz der Berechnung einer Semi-Separated Pair Decomposition. Faculty of Computer Science, TU Dortmund University, 2011.
• Gahn/2016a - Energieeffiziente Implementierungen von Algorithmen zur Bestimmung von minimalen Spannbäumen
• Koehler/2016a - Energieeffiziente Algorithmen zur Lösung des Frequent Item Problems''
• Schigulski/2016a - Energieeffiziente Datenanalyse
• Bohr/2015a - Sublineare Approximation von Eigenschaften geometrischer Punktmengen mittels Bereichsanfragen
• Goertz/2015a - Energy Efficient Database Join Algorithms
• Mezlaf/2015a - Online Minimum Makespan Scheduling With Reordering On Uniformly Related Machines
• Schwitalla/2015a - Optimierung von Genomsequenzanalysen mit Hauptspeicherdatenbanksystemen
• Boettinger/2014a - Hauptkomponentenanalyse für k-means
• Krause/2014a - Energieeffiziente Software
• Rensen/2014a - k-Means basierte Klassifikation auf teilweise unklassifizierten Daten
• Heinlein/2013a - Algorithms for the j-subspace problem using MapReduce
• Brueggen/2012a - Geometrische Datenanalyse in großen astronomischen Datenbanken unter Berücksichtigung von Ungenauigkeiten
• Bulinski/2012a - Parallelisierung von Clusteralgorithmen
• Temme/2011a - Untersuchungen zur Ressourceneffizienz der Berechnung einer Semi-Separated Pair Decomposition

## Preliminary Work:

 Blunck/Vahrenhold/2010a Henrik Blunck and Jan Vahrenhold. In-Place Algorithms for Computing (Layers of) Maxima. In Algorithmica, Vol. 57, No. 1, pages 1--21, 2010. Feldman/etal/2010a Dan Feldman and Morteza Monemizadeh and Christian Sohler and David Woodruff. Coresets and Sketches for High Dimensional Subspace Approximation Problems. In Proceedings 21st Annual ACM-SIAM Symposium on Discrete Algorithms, pages 630-649, 2010. Gieseke/etal/2010a Fabian Gieseke and Joachim Gudmundsson and Jan Vahrenhold. Pruning Spanners and Constructing Well-Separated Pair Decompositions in the Presence of Memory Hierarchies. In Journal of Discrete Algorithms, Vol. 8, No. 3, pages 259--272, 2010. Ackermann/etal/2008a Marcel R. Ackermann and Johannes Blömer and Christian Sohler. Clustering for metric and non-metric distance measures. In Shang-Hua Teng (editors), Proceedings of the 19th Annual ACM-SIAM Symposium on Discrete Algorithms, pages 799--808, 2008. Frahling/etal/2008a Gereon Frahling and Piotr Indyk and Christian Sohler. Sampling in Dynamic Data Streams and Applications. In International Journal of Computational Geometry and Applications (Special Issue with selected papers from the 21st ACM Symposium on Computational Geometry), Vol. 18, No. 1/2, pages 3 -- 28, 2008. Frahling/Sohler/2008a Gereon Frahling and Christian Sohler. A Fast k-Means Implementation Using Coresets. In International Journal on Computational Geometry and Applications, Vol. 18, No. 6, pages 605--625, 2008. Hazel/etal/2008a Thomas Hazel and Laura Toma and Jan Vahrenhold and Rajiv Wickremesinghe. TerraCost: A Versatile and Scalable Approach to Computing Least-Cost-Path Surfaces for Massive Grid-Based Terrains. In ACM Journal of Experimental Algorithmics, Vol. 12, 2008. Bose/etal/2007a Prosenjit Bose and Anil Maheshwari and Pat Morin and Jason Morrison and Michiel Smid and Jan Vahrenhold. Space-Efficient Geometric Divide-and-Conquer Algorithms. In Computational Geometry: Theory & Applications., Vol. 37, No. 3, pages 209--227, 2007. Blunck/Vahrenhold/2006a Henrik Blunck and Jan Vahrenhold. In-Place Randomized Slope Selection. In Proceedings of the 6th International Conference on Algorithms and Complexity, Vol. 3998, pages 31--40, Berlin, Springer, 2006. Gehweiler/etal/2006a Joachim Gehweiler and Christiane Lammersen and Christian Sohler. A distributed $O(1)$-approximation algorithm for the uniform facility location problem. In Proceedings of the 18th Annual ACM Symposium on Parallel Algorithms and Architectures, pages 237-243, 2006. Arge/etal/2002a Lars Allan Arge and Klaus Helmer Hinrichs and Jan Vahrenhold and Jeffrey Scott Vitter. Efficient Bulk Operations on Dynamic R-trees. In Algorithmica, Vol. 33, No. 1, pages 104--128, 2002.