Event Date: January 14, 2021 16:15
Text indexing for large amounts of data
Abstract - Large amounts of text are produced in bioinformatics, web crawling, and text mining, to name just a few examples. These texts need to be indexed to make them algorithmically efficient to handle. Classical text indexes are typically designed for sequential processors and main memory, and thus quickly reach their limits in real-world problems. In this talk, I will show some recent results on index construction for data sizes where the available main memory is insufficient and, moreover, the parallelism of modern systems is to be exploited. Concrete models here are multi-core CPUs and the PRAM model, distributed systems with message passing, and the external memory model. Applications in text compression are also discussed.
Short-Bio - Johannes Fischer has been Professor of Algorithm Engineering for Computer Science at TU Dortmund University since October 2013. After receiving his computer science degree from the University of Freiburg in 2003, he worked as a doctoral student at LMU Munich, where he received his PhD in 2007 for a dissertation in algorithmic bioinformatics. He then worked as a postdoctoral researcher at the University of Chile, the University of Tübingen, and KIT. His current research is at the intersection of theory/algorithm engineering and is mainly concerned with space-efficient data structures, text indexing and compression, and parallel algorithms on large data sets.