Menu+

Rohit Babbar, Max-Planck Institute Tuebingen, OH14 E23

Event Date: January 26, 2017 16:15

Scalable Algorithms for Extreme Multi-class and Multi-label Classifcation

In the era of big data, large-scale classification involving tens of thousand target categories is not uncommon these days. Also referred to as Extreme Classification, it has also been recently shown that the machine learning challenges arising in recommendation systems and web-advertising can be effectively addressed by reducing it to extreme multi-label classification. In this talk, I will discuss my two recent works which have been accepted at SDM 2016 and WSDM 2017, and present TerseSVM and DiSMEC algorithms for extreme multi-class and multi-label classification. The training process for these agorithms makes use of openMP based distributed architectures, thereby using thousands of cores for computation, and train models in a few hours which would otherwise take several weeks. The precision@k and nDCG@k results using DiSMEC improve by upto 10% on benchmark datasets over state-of-the-art methods such as SLEEC and FastXML, which are used by Microsoft in Bing Search. Furthermore, the model size is upto three orders of magnitutde smaller than that obtained by off-the-shelf solvers.

Bio
Rohit Babbar is currently a post-doc in the Empirical Inference group at Max-Planck Institute Tuebingen since October 2014. His work has primarily been focused around large-scale machine learning and Big data problems. His research interests also include optimization and deep learning. Before that, he finished his PhD from University of Grenoble in 2014.

SFB-876 NEWSLETTER

UPCOMING TALKS

NEWEST TECHREPORTS

All Technical Reports

Main Navigation

Rohit Babbar, Max-Planck Institute Tuebingen, OH14 E23