Event Date: August 27, 2015 16:15
In Learning with Label Proportions (LLP), the objective is to learn a supervised classifier when, instead of labels, only label proportions for bags of observations are known. This setting has broad practical relevance, in particular for privacy preserving data processing. We first show that the mean operator, a statistic which aggregates all labels, is sufficient for the minimization of many proper losses with linear classifiers without using labels. We provide a fast learning algorithm that estimates the mean operator via a manifold regularizer with guaranteed approximation bounds. Experiments display that our algorithms outperform the state-of-the-art in LLP, and in many cases compete with the Oracle, that learns knowing all labels. In more recent work, we show that the mean operator’s trick can be generalized, such that it is possible to learn without knowing individual feature vectors either. We can leverage this surprising result to design learning algorithms that do not need any individual example -only their aggregates- for training and for which many privacy guarantees can be proven.
Bio: Giorgio Patrini is a PhD student in Machine Learning at the Australian National University/NICTA. His main research is on understanding how learning is possible when some variables are only known as aggregates; for example, how to learn individual-level models from census-like data. His research naturally touches themes in social sciences, econometrics and privacy. He cofounded and advises Waynaut, an online travel start-up based in Milan, Italy.