C3 Regression approaches for large-scale high-dimensional data
- Prof. Dr. Christian Sohler
- Prof. Dr. Katja Ickstadt
The scalability of modern regression approaches is often stretched to its limits by a large number of observations and/or variables.
This aggravates their use in embedded systems. The goal of this project is therefore the development of
highly efficient regression methods. We pursue the development of algorithms
to reduce the number of observations using, e.g., random linear projections and sampling (streaming algorithms),
as well as the development of methods to reduce the dimensionality of the underlying, possibly Bayesian, model classes
imposing structural constraints, e.g., monotonicity.
Publications
- [1]
Bornkamp, B., A. Fritsch, O. Kuss und K. Ickstadt:
Penalty specialists
among goalkeepers: A nonparametric Bayesian analysis of 44 years of
German Bundesliga. In: Schipp, B. und W. Krämer (Hrsg.):
Statistical Inference, Econometric Analysis and Matrix Algebra:
Festschrift in Honour of Götz Trenkler , S. 63-76. Physica Verlag,
2009.
- [2] Bornkamp, B. und K. Ickstadt:
Bayesian
nonparametric estimation of continuous monotone functions with
applications to dose-response analysis. Biometrics, 65:198-205,
2009.
- [3] Bornkamp, B., K. Ickstadt und D. B. Dunson:
Stochastically ordered multiple regression. Biostatistics,
2010.
- [4] Feldman, D., M. Monemizadeh und C. Sohler:
A PTAS for k-means clustering based on weak coresets. In: Proceedings of
the 23rd ACM Symposium on Computational Geometry, S. 11-18,
2007.
- [5] Feldman, D., M. Monemizadeh, C. Sohler und
D. Woodruff:
Coresets and sketches for high-dimensional subspace
approximation Problems. In: Proceedings of the Nineteenth Annual
ACM-SIAM Symposium on Discrete Algorithms, S. 630-649, 2010.
- [6] Frahling, G., P. Indyk und C. Sohler:
Sampling in dynamic data streams and applications. International Journal
of Computational Geometry and Applications (Special Issue with selected
papers from the 21st ACM Symposium on Computational Geometry),
18(1/2):3-28, 2008.
- [7] Frahling, G. und C. Sohler:
Coresets in
dynamic geometric data streams. In: Proceedings of the 37th Annual ACM
Symposium on Theory of Computing, S. 209-217, 2005.
- [8] Fritsch, A. und Ickstadt, K.:
Comparing
logic regression based methods for identifying SNP interactions. In:
Hochreiter, S. und R.Wagner (Hrsg.): Bioinformatics in Research and
Development. Springer, Berlin, 2007.
- [9] Ickstadt, K. und R. L. Wolpert:
Spatial
regression for marked point processes. In: Bernardo, J. M., J. O.
Berger, A. P. Dawid und A. F. M. Smith (Hrsg.): Bayesian Statistics 6 ,
S. 323-341. Oxford University Press, Oxford, 1999.
- [10] Nunkesser, R., T. Bernholt, H. Schwender,
K. Ickstadt und I. Wegener:
Detecting high-order interactions of single
nucleotide polymorphisms using genetic programming. Bioinformatics,
23:3280-3288, 2007.
- [11] Schwender, H. und K. Ickstadt:
Identification of SNP interactions using logic regression.
Biostatistics, 9:187-198, 2008.
- [12] Wolpert, R. L. und K. Ickstadt:
Poisson/Gamma random field models for spatial statistics. Biometrika,
85:251-267, 1998.