Hauptnavigation

SFB 876 - News

Data Mining Competition: Mining your smartphone

Schedule: Wednesday, 05.09., 18.00-19.00 and Friday, 11.00-12.30

Organizer: Olaf Spinczyk

Mobile phone data provides everything making resource-aware machine learning a challenge: Huge amounts of data, captured in a distributed environment and data from a wide range of sensors. Modern smartphones and open operating systems provide access to rich personal data. As most people carry their phone with them all day, phone usage provides a comprehensive picture of everyone's life. Due to a wide availability of location informations, from GPS to cellular based position estimations, the when and what is now combined with the where of the location information.

Join our experts in analyzing real world mobile phone data as a part of this competition and learn, which insights the investigation of big phone data may deliver in future. The data will be provided to all participants for personal research after the summer school.

As part of the Collaborative Research Center SFB 876, a distributed data acquisition system has been developed, that collects various utilization data on Android-based smartphones, while not disturbing their regular operation. The resulting data sets contain environmental data like available WiFi access points or cellular signal strength, user-defined settings like screen brightness or phone call settings, and system software internals like running processes or file operations. The data will be carefully anonymized prior to being given to the challenge participants. The participants will analyze the collected data with respect to energy savings and operating system optimization.

Small groups of attendees are allowed to work together on exciting, novel data-mining tasks like Resource Aware Feature Selection and Classification. The winning group will be awarded during the closing session on Friday.

Gathering Android smartphone data made easy

Retrieving a detailed view of phone data and usage is not hard anymore. Modern smartphones, e.g. with an Android operating system, provide API access to all sensors inside the phone. For demonstration purposes we provide here a small script under the MIT license for free download. The script already captures a lot of the available data like battery status, network connection, WiFi, acceleration sensors...

The script stores all captured data locally on your phone, so nothing is transfered to us. Please be aware, that the script itself as well as the scripting engine for Android is still in alpha stage.

  1. Get SL4A and PythonForAndroid from http://code.google.com/p/android-scripting/
    Direct links to the respective Android packages:
    SL4A: http://android-scripting.googlecode.com/files/sl4a_r5.apk
    P4A: http://android-scripting.googlecode.com/files/PythonForAndroid_r4.apk
  2. In some cases, you may have to manually delete the file /sdcard/sl4a/scripts and create a directory with that path.
  3. Run the "Python for Android" app on your device and press "Install" (you will need Internet access for that).
  4. Download our python script from standalone_data.py and save it in /sdcard/sl4a/scripts
  5. Start the SL4A app, tap standalone_data.py, and tap one of the two icons on the left (leftmost means execution in a terminal).
  6. The script will now collect data, writing the results into /sdcard/sl4a.

The data set produced via the script will contain similar features as those used in the competition. Hence, you can become familiar with the data even before the competition starts!

Were will you go: Next place prediction

Participants of the competition will have to solve a resource-constrained location prediction task. Predicting the location of phone users has a key relevance for context-aware and mobile recommendation systems and may help to optimize evacuation planning, local traffic or phone networks. We consider the task of predicting the destination of a user given some features of his/her current context subject to some resource constraints.

We will prepare a reference process for the data analysis software RapidMiner (Presented at the summer school’s Tuesday session) that allows all participants to benefit from a rich toolbox of data mining and machine learning algorithms for solving the task.