MEWS: Mobile, Embedded, and Wireless Security Group

14-829 / 18-638: Mobile and IoT Security - Fall 2019

Assignment #1 - Permissions, User Data, Analytics, & Privacy

Due: Sep 26, 2019

Description: The goal of this assignment is to gain intuition about how Android applications access and manage user data, how the Android permission system works (and doesn't work), how context measurement can leak sensitive user data or activity, and how developers can get away with data theft without user awareness. Through this assignment, students will go through a variety of tasks and background reading to gain historical context of Android security, understand how the OS has evolved to protect users in certain scenarios, act as a curious/malicious developer to bypass OS protections and extract sensitive user information, and build knowledge of the trade-offs between perceived risk and value of certain types of mobile data. This is a rather large and time-consuming assignment, so students are strongly recommended to start early and devote time over several weeks.

Tasks:

Brush up on your Android security background - As a starting point for the assignment, make sure you are familiar with the Android permission system and development landscape. If needed, read through the Android documentation to learn how Android apps and permissions work, paying specific attention to differences between Android OS versions with different security models. For additional historical context and intuition, download and read these papers [1, 2, 3].
Design a useful app that secretly benefits the developer - Build an app that collects relevant data about user context, activity, and behavior from the device without being detected or suspected by the user. Collect as much information as possible that will help you learn the location and behavior of the user as well as other sensitive data, subject to the constraints below. Your app should appear to have some valuable functionality (e.g., a utility or game of some sort), but behind this functionality, it should report all of the secretly collected information back to you (for example, once per minute), again without being suspected by the user.
Constraints and Hints:
- The app must perform some useful user-facing task that is not related to stealing the protected data or learning about user behavior,
- Your app should be interesting enough that people would use it, but it doesn't need to be novel (if you mimic an existing app or use an open-source app, be sure to provide proper attribution).
- The user must be unaware of the app's hidden activities, regardless of what it does. You can assume that the user does not check the task manager, logs, or network traffic.
- Your app can request/use any permissions other than location permissions (either as manifest or equivalent runtime permissions), as long as they are needed for the functionality that is apparent to the user, not only for the secret functionality. You will be expected to explain and justify the need for any permissions used by your app.
- The app must exfiltrate data periodically to an off-device destination without user interaction and regardless of app or screen state.
- The app must work on all Android versions between 4.0 and 9 (inclusive) and NOT require root access. However, it may collect slightly different information under different OS versions.
- You can assume there is an easy way to get your app onto the user's device, so don't worry about that problem.
- Minimal UI is sufficient (this is a security assignment, not a dev assignment), as long as functionality and outcomes are achieved.
Feel free to make other assumptions, as long as they don't violate the constraints or directly go against the narrative of this assignment, but you must fully explain and justify any such assumptions.
Track and map the user's location - Using the data exfiltrated from the user's device, create a data analysis and visualization tool to map the target user's location history. If needed, you can use Wifi location databases such as WiGLE.net or do a manual survey in areas you know the user frequents.
Constraints and Hints:
- Estimated location of the user should be plotted to a visual map (not on the user's device, but yours) in an understandable way.
- Your location estimates don't need to be highly accurate, but they should be meaningful (e.g., 10s of meters).
- You are allowed to collaborate with other students on the manual survey work, if needed, but no other tasks.
- Be sure to create a mechanism for collecting ground truth data (outside the scope of the attack app and corresponding constraints), otherwise you won't know how accurate your results are.
Analyze data to expose additional sensitive user behaviors - Again using the exfiltrated data from the user's device, create another tool that extracts and presents some additional privacy-sensitive information about the user.
Constraints and Hints:
- The privacy-sensitive information that you learn should not be obvious, given the application and user context. For example, a fitness tracker should be able to know many physical details of the user, but it should not be able to learn the user's banking password.
- You should be able to easily convince anyone about the sensitivity of what you are learning (i.e., if you're unsure whether something is considered privacy-sensitive, it probably isn't sensitive enough).
- Be creative, but don't put too much effort into app development.
Study trade-offs between data quality and privacy risk - With many types of human activity data, the granularity or quality of the data is directly related to the apparent privacy risk (e.g., knowing your location at the centimeter level is more invasive than at the city level). In this task, you will intentionally reduce the data quality/granularity of your collected data for Task 4 above and see how it affects the apparent value to the attacker. For both the location and activity data, artificially downsample your data in a meaningful way and visually show the impact of the reduced data quality. Is the data still as sensitive as it was previously? Qualitatively, how has the privacy risk changed? Is there a point where the data loses value to the attacker? If so, would data at this granularity still be beneficial to the app's user-facing purpose?

Deliverables: Each student will submit a written summary of their efforts and responses for the above tasks. The summary report should should include:

A detailed description of the app and its functionality, with screenshots showing the app in action,
Code snippets to highlight important aspects of the required tasks,
Examples of the data being recorded, analysis being done, and impact of data quality/granularity,
Examples detailing how the attack works and the type of sensitive information exposed,
Details of how the design meets each of the given constraints,
Detailed step-by-step explanations of the relevant aspects of the app, analysis tools, and attack steps that demonstrate your understanding of what the code is doing and why it works the way it does.

The written summary should be formatted as a single-column document using font size 11 or greater, converted to a .pdf document for submission.

Submission Instructions: Each student should submit a .pdf version of their written summary via Canvas, using the format requested above. All students are expected to complete the assignment on their own; discussion about the assignment is allowed and encouraged, but all design, analysis, and writing tasks must be done individually -- other than the manual mapping task if needed.

Grading: This assignment is worth 40 points:

eight (8) points for a detailed description of the app including both the user-facing and hidden functionalities,
five (5) points for explaining how the app design satisfies the constraints,
three (3) points for discussing OS version issues and additional assumptions needed,
six (6) points for suitability and sufficiency of the sensitive information extracted in the attack,
five (5) points for appropriateness of attack analysis methods,
five (5) points for details of the mapping capability,
four (4) points for detailed description of granularity and security trade-offs, and
four (4) points for overall visualization of outcomes.

We reserve the right to take off points for presentation aspects, e.g., incorrect format, poor writing, etc.