Skip to content

Trustworthy-Software/GeoTwins

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GeoTwins: Uncovering Hidden Geographic Disparities in Android Apps

In this repository, we host all the data and code related to our paper titled "GeoTwins: Uncovering Hidden Geographic Disparities in Android Apps".

📜 Abstract

While mobile app evolution over time has been extensively studied, geographical variation in app behavior remains largely unexplored. This paper presents the first large-scale study of location-based Android app differentiation, revealing two critical and previously unknown phenomena with significant security and policy implications. First, we introduce the concept of GeoTwins: apps that are functionally similar and share branding, yet are released under different package names (i.e., as two separate apps) in different countries. Despite their apparent similarity, GeoTwins often diverge in critical aspects such as requested permissions, third-party libraries, and privacy disclosures. For example, we found that the Japanese version of the game Unison League requests the ACCESS_FINE_LOCATION permission, whereas its international counterpart does not, even though both share the same branding and user interface. Second, we investigate the Android App Bundle ecosystem and uncover unexpected regional differences in the supposedly consistent base.apk files, which are generally assumed to be invariant. Contrary to expectations, our analysis shows that even base.apk files can vary by region, revealing hidden customizations that may affect app behavior or security. The discrepancies observed through these two phenomena raise concerns about potential digital inequality, where users in different regions experience varying levels of access to features, privacy protection, or security standards. To support our study, we developed a distributed app collection pipeline spanning multiple regions and analyzed thousands of apps. We also release our dataset of 81 963 GeoTwins to facilitate further research. Our findings reveal systemic regional disparities in mobile software, with critical implications for app developers (who must consider regional compliance and user expectations), platform architects (who must address distribution inconsistencies), and policymakers (focused on digital fairness and global user protection).

🗂️ Repository Organization

The repository is structured into the following main directories:

  • 📁 0_Data/
    Contains all the datasets related to our experiments. Due to storage constraints, some large files (e.g., APKs, full privacy policy contents) are not included.

Note (Contributions): This folder includes all key data contributions presented in the paper, such as the GeoTwins list, GeoFamilies list, and the AndroZoo icons file, among others.

  • 📂 1_Code/
    Includes all the code used in the experiments, primarily in the form of Jupyter Notebooks to ease reproducibility and exploration.
    Organized by research question (RQ), each subfolder contains one or more notebooks.

    • 🔧 Utility Scripts:
      • AppUtils.py: Feature extraction from APK files
      • PairwiseAnalysisUtils.py: Pairwise comparisons and similarity score computations

      Note (Contributions): These two files contain all the code necessary to compare any pair of APKs and represent an additional contribution of this work.

⚠️ Note: Some data files (e.g., raw APKs) are not included due to size restrictions. However, the code is fully provided to reproduce the feature extraction and analysis processes.

📋 Requirements

🐍 Conda Environment

To launch the Jupyter Notebook, you will need various libraries. We provide a requirements.txt file which you can use to create a conda environment.

Follow the steps below:

  1. Create a conda environment named demoEnv:

    conda create --name demoEnv python=3.8
  2. Activate the newly created environment:

    conda activate demoEnv
  3. Install the required packages using pip and requirements.txt:

    pip install -r requirements.txt

Once these steps are complete, your environment will be set up with all the necessary libraries.

🔧 ApkTool

To decompile APKs, ApkTool must be installed on your system. Follow the steps below to set it up:

  1. Download ApkTool:
    Visit the official ApkTool page at https://ibotpeaches.github.io/Apktool/ and download the latest version.

  2. Install ApkTool:
    Follow the installation instructions for your operating system, which typically involve:

    • Placing the downloaded JAR file in a suitable directory.
    • Adding the ApkTool executable to your system's PATH for easier access.
  3. Verify Installation:
    Ensure ApkTool is installed correctly by running the following command in your terminal:

    apktool

    This should display the ApkTool usage instructions if the installation was successful.

📌 Environment File (.env)

Some settings should be specified in an environment file named .env, which should be placed in the main folder of this repository.

🔑 AndroZoo API Key

To analyze APKs, you need access to AndroZoo.

ANDROZOO_API_KEY: This key is necessary to download apps from the AndroZoo Repository, as various operations on the APK files are performed "on-the-fly," such as app download, extraction, and deletion. It can be requested here: https://androzoo.uni.lu/access

Example .env entry for the AndroZoo API Key:

ANDROZOO_API_KEY = [YOUR_ANDROZOO_API_KEY]
📚 Third-Party Libraries Path

During APK analysis, the tool compares against a list of known third-party libraries. You must provide the path to this file in your .env.

THIRD_PARTY_LIBS_PATH: This specifies the path to the list of known third-party libraries used during the analysis of the APKs. The list can be obtained from: https://github.com/JordanSamhi/AndroLibZoo

Example .env entry for the third-party libraries path:

THIRD_PARTY_LIBS_PATH = [YOUR_PATH_TO_THETHIRD_PARTY_LIBS_FILE]

⚙️ Usage

The analysis can be carried out by running the Jupyter Notebooks provided in the 1_Code/ directory. Each subfolder corresponds to a specific research question (RQ) and contains one or more notebooks related to that topic.

Feature extraction and pairwise comparisons are handled by two utility scripts:

  • AppUtils.py: Extracts features from APK files
  • PairwiseAnalysisUtils.py: Computes similarity scores between app versions

These notebooks rely on the aforementioned utilities and can be executed in sequence to reproduce the analysis. While some large datasets (such as APK files or raw privacy policy contents) are not included due to storage limitations, the code fully supports reproducing the necessary data if you have access.

📌 Make sure the .env file is correctly configured with your AndroZoo API key and the path to the third-party libraries list.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors