Instagram Political Bias Research 🔍

Academic research investigating political bias in social media algorithms through automated data collection and machine learning analysis.

📄 Full Paper: https://research.juleslemee.com

What This Research Does

This study examines how Instagram's algorithm creates political filter bubbles by systematically collecting and analyzing recommendation patterns across different user personas. In part 2, we also recreated a 2019 paper that created its own model to detect political bias in images.

Example of part 1: the scraper building a database of accounts with their political leaning

Part 1 methodology: automated user simulation and data collection

Example of part 2: the workflow to create the final classifier from the 2019 paper

Part 2 methodology: computer vision and algorithmic bias detection

Project Summary paper with findings and analysis

Why This Research Matters

The Problem: Social media algorithms create "echo chambers" that reinforce existing political beliefs, potentially polarizing society. Understanding how these systems work is crucial for digital literacy and informed civic participation.

Our Approach: By creating controlled experiments with bot accounts, we can systematically study algorithmic bias without relying on self-reported user data or proprietary company information.

Academic Impact: This research contributes to the growing field of algorithmic auditing and provides methodologies for studying recommendation systems across social platforms.

Research Components

Part 1: Automated Data Collection System

Simulated User Behavior: Bot accounts with different political personas
Systematic Interaction Tracking: Automated likes, follows, and engagement patterns
Recommendation Analysis: Monitoring how algorithm suggestions change over time
Political Content Classification: Building databases of politically-labeled accounts

Part 2: Computer Vision Analysis

Bias Detection in Images: Adapting Thomas & Kovashka (2019) methodology
High-Performance Computing: Custom SLURM implementations for GPU cluster training
Feature Extraction: CLIP and Doc2Vec models for image analysis
Classification Pipeline: End-to-end bias detection system

Key Features

✅ Automated Data Collection - Selenium-based Instagram scrapers
✅ Political Bias Detection - Computer vision analysis of images
✅ High-Performance Computing - SLURM GPU cluster implementation
✅ Academic Rigor - Peer-reviewed methodology and findings
✅ Reproducible Research - Complete codebase and documentation
✅ Real-World Impact - Insights into social media algorithmic bias

How to Reproduce This Research

Prerequisites

Python 3.7+ with virtual environment capability
Access to Instagram Accounts (for Part 1) or GPU cluster (for Part 2)
Git for cloning repositories

Part 1: Data Collection Setup

Clone and Configure

git clone https://github.com/juleslemee/instagram-algorithm-research.git
cd instagram-algorithm-research/research-part-1
pip install -r requirements.txt

Set Up Credentials

cp input/keys/credentials.example.py input/keys/credentials.py
cp input/keys/openai_key.example.py input/keys/openai_key.py
# Edit files with your actual credentials

Run Data Collection

python src/mainsim.py  # Automated user simulation
python src/maincol.py  # Political account collection

Part 2: Machine Learning Analysis

Prepare Environment

cd research-part-2
pip install -r requirements.txt

Download Original Dataset
- Visit Thomas & Kovashka project page
- Download 130GB dataset to project directory

Run Analysis (GPU Cluster)

module load slurm
sbatch slurm_all.sh  # For continuous training

Project Structure

instagram-algorithm-research/
├── research-part-1/          # Instagram data collection
│   ├── src/                  # Scraping and automation scripts
│   ├── input/                # Configuration and credentials
│   └── output/               # Collected data and screenshots
└── research-part-2/          # Machine learning analysis
    ├── SLURM outputs/        # GPU cluster training logs
    ├── Models/               # Trained classifiers
    └── Features/             # Extracted image features

Research Findings

Filter Bubble Formation: Documented rapid creation of political echo chambers
Algorithmic Amplification: Measured bias in content recommendation systems
Methodological Contributions: Novel approaches to social media algorithm auditing
Technical Challenges: Identified limitations in automated data collection

Academic Context

Institution: McGill University
Course: COMP 396 - Individual Research Project
Supervisor: Professor Joseph Vybihal
Collaborator: Rayane Bouafia

Ethical Considerations

All data collection followed academic ethical guidelines
Bot accounts were clearly identified in profile descriptions
No private user information was collected
Research aims to promote algorithmic transparency

Technologies Used

Web Automation: Selenium WebDriver, Python
Machine Learning: PyTorch, CLIP, Doc2Vec, scikit-learn
Computer Vision: OpenCV, PIL
High-Performance Computing: SLURM, GPU clusters
Data Processing: pandas, NumPy

Citations & References

This research builds upon and adapts:

Thomas, C. & Kovashka, A. (2019). "Who's Behind the Camera? Identifying the Authorship of a Photograph"
McGill University computing resources
Instagram's public API and web interface

License

MIT License - See LICENSE file for academic and research use permissions.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
assets		assets
research-part-1		research-part-1
research-part-2		research-part-2
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Instagram Political Bias Research 🔍

What This Research Does

Example of part 1: the scraper building a database of accounts with their political leaning

Example of part 2: the workflow to create the final classifier from the 2019 paper

Why This Research Matters

Research Components

Part 1: Automated Data Collection System

Part 2: Computer Vision Analysis

Key Features

How to Reproduce This Research

Prerequisites

Part 1: Data Collection Setup

Part 2: Machine Learning Analysis

Project Structure

Research Findings

Academic Context

Ethical Considerations

Technologies Used

Citations & References

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Instagram Political Bias Research 🔍

What This Research Does

Example of part 1: the scraper building a database of accounts with their political leaning

Example of part 2: the workflow to create the final classifier from the 2019 paper

Why This Research Matters

Research Components

Part 1: Automated Data Collection System

Part 2: Computer Vision Analysis

Key Features

How to Reproduce This Research

Prerequisites

Part 1: Data Collection Setup

Part 2: Machine Learning Analysis

Project Structure

Research Findings

Academic Context

Ethical Considerations

Technologies Used

Citations & References

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages