The PhonePe Transaction Insights project focuses on analyzing digital payment transaction data from the PhonePe platform to uncover meaningful business insights, transaction trends, and user behavior patterns across different states in India.
This project includes:
- Data Extraction from JSON files
- SQL Database Integration
- Exploratory Data Analysis (EDA)
- Data Visualization
- Machine Learning Model Implementation
- Interactive Streamlit Dashboard
The main objective of this project is to analyze transaction behavior and generate data-driven insights that can help improve digital payment services, customer engagement, and business decision-making.
Key goals include:
- Understanding transaction trends
- Identifying top-performing states
- Analyzing payment categories
- Predicting transaction amounts using Machine Learning
- Building an interactive dashboard for real-time insights
- Python
- MySQL
- Pandas
- NumPy
- Matplotlib
- Seaborn
- Scikit-learn
- Streamlit
The dataset was obtained from the PhonePe Pulse GitHub repository and consists of transaction-related JSON files categorized into:
- Aggregated Data
- Map Data
- Top Transaction Data
The data contains:
- State
- Year
- Quarter
- Transaction Type
- Transaction Count
- Transaction Amount
The setup.sql file contains the SQL commands used to create the database and tables for the project, while the phonepe-db.session.sql file is an automatically generated session file created by the SQLTools extension to store query history and connection details.
- Extracted JSON data from nested folders
- Created MySQL database and tables
- Loaded transaction data into SQL
- Data Cleaning
- Missing Value Handling
- Outlier Detection
- Feature Engineering
- Visualization & Insights
Implemented multiple regression models:
- Linear Regression
- Decision Tree Regressor
- Random Forest Regressor
Built an interactive dashboard with:
- KPI Metrics
- State-wise Analysis
- Transaction Trend Analysis
- Dynamic Filtering
- Data Visualization
- Peer-to-peer and merchant payments dominate transaction volume.
- Digital payment adoption has increased significantly over the years.
- Certain states contribute majorly to overall transaction amount.
- Transaction count strongly influences transaction amount.
- Seasonal variations exist across quarters.
Among all implemented models, the Random Forest Regressor achieved the best performance with improved prediction accuracy and better generalization capability.
Evaluation Metrics Used:
- MAE
- MSE
- RMSE
- R² Score
- Interactive Filters
- Real-time Visualization
- State-wise Insights
- Transaction Category Analysis
- Trend Monitoring
- KPI Cards
pip install -r requirements.txt
python -m streamlit run app.py
This project demonstrates how data analytics and machine learning can be used to generate valuable insights from digital payment transaction data. The analysis helps support strategic business decisions, improve customer engagement, and optimize digital payment services.