Skip to content
View Vikas-2703's full-sized avatar

Block or report Vikas-2703

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Vikas-2703/README.md

Vikas | Data & BI Analyst

Turning messy data into clear decisions · SQL · Python · Power BI · AWS


👋 About Me

I am a Data & Business Intelligence Analyst who enjoys turning raw, messy data into insights that business teams can actually use.
My experience spans dashboarding, analytics, and cloud data engineering – from cleaning and modelling data in SQL/Python to building modern data pipelines on AWS and visualising results in Power BI / Tableau / R Shiny.

I’m especially interested in:

  • Building end-to-end analytics solutions – ingestion → modelling → dashboards
  • Designing data lakes / ETL pipelines on cloud (AWS S3, Glue, Lambda, Athena)
  • Creating interactive dashboards that answer real business questions and tell a clear story

I enjoy collaborating with stakeholders, asking “what decision will this chart actually drive?”, and then designing the data model and visuals around that.


🚀 Skills & Technologies

🧩 Languages

Python R SQL

📊 Data & Analytics

Pandas NumPy Matplotlib Seaborn scikit-learn Excel

☁️ Cloud & Data Engineering

AWS S3 AWS Glue AWS Lambda Athena Git GitHub

📈 BI & Storytelling

Power BI Tableau R Shiny

🧰 Tools & IDEs

VS Code Jupyter Windows


📂 Featured Projects

1. YouTube Trending Data Lake on AWS

AWS · S3 · Glue (PySpark) · Lambda · Athena · Parquet

  • Built an end-to-end data lake on AWS for the YouTube Trending Kaggle dataset
  • Landed raw CSV/JSON into an S3 landing zone and used Glue (PySpark) + Lambda to build a cleansed Parquet layer
  • Registered datasets in the AWS Glue Data Catalog and queried them from Athena for analytics
  • Documented the full architecture and data flow for interview-ready walkthroughs

🔗 Repo: YouTube-Trending-Video-Analysis


2. Online Sales Dashboard (R Shiny)

R · R Shiny · dplyr · ggplot2

  • Designed an interactive sales performance dashboard for an e-commerce store
  • Provided KPIs (revenue, units sold, AOV, transactions) with filters by product category and region
  • Built visual breakdowns of sales by category/region and an interactive transaction table

(Screenshot in repo)
🔗 Repo: ONLINE_SALES_DASHBOARD


3. Audiobook Pricing & Sales Insights

Python · SQL · Pandas · Scikit-learn · Tableau

  • Cleaned audiobook metadata and listener behaviour data
  • Performed exploratory data analysis to understand pricing, ratings and sales patterns
  • Trained baseline ML models to explore what drives audiobook ratings
  • Built Tableau visuals to communicate insights to non-technical stakeholders

🔗 Repo: AudioBook-Prediction-Using-ML-models


4. US Military Spending Forecasting

R · Time-Series · ARIMA

  • Analysed 60+ years of defence budget data
  • Performed stationarity tests (ADF, PP, KPSS) and time-series diagnostics
  • Built ARIMA models and produced forward forecasts with interpretation

🔗 Repo: us-military-spending-forecasting


5. HR Attrition Analytics Dashboard (Power BI)

Power BI · DAX · Power Query · Drill-through · Bookmarks

  • Built a 4-page HR analytics dashboard to monitor attrition KPIs and uncover key drivers across department, job role, overtime, satisfaction, travel, age band, and education
  • Implemented a Decomposition Tree for interactive root-cause exploration and a heatmap matrix (Role × Satisfaction) to surface high-risk combinations
  • Added an Employee Detail drill-through page with searchable Employee ID slicer and reset bookmarks for clean default states and smooth navigation
  • Designed clear tooltips (rate + base) and consistent slicers to improve interpretability and reduce “small sample size” misreads

🔗 Repo: powerbi-hr-attrition-dashboard


💬 Let’s Connect

Always happy to talk about data, analytics, dashboards, and cloud projects!

Pinned Loading

  1. YouTube-Trending-Video-Analysis YouTube-Trending-Video-Analysis Public

    End-to-end AWS data lake for YouTube Trending videos using S3, Glue (PySpark), Lambda and Athena, from raw Kaggle CSV/JSON to analytics-ready Parquet tables

    Python

  2. AudioBook-Prediction-Using-ML-models AudioBook-Prediction-Using-ML-models Public

    Two-phase ML project predicting audiobook ratings using Python

    Jupyter Notebook

  3. ONLINE_SALES_DASHBOARD ONLINE_SALES_DASHBOARD Public

    📊 Interactive R Shiny dashboard visualizing online sales data with multi-tab analytics, dynamic filters, and real-time KPIs — live on Shinyapps.io.

    R

  4. Clinical-NLP-pipeline Clinical-NLP-pipeline Public

    Jupyter Notebook

  5. us-military-spending-forecasting us-military-spending-forecasting Public

    Time series analysis of US military

    R