Welcome to my GitHub! Iβm an Entry-Level Data Scientist with industry experience at Rolls-Royce Power Systems, passionate about turning data into actionable insights through machine learning, statistical analysis, and scalable data pipelines.
My work spans exploratory data analysis, predictive modeling, time-series analytics, and applied AI systems deployed in real-world environments.
- π Exploratory Data Analysis & Data Wrangling β cleaning, feature engineering, insight discovery
- π Predictive Modeling β classification, regression, ensemble methods
- β±οΈ Time-Series Analysis β segmentation, anomaly detection, forecasting
- π Unstructured Data Analytics β Retrieval-Augmented Generation (RAG), document intelligence
- Built and deployed machine learning pipelines for time-series classification and thermal prediction on real-world sensor data.
- Optimized deep learning models for production, reducing model size from ~29 GB to <1 GB with minimal accuracy loss.
- Accelerated end-to-end inference pipelines using PyTorch GPU acceleration, reducing processing time by ~95%.
- Developed physics-informed ML models to estimate engine node temperatures with Β±5β6Β°C accuracy, reducing reliance on costly simulations.
- Collaborated with cross-functional engineering teams and stakeholders to validate model outputs against domain constraints.
- Built and evaluated ensemble ML models achieving 81% accuracy and 87% F1-score.
- Performed feature engineering and model comparison to identify key churn drivers and derive actionable insights.
- Conducted EDA including data imputation, outlier detection, correlation analysis, and feature distribution analysis.
- Delivered clean, ML-ready datasets for downstream modeling.
- Implemented a RAG pipeline using FAISS and transformer models for semantic retrieval and context-aware answer generation over large PDF documents.
- Compared KNN, Logistic Regression, and Decision Tree models to analyze performance trade-offs in weather prediction tasks.
Languages & Data:
Python Β· SQL Β· PostgreSQL Β· Excel
Data Science & ML:
Pandas Β· NumPy Β· Scikit-learn Β· Matplotlib Β· PyTorch Β· TensorFlow
Tools & Platforms:
Git Β· Jupyter Β· AWS (Learning)
- Indian Patent (Published): AI-Driven Multi-Factor Authentication Using Hardware Fingerprints
- IEEE Access: Robust Authentication Using Hardware Fingerprints and AI
- Springer: Residual Network Depth vs Accuracy for Crop Disease Classification
- Elsevier (Under Review): Autoencoder-Based Mixed Pixel Correction in Thermal Images
- π§ Email: sachinsmanoj02@gmail.com
- π LinkedIn: linkedin.com/in/sachinsm2002
Thanks for visiting! Always open to discussions on data, machine learning, and real-world problem solving π