Skip to content

peiyan0/ai-stats-toolkit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

62 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🧮 Statistics Toolkit: AI-Augmented Analytics & Predictive Suite

A professional-grade predictive analytics platform and statistical modeling suite. Perform high-precision hypothesis testing and machine learning with local AI insights powered by Ollama—ensuring 100% data privacy and secure research workflows.

Predictive Analytics Local AI Statistics Statistical Inference

🎯 Core Analytics Capabilities

The Statistics Toolkit is designed for researchers, data scientists, and students who require rigorous statistical validation without sacrificing privacy.

Category Description Primary Analytical Tools
Predictive Suite Advanced Machine Learning for forecasting. Linear/Logistic Regression, Feature Importance, Cross-Validation.
Inferential Logic Scientific Hypothesis Testing & Validation. Independent T-Test, One-Way ANOVA, Assumption Auditing.
AI Consultant Private AI-driven results interpretation. "Which Test?" Wizard, Automated APA 7th Results Writer, AI Dataset Profiler & Clean.
Descriptive Data Comprehensive summary statistics. Mean, Median, Skewness, Kurtosis, Outlier Detection.
Data Persistence Secure result management & portability. CSV Research Export, SQLite Session History.

🌟 Premium Features

  • 🛡️ Automated Assumption Audit: Never run a biased test again. Every inference tool automatically validates Normality (Shapiro-Wilk) and Homogeneity (Levene's) before presenting results.
  • 🤖 Local & Cloud AI Insights: Supports a dual-channel hybrid routing engine. Seamlessly connects to direct cloud endpoints or falls back to a local Ollama server. Provides plain-English interpretations of complex stats.
  • 📄 Automated APA 7th Results Writer: Instantly drafts flawless, publication-ready "Results and Analysis" sections adhering strictly to the APA 7th Edition manual guidelines, formatting standard italicized notations (t, F, p, d, df) with precise statistical figures.
  • 📊 AI Dataset Profiling & Clean: Probes variable structures, identifies skewness and distribution anomalies, runs IQR outlier detection, and suggests target-specific cleaning protocols inside an immersive, glassmorphic viewport.
  • 🧠 Local Hybrid RAG Engine: Features a Retrieval-Augmented Generation (RAG) pipeline. AI recommendations and results interpretation are grounded in a local statistical knowledge base (e.g., APA 7th standards, test decision matrixes).
    • Semantic Embedding Match: Matches search queries with local text chunks using local embedding vectors.
    • Zero-Dependency TF-IDF Fallback: Automatically transitions to an optimized, pure-Python search index if the local embedding API is offline.
  • 📈 Model Audit Dashboard: Visualize Feature Impact and Error Metrics (MAE, RMSE, R²) in real-time to understand what drives your predictions.
  • ♿ Inclusive Design: Fully compliant with WCAG 2.2 AA standards, featuring aria-live regions, high-contrast themes, and full keyboard navigation support.
  • 📥 Professional Data Export: Export your entire calculation history as a research-ready CSV file for use in academic manuscripts or external BI tools.

💻 Tech Stack

  • Backend: Python 3.12+ (Flask)
  • Analytics: Scikit-Learn, SciPy, NumPy, Statsmodels
  • AI Engine: Ollama (Local Model Integration)
  • Frontend: Vanilla JS, Chart.js
  • Accessibility: WCAG 2.2 AA Standardized

🚀 Getting Started

  1. Install Ollama: Visit ollama.com and install the local server.
  2. Pull the Model: Run ollama pull phi3.5:3.8b.
  3. Setup Environment:
    # 1. Install dependencies
    pip install -r requirements.txt
    
    # 2. Run the app
    flask run

🔮 Future Roadmap

We are continuously evolving the toolkit to meet the needs of modern researchers. Upcoming features include:

  • 📊 Advanced Visual Diagnostics: Integration of Violin plots, Q-Q plots, and Residual dashboards for deeper model validation.
  • ⚖️ Bayesian Module: Moving beyond frequentist p-values with Bayesian credible intervals and posterior distribution modeling.
  • 📉 Time-Series Forecasting: Implementation of ARIMA and Prophet models for sequential data analysis and trend prediction.
  • 🧠 Deep Learning Integration: A neural network wizard for building simple Keras/PyTorch models with automated hyperparameter tuning.
  • 🔐 Encrypted Cloud Sync: Optional, end-to-end encrypted synchronization for multi-device research collaboration.

Developed for researchers who value privacy, precision, and inclusive design.

About

Professional-grade predictive analytics and statistical modeling platform. Features automated assumption audits, local AI result interpretation via Ollama, and high-precision ML tools. 100% private, secure, and WCAG 2.2 AA compliant.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors