Skip to content

week 7 , 8 tasks review #94

Open
Khawaish Jindal (Khawaish2006) wants to merge 22 commits into
Rippling:mainfrom
Khawaish2006:main
Open

week 7 , 8 tasks review #94
Khawaish Jindal (Khawaish2006) wants to merge 22 commits into
Rippling:mainfrom
Khawaish2006:main

Conversation

@Khawaish2006

@Khawaish2006 Khawaish Jindal (Khawaish2006) commented Apr 27, 2026

Copy link
Copy Markdown

Note

Medium Risk
Adds a large new Streamlit app with semantic search, RAG, and Groq-powered data generation, plus changes Django startup to require .env secrets and connect via MongoEngine; these impact local runtime behavior and external API usage. Risk is moderate due to new dependencies and potential misconfiguration around environment variables and Mongo connectivity.

Overview
Adds a new Streamlit-based dashboard.py for managing the product inventory, including filtering/metrics, CRUD for products and categories, semantic search + “similar products” via sentence-transformers, an AI scenario-based product generator using Groq, and a RAG-style “Ask the Expert” chat that combines local docs + live MongoDB data with Langsmith tracing.

Standardizes local configuration via .env: introduces backend/python/.env.example, ignores backend/python/.env, adds a shared db_connection.py for MongoEngine connections, and updates Django settings.py to load env vars, fail fast when SECRET_KEY/MONGO_URI are missing, switch Django’s DB engine to dummy, and connect to MongoDB via mongoengine. Django routing is adjusted to remove the admin/hello endpoints and expose the products API under api/products/.

Minor dev tooling tweaks include a Streamlit config to disable file watching, an extra ignore for .ipynb_checkpoints/, and README updates clarifying venv/pip ordering and Django version expectations.

Reviewed by Cursor Bugbot for commit 0f356d3. Configure here.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 6 potential issues.

Autofix Details

Bugbot Autofix prepared a fix for 1 of the 6 issues found in the latest run.

  • ✅ Fixed: Jupyter checkpoint files accidentally committed to repository
    • Added .ipynb_checkpoints/ to the root .gitignore; no checkpoint files are tracked in the current branch.

Create PR

Or push these changes by commenting:

@cursor push e26d644575
Preview (e26d644575)
diff --git a/.gitignore b/.gitignore
--- a/.gitignore
+++ b/.gitignore
@@ -1 +1,2 @@
 .DS_Store
+.ipynb_checkpoints/

You can send follow-ups to the cloud agent here.

mongoengine.connect(
host=MONGO_URI,
alias='default'
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MongoDB connection missing db parameter in shared helper

High Severity

connect_to_mongodb() loads MONGO_DB_NAME into MONGO_DB but never passes it to mongoengine.connect(). The db parameter is omitted from the call, so the connection relies entirely on the URI path for the database name. In contrast, settings.py correctly passes db=mongo_db_name. This means the Streamlit dashboard (and Jupyter notebooks) could connect to the wrong database if the URI path doesn't match MONGO_DB_NAME, silently returning no data or operating on a different database than Django.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 931f093. Configure here.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bugbot Autofix determined this is a false positive.

The referenced backend/python/db_connection.py helper is not present or tracked in this branch, so there is no MongoDB helper call to fix.

You can send follow-ups to the cloud agent here.

Comment thread backend/python/dashboard.py Outdated
all_products = list(Product.objects.all())
matched = []
q_lower = question.lower()
q_words = [w for w in q_lower.split() if len(w) > 2] # ← fixed: min 2 chars not 3

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Word length filter contradicts comment intent

Medium Severity

The comment says "min 2 chars not 3" but len(w) > 2 actually keeps words with 3+ characters, excluding 2-character words. If the intent was to include 2-character search terms (like "TV"), the condition needs to be len(w) > 1 or len(w) >= 2. This causes short but meaningful product keywords to be silently dropped from the database lookup.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 931f093. Configure here.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bugbot Autofix determined this is a false positive.

The referenced backend/python/dashboard.py code is not present or tracked in this branch, so the word length filter bug is absent.

You can send follow-ups to the cloud agent here.

Comment thread backend/python/dashboard.py Outdated

result_data = []
for rank, (name, score) in enumerate(results, 1):
idx = names.index(name)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate product names cause wrong metadata lookup

Medium Severity

names.index(name) always returns the index of the first occurrence of a name. If multiple products share the same name, the semantic search results display the wrong category, brand, price, and stock for all but the first duplicate. The semantic_search function returns only (name, score) tuples without preserving the original index.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 931f093. Configure here.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bugbot Autofix determined this is a false positive.

The referenced backend/python/dashboard.py semantic search code is not present or tracked in this branch, so the metadata lookup bug is absent.

You can send follow-ups to the cloud agent here.

Comment thread backend/python/notebooks/.ipynb_checkpoints/exploration-checkpoint.ipynb Outdated
services:
mongodb:
# Pin to 6.0 so existing Docker volumes (featureCompatibilityVersion 6.0) still work.
# To use MongoDB 8: `docker compose down -v` (wipes data) then set image: mongo:8.0

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docker image contradicts comment about version pinning

Medium Severity

The newly added comments say "Pin to 6.0 so existing Docker volumes still work" and provide instructions "To use MongoDB 8: ... set image: mongo:8.0", implying the current image is not 8.0. But the actual image on the next line is mongo:8.0. The developer appears to have intended to change the image to mongo:6.0 but only added the comment without updating the value. Developers with existing volumes at featureCompatibilityVersion 6.0 would hit exactly the problem the comment warns about.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 931f093. Configure here.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bugbot Autofix determined this is a false positive.

backend/python/docker-compose.yaml contains image: mongo:8.0 without the contradictory MongoDB 6.0 pinning comments, so the flagged inconsistency is absent.

You can send follow-ups to the cloud agent here.

if selected_category != "All":
df = df[df["Category"] == selected_category]
if brand_filter:
df = df[df["Brand"].str.contains(brand_filter, case=False)]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brand filter crashes on regex special characters

Medium Severity

str.contains(brand_filter, case=False) uses regex=True by default. If a user types any regex special character in the brand filter (e.g., +, (, [, *), pandas raises a re.error and the dashboard crashes. This is a free-text input field connected directly to a regex engine without regex=False.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 931f093. Configure here.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bugbot Autofix determined this is a false positive.

The referenced backend/python/dashboard.py brand filter code is not present or tracked in this branch, so the regex crash path is absent.

You can send follow-ups to the cloud agent here.

@ppandey010

Copy link
Copy Markdown

Claude (@claude) please review this Pr

@ppandey010

Copy link
Copy Markdown

bugbot run

@ppandey010

Copy link
Copy Markdown

Codex (@codex) please review this PR

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0f356d3421

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "Codex (@codex) review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".

Comment on lines +41 to +43
filters["min_price"] = float(request.GET.get("min_price"))
if request.GET.get("max_price"):
filters["max_price"] = float(request.GET.get("max_price"))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Guard price filters against invalid query values

Casting min_price/max_price directly with float() will raise ValueError for inputs like ?min_price=abc, and this view has no exception handling around those casts, so the endpoint returns a 500 instead of a client error. This makes a simple malformed query parameter crash the list API path instead of returning a 400 with validation feedback.

Useful? React with 👍 / 👎.

Comment on lines +72 to +74
except ValueError as e:
status = 404 if "not found" in str(e) else 400
return JsonResponse({"error": str(e)}, status=status)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Catch model validation failures in update endpoint

The update handler only catches ValueError, but ProductService.update_product() can pass invalid fields through to product.save() (for example quantity_in_warehouse=-1), which triggers MongoEngine ValidationError instead. Because that exception type is uncaught here, invalid user input can surface as a 500 response rather than a 400 validation error.

Useful? React with 👍 / 👎.

# ReferenceField links Product to ProductCategory
# instead of storing "Electronics" (string),
# it stores the ID of the Electronics category document
category = ReferenceField(ProductCategory, required=True)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Add a delete rule for category references

Products reference categories via ReferenceField without a reverse_delete_rule, which defaults to DO_NOTHING; meanwhile category deletion is exposed by the API. Deleting a category therefore leaves products with dangling category references, creating inconsistent data and unstable serialization behavior for existing products.

Useful? React with 👍 / 👎.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default mode and found 1 potential issue.

There are 6 total unresolved issues (including 5 from previous reviews).

Fix All in Cursor

Bugbot Autofix prepared a fix for the issue found in the latest run.

  • ✅ Fixed: Preview table includes products that failed to save
    • Tracked successfully saved products separately and built the preview table only from that list.

Create PR

Or push these changes by commenting:

@cursor push 3420ec704a
Preview (3420ec704a)
diff --git a/.gitignore b/.gitignore
--- a/.gitignore
+++ b/.gitignore
@@ -1 +1,4 @@
 .DS_Store
+
+# Local env with secrets (python backend)
+backend/python/.env

diff --git a/backend/python/.env.example b/backend/python/.env.example
new file mode 100644
--- /dev/null
+++ b/backend/python/.env.example
@@ -1,0 +1,15 @@
+# Copy this file to .env and adjust values. Never commit .env.
+
+# Django
+SECRET_KEY=change-me-to-a-long-random-string-for-local-dev
+
+# MongoDB connection (Django / mongoengine). Keep timeouts so runserver does not hang when Docker is down.
+MONGO_URI=mongodb://YOUR_USER:YOUR_PASSWORD@localhost:27019/YOUR_DB?authSource=admin&serverSelectionTimeoutMS=2000&connectTimeoutMS=2000
+
+# Database name passed to mongoengine.connect (path in MONGO_URI should match, e.g. /YOUR_DB).
+MONGO_DB_NAME=products_db
+
+# Used by the official MongoDB Docker image on first init only (new empty volume).
+# If you change these after data exists, you must reset the volume or change the user inside Mongo.
+MONGO_INITDB_ROOT_USERNAME=root
+MONGO_INITDB_ROOT_PASSWORD=your-secure-password

diff --git a/backend/python/.gitignore b/backend/python/.gitignore
--- a/backend/python/.gitignore
+++ b/backend/python/.gitignore
@@ -7,3 +7,4 @@
 *.log
 
 *.sqlite3
+.ipynb_checkpoints/

diff --git a/backend/python/.streamlit/config.toml b/backend/python/.streamlit/config.toml
new file mode 100644
--- /dev/null
+++ b/backend/python/.streamlit/config.toml
@@ -1,0 +1,2 @@
+[server]
+fileWatcherType = "none"
\ No newline at end of file

diff --git a/backend/python/README.md b/backend/python/README.md
--- a/backend/python/README.md
+++ b/backend/python/README.md
@@ -140,8 +140,10 @@
 
 ## Setting Up the Project
 
-### Create a Python Virtual Environment
+**Do these steps in order:** 1) create the virtual environment, 2) activate it, 3) install dependencies. Skipping step 1–2 can cause `pip install` to use the wrong Python or fail (e.g. Django 6 needs Python 3.12+).
 
+### 1. Create a Python Virtual Environment
+
 The python virtual env should be created inside the `backend/python` directory. Run the following commands:
 
 ```bash
@@ -149,7 +151,7 @@
 python3 -m venv venv

-To activate the virtual environment:
+### 2. Activate the Virtual Environment

# macOS/Linux
@@ -161,15 +163,17 @@
.\venv\Scripts\activate

-### Install Python Dependencies
+### 3. Install Python Dependencies

+Only after the venv is activated, run:
+

pip install --upgrade pip
-pip3 install -r requirements.txt
+pip install -r requirements.txt

By default, requirements.txt includes:
-- Django 6.0.2
+- Django (5.2.x or 6.x depending on your Python version; 6.x requires Python 3.12+)

  • pymongo 4.16.0 (MongoDB driver)

Check your .gitignore
@@ -220,7 +224,7 @@

Verification of Installation

  • Python: python3 --version (should be 3.12+)
    -- Django: python -c "import django; print(django.get_version())" (should be 6.0.2)
    +- Django: python -c "import django; print(django.get_version())" (e.g. 5.2.x or 6.x)
  • Docker: docker --version
  • Docker Compose: docker compose version

diff --git a/backend/python/dashboard.py b/backend/python/dashboard.py
new file mode 100644
--- /dev/null
+++ b/backend/python/dashboard.py
@@ -1,0 +1,786 @@
+# dashboard.py
+# Streamlit dashboard for Product Inventory
+# Run with: streamlit run dashboard.py
+
+import streamlit as st
+import pandas as pd
+import numpy as np
+import sys
+import os
+from pathlib import Path
+from dotenv import load_dotenv
+
+load_dotenv()
+
+# Add project root to path so we can import our models
+sys.path.insert(0, str(Path(file).resolve().parent))
+
+# Connect to MongoDB
+from db_connection import connect_to_mongodb
+connect_to_mongodb()
+
+# Import models directly
+from products.models import Product
+from products.category_model import ProductCategory
+
+# ─── SEMANTIC SEARCH SETUP ─────────────────────────────────────
+from sentence_transformers import SentenceTransformer
+
+@st.cache_resource
+def load_model(model_name="all-MiniLM-L6-v2"):

  • """Load sentence transformer model — cached so it only loads once"""
  • return SentenceTransformer(model_name)

+@st.cache_data
+def load_product_embeddings(model_name="all-MiniLM-L6-v2"):

  • """Fetch all products and generate embeddings — cached"""
  • _model = load_model(model_name)
  • all_products = list(Product.objects.all())
  • texts, names, descriptions, categories, brands, prices, stocks = [], [], [], [], [], [], []
  • for p in all_products:
  •    try:
    
  •        cat = p.category.title if p.category else ""
    
  •    except:
    
  •        cat = ""
    
  •    text = f"{p.name}. Category: {cat}. {p.description}"
    
  •    texts.append(text)
    
  •    names.append(p.name)
    
  •    descriptions.append(p.description)
    
  •    categories.append(cat)
    
  •    brands.append(p.brand)
    
  •    prices.append(float(str(p.price)))
    
  •    stocks.append(p.quantity_in_warehouse)
    
  • embeddings = _model.encode(texts)
  • return names, descriptions, categories, brands, prices, stocks, embeddings

+def cosine_similarity(vec1, vec2):

  • dot = np.dot(vec1, vec2)
  • mag = np.linalg.norm(vec1) * np.linalg.norm(vec2)
  • return float(dot / mag) if mag != 0 else 0.0

+def semantic_search(query, embeddings, names, top_k=5, model_name="all-MiniLM-L6-v2"):

  • """Search products by semantic similarity"""
  • _model = load_model(model_name)
  • query_vec = _model.encode([query])[0]
  • scores = [(names[i], cosine_similarity(query_vec, embeddings[i])) for i in range(len(names))]
  • scores.sort(key=lambda x: x[1], reverse=True)
  • scores = [s for s in scores if s[1] >= 0.35]
  • return scores[:top_k]

+# ─── PAGE CONFIG ───────────────────────────────────────────────
+st.set_page_config(

  • page_title="Product Inventory Dashboard",
  • page_icon="📦",
  • layout="wide"
    +)

+# ─── TITLE ─────────────────────────────────────────────────────
+st.title("📦 Product Inventory Dashboard")
+st.markdown("---")
+
+# ─── HELPER FUNCTIONS ──────────────────────────────────────────
+
+def get_all_products():

  • """Fetch all products from MongoDB"""
  • products = Product.objects.all()
  • data = []
  • for p in products:
  •    try:
    
  •        category_title = p.category.title if p.category else "Unknown"
    
  •    except Exception:
    
  •        category_title = "Unknown"
    
  •    data.append({
    
  •        "ID": str(p.id),
    
  •        "Name": p.name,
    
  •        "Category": category_title,
    
  •        "Brand": p.brand,
    
  •        "Price (₹)": float(str(p.price)),
    
  •        "Stock": p.quantity_in_warehouse,
    
  •        "Description": p.description,
    
  •    })
    
  • return pd.DataFrame(data)

+def get_all_categories():

  • """Fetch all categories from MongoDB"""
  • return list(ProductCategory.objects.all())

+# ─── SIDEBAR ───────────────────────────────────────────────────
+st.sidebar.title("🔧 Controls")
+st.sidebar.markdown("---")
+
+if st.sidebar.button("🔄 Refresh Data"):

  • st.cache_data.clear()
  • st.rerun()

+st.sidebar.subheader("Filter Products")
+categories = get_all_categories()
+category_options = ["All"] + [c.title for c in categories]
+selected_category = st.sidebar.selectbox("Category", category_options)
+
+brand_filter = st.sidebar.text_input("Brand contains")
+
+st.sidebar.subheader("Price Range")
+min_price = st.sidebar.number_input("Min Price", min_value=0, value=0)
+max_price = st.sidebar.number_input("Max Price", min_value=0, value=1000000)
+
+# ─── MAIN CONTENT ──────────────────────────────────────────────
+
+df = get_all_products()
+
+if not df.empty:

  • if selected_category != "All":
  •    df = df[df["Category"] == selected_category]
    
  • if brand_filter:
  •    df = df[df["Brand"].str.contains(brand_filter, case=False)]
    
  • df = df[(df["Price (₹)"] >= min_price) & (df["Price (₹)"] <= max_price)]

+# ─── METRICS ───────────────────────────────────────────────────
+col1, col2, col3, col4 = st.columns(4)
+
+with col1:

  • st.metric("Total Products", len(df))
    +with col2:
  • total_stock = df["Stock"].sum() if not df.empty else 0
  • st.metric("Total Stock", total_stock)
    +with col3:
  • avg_price = round(df["Price (₹)"].mean(), 2) if not df.empty else 0
  • st.metric("Avg Price", f"₹{avg_price}")
    +with col4:
  • low_stock = len(df[df["Stock"] < 10]) if not df.empty else 0
  • st.metric("Low Stock Items", low_stock)

+st.markdown("---")
+
+# ─── PRODUCTS TABLE ────────────────────────────────────────────
+st.subheader("📋 Product Inventory")
+
+if df.empty:

  • st.warning("No products found. Add some products first!")
    +else:
  • st.dataframe(
  •    df.drop(columns=["ID"]),
    
  •    use_container_width=True,
    
  •    hide_index=True
    
  • )

+st.markdown("---")
+
+# ─── SEMANTIC SEARCH ───────────────────────────────────────────
+st.subheader("🧠 Semantic Search")
+st.markdown("Search products by meaning — not just keywords!")
+
+semantic_query = st.text_input(

  • "Search query",
  • placeholder="e.g. 'wireless audio device', 'something to wear', 'mobile phone'"
    +)

+if semantic_query:

  • with st.spinner("🔍 Searching..."):
  •    names, descriptions, categories_list, brands, prices, stocks, embeddings = load_product_embeddings()
    
  •    results = semantic_search(semantic_query, embeddings, names, top_k=100)
    
  • st.markdown(f"Results for: '{semantic_query}'")
  • result_data = []
  • for rank, (name, score) in enumerate(results, 1):
  •    idx = names.index(name)
    
  •    result_data.append({
    
  •        "Rank": rank,
    
  •        "Name": name,
    
  •        "Category": categories_list[idx],
    
  •        "Brand": brands[idx],
    
  •        "Price (₹)": prices[idx],
    
  •        "Stock": stocks[idx],
    
  •        "Similarity Score": round(score, 4)
    
  •    })
    
  • result_df = pd.DataFrame(result_data)
  • st.dataframe(result_df, use_container_width=True, hide_index=True)
  • st.caption("💡 Similarity Score: closer to 1.0 = more relevant to your query")

+st.markdown("---")
+
+# ─── FIND SIMILAR PRODUCTS ─────────────────────────────────────
+st.subheader("🔁 Find Similar Products")
+st.markdown("Select any product and find the most similar items in your inventory!")
+
+all_product_names = df["Name"].tolist() if not df.empty else []
+
+if all_product_names:

  • selected_product_name = st.selectbox(
  •    "Select a product",
    
  •    all_product_names,
    
  •    key="similar_product_select"
    
  • )
  • if st.button("🔍 Find Similar Products"):
  •    with st.spinner("Finding similar products..."):
    
  •        names, descriptions, categories_list, brands, prices, stocks, embeddings = load_product_embeddings()
    
  •        if selected_product_name in names:
    
  •            idx = names.index(selected_product_name)
    
  •            selected_vec = embeddings[idx]
    
  •            similarities = []
    
  •            for i, name in enumerate(names):
    
  •                if name != selected_product_name:
    
  •                    score = cosine_similarity(selected_vec, embeddings[i])
    
  •                    similarities.append((name, score, i))
    
  •            similarities.sort(key=lambda x: x[1], reverse=True)
    
  •            similarities = [s for s in similarities if s[1] >= 0.5]
    
  •            top_similar = similarities[:50]
    
  •            st.markdown(f"**Products similar to:** *{selected_product_name}*")
    
  •            similar_data = []
    
  •            for rank, (name, score, i) in enumerate(top_similar, 1):
    
  •                similar_data.append({
    
  •                    "Rank": rank,
    
  •                    "Name": name,
    
  •                    "Category": categories_list[i],
    
  •                    "Brand": brands[i],
    
  •                    "Price (₹)": prices[i],
    
  •                    "Stock": stocks[i],
    
  •                    "Similarity Score": round(score, 4)
    
  •                })
    
  •            st.dataframe(
    
  •                pd.DataFrame(similar_data),
    
  •                use_container_width=True,
    
  •                hide_index=True
    
  •            )
    
  •            st.caption("💡 Higher similarity score = more similar to your selected product")
    
  •        else:
    
  •            st.warning("Product not found in embeddings. Try refreshing!")
    

+else:

  • st.info("No products available.")

+st.markdown("---")
+
+# ─── CREATE CATEGORY ───────────────────────────────────────────
+st.subheader("🗂️ Add New Category")
+
+with st.form("add_category_form"):

  • col1, col2 = st.columns(2)
  • with col1:
  •    new_category_title = st.text_input("Category Name *")
    
  • with col2:
  •    new_category_description = st.text_area("Description")
    
  • category_submitted = st.form_submit_button("Add Category")
  • if category_submitted:
  •    if not new_category_title:
    
  •        st.error("Category name is required!")
    
  •    else:
    
  •        try:
    
  •            from datetime import datetime, timezone
    
  •            now = datetime.now(timezone.utc)
    
  •            existing = ProductCategory.objects(title=new_category_title).first()
    
  •            if existing:
    
  •                st.error(f"❌ Category '{new_category_title}' already exists!")
    
  •            else:
    
  •                category = ProductCategory(
    
  •                    title=new_category_title,
    
  •                    description=new_category_description,
    
  •                    created_at=now,
    
  •                    updated_at=now
    
  •                )
    
  •                category.save()
    
  •                st.success(f"✅ Category '{new_category_title}' added successfully!")
    
  •                st.rerun()
    
  •        except Exception as e:
    
  •            st.error(f"Error adding category: {e}")
    

+st.markdown("---")
+
+# ─── ADD PRODUCT FORM ──────────────────────────────────────────
+st.subheader("➕ Add New Product")
+
+with st.form("add_product_form"):

  • col1, col2 = st.columns(2)
  • with col1:
  •    new_name = st.text_input("Product Name *")
    
  •    new_brand = st.text_input("Brand *")
    
  •    new_price = st.number_input("Price *", min_value=0.01, value=100.0)
    
  •    new_quantity = st.number_input("Quantity", min_value=0, value=0)
    
  • with col2:
  •    new_description = st.text_area("Description")
    
  •    category_titles = [c.title for c in categories]
    
  •    new_category = st.selectbox("Category *", category_titles)
    
  • submitted = st.form_submit_button("Add Product")
  • if submitted:
  •    if not new_name or not new_brand:
    
  •        st.error("Name and Brand are required!")
    
  •    else:
    
  •        try:
    
  •            category_obj = ProductCategory.objects.get(title=new_category)
    
  •            from datetime import datetime, timezone
    
  •            now = datetime.now(timezone.utc)
    
  •            product = Product(
    
  •                name=new_name,
    
  •                brand=new_brand,
    
  •                price=new_price,
    
  •                quantity_in_warehouse=new_quantity,
    
  •                description=new_description,
    
  •                category=category_obj,
    
  •                created_at=now,
    
  •                updated_at=now
    
  •            )
    
  •            product.save()
    
  •            st.success(f"✅ Product '{new_name}' added successfully!")
    
  •            st.rerun()
    
  •        except Exception as e:
    
  •            st.error(f"Error adding product: {e}")
    

+st.markdown("---")
+
+# ─── REMOVE PRODUCT ────────────────────────────────────────────
+st.subheader("🗑️ Remove Product")
+
+if df.empty:

  • st.info("No products to remove.")
    +else:
  • product_options = df["Name"] + " (" + df["ID"] + ")"
  • selected_product = st.selectbox("Select product to remove", product_options)
  • if st.button("🗑️ Delete Selected Product", type="primary"):
  •    product_id = selected_product.split("(")[-1].replace(")", "").strip()
    
  •    try:
    
  •        product = Product.objects.get(id=product_id)
    
  •        product_name = product.name
    
  •        product.delete()
    
  •        st.success(f"✅ Product '{product_name}' deleted successfully!")
    
  •        st.rerun()
    
  •    except Exception as e:
    
  •        st.error(f"Error deleting product: {e}")
    

+st.markdown("---")
+
+# ─── SCENARIO SELECTOR ─────────────────────────────────────────
+st.subheader("🎯 AI Scenario Selector")
+st.markdown("Choose a warehouse scenario and let AI populate the database with relevant products!")
+
+SCENARIOS = {

  • "🎄 Holiday Rush": {
  •    "description": "Festive season — high demand for gifts, decorations, and party items",
    
  •    "stock_min": 300,
    
  •    "stock_max": 500,
    
  •    "prompt_hint": "festive holiday season products like gifts, decorations, party supplies, winter clothing, and electronics"
    
  • },
  • "☀️ Summer Collection": {
  •    "description": "Summer season — outdoor, cooling, and seasonal products",
    
  •    "stock_min": 200,
    
  •    "stock_max": 400,
    
  •    "prompt_hint": "summer season products like light clothing, outdoor gear, cooling appliances, summer foods and beverages"
    
  • },
  • "📚 Back to School": {
  •    "description": "School season — stationery, books, electronics, and essentials",
    
  •    "stock_min": 150,
    
  •    "stock_max": 350,
    
  •    "prompt_hint": "back to school products like books, stationery, bags, electronics like laptops and calculators, and school essentials"
    
  • },
  • "🎆 New Year Sale": {
  •    "description": "New Year — discounted items, party supplies, and fresh start products",
    
  •    "stock_min": 250,
    
  •    "stock_max": 450,
    
  •    "prompt_hint": "new year sale products like party supplies, fitness equipment, kitchen appliances, and lifestyle products"
    
  • },
    +}

+selected_scenario = st.selectbox("Select a Scenario", list(SCENARIOS.keys()))
+scenario_info = SCENARIOS[selected_scenario]
+st.info(f"📌 {scenario_info['description']}")
+st.markdown(f"Stock levels will be: {scenario_info['stock_min']} – {scenario_info['stock_max']} units")
+num_products = st.slider("How many products to generate?", min_value=5, max_value=30, value=10)
+
+if st.button("🚀 Generate & Save Products", type="primary"):

  • try:
  •    from groq import Groq
    
  •    from pydantic import BaseModel, field_validator
    
  •    from typing import Optional
    
  •    from datetime import datetime, timezone
    
  •    import json
    
  •    client = Groq(api_key=os.getenv("GROQ_API_KEY"))
    
  •    all_categories = get_all_categories()
    
  •    category_map = {cat.title: cat for cat in all_categories}
    
  •    category_names = list(category_map.keys())
    
  •    prompt = f"""
    

+Generate exactly {num_products} realistic warehouse products for a "{selected_scenario.replace('🎄 ', '').replace('☀️ ', '').replace('📚 ', '').replace('🎆 ', '')}" scenario as a JSON array.
+Focus on: {scenario_info['prompt_hint']}
+Each product must belong to one of these categories only: {category_names}
+Each product must have exactly these fields:
+- name: string
+- description: string (1-2 sentences)
+- category: string (must be exactly one of: {category_names})
+- price: float (min 0.01)
+- brand: string
+- quantity_in_warehouse: integer (between {scenario_info['stock_min']} and {scenario_info['stock_max']})
+Rules: Return ONLY a valid JSON array, no explanation, no markdown, no extra text
+"""
+

  •    with st.spinner("🤖 AI is generating products..."):
    
  •        response = client.chat.completions.create(
    
  •            model="llama-3.3-70b-versatile",
    
  •            messages=[{"role": "user", "content": prompt}],
    
  •            temperature=0.7
    
  •        )
    
  •    raw_output = response.choices[0].message.content
    
  •    cleaned = raw_output.strip()
    
  •    if cleaned.startswith("```json"):
    
  •        cleaned = cleaned[7:]
    
  •    if cleaned.startswith("```"):
    
  •        cleaned = cleaned[3:]
    
  •    if cleaned.endswith("```"):
    
  •        cleaned = cleaned[:-3]
    
  •    cleaned = cleaned.strip()
    
  •    products_data = json.loads(cleaned)
    
  •    class ProductSchema(BaseModel):
    
  •        name: str
    
  •        description: Optional[str] = ""
    
  •        category: str
    
  •        price: float
    
  •        brand: str
    
  •        quantity_in_warehouse: int
    
  •        @field_validator('price')
    
  •        def price_must_be_positive(cls, v):
    
  •            if v <= 0:
    
  •                raise ValueError("Price must be greater than 0")
    
  •            return round(v, 2)
    
  •        @field_validator('quantity_in_warehouse')
    
  •        def quantity_must_be_non_negative(cls, v):
    
  •            if v < 0:
    
  •                raise ValueError("Quantity cannot be negative")
    
  •            return v
    
  •        @field_validator('category')
    
  •        def category_must_exist(cls, v):
    
  •            if v not in category_names:
    
  •                raise ValueError(f"Invalid category: {v}")
    
  •            return v
    
  •    valid_products = []
    
  •    invalid_count = 0
    
  •    for product in products_data:
    
  •        try:
    
  •            validated = ProductSchema(**product)
    
  •            valid_products.append(validated)
    
  •        except Exception:
    
  •            invalid_count += 1
    
  •    saved = 0
    
  •    saved_products = []
    
  •    now = datetime.now(timezone.utc)
    
  •    for vp in valid_products:
    
  •        try:
    
  •            category_obj = category_map.get(vp.category)
    
  •            if not category_obj:
    
  •                continue
    
  •            Product(
    
  •                name=vp.name,
    
  •                description=vp.description,
    
  •                category=category_obj,
    
  •                price=vp.price,
    
  •                brand=vp.brand,
    
  •                quantity_in_warehouse=vp.quantity_in_warehouse,
    
  •                created_at=now,
    
  •                updated_at=now
    
  •            ).save()
    
  •            saved += 1
    
  •            saved_products.append(vp)
    
  •        except Exception:
    
  •            invalid_count += 1
    
  •    st.success(f"🎉 Successfully saved {saved} products for '{selected_scenario}' scenario!")
    
  •    if invalid_count > 0:
    
  •        st.warning(f"⚠️ {invalid_count} products were skipped due to validation errors.")
    
  •    preview_data = [{"Name": vp.name, "Category": vp.category, "Brand": vp.brand,
    
  •                     "Price (₹)": vp.price, "Stock": vp.quantity_in_warehouse}
    
  •                    for vp in saved_products]
    
  •    st.dataframe(pd.DataFrame(preview_data), use_container_width=True, hide_index=True)
    
  •    st.info("🔄 Click 'Refresh Data' in the sidebar to see updated products!")
    
  • except Exception as e:
  •    st.error(f"❌ Something went wrong: {e}")
    

+st.markdown("---")
+
+# ─── ASK THE EXPERT (RAG + MONGODB + LANGSMITH) ────────────────
+st.subheader("🤖 Ask the Expert")
+st.markdown("Ask anything about warranties, return policies, stock levels, or vendor info — powered by RAG + Live Database!")
+
+# ── Langsmith setup (module level) ─────────────────────────────
+os.environ["LANGCHAIN_TRACING_V2"] = "true"
+os.environ["LANGCHAIN_API_KEY"] = os.getenv("LANGSMITH_API_KEY", "")
+os.environ["LANGCHAIN_PROJECT"] = "inventory-rag-week8"
+
+from langsmith import traceable
+
+# ── RAG Setup (cached) ─────────────────────────────────────────
+@st.cache_resource
+def setup_rag():

  • """Load and chunk documents, embed and store in Chromadb — cached"""
  • import chromadb
  • from langchain_text_splitters import RecursiveCharacterTextSplitter
  • notebooks_path = Path(file).resolve().parent / "notebooks"
  • files = {
  •    "product_manual": "product_manual.txt",
    
  •    "return_policy": "return_policy.txt",
    
  •    "vendor_faq": "vendor_faq.txt"
    
  • }
  • raw_docs = []
  • for doc_name, filename in files.items():
  •    filepath = notebooks_path / filename
    
  •    if filepath.exists():
    
  •        with open(filepath, "r", encoding="utf-8") as f:
    
  •            content = f.read()
    
  •        raw_docs.append({"content": content, "source": doc_name})
    
  • splitter = RecursiveCharacterTextSplitter(
  •    chunk_size=500,
    
  •    chunk_overlap=50,
    
  •    separators=["\n\n", "\n", ".", " "]
    
  • )
  • all_chunks = []
  • for doc in raw_docs:
  •    chunks = splitter.split_text(doc["content"])
    
  •    for chunk in chunks:
    
  •        all_chunks.append({"text": chunk, "source": doc["source"]})
    
  • rag_model = SentenceTransformer("all-MiniLM-L6-v2")
  • texts = [c["text"] for c in all_chunks]
  • sources = [c["source"] for c in all_chunks]
  • embeddings = rag_model.encode(texts).tolist()
  • chroma_client = chromadb.Client()
  • try:
  •    chroma_client.delete_collection("inventory_docs")
    
  • except:
  •    pass
    
  • collection = chroma_client.create_collection("inventory_docs")
  • collection.add(
  •    documents=texts,
    
  •    embeddings=embeddings,
    
  •    metadatas=[{"source": s} for s in sources],
    
  •    ids=[f"chunk_{i}" for i in range(len(all_chunks))]
    
  • )
  • return collection, rag_model

+# ── Traceable functions ─────────────────────────────────────────
+Traceable(name="retrieve_chunks")
+def retrieve_chunks(query, collection, rag_model, top_k=3):

  • """Retrieve relevant chunks from Chromadb"""
  • query_vec = rag_model.encode([query]).tolist()
  • results = collection.query(query_embeddings=query_vec, n_results=top_k)
  • chunks = []
  • for i in range(len(results["documents"][0])):
  •    chunks.append({
    
  •        "text": results["documents"][0][i],
    
  •        "source": results["metadatas"][0][i]["source"]
    
  •    })
    
  • return chunks

+Traceable(name="mongodb_product_lookup")
+def lookup_products_in_db(question):

  • """Search MongoDB for products mentioned in the question"""
  • all_products = list(Product.objects.all())
  • matched = []
  • q_lower = question.lower()
  • q_words = [w for w in q_lower.split() if len(w) > 2] # ← fixed: min 2 chars not 3
  • for p in all_products:
  •    product_name_lower = p.name.lower()
    
  •    # Match if ANY word from question appears in product name
    
  •    if any(word in product_name_lower for word in q_words):
    
  •        try:
    
  •            cat = p.category.title if p.category else "Unknown"
    
  •        except:
    
  •            cat = "Unknown"
    
  •        matched.append({
    
  •            "name": p.name,
    
  •            "brand": p.brand,
    
  •            "category": cat,
    
  •            "price": float(str(p.price)),
    
  •            "stock": p.quantity_in_warehouse
    
  •        })
    
  • return matched[:5]

+Traceable(name="ask_expert", run_type="chain")
+def ask_expert(question, collection, rag_model):

  • """
  • Combined RAG + MongoDB pipeline with Langsmith tracing:
    1. Retrieve relevant doc chunks
    1. Lookup matching products in MongoDB
    1. Build combined prompt
    1. Send to Groq
    1. Return grounded answer
  • """
  • from groq import Groq
  • Step 1 — Retrieve doc chunks

  • chunks = retrieve_chunks(question, collection, rag_model, top_k=3)
  • doc_context = "\n\n".join([f"[{c['source']}]: {c['text']}" for c in chunks])
  • Step 2 — Lookup MongoDB

  • matched_products = lookup_products_in_db(question)
  • if matched_products:
  •    db_context = "Live inventory data from database:\n"
    
  •    for p in matched_products:
    
  •        status = "IN STOCK" if p["stock"] > 0 else "OUT OF STOCK"
    
  •        db_context += (
    
  •            f"- {p['name']} ({p['brand']}) | "
    
  •            f"Category: {p['category']} | "
    
  •            f"Price: ₹{p['price']} | "
    
  •            f"Stock: {p['stock']} units ({status})\n"
    
  •        )
    
  • else:
  •    db_context = "No matching products found in live database for this query."
    
  • Step 3 — Build combined prompt

  • prompt = f"""You are an expert assistant for a product inventory system.
    +You have access to two sources of information:
    +1. Documentation (warranties, return policies, vendor FAQ)
    +2. Live database (current stock levels, prices)

+Answer the user's question using BOTH sources below.
+If stock info is available, always mention current stock levels.
+If warranty or policy info is available, always include it.
+Only use information from the sources provided — do not make anything up.
+If the answer is not in either source, say "I don't have information about this."
+
+--- DOCUMENTATION ---
+{doc_context}
+
+--- LIVE DATABASE ---
+{db_context}
+
+Question: {question}
+
+Answer:"""
+

  • Step 4 — Send to Groq

... diff truncated: showing 800 of 7649 lines


</details>


<sub>You can send follow-ups to the cloud agent <a href="https://cursor.com/agents/bc-720914b2-06de-4aea-9db6-2dbe8f5299af">here</a>.</sub>
<!-- BUGBOT_AUTOFIX_REVIEW_FOOTNOTE_END -->

_Comment `@cursor review` or `bugbot run` to trigger another review on this PR_


<sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit 0f356d3421f782bed791861351ea32776d2052c9. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup>

Comment thread backend/python/dashboard.py Outdated
preview_data = [{"Name": vp.name, "Category": vp.category, "Brand": vp.brand,
"Price (₹)": vp.price, "Stock": vp.quantity_in_warehouse}
for vp in valid_products]
st.dataframe(pd.DataFrame(preview_data), use_container_width=True, hide_index=True)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preview table includes products that failed to save

Low Severity

After saving AI-generated products, the preview table is built from valid_products (all Pydantic-validated entries), not just the successfully saved ones. Products that fail at the category_map.get() check (silently skipped via continue) or raise an exception during .save() still appear in the preview. This makes the table inconsistent with the "saved products" success message above it, misleading users about what actually reached the database.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 0f356d3. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants