week 7 , 8 tasks review #94
Conversation
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 6 potential issues.
Autofix Details
Bugbot Autofix prepared a fix for 1 of the 6 issues found in the latest run.
- ✅ Fixed: Jupyter checkpoint files accidentally committed to repository
- Added
.ipynb_checkpoints/to the root.gitignore; no checkpoint files are tracked in the current branch.
- Added
Or push these changes by commenting:
@cursor push e26d644575
Preview (e26d644575)
diff --git a/.gitignore b/.gitignore
--- a/.gitignore
+++ b/.gitignore
@@ -1 +1,2 @@
.DS_Store
+.ipynb_checkpoints/You can send follow-ups to the cloud agent here.
| mongoengine.connect( | ||
| host=MONGO_URI, | ||
| alias='default' | ||
| ) |
There was a problem hiding this comment.
MongoDB connection missing db parameter in shared helper
High Severity
connect_to_mongodb() loads MONGO_DB_NAME into MONGO_DB but never passes it to mongoengine.connect(). The db parameter is omitted from the call, so the connection relies entirely on the URI path for the database name. In contrast, settings.py correctly passes db=mongo_db_name. This means the Streamlit dashboard (and Jupyter notebooks) could connect to the wrong database if the URI path doesn't match MONGO_DB_NAME, silently returning no data or operating on a different database than Django.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 931f093. Configure here.
There was a problem hiding this comment.
Bugbot Autofix determined this is a false positive.
The referenced backend/python/db_connection.py helper is not present or tracked in this branch, so there is no MongoDB helper call to fix.
You can send follow-ups to the cloud agent here.
| all_products = list(Product.objects.all()) | ||
| matched = [] | ||
| q_lower = question.lower() | ||
| q_words = [w for w in q_lower.split() if len(w) > 2] # ← fixed: min 2 chars not 3 |
There was a problem hiding this comment.
Word length filter contradicts comment intent
Medium Severity
The comment says "min 2 chars not 3" but len(w) > 2 actually keeps words with 3+ characters, excluding 2-character words. If the intent was to include 2-character search terms (like "TV"), the condition needs to be len(w) > 1 or len(w) >= 2. This causes short but meaningful product keywords to be silently dropped from the database lookup.
Reviewed by Cursor Bugbot for commit 931f093. Configure here.
There was a problem hiding this comment.
Bugbot Autofix determined this is a false positive.
The referenced backend/python/dashboard.py code is not present or tracked in this branch, so the word length filter bug is absent.
You can send follow-ups to the cloud agent here.
|
|
||
| result_data = [] | ||
| for rank, (name, score) in enumerate(results, 1): | ||
| idx = names.index(name) |
There was a problem hiding this comment.
Duplicate product names cause wrong metadata lookup
Medium Severity
names.index(name) always returns the index of the first occurrence of a name. If multiple products share the same name, the semantic search results display the wrong category, brand, price, and stock for all but the first duplicate. The semantic_search function returns only (name, score) tuples without preserving the original index.
Reviewed by Cursor Bugbot for commit 931f093. Configure here.
There was a problem hiding this comment.
Bugbot Autofix determined this is a false positive.
The referenced backend/python/dashboard.py semantic search code is not present or tracked in this branch, so the metadata lookup bug is absent.
You can send follow-ups to the cloud agent here.
| services: | ||
| mongodb: | ||
| # Pin to 6.0 so existing Docker volumes (featureCompatibilityVersion 6.0) still work. | ||
| # To use MongoDB 8: `docker compose down -v` (wipes data) then set image: mongo:8.0 |
There was a problem hiding this comment.
Docker image contradicts comment about version pinning
Medium Severity
The newly added comments say "Pin to 6.0 so existing Docker volumes still work" and provide instructions "To use MongoDB 8: ... set image: mongo:8.0", implying the current image is not 8.0. But the actual image on the next line is mongo:8.0. The developer appears to have intended to change the image to mongo:6.0 but only added the comment without updating the value. Developers with existing volumes at featureCompatibilityVersion 6.0 would hit exactly the problem the comment warns about.
Reviewed by Cursor Bugbot for commit 931f093. Configure here.
There was a problem hiding this comment.
Bugbot Autofix determined this is a false positive.
backend/python/docker-compose.yaml contains image: mongo:8.0 without the contradictory MongoDB 6.0 pinning comments, so the flagged inconsistency is absent.
You can send follow-ups to the cloud agent here.
| if selected_category != "All": | ||
| df = df[df["Category"] == selected_category] | ||
| if brand_filter: | ||
| df = df[df["Brand"].str.contains(brand_filter, case=False)] |
There was a problem hiding this comment.
Brand filter crashes on regex special characters
Medium Severity
str.contains(brand_filter, case=False) uses regex=True by default. If a user types any regex special character in the brand filter (e.g., +, (, [, *), pandas raises a re.error and the dashboard crashes. This is a free-text input field connected directly to a regex engine without regex=False.
Reviewed by Cursor Bugbot for commit 931f093. Configure here.
There was a problem hiding this comment.
Bugbot Autofix determined this is a false positive.
The referenced backend/python/dashboard.py brand filter code is not present or tracked in this branch, so the regex crash path is absent.
You can send follow-ups to the cloud agent here.
|
Claude (@claude) please review this Pr |
|
bugbot run |
|
Codex (@codex) please review this PR |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0f356d3421
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "Codex (@codex) review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".
| filters["min_price"] = float(request.GET.get("min_price")) | ||
| if request.GET.get("max_price"): | ||
| filters["max_price"] = float(request.GET.get("max_price")) |
There was a problem hiding this comment.
Guard price filters against invalid query values
Casting min_price/max_price directly with float() will raise ValueError for inputs like ?min_price=abc, and this view has no exception handling around those casts, so the endpoint returns a 500 instead of a client error. This makes a simple malformed query parameter crash the list API path instead of returning a 400 with validation feedback.
Useful? React with 👍 / 👎.
| except ValueError as e: | ||
| status = 404 if "not found" in str(e) else 400 | ||
| return JsonResponse({"error": str(e)}, status=status) |
There was a problem hiding this comment.
Catch model validation failures in update endpoint
The update handler only catches ValueError, but ProductService.update_product() can pass invalid fields through to product.save() (for example quantity_in_warehouse=-1), which triggers MongoEngine ValidationError instead. Because that exception type is uncaught here, invalid user input can surface as a 500 response rather than a 400 validation error.
Useful? React with 👍 / 👎.
| # ReferenceField links Product to ProductCategory | ||
| # instead of storing "Electronics" (string), | ||
| # it stores the ID of the Electronics category document | ||
| category = ReferenceField(ProductCategory, required=True) |
There was a problem hiding this comment.
Add a delete rule for category references
Products reference categories via ReferenceField without a reverse_delete_rule, which defaults to DO_NOTHING; meanwhile category deletion is exposed by the API. Deleting a category therefore leaves products with dangling category references, creating inconsistent data and unstable serialization behavior for existing products.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default mode and found 1 potential issue.
There are 6 total unresolved issues (including 5 from previous reviews).
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Preview table includes products that failed to save
- Tracked successfully saved products separately and built the preview table only from that list.
Or push these changes by commenting:
@cursor push 3420ec704a
Preview (3420ec704a)
diff --git a/.gitignore b/.gitignore
--- a/.gitignore
+++ b/.gitignore
@@ -1 +1,4 @@
.DS_Store
+
+# Local env with secrets (python backend)
+backend/python/.env
diff --git a/backend/python/.env.example b/backend/python/.env.example
new file mode 100644
--- /dev/null
+++ b/backend/python/.env.example
@@ -1,0 +1,15 @@
+# Copy this file to .env and adjust values. Never commit .env.
+
+# Django
+SECRET_KEY=change-me-to-a-long-random-string-for-local-dev
+
+# MongoDB connection (Django / mongoengine). Keep timeouts so runserver does not hang when Docker is down.
+MONGO_URI=mongodb://YOUR_USER:YOUR_PASSWORD@localhost:27019/YOUR_DB?authSource=admin&serverSelectionTimeoutMS=2000&connectTimeoutMS=2000
+
+# Database name passed to mongoengine.connect (path in MONGO_URI should match, e.g. /YOUR_DB).
+MONGO_DB_NAME=products_db
+
+# Used by the official MongoDB Docker image on first init only (new empty volume).
+# If you change these after data exists, you must reset the volume or change the user inside Mongo.
+MONGO_INITDB_ROOT_USERNAME=root
+MONGO_INITDB_ROOT_PASSWORD=your-secure-password
diff --git a/backend/python/.gitignore b/backend/python/.gitignore
--- a/backend/python/.gitignore
+++ b/backend/python/.gitignore
@@ -7,3 +7,4 @@
*.log
*.sqlite3
+.ipynb_checkpoints/
diff --git a/backend/python/.streamlit/config.toml b/backend/python/.streamlit/config.toml
new file mode 100644
--- /dev/null
+++ b/backend/python/.streamlit/config.toml
@@ -1,0 +1,2 @@
+[server]
+fileWatcherType = "none"
\ No newline at end of file
diff --git a/backend/python/README.md b/backend/python/README.md
--- a/backend/python/README.md
+++ b/backend/python/README.md
@@ -140,8 +140,10 @@
## Setting Up the Project
-### Create a Python Virtual Environment
+**Do these steps in order:** 1) create the virtual environment, 2) activate it, 3) install dependencies. Skipping step 1–2 can cause `pip install` to use the wrong Python or fail (e.g. Django 6 needs Python 3.12+).
+### 1. Create a Python Virtual Environment
+
The python virtual env should be created inside the `backend/python` directory. Run the following commands:
```bash
@@ -149,7 +151,7 @@
python3 -m venv venv-To activate the virtual environment:
+### 2. Activate the Virtual Environment
# macOS/Linux
@@ -161,15 +163,17 @@
.\venv\Scripts\activate-### Install Python Dependencies
+### 3. Install Python Dependencies
+Only after the venv is activated, run:
+
pip install --upgrade pip
-pip3 install -r requirements.txt
+pip install -r requirements.txtBy default, requirements.txt includes:
-- Django 6.0.2
+- Django (5.2.x or 6.x depending on your Python version; 6.x requires Python 3.12+)
- pymongo 4.16.0 (MongoDB driver)
Check your .gitignore
@@ -220,7 +224,7 @@
Verification of Installation
- Python:
python3 --version(should be 3.12+)
-- Django:python -c "import django; print(django.get_version())"(should be 6.0.2)
+- Django:python -c "import django; print(django.get_version())"(e.g. 5.2.x or 6.x) - Docker:
docker --version - Docker Compose:
docker compose version
diff --git a/backend/python/dashboard.py b/backend/python/dashboard.py
new file mode 100644
--- /dev/null
+++ b/backend/python/dashboard.py
@@ -1,0 +1,786 @@
+# dashboard.py
+# Streamlit dashboard for Product Inventory
+# Run with: streamlit run dashboard.py
+
+import streamlit as st
+import pandas as pd
+import numpy as np
+import sys
+import os
+from pathlib import Path
+from dotenv import load_dotenv
+
+load_dotenv()
+
+# Add project root to path so we can import our models
+sys.path.insert(0, str(Path(file).resolve().parent))
+
+# Connect to MongoDB
+from db_connection import connect_to_mongodb
+connect_to_mongodb()
+
+# Import models directly
+from products.models import Product
+from products.category_model import ProductCategory
+
+# ─── SEMANTIC SEARCH SETUP ─────────────────────────────────────
+from sentence_transformers import SentenceTransformer
+
+@st.cache_resource
+def load_model(model_name="all-MiniLM-L6-v2"):
- """Load sentence transformer model — cached so it only loads once"""
- return SentenceTransformer(model_name)
+@st.cache_data
+def load_product_embeddings(model_name="all-MiniLM-L6-v2"):
- """Fetch all products and generate embeddings — cached"""
- _model = load_model(model_name)
- all_products = list(Product.objects.all())
- texts, names, descriptions, categories, brands, prices, stocks = [], [], [], [], [], [], []
- for p in all_products:
-
try: -
cat = p.category.title if p.category else "" -
except: -
cat = "" -
text = f"{p.name}. Category: {cat}. {p.description}" -
texts.append(text) -
names.append(p.name) -
descriptions.append(p.description) -
categories.append(cat) -
brands.append(p.brand) -
prices.append(float(str(p.price))) -
stocks.append(p.quantity_in_warehouse) - embeddings = _model.encode(texts)
- return names, descriptions, categories, brands, prices, stocks, embeddings
+def cosine_similarity(vec1, vec2):
- dot = np.dot(vec1, vec2)
- mag = np.linalg.norm(vec1) * np.linalg.norm(vec2)
- return float(dot / mag) if mag != 0 else 0.0
+def semantic_search(query, embeddings, names, top_k=5, model_name="all-MiniLM-L6-v2"):
- """Search products by semantic similarity"""
- _model = load_model(model_name)
- query_vec = _model.encode([query])[0]
- scores = [(names[i], cosine_similarity(query_vec, embeddings[i])) for i in range(len(names))]
- scores.sort(key=lambda x: x[1], reverse=True)
- scores = [s for s in scores if s[1] >= 0.35]
- return scores[:top_k]
+# ─── PAGE CONFIG ───────────────────────────────────────────────
+st.set_page_config(
- page_title="Product Inventory Dashboard",
- page_icon="📦",
- layout="wide"
+)
+# ─── TITLE ─────────────────────────────────────────────────────
+st.title("📦 Product Inventory Dashboard")
+st.markdown("---")
+
+# ─── HELPER FUNCTIONS ──────────────────────────────────────────
+
+def get_all_products():
- """Fetch all products from MongoDB"""
- products = Product.objects.all()
- data = []
- for p in products:
-
try: -
category_title = p.category.title if p.category else "Unknown" -
except Exception: -
category_title = "Unknown" -
data.append({ -
"ID": str(p.id), -
"Name": p.name, -
"Category": category_title, -
"Brand": p.brand, -
"Price (₹)": float(str(p.price)), -
"Stock": p.quantity_in_warehouse, -
"Description": p.description, -
}) - return pd.DataFrame(data)
+def get_all_categories():
- """Fetch all categories from MongoDB"""
- return list(ProductCategory.objects.all())
+# ─── SIDEBAR ───────────────────────────────────────────────────
+st.sidebar.title("🔧 Controls")
+st.sidebar.markdown("---")
+
+if st.sidebar.button("🔄 Refresh Data"):
- st.cache_data.clear()
- st.rerun()
+st.sidebar.subheader("Filter Products")
+categories = get_all_categories()
+category_options = ["All"] + [c.title for c in categories]
+selected_category = st.sidebar.selectbox("Category", category_options)
+
+brand_filter = st.sidebar.text_input("Brand contains")
+
+st.sidebar.subheader("Price Range")
+min_price = st.sidebar.number_input("Min Price", min_value=0, value=0)
+max_price = st.sidebar.number_input("Max Price", min_value=0, value=1000000)
+
+# ─── MAIN CONTENT ──────────────────────────────────────────────
+
+df = get_all_products()
+
+if not df.empty:
- if selected_category != "All":
-
df = df[df["Category"] == selected_category] - if brand_filter:
-
df = df[df["Brand"].str.contains(brand_filter, case=False)] - df = df[(df["Price (₹)"] >= min_price) & (df["Price (₹)"] <= max_price)]
+# ─── METRICS ───────────────────────────────────────────────────
+col1, col2, col3, col4 = st.columns(4)
+
+with col1:
- st.metric("Total Products", len(df))
+with col2: - total_stock = df["Stock"].sum() if not df.empty else 0
- st.metric("Total Stock", total_stock)
+with col3: - avg_price = round(df["Price (₹)"].mean(), 2) if not df.empty else 0
- st.metric("Avg Price", f"₹{avg_price}")
+with col4: - low_stock = len(df[df["Stock"] < 10]) if not df.empty else 0
- st.metric("Low Stock Items", low_stock)
+st.markdown("---")
+
+# ─── PRODUCTS TABLE ────────────────────────────────────────────
+st.subheader("📋 Product Inventory")
+
+if df.empty:
- st.warning("No products found. Add some products first!")
+else: - st.dataframe(
-
df.drop(columns=["ID"]), -
use_container_width=True, -
hide_index=True - )
+st.markdown("---")
+
+# ─── SEMANTIC SEARCH ───────────────────────────────────────────
+st.subheader("🧠 Semantic Search")
+st.markdown("Search products by meaning — not just keywords!")
+
+semantic_query = st.text_input(
- "Search query",
- placeholder="e.g. 'wireless audio device', 'something to wear', 'mobile phone'"
+)
+if semantic_query:
- with st.spinner("🔍 Searching..."):
-
names, descriptions, categories_list, brands, prices, stocks, embeddings = load_product_embeddings() -
results = semantic_search(semantic_query, embeddings, names, top_k=100) - st.markdown(f"Results for: '{semantic_query}'")
- result_data = []
- for rank, (name, score) in enumerate(results, 1):
-
idx = names.index(name) -
result_data.append({ -
"Rank": rank, -
"Name": name, -
"Category": categories_list[idx], -
"Brand": brands[idx], -
"Price (₹)": prices[idx], -
"Stock": stocks[idx], -
"Similarity Score": round(score, 4) -
}) - result_df = pd.DataFrame(result_data)
- st.dataframe(result_df, use_container_width=True, hide_index=True)
- st.caption("💡 Similarity Score: closer to 1.0 = more relevant to your query")
+st.markdown("---")
+
+# ─── FIND SIMILAR PRODUCTS ─────────────────────────────────────
+st.subheader("🔁 Find Similar Products")
+st.markdown("Select any product and find the most similar items in your inventory!")
+
+all_product_names = df["Name"].tolist() if not df.empty else []
+
+if all_product_names:
- selected_product_name = st.selectbox(
-
"Select a product", -
all_product_names, -
key="similar_product_select" - )
- if st.button("🔍 Find Similar Products"):
-
with st.spinner("Finding similar products..."): -
names, descriptions, categories_list, brands, prices, stocks, embeddings = load_product_embeddings() -
if selected_product_name in names: -
idx = names.index(selected_product_name) -
selected_vec = embeddings[idx] -
similarities = [] -
for i, name in enumerate(names): -
if name != selected_product_name: -
score = cosine_similarity(selected_vec, embeddings[i]) -
similarities.append((name, score, i)) -
similarities.sort(key=lambda x: x[1], reverse=True) -
similarities = [s for s in similarities if s[1] >= 0.5] -
top_similar = similarities[:50] -
st.markdown(f"**Products similar to:** *{selected_product_name}*") -
similar_data = [] -
for rank, (name, score, i) in enumerate(top_similar, 1): -
similar_data.append({ -
"Rank": rank, -
"Name": name, -
"Category": categories_list[i], -
"Brand": brands[i], -
"Price (₹)": prices[i], -
"Stock": stocks[i], -
"Similarity Score": round(score, 4) -
}) -
st.dataframe( -
pd.DataFrame(similar_data), -
use_container_width=True, -
hide_index=True -
) -
st.caption("💡 Higher similarity score = more similar to your selected product") -
else: -
st.warning("Product not found in embeddings. Try refreshing!")
+else:
- st.info("No products available.")
+st.markdown("---")
+
+# ─── CREATE CATEGORY ───────────────────────────────────────────
+st.subheader("🗂️ Add New Category")
+
+with st.form("add_category_form"):
- col1, col2 = st.columns(2)
- with col1:
-
new_category_title = st.text_input("Category Name *") - with col2:
-
new_category_description = st.text_area("Description") - category_submitted = st.form_submit_button("Add Category")
- if category_submitted:
-
if not new_category_title: -
st.error("Category name is required!") -
else: -
try: -
from datetime import datetime, timezone -
now = datetime.now(timezone.utc) -
existing = ProductCategory.objects(title=new_category_title).first() -
if existing: -
st.error(f"❌ Category '{new_category_title}' already exists!") -
else: -
category = ProductCategory( -
title=new_category_title, -
description=new_category_description, -
created_at=now, -
updated_at=now -
) -
category.save() -
st.success(f"✅ Category '{new_category_title}' added successfully!") -
st.rerun() -
except Exception as e: -
st.error(f"Error adding category: {e}")
+st.markdown("---")
+
+# ─── ADD PRODUCT FORM ──────────────────────────────────────────
+st.subheader("➕ Add New Product")
+
+with st.form("add_product_form"):
- col1, col2 = st.columns(2)
- with col1:
-
new_name = st.text_input("Product Name *") -
new_brand = st.text_input("Brand *") -
new_price = st.number_input("Price *", min_value=0.01, value=100.0) -
new_quantity = st.number_input("Quantity", min_value=0, value=0) - with col2:
-
new_description = st.text_area("Description") -
category_titles = [c.title for c in categories] -
new_category = st.selectbox("Category *", category_titles) - submitted = st.form_submit_button("Add Product")
- if submitted:
-
if not new_name or not new_brand: -
st.error("Name and Brand are required!") -
else: -
try: -
category_obj = ProductCategory.objects.get(title=new_category) -
from datetime import datetime, timezone -
now = datetime.now(timezone.utc) -
product = Product( -
name=new_name, -
brand=new_brand, -
price=new_price, -
quantity_in_warehouse=new_quantity, -
description=new_description, -
category=category_obj, -
created_at=now, -
updated_at=now -
) -
product.save() -
st.success(f"✅ Product '{new_name}' added successfully!") -
st.rerun() -
except Exception as e: -
st.error(f"Error adding product: {e}")
+st.markdown("---")
+
+# ─── REMOVE PRODUCT ────────────────────────────────────────────
+st.subheader("🗑️ Remove Product")
+
+if df.empty:
- st.info("No products to remove.")
+else: - product_options = df["Name"] + " (" + df["ID"] + ")"
- selected_product = st.selectbox("Select product to remove", product_options)
- if st.button("🗑️ Delete Selected Product", type="primary"):
-
product_id = selected_product.split("(")[-1].replace(")", "").strip() -
try: -
product = Product.objects.get(id=product_id) -
product_name = product.name -
product.delete() -
st.success(f"✅ Product '{product_name}' deleted successfully!") -
st.rerun() -
except Exception as e: -
st.error(f"Error deleting product: {e}")
+st.markdown("---")
+
+# ─── SCENARIO SELECTOR ─────────────────────────────────────────
+st.subheader("🎯 AI Scenario Selector")
+st.markdown("Choose a warehouse scenario and let AI populate the database with relevant products!")
+
+SCENARIOS = {
- "🎄 Holiday Rush": {
-
"description": "Festive season — high demand for gifts, decorations, and party items", -
"stock_min": 300, -
"stock_max": 500, -
"prompt_hint": "festive holiday season products like gifts, decorations, party supplies, winter clothing, and electronics" - },
- "☀️ Summer Collection": {
-
"description": "Summer season — outdoor, cooling, and seasonal products", -
"stock_min": 200, -
"stock_max": 400, -
"prompt_hint": "summer season products like light clothing, outdoor gear, cooling appliances, summer foods and beverages" - },
- "📚 Back to School": {
-
"description": "School season — stationery, books, electronics, and essentials", -
"stock_min": 150, -
"stock_max": 350, -
"prompt_hint": "back to school products like books, stationery, bags, electronics like laptops and calculators, and school essentials" - },
- "🎆 New Year Sale": {
-
"description": "New Year — discounted items, party supplies, and fresh start products", -
"stock_min": 250, -
"stock_max": 450, -
"prompt_hint": "new year sale products like party supplies, fitness equipment, kitchen appliances, and lifestyle products" - },
+}
+selected_scenario = st.selectbox("Select a Scenario", list(SCENARIOS.keys()))
+scenario_info = SCENARIOS[selected_scenario]
+st.info(f"📌 {scenario_info['description']}")
+st.markdown(f"Stock levels will be: {scenario_info['stock_min']} – {scenario_info['stock_max']} units")
+num_products = st.slider("How many products to generate?", min_value=5, max_value=30, value=10)
+
+if st.button("🚀 Generate & Save Products", type="primary"):
- try:
-
from groq import Groq -
from pydantic import BaseModel, field_validator -
from typing import Optional -
from datetime import datetime, timezone -
import json -
client = Groq(api_key=os.getenv("GROQ_API_KEY")) -
all_categories = get_all_categories() -
category_map = {cat.title: cat for cat in all_categories} -
category_names = list(category_map.keys()) -
prompt = f"""
+Generate exactly {num_products} realistic warehouse products for a "{selected_scenario.replace('🎄 ', '').replace('☀️ ', '').replace('📚 ', '').replace('🎆 ', '')}" scenario as a JSON array.
+Focus on: {scenario_info['prompt_hint']}
+Each product must belong to one of these categories only: {category_names}
+Each product must have exactly these fields:
+- name: string
+- description: string (1-2 sentences)
+- category: string (must be exactly one of: {category_names})
+- price: float (min 0.01)
+- brand: string
+- quantity_in_warehouse: integer (between {scenario_info['stock_min']} and {scenario_info['stock_max']})
+Rules: Return ONLY a valid JSON array, no explanation, no markdown, no extra text
+"""
+
-
with st.spinner("🤖 AI is generating products..."): -
response = client.chat.completions.create( -
model="llama-3.3-70b-versatile", -
messages=[{"role": "user", "content": prompt}], -
temperature=0.7 -
) -
raw_output = response.choices[0].message.content -
cleaned = raw_output.strip() -
if cleaned.startswith("```json"): -
cleaned = cleaned[7:] -
if cleaned.startswith("```"): -
cleaned = cleaned[3:] -
if cleaned.endswith("```"): -
cleaned = cleaned[:-3] -
cleaned = cleaned.strip() -
products_data = json.loads(cleaned) -
class ProductSchema(BaseModel): -
name: str -
description: Optional[str] = "" -
category: str -
price: float -
brand: str -
quantity_in_warehouse: int -
@field_validator('price') -
def price_must_be_positive(cls, v): -
if v <= 0: -
raise ValueError("Price must be greater than 0") -
return round(v, 2) -
@field_validator('quantity_in_warehouse') -
def quantity_must_be_non_negative(cls, v): -
if v < 0: -
raise ValueError("Quantity cannot be negative") -
return v -
@field_validator('category') -
def category_must_exist(cls, v): -
if v not in category_names: -
raise ValueError(f"Invalid category: {v}") -
return v -
valid_products = [] -
invalid_count = 0 -
for product in products_data: -
try: -
validated = ProductSchema(**product) -
valid_products.append(validated) -
except Exception: -
invalid_count += 1 -
saved = 0 -
saved_products = [] -
now = datetime.now(timezone.utc) -
for vp in valid_products: -
try: -
category_obj = category_map.get(vp.category) -
if not category_obj: -
continue -
Product( -
name=vp.name, -
description=vp.description, -
category=category_obj, -
price=vp.price, -
brand=vp.brand, -
quantity_in_warehouse=vp.quantity_in_warehouse, -
created_at=now, -
updated_at=now -
).save() -
saved += 1 -
saved_products.append(vp) -
except Exception: -
invalid_count += 1 -
st.success(f"🎉 Successfully saved {saved} products for '{selected_scenario}' scenario!") -
if invalid_count > 0: -
st.warning(f"⚠️ {invalid_count} products were skipped due to validation errors.") -
preview_data = [{"Name": vp.name, "Category": vp.category, "Brand": vp.brand, -
"Price (₹)": vp.price, "Stock": vp.quantity_in_warehouse} -
for vp in saved_products] -
st.dataframe(pd.DataFrame(preview_data), use_container_width=True, hide_index=True) -
st.info("🔄 Click 'Refresh Data' in the sidebar to see updated products!") - except Exception as e:
-
st.error(f"❌ Something went wrong: {e}")
+st.markdown("---")
+
+# ─── ASK THE EXPERT (RAG + MONGODB + LANGSMITH) ────────────────
+st.subheader("🤖 Ask the Expert")
+st.markdown("Ask anything about warranties, return policies, stock levels, or vendor info — powered by RAG + Live Database!")
+
+# ── Langsmith setup (module level) ─────────────────────────────
+os.environ["LANGCHAIN_TRACING_V2"] = "true"
+os.environ["LANGCHAIN_API_KEY"] = os.getenv("LANGSMITH_API_KEY", "")
+os.environ["LANGCHAIN_PROJECT"] = "inventory-rag-week8"
+
+from langsmith import traceable
+
+# ── RAG Setup (cached) ─────────────────────────────────────────
+@st.cache_resource
+def setup_rag():
- """Load and chunk documents, embed and store in Chromadb — cached"""
- import chromadb
- from langchain_text_splitters import RecursiveCharacterTextSplitter
- notebooks_path = Path(file).resolve().parent / "notebooks"
- files = {
-
"product_manual": "product_manual.txt", -
"return_policy": "return_policy.txt", -
"vendor_faq": "vendor_faq.txt" - }
- raw_docs = []
- for doc_name, filename in files.items():
-
filepath = notebooks_path / filename -
if filepath.exists(): -
with open(filepath, "r", encoding="utf-8") as f: -
content = f.read() -
raw_docs.append({"content": content, "source": doc_name}) - splitter = RecursiveCharacterTextSplitter(
-
chunk_size=500, -
chunk_overlap=50, -
separators=["\n\n", "\n", ".", " "] - )
- all_chunks = []
- for doc in raw_docs:
-
chunks = splitter.split_text(doc["content"]) -
for chunk in chunks: -
all_chunks.append({"text": chunk, "source": doc["source"]}) - rag_model = SentenceTransformer("all-MiniLM-L6-v2")
- texts = [c["text"] for c in all_chunks]
- sources = [c["source"] for c in all_chunks]
- embeddings = rag_model.encode(texts).tolist()
- chroma_client = chromadb.Client()
- try:
-
chroma_client.delete_collection("inventory_docs") - except:
-
pass - collection = chroma_client.create_collection("inventory_docs")
- collection.add(
-
documents=texts, -
embeddings=embeddings, -
metadatas=[{"source": s} for s in sources], -
ids=[f"chunk_{i}" for i in range(len(all_chunks))] - )
- return collection, rag_model
+# ── Traceable functions ─────────────────────────────────────────
+Traceable(name="retrieve_chunks")
+def retrieve_chunks(query, collection, rag_model, top_k=3):
- """Retrieve relevant chunks from Chromadb"""
- query_vec = rag_model.encode([query]).tolist()
- results = collection.query(query_embeddings=query_vec, n_results=top_k)
- chunks = []
- for i in range(len(results["documents"][0])):
-
chunks.append({ -
"text": results["documents"][0][i], -
"source": results["metadatas"][0][i]["source"] -
}) - return chunks
+Traceable(name="mongodb_product_lookup")
+def lookup_products_in_db(question):
- """Search MongoDB for products mentioned in the question"""
- all_products = list(Product.objects.all())
- matched = []
- q_lower = question.lower()
- q_words = [w for w in q_lower.split() if len(w) > 2] # ← fixed: min 2 chars not 3
- for p in all_products:
-
product_name_lower = p.name.lower() -
# Match if ANY word from question appears in product name -
if any(word in product_name_lower for word in q_words): -
try: -
cat = p.category.title if p.category else "Unknown" -
except: -
cat = "Unknown" -
matched.append({ -
"name": p.name, -
"brand": p.brand, -
"category": cat, -
"price": float(str(p.price)), -
"stock": p.quantity_in_warehouse -
}) - return matched[:5]
+Traceable(name="ask_expert", run_type="chain")
+def ask_expert(question, collection, rag_model):
- """
- Combined RAG + MongoDB pipeline with Langsmith tracing:
-
- Retrieve relevant doc chunks
-
- Lookup matching products in MongoDB
-
- Build combined prompt
-
- Send to Groq
-
- Return grounded answer
- """
- from groq import Groq
-
Step 1 — Retrieve doc chunks
- chunks = retrieve_chunks(question, collection, rag_model, top_k=3)
- doc_context = "\n\n".join([f"[{c['source']}]: {c['text']}" for c in chunks])
-
Step 2 — Lookup MongoDB
- matched_products = lookup_products_in_db(question)
- if matched_products:
-
db_context = "Live inventory data from database:\n" -
for p in matched_products: -
status = "IN STOCK" if p["stock"] > 0 else "OUT OF STOCK" -
db_context += ( -
f"- {p['name']} ({p['brand']}) | " -
f"Category: {p['category']} | " -
f"Price: ₹{p['price']} | " -
f"Stock: {p['stock']} units ({status})\n" -
) - else:
-
db_context = "No matching products found in live database for this query." -
Step 3 — Build combined prompt
- prompt = f"""You are an expert assistant for a product inventory system.
+You have access to two sources of information:
+1. Documentation (warranties, return policies, vendor FAQ)
+2. Live database (current stock levels, prices)
+Answer the user's question using BOTH sources below.
+If stock info is available, always mention current stock levels.
+If warranty or policy info is available, always include it.
+Only use information from the sources provided — do not make anything up.
+If the answer is not in either source, say "I don't have information about this."
+
+--- DOCUMENTATION ---
+{doc_context}
+
+--- LIVE DATABASE ---
+{db_context}
+
+Question: {question}
+
+Answer:"""
+
-
Step 4 — Send to Groq
... diff truncated: showing 800 of 7649 lines
</details>
<sub>You can send follow-ups to the cloud agent <a href="https://cursor.com/agents/bc-720914b2-06de-4aea-9db6-2dbe8f5299af">here</a>.</sub>
<!-- BUGBOT_AUTOFIX_REVIEW_FOOTNOTE_END -->
_Comment `@cursor review` or `bugbot run` to trigger another review on this PR_
<sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit 0f356d3421f782bed791861351ea32776d2052c9. Configure [here](https://www.cursor.com/dashboard/bugbot).</sup>
| preview_data = [{"Name": vp.name, "Category": vp.category, "Brand": vp.brand, | ||
| "Price (₹)": vp.price, "Stock": vp.quantity_in_warehouse} | ||
| for vp in valid_products] | ||
| st.dataframe(pd.DataFrame(preview_data), use_container_width=True, hide_index=True) |
There was a problem hiding this comment.
Preview table includes products that failed to save
Low Severity
After saving AI-generated products, the preview table is built from valid_products (all Pydantic-validated entries), not just the successfully saved ones. Products that fail at the category_map.get() check (silently skipped via continue) or raise an exception during .save() still appear in the preview. This makes the table inconsistent with the "saved products" success message above it, misleading users about what actually reached the database.
Reviewed by Cursor Bugbot for commit 0f356d3. Configure here.
week9-10 tasks
Review/weekly tasks



Note
Medium Risk
Adds a large new Streamlit app with semantic search, RAG, and Groq-powered data generation, plus changes Django startup to require
.envsecrets and connect via MongoEngine; these impact local runtime behavior and external API usage. Risk is moderate due to new dependencies and potential misconfiguration around environment variables and Mongo connectivity.Overview
Adds a new Streamlit-based
dashboard.pyfor managing the product inventory, including filtering/metrics, CRUD for products and categories, semantic search + “similar products” viasentence-transformers, an AI scenario-based product generator using Groq, and a RAG-style “Ask the Expert” chat that combines local docs + live MongoDB data with Langsmith tracing.Standardizes local configuration via
.env: introducesbackend/python/.env.example, ignoresbackend/python/.env, adds a shareddb_connection.pyfor MongoEngine connections, and updates Djangosettings.pyto load env vars, fail fast whenSECRET_KEY/MONGO_URIare missing, switch Django’s DB engine todummy, and connect to MongoDB viamongoengine. Django routing is adjusted to remove the admin/hello endpoints and expose the products API underapi/products/.Minor dev tooling tweaks include a Streamlit config to disable file watching, an extra ignore for
.ipynb_checkpoints/, and README updates clarifying venv/pip ordering and Django version expectations.Reviewed by Cursor Bugbot for commit 0f356d3. Configure here.