SmartColleague is an AI-powered assistant designed to improve how employees and clients interact with a company's internal knowledge — from HR processes and employee expertise to documentation and project history.
- Maria Rosaria Di Domenico — Data Scientist
Companies often struggle with scattered internal knowledge:
- Employees are unaware of who's working on what.
- Documentation is hard to navigate.
- HR procedures are opaque or difficult to follow.
This results in reduced productivity, onboarding friction, and internal communication gaps.
SmartColleague is your intelligent workplace assistant — designed to help employees, HR teams, and even prospective clients access and understand company information effortlessly.
It enables users to:
- 🔍 Discover people, projects, or documents using natural language.
- 🧭 Navigate HR processes with clear, step-by-step assistance.
- 📄 Access company documents or find the right expert for a task.
- 📎 Analyze a PDF file uploaded on-the-fly, not included in the company database.
- 💬 Interact with a fallback general-purpose AI model to ask off-topic or general knowledge questions.
To build an intuitive and helpful assistant with access to different facets of company knowledge — enabling better collaboration, document discovery, HR support, and personalized experiences.
SmartColleague integrates advanced Generative AI and modern NLP technologies:
- LLM Core:
Google Gemini(gemini-1.5-flash) for understanding queries, generating responses, and calling tools. - Agent Framework:
LangGraphto manage conversational state and tool orchestration. - Key Tools & Capabilities:
- Vector Database (ChromaDB): For semantic document and resume search.
- Document Parsing: PDF processing to read and extract information from resumes, policies, and company docs.
- Embedding Models: For chunking and indexing documents.
- Function Calling / Routing: Identify whether the user wants HR help, document retrieval, team discovery, file analysis, or general QA.
- Fallback LLM Support: For general-purpose non-corporate queries (e.g., similar to ChatGPT).
SmartColleague relies on several internal data sources:
- Resumes / CVs: Parsed from PDF files (~1K+ examples).
- HR Documentation: Policies, procedures, benefits guides, etc.
- Company Docs & Projects: Internal knowledge bases, project reports, presentations.
Follow these steps to run SmartColleague locally:
- Clone the repository:
git clone https://github.com/your-username/smartcolleague.git
cd smartcolleague- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txtOr manually install key libraries:
pip install google-generativeai langchain-google-genai langgraph chromadb pypdf openai- Add API keys:
Store your Google Gemini API key securely using .env:
GOOGLE_API_KEY="your-api-key-here"Or set it directly via environment variables.
- 📄 Retrieve resume or document content based on semantic search.
- 🧑💼 Match employee profiles to a specific user query.
- 📂 List relevant documents or projects by topic.
- 📘 Provide HR policy and procedure assistance.
- 📎 Analyze newly uploaded PDFs (even if not part of the preloaded database).
- 💬 Answer general, non-corporate questions using a fallback LLM (ChatGPT-style).
- 🧠 Understand context and conversation history via chat interface.
- 💾 Summarize or extract insights from project files or onboarding docs.
- “I need someone with NLP experience for a client project.”
- “Where is the latest performance review process document?”
- “Analyze this PDF I just uploaded — what’s the key takeaway?”
- “What is the difference between supervised and unsupervised learning?”
SmartColleague is deployed with a Gradio interface, enabling users to:
- Ask questions like:
“Who in the company has experience with AI projects?”
“Show me the onboarding guide for interns.” - Upload files for quick summarization or extraction.
- Ask general-purpose questions unrelated to the company (e.g., “What’s the capital of Norway?”).
- Ensure your API keys are never hardcoded in public repositories.
- For production deployment, consider containerizing with Docker and using secure storage for secrets.
We welcome feature suggestions, bug reports, and contributions. Feel free to fork this repo and open a pull request!