Skip to content

feat(mcp): Add Kapa knowledge search tool#1033

Draft
Aaron ("AJ") Steers (aaronsteers) wants to merge 2 commits into
mainfrom
devin/1779842438-kapa-replication-mcp
Draft

feat(mcp): Add Kapa knowledge search tool#1033
Aaron ("AJ") Steers (aaronsteers) wants to merge 2 commits into
mainfrom
devin/1779842438-kapa-replication-mcp

Conversation

@aaronsteers
Copy link
Copy Markdown
Member

@aaronsteers Aaron ("AJ") Steers (aaronsteers) commented May 27, 2026

Summary

Adds the search_airbyte_knowledge_sources(query: str) MCP tool to the Airbyte Replication MCP server using the strict one-argument Kapa MCP-compatible signature approved by AJ Steers.

The tool is registered only when both a Kapa credential env var is present (KAPA_API_KEY, KAPA_DOCS_MCP_BEARER_TOKEN, or KAPA_BEARER_TOKEN) and KAPA_PROJECT_ID is configured. It wraps the Kapa Retrieval REST API, optionally passes KAPA_INTEGRATION_ID, and normalizes responses to [{"source_url": ..., "content": ...}].

The Kapa configuration reads through PyAirbyte's existing secret helper path rather than direct environment access, so env vars, .env, and registered secret managers follow the same lookup behavior as the rest of PyAirbyte.

Review & Testing Checklist for Human

  • Confirm the env-var names match the intended deployment configuration for Cloud Replication MCP.
  • With Kapa credential and KAPA_PROJECT_ID configured, verify search_airbyte_knowledge_sources appears in the Replication MCP tool list with only the query parameter.
  • With Kapa credential or KAPA_PROJECT_ID unset, verify the tool is absent from the Replication MCP tool list.

Notes

Local verification run:

  • uv run ruff format airbyte/mcp/kapa.py tests/unit_tests/test_mcp_kapa.py
  • uv run pytest tests/unit_tests/test_mcp_kapa.py (8 passed)
  • uv run poe check

Requested by AJ Steers.

Link to Devin session: https://app.devin.ai/sessions/8ed989bc34a840a081d0c94eae01d26c
Requested by: Aaron ("AJ") Steers (@aaronsteers)

@devin-ai-integration
Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions
Copy link
Copy Markdown

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

Testing This PyAirbyte Version

You can test this version of PyAirbyte using the following:

# Run PyAirbyte CLI from this branch:
uvx --from 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1779842438-kapa-replication-mcp' pyairbyte --help

# Install PyAirbyte from this branch for development:
pip install 'git+https://github.com/airbytehq/PyAirbyte.git@devin/1779842438-kapa-replication-mcp'

PR Slash Commands

Airbyte Maintainers can execute the following slash commands on your PR:

  • /fix-pr - Fixes most formatting and linting issues
  • /uv-lock - Updates uv.lock file
  • /test-pr - Runs tests with the updated PyAirbyte
  • /prerelease - Builds and publishes a prerelease version to PyPI
📚 Show Repo Guidance

Helpful Resources

Community Support

Questions? Join the #pyairbyte channel in our Slack workspace.

📝 Edit this welcome message.

Comment thread airbyte/mcp/kapa.py Outdated
Comment on lines +24 to +26
os.getenv("KAPA_API_KEY")
or os.getenv("KAPA_DOCS_MCP_BEARER_TOKEN")
or os.getenv("KAPA_BEARER_TOKEN")
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have proper helpers and patterns for getting secrets from env vars (and other sources). Use existing code paths.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in commit 7ed7611 to route Kapa config reads through PyAirbyte's existing secret helper path instead of direct os.getenv() access.


Devin session

Comment thread airbyte/mcp/kapa.py
Comment on lines +30 to +44
def _kapa_auth_headers() -> dict[str, str]:
api_key = (os.getenv("KAPA_API_KEY") or "").strip()
if api_key:
return {"X-API-KEY": api_key}

bearer_token = (
(os.getenv("KAPA_DOCS_MCP_BEARER_TOKEN") or os.getenv("KAPA_BEARER_TOKEN")) or ""
).strip()
if bearer_token:
return {"Authorization": f"Bearer {bearer_token}"}

raise ValueError(
"Kapa docs search is not configured. Set KAPA_API_KEY, "
"KAPA_DOCS_MCP_BEARER_TOKEN, or KAPA_BEARER_TOKEN."
)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep this DRY and call the other helper, or take its output.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in commit 7ed7611. _kapa_auth_headers() now uses the same _kapa_config_value() helper as registration and payload construction, so the credential lookup and fallback order are centralized.


Devin session

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 27, 2026

📝 Walkthrough

Walkthrough

Adds a Kapa-backed MCP tool that reads credentials and project configuration from environment variables, builds auth headers and a POST payload, calls Kapa’s retrieval API with a fixed timeout, normalizes the JSON response to a list of {source_url, content} records, and conditionally registers the tool with the MCP server. Tests cover auth, request/response, and registration behavior.

Changes

Kapa Search Tool

Layer / File(s) Summary
Kapa search tool core
airbyte/mcp/kapa.py
Module constants and env-var names; helpers to read config and detect credentials; _kapa_auth_headers() preferring API key then bearer token; retrieval URL builder requiring KAPA_PROJECT_ID; request JSON builder with optional integration_id; response normalization enforcing list-of-objects with source_url and content strings; search_airbyte_knowledge_sources() performs POST with timeout and raise_for_status().
MCP server registration
airbyte/mcp/server.py
Imports and calls register_kapa_tools(app) during MCP server initialization so Kapa tools are conditionally registered when credentials and project id are configured.

Kapa Tool Test Suite

Layer / File(s) Summary
Auth, request/response, and registration tests
tests/unit_tests/test_mcp_kapa.py
Autouse fixture clears Kapa env vars before tests. Parameterized unit test verifies auth header selection from supported env vars. responses-backed test asserts POST endpoint, headers, and JSON body, and verifies normalized return value. Tests assert register_kapa_tools() skips registration without credentials and registers with correct mcp_module when credentials are present.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main change: adding a new Kapa knowledge search tool to the MCP (Model Context Protocol) system.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch devin/1779842438-kapa-replication-mcp

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/unit_tests/test_mcp_kapa.py (1)

88-98: ⚡ Quick win

Could we parametrize the positive registration test across all supported credential env vars (Line 88), not just KAPA_API_KEY, to lock in the full contract from _kapa_credentials_configured()?

That would catch regressions if one credential path stops enabling registration. wdyt?

Suggested patch
-def test_register_kapa_tools_registers_when_credentials_are_configured(
-    monkeypatch: pytest.MonkeyPatch,
-) -> None:
+@pytest.mark.parametrize(
+    "env_name",
+    ["KAPA_API_KEY", "KAPA_DOCS_MCP_BEARER_TOKEN", "KAPA_BEARER_TOKEN"],
+)
+def test_register_kapa_tools_registers_when_credentials_are_configured(
+    monkeypatch: pytest.MonkeyPatch,
+    env_name: str,
+) -> None:
     """Test that Kapa tools are visible when credentials are configured."""
     app = MagicMock()
-    monkeypatch.setenv("KAPA_API_KEY", "secret")
+    monkeypatch.setenv(env_name, "secret")
 
     with patch("airbyte.mcp.kapa.register_mcp_tools") as register:
         kapa.register_kapa_tools(app)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit_tests/test_mcp_kapa.py` around lines 88 - 98, Update the
test_register_kapa_tools_registers_when_credentials_are_configured to
parametrize over all credential environment variable names used by the module
instead of only KAPA_API_KEY: call or import the helper that lists required env
vars (e.g. _kapa_credentials_configured or its underlying constant) and use
pytest.mark.parametrize on those names, then for each param set
monkeypatch.setenv(var, "secret") before calling kapa.register_kapa_tools(app)
and assert register_mcp_tools was called with (app,
mcp_module="airbyte.mcp.kapa"); this ensures register_kapa_tools and
register_mcp_tools are exercised for every supported credential env var.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@airbyte/mcp/kapa.py`:
- Around line 109-112: The register_kapa_tools function currently registers Kapa
tools when _kapa_credentials_configured() is true but still errors later if
KAPA_PROJECT_ID is missing; change register_kapa_tools to require both
credentials and a configured project id (check KAPA_PROJECT_ID via the same
config/env accessor used elsewhere or os.getenv("KAPA_PROJECT_ID")/a helper)
before calling register_mcp_tools; reference register_kapa_tools,
_kapa_credentials_configured, and _kapa_retrieval_url so you add the extra
project-id guard in the same function to avoid registering a tool that will fail
at runtime.

---

Nitpick comments:
In `@tests/unit_tests/test_mcp_kapa.py`:
- Around line 88-98: Update the
test_register_kapa_tools_registers_when_credentials_are_configured to
parametrize over all credential environment variable names used by the module
instead of only KAPA_API_KEY: call or import the helper that lists required env
vars (e.g. _kapa_credentials_configured or its underlying constant) and use
pytest.mark.parametrize on those names, then for each param set
monkeypatch.setenv(var, "secret") before calling kapa.register_kapa_tools(app)
and assert register_mcp_tools was called with (app,
mcp_module="airbyte.mcp.kapa"); this ensures register_kapa_tools and
register_mcp_tools are exercised for every supported credential env var.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 07354652-62f8-4b0c-9c09-e18d218656ea

📥 Commits

Reviewing files that changed from the base of the PR and between 8427840 and 061924c.

📒 Files selected for processing (3)
  • airbyte/mcp/kapa.py
  • airbyte/mcp/server.py
  • tests/unit_tests/test_mcp_kapa.py

Comment thread airbyte/mcp/kapa.py
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 27, 2026

PyTest Results (Fast Tests Only, No Creds)

475 tests  +8   475 ✅ +8   6m 4s ⏱️ -3s
  1 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit 7ed7611. ± Comparison against base commit 8427840.

♻️ This comment has been updated with latest results.

@devin-ai-integration
Copy link
Copy Markdown
Contributor

CodeRabbit's remaining docstring coverage item is a non-blocking warning from its optional pre-merge checks, not a concrete behavioral issue. I’m leaving the helper functions terse to match the repo’s style and avoid adding boilerplate docstrings.


Devin session

@devin-ai-integration
Copy link
Copy Markdown
Contributor

Correction to the local verification list: after the review feedback update, the targeted Kapa suite has 8 tests due to the credential-registration parametrization.

uv run pytest tests/unit_tests/test_mcp_kapa.py
8 passed

uv run poe check
exit 0

The PR description was edited independently, so I did not overwrite it.


Devin session

@devin-ai-integration
Copy link
Copy Markdown
Contributor

Addressed the CodeRabbit nitpick about registration coverage in commit 7ed7611. The positive registration test is now parametrized over all supported Kapa credential env vars. I couldn’t reply directly to the outdated inline thread because GitHub no longer exposes the parent comment.


Devin session

@github-actions
Copy link
Copy Markdown

PyTest Results (Full)

545 tests  +8   527 ✅ +8   25m 8s ⏱️ +44s
  1 suites ±0    18 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit 7ed7611. ± Comparison against base commit 8427840.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant