Skip to content

Feat: Implement Gemini Interaction API in adk-js#364

Open
AmaadMartin wants to merge 15 commits into
google:mainfrom
AmaadMartin:feat/interactions-api-2
Open

Feat: Implement Gemini Interaction API in adk-js#364
AmaadMartin wants to merge 15 commits into
google:mainfrom
AmaadMartin:feat/interactions-api-2

Conversation

@AmaadMartin
Copy link
Copy Markdown
Collaborator

Please ensure you have read the contribution guide before creating a pull request.

Link to Issue or Description of Change

  1. Link to an existing issue (if applicable):
    N/A

Closes: #issue_number
Related: #issue_number
2. Or, if no issue exists, describe the change:
This PR implements the next-generation stateful Gemini Interaction API integration in adk-js, mirroring the design and functionality already present in adk-python. This enables stateful, multi-turn conversations by tracking interaction history server-side using interactionId, reducing payload sizes across progressive turns.

If applicable, please follow the issue templates to provide as much detail as possible.

Problem:
The current adk-js core only supports stateless execution via the standard generateContent API, which requires sending the entire conversational history back and forth on every turn. This increases payload sizes, causes overhead, and prevents leveraging server-side interaction history tracking.

Solution:

  1. Interface Updates:
    • Added optional previousInteractionId?: string to LlmRequest in core/src/models/llm_request.ts.
    • Added optional interactionId?: string to LlmResponse in core/src/models/llm_response.ts.
  2. Stateful Request Processor:
    • Created InteractionsRequestProcessor under core/src/agents/processors/interactions_request_processor.ts. It automatically traverses the session events history in reverse to find the latest valid interactionId for the current branch and sub-agent name, injecting it as previousInteractionId into the outgoing request.
    • Registered INTERACTIONS_REQUEST_PROCESSOR in LlmAgent request processors, immediately following the CONTENT_REQUEST_PROCESSOR.
  3. Interaction Utility & Payload Transformation:
    • Created core/src/models/interactions_utils.ts containing:
      • getLatestUserContents to trim the outgoing conversation history, sending only the latest continuous user turn when previousInteractionId is present (with special handling to retain the preceding model turn's function call if the user turn contains a function response).
      • Request/response converters mapping ADK types (text, function calls, tool results, media data, code execution) to @google/genai Interactions REST schemas (and vice-versa).
      • Core streaming/non-streaming runner generateContentViaInteractions wrapping @google/genai interactions resource calls.
  4. Model Integration:
    • Updated Gemini class (core/src/models/google_llm.ts) to accept useInteractionsApi?: boolean parameter, toggling the flow to delegate to generateContentViaInteractions when enabled.

Testing Plan
Please describe the tests that you ran to verify your changes. This is required for all PRs that are not small documentation or typo fixes.

Unit Tests:

  • I have added or updated unit tests for my change.
  • All unit tests pass locally.

We implemented extensive unit tests targeting the stateful request processor and the payload converters:

  • core/test/agents/processors/interactions_request_processor_test.ts (6 tests)
  • core/test/models/interactions_utils_test.ts (89 tests)

Summary of passed npm test results:

 RUN  v3.2.4 /usr/local/google/home/amaadmartin/Workspace/Agentspaces/feat-interactions-api-2/adk-js

 Test Files  2 passed (2)
      Tests  95 passed (95)
   Start at  13:17:31
    Duration  10.68s

We achieved 100% Statement, Branch, Function, and Line coverage for both new source files in adk-js/core:

  • core/src/agents/processors/interactions_request_processor.ts: 100% Coverage
  • core/src/models/interactions_utils.ts: 100% Coverage

Manual End-to-End (E2E) Tests:
We created a verification script verify_interactions.ts in the root of the workspace. It tests a two-turn conversation:

  1. Turn 1: "My favorite color is deep blue. Remember this." (verifies the model responds and returns a valid interactionId).
  2. Turn 2: "What is my favorite color?" with previousInteractionId set (verifies history is trimmed and the model correctly recalls "blue" from the server-side state).

To execute manual verification:

GEMINI_API_KEY=your_live_api_key npx tsx verify_interactions.ts

Checklist

  • I have read the CONTRIBUTING.md document.
  • I have performed a self-review of my own code.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have added tests that prove my fix is effective or that my feature works.
  • New and existing unit tests pass locally with my changes.
  • I have manually tested my changes end-to-end.
  • Any dependent changes have been merged and published in downstream modules.

Additional context
TAG=agy
CONV=8a91ed6a-f4db-4160-83d9-68d5e80e066c

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants