Skip to content

feat: Add LangExtract tool for structured information extractionΒ #91

@its-animay

Description

@its-animay

πŸ”΄ Required Information

Is your feature request related to a specific problem?

ADK community lacks a native integration for LangExtract β€” Google's own library for extracting structured information from unstructured text using LLMs with precise source grounding. Users who want to use LangExtract within ADK agents currently have to manually wrap `lx.extract()` in a custom tool class with significant boilerplate.

Describe the Solution You'd Like

Add a `LangExtractTool` to `google.adk_community.tools` that:

  • Extends `BaseTool` with a clean function declaration exposing `text` and `prompt_description` as LLM-visible parameters
  • Pre-configures extraction settings (examples, model_id, extraction_passes, etc.) at construction time
  • Runs `lx.extract()` via `asyncio.to_thread()` to avoid blocking the event loop
  • Includes a companion `LangExtractToolConfig` for easy programmatic configuration

Usage:

from google.adk_community.tools import LangExtractTool
import langextract as lx

tool = LangExtractTool(
    name='extract_entities',
    description='Extract named entities from text.',
    examples=[
        lx.data.ExampleData(
            text='John is a software engineer at Google.',
            extractions=[
                lx.data.Extraction(
                    extraction_class='person',
                    extraction_text='John',
                    attributes={'role': 'software engineer', 'company': 'Google'},
                )
            ],
        )
    ],
)

agent = Agent(model='gemini-2.5-flash', name='extraction_agent', tools=[tool])

Impact on your work

Enables ADK agents to perform structured extraction (entities, attributes, relationships) from documents out of the box β€” a common use case for enterprise AI workflows. Since LangExtract is a Google library, having native ADK community support is a natural fit.

Willingness to contribute

Yes β€” I have an implementation ready to submit as a PR.


🟑 Recommended Information

Additional Context

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions