Lingo is a lightweight, type-safe Python framework for building LLM-powered applications. It moves beyond generic "agents" and "chains" to focus on Conversational Modelingβthe discipline of defining exactly how a system perceives, processes, and advances a dialogue state.
It unifies three powerful paradigms in a single, typed architecture:
- Procedural Skills (Linear, script-like flows)
- Symbolic States (Deterministic FSMs)
- Reflexive Patterns (Event-driven guardrails)
- πΎ Stateful by Default: The Python stack is your state machine. Use
await engine.ask()to pause execution and wait for user input naturally. - π§ Cognitive Architecture: Mix rigid business rules (States) with flexible reasoning (Skills).
- π‘οΈ Type-Safe: Built on Pydantic. All inputs, outputs, and tool calls are validated schemas.
- π Low-Level Flow Control: Direct access to the underlying
Flowgraph for complex orchestration (Fork/Join, Retry, Loops). - π§ Native Tool-Calling (new in 2.0): Pass
tools=[...]toLLM.chat()and lingo serializes the schemas, parses the model's tool calls back intoMessage.tool_calls, and surfacesthinking+stop_reason. The library wires the pipes β you keep control of the loop.
- Native tool-calling wire β
LLM.chat(messages, tools=[...])now serializes@toolschemas to OpenAI's nativetools=[...]API field and parses the streamedtool_callsback intoMessage.tool_calls(a list ofToolCall(id, name, arguments)). New streaming callbackson_toolcall_start/on_toolcall_delta/on_toolcall_endmirror the existingon_tokenpattern for live UIs. - Finalized
thinkingon every message β reasoning fragments are now accumulated ontoMessage.thinkingin addition to the existingon_reasoning_tokenstream. Message.stop_reasonβ captures the OpenAIfinish_reason(stop/length/tool_calls/content_filter) so consumers can distinguish "the model stopped naturally" from "the model wants to call a tool" from "max tokens hit".ToolCallexported at the top level βfrom lingo import ToolCall.
Two paths coexist in 2.0, and you pick based on what you need:
- Native tool-calling (
LLM.chat(tools=...)βMessage.tool_calls) β the LLM decides when and which tools to call, in parallel batches if it wants. You execute and decide whether to loop. Fastest path; built for agent-style applications. Seeexamples/native_tool_call.py. - Structured-output tool dispatch (
Engine.equip/invoke/act) β the developer decides which tool gets called and asks the LLM only to fill its arguments via structured output. More controlled; lets you drive the conversation explicitly. Seeexamples/banker.py.
pip install lingo-aiLingo allows you to model conversations as linear scripts. You don't need to manage session IDs or database steps manuallyβvariables persist in memory across turns.
import asyncio
from lingo import Lingo
# Initialize the application
app = Lingo("Wizard", description="A helpful setup wizard")
@app.skill
async def onboarding(ctx, eng):
# 1. Output a message
await eng.reply(ctx, "Welcome to the system.")
# 2. PAUSE execution and wait for user input
# Lingo automatically suspends the stack here.
# The variable 'name' is preserved in memory when the user replies!
name = await eng.ask(ctx, "What is your name?")
# 3. Resume and use context from previous turns
email = await eng.ask(ctx, f"Hi {name}, what is your email?")
# 4. Use structured decision making (LLM is forced to return bool)
if await eng.decide(ctx, f"Is {email} a valid corporate email address?"):
await eng.reply(ctx, "Registration complete.")
else:
await eng.reply(ctx, "Personal emails are not allowed.")
if __name__ == "__main__":
from lingo.cli import loop
loop(app)π‘ The runnable version of this quickstart lives at
examples/wizard.py. It extends the pattern withengine.choose(LLM picks from a list),engine.create(LLM extracts a typed object), andengine.decide(LLM returns a bool) β the four conversational-modeling primitives.
The examples/ directory has 10 runnable end-to-end demos:
hello_world.pyβ the minimalLingo()+ CLI loopwizard.pyβ conversational-modeling primitives (ask / choose / decide / create)banker.pyβ structured tool dispatch viaengine.equip+engine.invokenative_tool_call.pyβ 2.0 native tool-calling with a manual loopnative_tool_call_streaming.pyβ same with live streaming callbacksfsm.pyβStateMachinewith per-state tools (triage β billing/tech)smart_home.pyβ nested skills with scoped tools (kitchen, etc.)state_rpg.pyβ persistentState(Pydantic) with state-mutating toolswhen.pyβ reflexive@app.whenguardrail patternsinjection.pyβdepends()+ hidden underscore params + tool-from-tool
Every example is run end-to-end against MockLLM in CI (tests/test_examples.py) β they don't bitrot.
Lingo gives you the right abstraction for every type of logic.
Best for: Business Logic, Security Boundaries, Multi-Step Workflows.
Use the StateMachine to enforce strict rules about allowed transitions.
from lingo.fsm import StateMachine
# 1. Initialize the FSM with the bot's registry
fsm = StateMachine(app.registry)
@fsm.state
async def login(ctx, eng):
await eng.reply(ctx, "Please log in.")
# Deterministic transition to the next state
fsm.goto(dashboard, restart=True)
@fsm.state
async def dashboard(ctx, eng):
await eng.reply(ctx, "Welcome to your dashboard.")
# Logic restricted to this state...
# Register the FSM as a skill
@app.skill
async def run_workflow(ctx, eng):
await fsm.execute(ctx, eng)Best for: Guardrails, Interruptions, Global Commands.
Use @app.when to define high-priority listeners that intercept messages before they reach skills.
@app.when("User wants to quit or cancel the operation")
async def emergency_stop(ctx, eng):
await eng.reply(ctx, "Stopping immediately.")
eng.stop() # Terminates the flow and clears the stackBest for: Parallel Processing, Retries, Complex Orchestration.
You can drop down to the Flow API to build complex execution graphs explicitly.
from lingo import Flow
# Define two sub-flows
research = Flow("Research").reply("Searching for info...")
draft = Flow("Draft").reply("Drafting content...")
# Build a flow that runs them in parallel (Fork)
# and summarizes the result
complex_flow = (
Flow("ParallelWorker")
.fork(
research,
draft,
aggregator="Combine the research and draft into a final report."
)
)For agent-style applications where the LLM decides which tools to call:
import asyncio
from lingo import LLM, Message, tool
@tool
async def get_weather(city: str) -> str:
"""Get the current weather for a city."""
return f"sunny, 22Β°C in {city}"
async def main():
llm = LLM() # picks up MODEL / BASE_URL / API_KEY from env
messages = [Message.user("What's the weather in Havana?")]
# Manual dispatch loop: keep calling until no more tool calls.
while True:
msg = await llm.chat(messages, tools=[get_weather])
messages.append(msg)
if not msg.tool_calls:
print(msg.content)
break
for call in msg.tool_calls:
result = await get_weather.run(**call.arguments)
# tool_call_id is REQUIRED so the model can link the result
# back to the call it made.
messages.append(Message.tool(str(result), tool_call_id=call.id))
asyncio.run(main())Lingo doesn't loop for you, doesn't execute tools, doesn't decide what's next. It wires the pipes β schema serialization out, tool-call parsing back, streaming events β and lets you build the agentic surface you want on top. For a complete agentic framework on top of lingo see apiad/lovelaice.
See examples/native_tool_call.py for the one-shot loop and examples/native_tool_call_streaming.py for the streaming callbacks (live token + tool-call rendering).
Context: Mutable ledger of the conversation history.Engine: The "actuator" that drives the LLM. It exposes methods like.ask(),.decide(),.choose(), and.create().Flow: The underlying graph representation of all skills.LLM: One-call interface to the language model. Streams content/reasoning/tool-calls; returns a richMessage(content,tool_calls,thinking,stop_reason,usage).
We welcome contributions! Please see CONTRIBUTING for details on how to set up your development environment and submit pull requests.
This project is licensed under the MIT License - see the LICENSE file for details.
