SAIDA is a Python framework for deterministic data analysis.
Its core contract is simple:
- input:
AnalysisPlan - output:
AnalysisResult
I built SAIDA to make BI dashboarding and day-to-day analysis easier to automate, easier to test, and easier to plug into apps.
- canonical
AnalysisPlanexecution - deterministic validation before compute
- standardized
AnalysisResultoutput - pluggable source, compute, and output adapters
- optional prompt and LLM helpers that sit on top of the core
The main runtime flow is:
Dataset -> AnalysisPlan -> Validate -> Execute -> AnalysisResult
from saida.sources import CSVSource
dataset = CSVSource(
"examples/datasets/support_tickets_500.csv",
context_path="examples/contexts/support_tickets_500.md",
).load()from saida import Saida
from saida.core.contracts import AnalysisPlan, PlanStep
engine = Saida()
plan = AnalysisPlan(
task_type="descriptive",
rationale="Count tickets by team.",
expected_result_name="group_row_counts",
expected_result_shape="table",
steps=[
PlanStep(
step_id="count_rows_by_group",
tool_family="duckdb",
action="count_rows_by_group",
method_id="count_rows_by_group",
family="aggregation_grouping",
parameters={"group_by": ["team"]},
description="Count tickets by team.",
)
],
)
result = engine.execute_plan(dataset, plan)
print(result.response["result"]["name"])
print(result.response["result"]["value"])json_payload = engine.render_output(result, output_format="json")
summary_text = engine.render_output(result, output_format="summary")Prompt handling is supported, but it is optional.
If you want prompt-to-plan generation:
from saida import PromptAnalysisFrontend
frontend = PromptAnalysisFrontend()
plan = frontend.plan(dataset, "How many tickets do we have by team?")
result = frontend.analyze(dataset, "How many tickets do we have by team?")Prompt and LLM features are not the compute layer. They are convenience tooling that generates candidate plans or optional summaries.
- CSV
- Excel
- JSON
- pandas
- SQL sources such as SQLite, PostgreSQL, and MySQL
- DuckDB
- metadata/profile-backed execution
- statsmodels-backed execution
- JSON
- summary text
SAIDA is strongest today as a framework for:
- aggregation and grouping
- tabular retrieval
- ranking
- verification checks
- schema and metadata inspection
- distinct value analysis
- time-based analysis
- period comparison
- statistical inference
- diagnostic workflows
Reserved ML-facing APIs still exist, but forecasting and model training are not the main stable feature area yet.
