Arke

Let LLMs write the kernels and choose the optimizations. Let compilers verify the result.

Overview

Arke is a native LLM programming language, IR, compiler toolchain, and agent engineering system for GPU/NPU kernels. It combines benchmark systems to drive LLM for extreme kernel of operator functionality and performance generalization.

Status & Version Semantics

Release line: v0.1.0 — the current project, Python package, CLI, .ak language, and .akir IR contract all start from this version.
Python package / CLI release: 0.1.0 — exposed by pyproject.toml, arke.__version__, and CLI metadata.
Arke Language schema: 0.1.0 — the canonical .ak surface documented in docs/spec/arke-lang-spec.md.
Arke IR / .akir schema: 0.1.0 — the canonical multi-layer IR contract documented in docs/spec/arke-ir-spec.md.
Repository policy: the active tree describes the current Arke architecture directly as the clean starting point for Arke-Lang, Arke-IR, Arke-Compiler, and Arke-Agent.

Key Features

Project-Level Features

🤖 AI-First Design — Arke treats LLMs as optimization decision makers, not just code generators.
🔗 Semantic/Strategy Separation — "What to compute" and "how to optimize" are represented independently, enabling safe and reversible strategy exploration.
🪙 Minimal-Token Efficiency — The path from kernel definition through optimization and verification is designed to minimize token consumption.
🛡️ Compiler-Verified Optimization — Optimization decisions are validated through deterministic checks, from static legality to numerical correctness and performance.
💬 @rationale as a First-Class Artifact — Decisions carry natural-language explanations that make optimization trajectories auditable, reusable, and learnable.
⚡ Cross-Hardware Performance Ambition — A single semantic definition can lower toward multiple hardware targets while preserving a consistent optimization model.

The LLM-Native Stack

📝 LLM-Native Language

Arke's .ak language is a compact operator description surface for both humans and LLMs. It separates kernel semantics from strategy decisions so the mathematical definition remains stable while optimization policy evolves independently.

🧬 LLM-Native IR

Arke IR makes the split explicit: Semantic IR captures what to compute, while Strategy IR captures how to optimize. This separation is the foundation for bounded action spaces, staged verification, rollback, and multi-backend lowering.

🧰 LLM-Native Compiler Toolchain

The compiler is more than code generation. It enumerates legal actions, checks IR validity, lowers to backend-specific representations, and measures correctness and performance under a structured verification flow.

🤖 AI Agent System

Arke's agent layer drives the optimization loop itself: analyze the kernel, choose legal actions, apply decisions with @rationale, verify outcomes, rollback when necessary, and iterate under compiler-enforced constraints.

Architecture At A Glance

At a high level: Semantic IR defines what to compute, Strategy IR defines how to optimize, the compiler validates and lowers, and the agent iterates within that structured space.

  Natural language │ Python/Triton | CUDA/Ascend C │ .ak source │ Benchmarks │ ...
                             │
                             │ LLM translates
                             ▼
  ┌────────────────────────────────────────────────────────────┐
  │  .ak — Arke Language (AI-Native Operator Programming)      │
  │  kernel { semantics }    strategy { @rationale decisions } │
  └────────────────────────────┬───────────────────────────────┘
                               │ parse
                               ▼
  ┌────────────────────────────────────────────────────────────┐
  │            Semantic IR — WHAT to compute                   │
  │         (immutable math, graph structure, correctness)     │
  └────────────────────────────┬───────────────────────────────┘
                               │
  ┌────────────────────────────▼───────────────────────────────┐
  │   LLM(Agent loop) ◄══ Structured Protocol ══► Compiler     │
  │                                                            │
  │   analyze → choose → apply → verify → rollback → iterate   │
  │                                                            │
  │  LLM Agent (Decides)       ArkeEnv (Verifies)              │
  │  ┌──────────────────┐      ┌─────────────────────────────┐ │
  │  │ analyze kernel   │─────►│ enumerate legal_actions     │ │
  │  │ select action    │◄─────│ (bounded decision space)    │ │
  │  │ apply @rationale │─────►│ validate: V0(<1ms)→V1→V2    │ │
  │  │ iterate / stop   │◄─────│ checkpoint / rollback       │ │
  │  └──────────────────┘      └───────────────┬─────────────┘ │
  │                                            │               │
  │  ┌─────────────────────────────────────────▼─────────────┐ │
  │  │           Strategy IR — HOW to optimize               │ │
  │  |    explicit decisions, rationale, backend-aware flow  | │
  │  └───────────────────────────────────────────────────────┘ │
  └────────────────────────────┬───────────────────────────────┘
                               │
  ┌────────────────────────────▼───────────────────────────────┐
  │  Codegen Backends (progressive depth into hardware)        │
  │                                                            │
  │   Triton   │  MLIR Dialect  │   LLVM IR   │   HW ISA       │
  │  (Phase 1) │   (Phase 3)   │  (Phase 4)  │  (Future)       │
  │                                                            │
  │  ◄── deeper hardware control ── extreme performance ──►    │
  └────────────────────────────┬───────────────────────────────┘
                               │
  ┌────────────────────────────▼───────────────────────────────┐
  │      GPU / NPU Execution: NVIDIA │ Ascend │ AMD │ ...      │
  └────────────────────────────────────────────────────────────┘

Semantic IR is the source of truth for correctness-oriented reasoning.
Strategy IR keeps optimization decisions explicit instead of burying them in free-form backend code.
The compiler owns legality, validation, lowering, and measurement.
The agent operates inside a bounded, inspectable optimization loop instead of an open-ended code generation loop.

Minimal Example

Arke separates pure computation from optimization policy. The kernel block says what to compute; the strategy block says how to optimize it for a target.

kernel fused_matmul_relu(
    A: Tensor<[1024, 512], f16>,
    B: Tensor<[512, 2048], f16>
) -> Tensor<[1024, 2048], f16> {
    let C = matmul(A=A, B=B);
    let Y = relu(X=C);
    return Y;
}

strategy fused_matmul_relu for target("nvidia_ampere") {
    tile(loop="i", factors=[64, 16])
        @rationale("align tiles with the target's execution structure");
    tile(loop="j", factors=[128, 8])
        @rationale("improve memory coalescing on the output path");
    fuse(ops=["matmul", "relu"], type=epilogue)
        @rationale("remove an intermediate write to global memory");
}

Why This Design Works

Verifiable — semantics stay separate from optimization, so correctness can be checked against a stable computation definition.
Searchable — optimization is expressed as explicit decisions rather than hidden inside handwritten backend code.
LLM-friendly — the language and IR reduce token-heavy boilerplate while preserving enough structure for planning and validation.
Portable — semantics remain stable while lowering and strategy specialization adapt to hardware targets.

Token Efficiency

Arke is designed to reduce token usage across the full optimization loop, not just the surface syntax of a kernel definition.

Representation	Tokens	Ratio
Arke `.ak` (kernel only)	72	1x
Arke `.ak` (kernel + strategy)	160	2x
LLM direct-write Triton	563	8x
Triton (autotuned, hand-written)	1,102	15x

Definition — semantic intent is represented directly instead of backend boilerplate.
Search — optimization steps become compact actions, not whole-program rewrites.
Verification — deterministic checks replace long back-and-forth debugging loops.
Iteration — invalid strategies rollback cleanly without regenerating everything.

For a deeper analysis, see docs/architecture/token-efficiency-analysis.md.

Quick Start

Prerequisites

Linux (tested on Ubuntu / WSL2)
Python 3.10+
NVIDIA GPU and CUDA for the GPU-oriented setup paths

One-Click Setup

git clone https://github.com/arke-lang/arke.git
cd arke
make setup

Other setup profiles:

make setup-cpu
make setup-gpu
make setup-bench

You can also use the bootstrap script directly:

scripts/bootstrap_env.sh cpu-dev
scripts/bootstrap_env.sh gpu-dev
scripts/bootstrap_env.sh bench

Manual Setup

python -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
python -m pytest tests/ -q

First Verified Commands

The current top-level CLI exposes the compiler-facing path and the Stage 8 MVP optimization path:

arke compile examples/operators/01_matmul.ak
arke optimize examples/operators/01_matmul.ak --output /tmp/arke-opt --cycles 3 --json

To write the resulting .akir JSON from the compiler path to a file:

arke compile examples/operators/01_matmul.ak -o /tmp/matmul.akir

For environment details and custom venv paths, see docs/architecture/python-environment-setup.md.

Current CLI

Today, the documented package entry points in the current prerelease distribution are:

arke compile <file.ak> — compile .ak source into Arke IR / .akir JSON
arke optimize <file.ak> — Stage 8 MVP flow: generate bounded StrategyIR, validate/lower, and emit machine-readable optimization artifacts

arke optimize currently accepts .ak file input and uses a deterministic heuristic strategy generator by default. It emits strategy.json, result.akir, trajectory.jsonl, and summary.json so agent workflows can validate the compile→profile→adjust contract before the live LLM provider path is enabled.

Design documents describe richer optimization flows and agent-driven workflows; read those as architecture and roadmap material unless a specific interface is documented here and implemented in the package entry points.

If you are checking versions: the project, package, language schema, and IR schema are aligned on the v0.1.0 / 0.1.0 starting line. See docs/spec/arke-lang-spec.md#11-versioning and docs/spec/arke-ir-spec.md#15-versioning.

Roadmap Snapshot

Arke is developed in four phases:

Phase 1 — Arke -> Triton -> NVIDIA GPU: validate the SIMT path, language/IR, compiler infrastructure, and benchmark system
Phase 2 — Arke -> Triton -> Ascend NPU: validate cross-architecture generalization on SIMD hardware
Phase 3 — Arke -> MLIR Dialect: gain deeper compiler control beyond Triton's abstraction boundary
Phase 4 — Arke -> LLVM IR: pursue lower-level backend completeness and broader hardware coverage

The active roadmap, Gate criteria, and stage details live in docs/roadmap/plan.md.

Documentation Guide

Start Here

The roadmap, Gate definitions, and benchmark terminology are maintained in the following docs:

docs/roadmap/plan.md — development roadmap, stages, Gates, and project-level status
docs/benchmark/benchmark-design.md — benchmark model and the BL / OT / ST / L terminology used throughout the project
docs/architecture/e2e-flow.md — end-to-end system walkthrough

Active Specs

docs/spec/arke-lang-spec.md — active Arke language specification
docs/spec/arke-ir-spec.md — active multi-layer IR specification
docs/spec/symbolic-dimension-spec.md — symbolic dimension model
docs/spec/pass-infrastructure-spec.md — pass system specification
docs/spec/op-registry-interface.md — operator registry contract

Architecture And Design

docs/architecture/arke-lang-spec-design.md — language design rationale
docs/architecture/arke-ir-spec-design.md — IR design rationale
docs/architecture/arke-compiler-infrastructure.md — compiler infrastructure design
docs/architecture/arke-harness.md — Arke Harness architecture and integration modes (A/B/C)
docs/architecture/naming-system.md — canonical terminology and naming
docs/architecture/python-environment-setup.md — environment bootstrap details
docs/architecture/token-efficiency-analysis.md — token-cost analysis

Benchmark System

docs/benchmark/benchmark-design.md — benchmark overview
docs/benchmark/benchmark-ops.md — operator tiers and catalog
docs/benchmark/benchmark-shapes.md — shape tiers and matrices
docs/benchmark/benchmark-protocol.md — measurement and scoring protocol
docs/benchmark/benchmark-csv-spec.md — benchmark result schema
docs/benchmark/operator-source-registry.md — baseline source registry

Stage Plans

docs/phase1/stage6-plan.md — compiler infrastructure
docs/phase1/stage7-plan.md — Lang and IR v0.1.0
docs/phase1/stage8-plan.md — agent autonomy
docs/phase1/stage9-plan.md — Phase 1 finalization
docs/phase1/design-review.md — design review and risk analysis
docs/phase1/completion-summary.md — completed Phase 1 summary for earlier stages

Repository History Policy

The active tree documents the current Arke language and IR surfaces only. The repository is treated as a clean starting point for the Arke four-piece architecture: Arke-Lang, Arke-IR, Arke-Compiler, and Arke-Agent.

Project Structure

arke/         core language, IR, compiler, backend, and agent packages
benchmarks/   benchmark runners, baselines, reports, and result artifacts
docs/         roadmap, specs, architecture notes, and stage plans
examples/     example `.ak` operators and walkthrough materials
tests/        automated coverage for language, compiler, benchmarks, and agent-adjacent flows
scripts/      bootstrap and project utility scripts

About The Name

Arke (Ἄρκη) is named after the swift-footed messenger of Greek mythology. In the context of this project, the name reflects a bridge between semantic intent and hardware-specific execution strategy.

License

Apache License 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 386 Commits
.github/workflows		.github/workflows
.openclaw		.openclaw
arke		arke
arkec		arkec
benchmarks		benchmarks
docs		docs
examples		examples
scripts		scripts
skills/arke-test-coverage		skills/arke-test-coverage
tests		tests
.benchmark_ops_snapshot.json		.benchmark_ops_snapshot.json
.benchmark_shapes_snapshot.json		.benchmark_shapes_snapshot.json
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
HEARTBEAT.md		HEARTBEAT.md
IDENTITY.md		IDENTITY.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SOUL.md		SOUL.md
STAGE7_PROGRESS_REPORT.md		STAGE7_PROGRESS_REPORT.md
TOOLS.md		TOOLS.md
USER.md		USER.md
benchmark_full.log		benchmark_full.log
benchmark_run.log		benchmark_run.log
pyproject.toml		pyproject.toml
requirements-benchmark.txt		requirements-benchmark.txt
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Arke

Overview

Status & Version Semantics

Key Features

Project-Level Features

The LLM-Native Stack

📝 LLM-Native Language

🧬 LLM-Native IR

🧰 LLM-Native Compiler Toolchain

🤖 AI Agent System

Architecture At A Glance

Minimal Example

Why This Design Works

Token Efficiency

Quick Start

Prerequisites

One-Click Setup

Manual Setup

First Verified Commands

Current CLI

Roadmap Snapshot

Documentation Guide

Start Here

Active Specs

Architecture And Design

Benchmark System

Stage Plans

Repository History Policy

Project Structure

About The Name

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages