Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions benches/programs/vec_hot_loop.ndc
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
// Vec dispatch hot loop: a tight loop over many
// `Tuple<Int, Int> + Tuple<Int, Int>` calls.
let n = 200_000;
let acc = (0, 0);
for i in 0..n {
acc = acc + (i, i);
}
print(acc);
104 changes: 104 additions & 0 deletions docs/design/vectorization.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# Vectorization

Operator syntax broadcasts element-wise over tuples. `a + b` where both
arguments are `Tuple<Int, Int>` resolves to two `+(Int, Int)` calls and a
tuple build. The mechanism is gated to operator syntax so regular function
calls never accidentally broadcast.

## Background

PR [#140] widened `Binding::Dynamic` return types to `Any` to fix issue
[#139]: the analyser had been LUB-ing declared overload returns, but the
value-level dispatcher could fall through to vec dispatch and produce a
value no declared overload returned. The widening pessimised every dynamic
caller β€” including ones with no vec path at all. The current design
restores that precision by tracking vec-ness on the binding rather than
on a separate fallback path, and broadens vec to cover n-ary operators
and non-numeric overloads.

[#139]: https://github.com/timfennis/andy-cpp/issues/139
[#140]: https://github.com/timfennis/andy-cpp/pull/140

## Three pieces

### 1. `Expression::OperatorCall` distinguishes operator desugars

The parser emits `Expression::OperatorCall { function, arguments }` for
`a + b`, `-x`, `op=`, and `not x` β€” same shape as `Call` but a distinct
variant. Downstream layers pattern-match exhaustively: the analyser opts
into vec dispatch on `OperatorCall` only, while `Call` keeps regular
semantics. No flag, no curated list of operator names anywhere outside
the parser.

### 2. `Candidate` distinguishes scalar from vec overloads

```rust
pub enum Candidate {
Scalar(ResolvedVar),
/// Element-wise tuple broadcast over the scalar that `var()` returns.
Vec(ResolvedVar),
}
```

`Binding::{Resolved,Dynamic}` carry `Candidate`/`Vec<Candidate>`. The
analyser pins `Resolved(Candidate::Vec(scalar))` when per-position
resolution unanimously picks one scalar; it carries a mixed list as
`Dynamic` when types aren't precise enough.

### 3. Per-position vec resolution

For an operator-form call `op(a₁, …, aβ‚™)` where at least one `aα΅’` is
statically a non-empty tuple of length `k`, the analyser:

1. Builds a per-position signature for each `i ∈ 0..k`: tuple args
contribute `arg[i]`, scalar args broadcast unchanged.
2. Looks up scalar overloads for each position signature.
3. **All positions pick the same scalar**: emit
`Binding::Resolved(Candidate::Vec(scalar))`, result type
`Tuple<scalar_return; k>`.
4. **Mixed positions**: emit `Binding::Dynamic(merged_candidates)`,
result type = per-position LUB wrapped as `Tuple<…>`.
5. **Any position has zero candidates**: emit `Binding::None`. The call
can't succeed at runtime either, so we error at compile time with
`function_not_found`.

## Runtime dispatch

Two opcodes carry vec work:

* `CallVec(args)` β€” the compiler emits this for `Resolved(Vec)`. The
scalar is loaded directly (no `OverloadSet` wrapper); the VM reads the
broadcast axis from the tuple args at runtime and calls the known
scalar `axis_len` times. This is the fast path that recovers the perf
the per-element re-probe would cost.

* `Call(args)` with an `OverloadSet` callee β€” used for `Dynamic`. The
dispatcher walks candidates in priority order: scalars first
(first-match-wins), then vec candidates produce a `Callable::Vec`
carrying the list of scalars that the broadcast loop narrows per
element pair. The pinned-single-scalar case (one vec candidate) skips
the per-element probe via the same fast path `CallVec` uses.

Element-call failures surface with `while vectorising '<name>' at index N`
prefixed to the inner message, so the outer call and failing position
appear in the error.

## What changed vs the old design

| Old | New |
|---|---|
| Binary numeric vec only | n-ary, any scalar overload |
| `Binding::Dynamic` widened all returns to `Any` | LUB-d for pure scalar; precise `Tuple<…>` for vec |
| Runtime `try_vectorized_call` post-check | First-class candidate in `OverloadSet` + `CallVec` opcode |
| Mixed-element tuples crashed mid-iteration | Compile-time `function_not_found` |
| Unary `-(1, 2, 3)` errored | Broadcasts to `(-1, -2, -3)` |

## Notes

* **Per-position LUB collapse**: `(Int, Float) + (Float, Int)` infers
`Tuple<Number, Number>` rather than the per-element-precise
`Tuple<Int, Number>`. The simpler uniform return type keeps the
candidate list small; the cost is rare in practice.
* **Empty tuples** decline vec resolution β€” they have no broadcast axis.
* **Indexing** (`a[i]`) parses as `Call`, not `OperatorCall`: there's no
natural broadcast story for `(list_a, list_b)[i]`.
24 changes: 24 additions & 0 deletions manual/src/reference/types/tuple.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,3 +43,27 @@ assert_eq(b, (1,2,3,4,5));
## Operators

{{#include ../../snippets/list-operators.md}}

## Vectorization

Operators broadcast element-wise over tuples. Both arguments must be tuples
of the same length, or one side may be a scalar that broadcasts:

```ndc
assert_eq((1, 2) + (3, 4), (4, 6));
assert_eq(-(1, 2, 3), (-1, -2, -3));
assert_eq((1, 2) + 5, (6, 7));
assert_eq(("a", "b") ++ ("c", "d"), ("ac", "bd"));
```

Vectorization only kicks in for operator syntax (`a + b`, `-x`, `a ++ b`).
Regular function calls never broadcast, so `f((1, 2, 3))` passes the whole
tuple to `f` and does not call `f` once per element.

Mixed-element tuples or length mismatches error at compile time rather than
silently producing wrong results:

```ndc
(1, 2, 3) + (4, 5) // ERROR: no overload accepts those argument types
(1, "a") + (2, "b") // ERROR: no `+(String, String)` overload
```
176 changes: 84 additions & 92 deletions ndc_analyser/src/analyser.rs
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
use std::collections::HashMap;
use std::fmt::Debug;

use crate::scope::{ScopeTree, TypeBinding};
use crate::scope::{CallKind, ResolvedCall, ScopeTree, TypeBinding};
use itertools::{Itertools, izip};
use ndc_core::{StaticType, TypeSignature};
use ndc_lexer::Span;
use ndc_parser::{
Binding, Expression, ExpressionLocation, ForBody, ForIteration, FunctionParameter, Lvalue,
NodeId,
Binding, Candidate, Expression, ExpressionLocation, ForBody, ForIteration, FunctionParameter,
Lvalue, NodeId,
};

/// Side table holding semantic information keyed by AST node identity.
Expand Down Expand Up @@ -130,7 +130,7 @@ impl Analyser {
return Ok(StaticType::Any);
};

*resolved = Binding::Resolved(binding);
*resolved = Binding::Resolved(Candidate::Scalar(binding));

Ok(self.scope_tree.get_type(binding).clone())
}
Expand Down Expand Up @@ -202,36 +202,41 @@ impl Analyser {
let right_type = self.analyse_or_any(r_value);
let arg_types = vec![left_type, right_type];

*resolved_assign_operation = self
.scope_tree
.resolve_function_binding(&format!("{operation}="), &arg_types);
*resolved_operation = self
// Resolve both `op=` and `op` so we can widen the lvalue
// by the result type of whichever one actually fires.
let ResolvedCall {
binding: assign_binding,
..
} = self.scope_tree.resolve_call(
&format!("{operation}="),
&arg_types,
CallKind::Operator,
);
let ResolvedCall {
binding: op_binding,
return_type: op_return,
} = self
.scope_tree
.resolve_function_binding(operation, &arg_types);
.resolve_call(operation, &arg_types, CallKind::Operator);

if let Binding::None = resolved_operation {
*resolved_assign_operation = assign_binding;
*resolved_operation = op_binding;

// Either form satisfies the call: `op=` mutates in place;
// `op` falls back through `a = a op b`. Only error when both
// are missing β€” e.g. `Map -= Map` is fine via `-=` even when
// `-` itself has no Map overload.
if matches!(resolved_assign_operation, Binding::None)
&& matches!(resolved_operation, Binding::None)
{
self.emit(AnalysisError::function_not_found(
operation, &arg_types, *span,
));
}

// Determine the result type of the operation
let result_type = match resolved_operation {
Binding::Resolved(res) => {
if let StaticType::Function { return_type, .. } =
self.scope_tree.get_type(*res)
{
Some(return_type.as_ref().clone())
} else {
None
}
}
_ => None,
};

if let Some(result_type) = result_type {
if !matches!(resolved_operation, Binding::None) {
let result_type = op_return;
match l_value {
// Direct variable: widen or reject if annotated
Lvalue::Identifier {
resolved: Some(target),
..
Expand All @@ -249,10 +254,9 @@ impl Analyser {
));
}
}
// Index into a container: widen the container's type
Lvalue::Index { value, .. } => {
if let Expression::Identifier {
resolved: Binding::Resolved(target),
resolved: Binding::Resolved(Candidate::Scalar(target)),
..
} = &value.expression
{
Expand Down Expand Up @@ -405,25 +409,11 @@ impl Analyser {
Expression::Call {
function,
arguments,
} => {
let mut type_sig = Vec::with_capacity(arguments.len());
for a in arguments {
type_sig.push(self.analyse_or_any(a));
}

let callee_type =
self.resolve_function_with_argument_types(function, &type_sig, *span);

let StaticType::Function { return_type, .. } = callee_type else {
if callee_type == StaticType::Any {
return Ok(StaticType::Any);
}
self.emit(AnalysisError::not_callable(&callee_type, *span));
return Ok(StaticType::Any);
};

Ok(*return_type)
}
} => self.analyse_call(function, arguments, CallKind::Regular, *span),
Expression::OperatorCall {
function,
arguments,
} => self.analyse_call(function, arguments, CallKind::Operator, *span),
Expression::Tuple { values } => {
let mut types = Vec::with_capacity(values.len());
for v in values {
Expand Down Expand Up @@ -479,56 +469,48 @@ impl Analyser {
}
}

fn resolve_function_with_argument_types(
/// Resolves a call (regular or operator-form) and returns its result type.
/// Only operator-form calls are eligible for vec dispatch.
fn analyse_call(
&mut self,
ident: &mut ExpressionLocation,
argument_types: &[StaticType],
function: &mut ExpressionLocation,
arguments: &mut [ExpressionLocation],
kind: CallKind,
span: Span,
) -> StaticType {
let ExpressionLocation {
expression: Expression::Identifier { name, resolved },
..
} = ident
else {
// It's possible that we're not trying to invoke an identifier `foo()` but instead we're
// invoking a value like `get_function()()` so in this case we just continue like normal?
return self.analyse_or_any(ident);
};
) -> Result<StaticType, AnalysisError> {
let mut type_sig = Vec::with_capacity(arguments.len());
for arg in arguments {
type_sig.push(self.analyse_or_any(arg));
}

let binding = self
.scope_tree
.resolve_function_binding(name, argument_types);

let out_type = match &binding {
Binding::None => {
self.emit(AnalysisError::function_not_found(
name,
argument_types,
span,
));
return StaticType::Any;
}
Binding::Resolved(res) => self.scope_tree.get_type(*res).clone(),

Binding::Dynamic(_) => {
// Dispatch is decided at runtime, so we have no sound static bound
// on the result. The runtime may pick a declared overload or fall
// through to elementwise (vectorized) dispatch, which can produce
// a value no declared overload returns β€” treating the LUB of
// declared returns as the result type is unsound and led to issue
// #139, where `let diff = a - b` over tuples was inferred as
// `Number` and a follow-up `diff * diff` then matched the numeric
// overload directly and bypassed dynamic dispatch entirely.
StaticType::Function {
parameters: None,
return_type: Box::new(StaticType::Any),
// Higher-order call shapes like `get_function()()` have a non-identifier
// function position; in that case we just analyse the callee as a value
// and trust the runtime to dispatch.
let Expression::Identifier { name, resolved } = &mut function.expression else {
let callee_type = self.analyse_or_any(function);
return Ok(match callee_type {
StaticType::Function { return_type, .. } => *return_type,
StaticType::Any => StaticType::Any,
other => {
self.emit(AnalysisError::not_callable(&other, span));
StaticType::Any
}
}
});
};

*resolved = binding;
let ResolvedCall {
binding,
return_type,
} = self.scope_tree.resolve_call(name, &type_sig, kind);

out_type
if matches!(binding, Binding::None) {
self.emit(AnalysisError::function_not_found(name, &type_sig, span));
*resolved = binding;
return Ok(StaticType::Any);
}

*resolved = binding;
Ok(return_type)
}

fn resolve_for_iterations(
Expand Down Expand Up @@ -660,8 +642,18 @@ impl Analyser {
let get_args = [type_of_index_target.clone(), index_type.clone()];
let set_args = [type_of_index_target.clone(), index_type, StaticType::Any];

*resolved_get = Some(self.scope_tree.resolve_function_binding("[]", &get_args));
*resolved_set = Some(self.scope_tree.resolve_function_binding("[]=", &set_args));
// Indexing isn't operator-form for vec purposes: there's no
// natural broadcast story for `(list_a, list_b)[i]`.
*resolved_get = Some(
self.scope_tree
.resolve_call("[]", &get_args, CallKind::Regular)
.binding,
);
*resolved_set = Some(
self.scope_tree
.resolve_call("[]=", &set_args, CallKind::Regular)
.binding,
);

if let Some(t) = type_of_index_target.index_element_type() {
Ok(t)
Expand Down
Loading
Loading