v2: implement native x64 backend, part 1 by GGRei · Pull Request #27288 · vlang/v

GGRei · 2026-05-29T10:08:38Z

Summary

This PR introduces the first upstream-ready slice of the V2 native x64 backend.

It adds the foundation needed to compile and run small native x64 programs through the V2 pipeline: ABI lowering, object/linker support, minimal runtime integration, SSA/backend correctness hardening, and a cross-platform runtime smoke suite.

This is intentionally scoped as part 1. Follow-up PRs will extend the supported language surface and progressively add more official V examples to the native x64 smoke coverage.

What is included

Adds initial x64 ABI matrix coverage for SysV and Windows x64 calling conventions.
Extends the native x64 backend for Linux, macOS, and Windows targets.
Adds ELF, Mach-O, and PE/COFF object/linker coverage needed by the current backend slice.
Adds controlled backend diagnostics for unsupported x64 features.
Hardens V2 SSA optimization and verification paths used by the backend.
Adds minimal runtime root handling for native x64 builds.
Improves Windows x64 minimal runtime support, including stdout/stderr and required WinAPI imports.
Adds runtime smoke coverage for scalar control flow, strings, arrays, fixed arrays, structs, module globals, module init, short-circuit logic, integer ops, and exit status behavior.

Unsupported backend features

The native x64 backend is still incomplete, so unsupported cases are now reported explicitly instead of failing silently or crashing later in code generation/linking.

When the backend reaches a feature that is not implemented yet, it returns a controlled diagnostic in this form:

x64: unsupported backend feature: <feature>

The diagnostics are covered by tests, including unsupported ABI/lowering cases and unresolved runtime/linker symbols. This makes missing backend support visible to users and keeps future work incremental: new language/runtime support can be added by replacing a specific diagnostic with an implementation and a matching runtime test.

Tests added

This PR adds focused test coverage under:

vlib/v2/abi
vlib/v2/gen/x64
vlib/v2/markused
vlib/v2/ssa/optimize
vlib/v2/ssa

The runtime smoke suite now exercises native x64 binaries on:

Linux x64
macOS x64
Windows x64

The final CI smoke also builds and runs real V examples directly from the repository:

examples/hello_world.v
examples/fizz_buzz.v

Future parts will add more official examples as the backend supports more of the language and runtime.

CI integration

The x64 checks are integrated into the existing official CI workflows:

Linux: linux_ci.yml
macOS: macos_ci.yml
Windows MSVC: windows_ci_msvc.yml

Each platform runs:

v test vlib/v2/abi vlib/v2/gen/x64
v test vlib/v2/ssa/optimize
V2_VERIFY_STRICT=1 v test vlib/v2/ssa/optimize
native x64 build and run of examples/hello_world.v
native x64 build and run of examples/fizz_buzz.v

Local validation

Validated locally with:

git diff --check
v fmt -verify vlib/v2/ssa/builder.v vlib/v2/ssa/types.v
./v test vlib/v2/abi vlib/v2/gen/x64
./v test vlib/v2/ssa/optimize
V2_VERIFY_STRICT=1 ./v test vlib/v2/ssa/optimize
native x64 build and run of examples/hello_world.v
native x64 build and run of examples/fizz_buzz.v

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 77c3b6d0fc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-29T10:19:50Z

+	if g.store_sysv_integer_pair_call_result(val_id, ret_class) {
+		return
+	}
+	g.normalize_integer_call_result(val_id)
 	g.store_reg_to_val(0, val_id)


Handle SysV SSE aggregate returns

When a SysV x64 function returns a <=16-byte aggregate classified into SSE eightbytes, such as struct Pair { a f64; b f64 }, store_call_result only special-cases INTEGER/INTEGER pairs and otherwise falls through to the scalar RAX store. The matching direct-return path has the same integer-only special case, so these structs are returned with only the first 8 bytes via RAX instead of XMM0/XMM1, leaving the second field lost/uninitialized for ordinary V-to-V calls and also violating the platform ABI for extern calls.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-29T10:19:50Z

+			.sse {
+				last_sse = state.sse_regs
+				locs << mir.AbiLocation{
+					kind:   .sse_reg
+					index:  state.sse_regs


Avoid direct SysV SSE aggregate argument lowering

When a SysV x64 parameter is an aggregate with SSE eightbytes, such as struct { f64 a; f64 b } or { i64, f64 }, this layout records SSE-register locations but the x64 parameter/call code still ignores abi_param_layouts and splits all <=16-byte aggregates through integer registers from abi_param_class only. That makes calls crossing the real SysV ABI boundary (extern C functions or separately compiled objects) receive the wrong registers instead of XMM/GPR according to the classification.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-29T10:19:50Z

+					for i in 0 .. arr_len {
+						idx := b.mod.get_or_add_const(i32_t, i.str())
+						gep := b.mod.add_instr(.get_element_ptr, b.cur_block, elem_ptr_type, [
+							alloca,
+							idx,
+						])
+						b.mod.add_instr(.store, b.cur_block, 0, [zero, gep])


Avoid per-element zero stores for large fixed arrays

For empty fixed-array literals with a large constant length, e.g. [1000000]u8{}, this loop now emits a GEP and store for every element before codegen. That can create millions of SSA instructions and make compilation effectively unusable for large zero-initialized fixed arrays; this path needs a bulk zero-initialization representation instead of expanding by arr_len.

Useful? React with 👍 / 👎.

JalonSolov · 2026-05-29T11:43:38Z

Please make the code use syscalls where possible (linux: mostly yes, Mac: no, Windows: rofl, ???: ???) rather than libc.

Being able to have the tiniest possible executables will be a selling point. For example, the original -native output from examples/hello_world.v in V produced a 184 byte executable on linux.

Nowadays, it has been producing a 16k or so executable.

GGRei · 2026-05-29T11:52:02Z

Agreed. This first slice uses a light runtime path mainly to validate the full V2 x64 pipeline across Linux, macOS and Windows without folding too many runtime/linker changes into the same PR.

For Linux, I agree the native path should move toward direct syscalls where practical, especially for the minimal runtime pieces like write/exit and later allocation support.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2467121e5a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-29T21:41:08Z

+							op:   .amp
+							expr: ast.IndexExpr{
+								lhs:  lhs
+								expr: start_expr


Avoid reevaluating fixed-array slice bounds

For fixed-array slices whose start bound has side effects, this transformation now reuses start_expr in both the computed length (end - start) and the address expression &lhs[start]; because these AST nodes are emitted as separate call arguments, a slice like arr[next()..end] will call next() more than once and use inconsistent bounds/pointers. This path should materialize the bounds once before constructing new_array_from_c_array, as the dynamic-array slice path passes the bound expression only once.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-29T21:41:08Z

+	if value_class.mode == .indirect || value_class.classes == [mir.AbiEightbyteClass.memory] {
+		return sysv_stack_value_layout(value_class, mut state)


Account for indirect SysV args when assigning layouts

When a SysV memory-class aggregate appears before a direct aggregate argument, this branch records the memory-class value as stack slots without consuming the integer register that codegen still uses for the indirect pointer. For a signature like f(big_struct, pair_struct), lowering gives pair_struct locations starting at RDI/RSI, but gen_func first consumes RDI for big_struct's pointer and then stores pair_struct from the wrong registers. Indirect aggregate arguments need to advance the SysV location state as one INTEGER argument (or codegen and lowering need to agree on passing them by value on the stack).

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-29T21:41:08Z

+			if (!w.opts.minimal_runtime_roots || should_mark_ident_as_fn(expr.name))
+				&& !w.is_cast_type_name(expr.name) {
 				w.mark_fn_name(expr.name, mod_name)


Preserve function values in minimal markused

In minimal_runtime_roots mode this filter stops treating ordinary identifiers as possible function references, but mark_fn_value_expr is only called for assignment RHSs, return expressions, and call arguments. Function values stored inside composite literals, such as S{ cb: cleanup } or [cleanup], now fall through walk_expr as plain Idents and are pruned, so Windows x64 minimal-runtime builds can drop the callback target and later fail to link or call a missing function. The composite-literal walkers need to mark field/element values as function values before relying on this filtered Ident path.

Useful? React with 👍 / 👎.

GGRei · 2026-05-29T22:53:37Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 38015718ce

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-29T23:07:48Z

+			if (!w.opts.minimal_runtime_roots || should_mark_ident_as_fn(expr.name))
+				&& !w.is_cast_type_name(expr.name) {
 				w.mark_fn_name(expr.name, mod_name)


Keep const function aliases rooted in minimal mode

When minimal_runtime_roots is enabled, ordinary identifier expressions no longer mark their referenced function unless the name matches should_mark_ident_as_fn. Const declarations are still walked only with walk_expr(field.value), so supported patterns like fn abc() { ... }; const cb = abc; fn main() { cb() } stop marking abc on the Windows x64 native path; used_fn_keys then prunes the function body and the generated const function pointer can become unresolved or unusable. Either call mark_fn_value_expr for const/global initializer values or keep marking identifiers that resolve to function values here.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-29T23:07:48Z

-			} else {
-				asm_lea_rax_rbp_disp32(mut g, off)
-			}
+			g.zero_large_fixed_array_alloca(val_id, off)


Probe large Windows stack frames before zeroing them

On Windows x64, this path can now be reached for large fixed-array locals, but the prologue still allocates the whole frame with a single sub rsp, stack_size before this zeroing touches the large slot. Microsoft documents that fixed stack allocations larger than one page must be probed (for example via __chkstk) before use; otherwise a frame over roughly 4096 bytes can jump past the guard page and fault even though smaller smoke tests pass. See https://learn.microsoft.com/en-us/cpp/build/prolog-and-epilog?view=msvc-170

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-29T23:07:48Z

+	rt.bytes << [u8(0x48), 0xc7, 0x41, 0x08, 0, 0, 0, 0] // mov qword ptr [rcx+8], 0
+	rt.bytes << [u8(0x48), 0x85, 0xd2] // test rdx, rdx
+	null_array := pe_emit_jcc32(mut rt.bytes, 0x84) // je
+	rt.bytes << [u8(0x48), 0x63, 0x42, 0x10] // movsxd rax, dword ptr [rdx+16]


Use array length, not capacity, for rune string conversion

For Windows x64 programs that convert a []rune whose capacity is larger than its length, this runtime thunk loads [rdx+16], which is the array cap field, and later uses it as the loop count. The V array layout stores len at [rdx+12], so []rune{len: 1, cap: 10}.string() will encode the unused tail instead of only initialized elements, producing corrupt output and potentially reading uninitialized memory.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: faaaf15e04

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-29T23:49:38Z

+	mut classes := []mir.AbiEightbyteClass{len: (size + 7) / 8, init: .no_class}
+	mut visiting := map[int]bool{}
+	if !sysv_classify_type_into(m, typ_id, 0, mut classes, mut visiting) {


Avoid classifying huge SysV aggregates element by element

For SysV x64 signatures involving large fixed arrays, e.g. a parameter or return type like [1000000]u8, this allocates an eightbyte-class array proportional to the full object size and then sysv_classify_array_into walks every element before sysv_post_merge_classes inevitably returns MEMORY for non-vector aggregates larger than 16 bytes. Because ABI lowering runs for every function and call signature, such inputs can make compilation consume huge CPU/memory; bail out to a memory class before per-element classification when the aggregate is too large to be returned/passed in registers.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-29T23:49:38Z

+	rt.bytes << [u8(0x4c), 0x8b, 0x44, 0x24, 0x20] // mov r8, [rsp+32]
+	rt.bytes << [u8(0x4c), 0x8b, 0x4c, 0x24, 0x28] // mov r9, [rsp+40]
+	rt.bytes << [u8(0x4d), 0x01, 0xc8] // add r8, r9
+	rt.bytes << [u8(0x49), 0x83, 0xc0, 0x08] // add r8, 8
+	pe_emit_runtime_call_import(mut rt, 'HeapAlloc')


Check aligned allocation size overflow

For Windows x64 native builds, calls that reach the backend _aligned_malloc shim with a very large size can wrap the computed size + alignment + 8 in r8 before HeapAlloc is called. That can return a much smaller block than requested, after which the caller writes as if the original allocation succeeded; the calloc/realloc shims nearby already branch on carry for this case, so this path should do the same before calling HeapAlloc.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 030ca13a4e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-30T00:25:44Z

+					if field.value is ast.Ident
+						&& w.ident_resolves_to_fn_value(field.value.name, mod_name) {
+						w.const_fn_value_aliases[const_fn_value_alias_key(mod_name, field.name)] = field.value.name


Record qualified const function aliases

In minimal-runtime builds, top-level const declarations are not walked as roots, so calls through an alias rely on this collection step. The fresh issue is that the new alias table only records ast.Ident initializers; a supported alias like import m; const cb = m.cleanup; fn main() { cb() } leaves cleanup unmarked because mark_call_lhs(cb) finds no alias target and the const initializer is never visited. Please handle ast.SelectorExpr aliases here as well, resolving the module/alias name before storing the target.

Useful? React with 👍 / 👎.

GGRei · 2026-05-30T07:11:06Z

Ready.

The next PR will be part 1b.

It will focus on reducing the native x64 runtime/linking footprint, starting with Linux syscall-based paths where practical, in line with JalonSolov suggestion about making tiny native executables a concrete goal again.

medvednikov · 2026-05-31T02:41:13Z

Conflicts after the recent flat ast commits.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f803dbdd6f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-05-31T07:39:14Z

+	if !b.is_macos_target() && !b.is_linux_target() {
+		return none


Handle Windows errno via the CRT accessor

When targeting the new Windows PE backend, any code that reads or writes C.errno (for example vlib/os/os_windows.c.v) reaches build_c_errno_storage_addr(), but this guard returns none for Windows and falls back to emitting an external global named errno. On MSVC/UCRT errno is TLS behind the _errno() accessor rather than a linkable data symbol, and the PE linker only resolves its built-in kernel32/runtime symbols, so native Windows programs that touch C.errno will either fail with an unresolved external or use the wrong storage. Please add a Windows-specific _errno address path instead of falling through to the raw global.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-31T07:39:14Z

 					alloca := b.mod.add_instr(.alloca, b.cur_block, ptr_type, []ValueID{})
-					// For small arrays (<=16 elements), zero-initialize element by element.
-					// For larger arrays, the codegen will bulk-zero the alloca slot.
-					if arr_len <= 16 {
+					if arr_len <= fixed_array_empty_literal_element_store_threshold {


Preserve zeroing for large fixed arrays on all backends

For empty fixed-array literals with more than 16 elements, this now skips the element stores and immediately loads the alloca, which is only safe for backends that added matching bulk-zero alloca handling. The v2 C backend still emits .alloca as an uninitialized local declaration, so compiling something like [17]int{} through that backend reads indeterminate stack data instead of zeros. Please keep an explicit zero-initialization representation that every backend lowers, or gate this shortcut to backends that actually zero these allocas.

Useful? React with 👍 / 👎.

GGRei · 2026-05-31T08:14:23Z

Conflict resolved. Codex Review accepted the latest changes.

@medvednikov : Nice work on the big V2 update. It moved a lot of important pieces forward and made the follow-up integration much cleaner.

chatgpt-codex-connector Bot reviewed May 29, 2026

View reviewed changes

GGRei marked this pull request as draft May 29, 2026 10:21

GGRei added 2 commits May 29, 2026 13:30

v2: implement native x64 backend, part 1

4d2c877

v2: address native x64 backend review feedback

7ed93e2

GGRei force-pushed the v2-x64-native-backend-part1-upstream branch from 77c3b6d to 7ed93e2 Compare May 29, 2026 11:40

GGRei added 6 commits May 29, 2026 14:17

v2: fix minimal PE stdio reachability

1cec180

v2: restore non-Windows stdio path

0db80fe

v2: stabilize native x64 smoke follow-ups

3bbc423

v2: close native x64 review follow-ups

e9ff005

v2: fix Windows x64 slice sret lowering

d2df6d8

v2: prevent numeric literals from inheriting aggregate SSA types

2467121

GGRei marked this pull request as ready for review May 29, 2026 21:30

chatgpt-codex-connector Bot reviewed May 29, 2026

View reviewed changes

GGRei marked this pull request as draft May 29, 2026 21:45

v2: close native x64 review follow-ups

3801571

GGRei marked this pull request as ready for review May 29, 2026 22:52

chatgpt-codex-connector Bot reviewed May 29, 2026

View reviewed changes

v2: address native x64 review edge cases

faaaf15

GGRei marked this pull request as draft May 29, 2026 23:34

GGRei marked this pull request as ready for review May 29, 2026 23:34

chatgpt-codex-connector Bot reviewed May 29, 2026

View reviewed changes

v2: harden native x64 review edge cases

030ca13

GGRei marked this pull request as draft May 30, 2026 00:17

GGRei marked this pull request as ready for review May 30, 2026 00:17

chatgpt-codex-connector Bot reviewed May 30, 2026

View reviewed changes

v2: preserve qualified const function aliases

4956d41

GGRei marked this pull request as draft May 30, 2026 07:00

GGRei marked this pull request as ready for review May 30, 2026 07:00

Merge upstream master into native x64 backend

f803dbd

GGRei marked this pull request as draft May 31, 2026 07:26

GGRei marked this pull request as ready for review May 31, 2026 07:26

chatgpt-codex-connector Bot reviewed May 31, 2026

View reviewed changes

GGRei marked this pull request as draft May 31, 2026 07:50

Fix Windows errno and fixed-array zeroing

e34f661

GGRei marked this pull request as ready for review May 31, 2026 08:03

medvednikov merged commit 81a5657 into vlang:master May 31, 2026
63 of 94 checks passed

		if value_class.mode == .indirect \|\| value_class.classes == [mir.AbiEightbyteClass.memory] {
		return sysv_stack_value_layout(value_class, mut state)

		if !b.is_macos_target() && !b.is_linux_target() {
		return none

Uh oh!

Conversation

GGRei commented May 29, 2026

Summary

What is included

Unsupported backend features

Tests added

CI integration

Local validation

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

JalonSolov commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

GGRei commented May 29, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

GGRei commented May 29, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 30, 2026

Choose a reason for hiding this comment

Uh oh!

GGRei commented May 30, 2026

Uh oh!

medvednikov commented May 31, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

JalonSolov commented May 29, 2026 •

edited

Loading