This tool parses annotated C++ headers via libclang and emits
backend-specific binding source. Today it generates Embind bindings
(bindings/wasm/mdx_bindings.cpp); the IR is backend-neutral, so adding
a pybind11 emitter is mostly mechanical.
# Embind (default backend) -> bindings/wasm/<module>_bindings.cpp
python -m tools.codegen.codegen mdx
python -m tools.codegen.codegen m2
python -m tools.codegen.codegen m3
# pybind11 -> bindings/python/<module>_bindings.cpp
python -m tools.codegen.codegen mdx --backend pybind11
python -m tools.codegen.codegen m2 --backend pybind11
python -m tools.codegen.codegen m3 --backend pybind11scripts/build-wasm.ps1 runs the Embind codegen automatically before
cmake --build. scripts/build-python.ps1 does the equivalent for the
pybind11 backend, builds the .pyd, and stages it into bindings/python/.
To add, remove, or alter a binding, edit the C++ header, not the
generated .cpp:
- Add or modify an
@bindannotation ininclude/whiteout/.... - Run
python -m tools.codegen.codegen mdx. - Build:
scripts\build-wasm.ps1. - Test:
node --test tests\wasm\smoke.test.js.
mdx_bindings.cpp carries a // AUTOGENERATED banner; never edit it
directly.
Annotations live in C++ doc comments (///, /**…*/, or trailing ///<
on a field). The codegen reads them via libclang's cursor.raw_comment.
/// @bind // include this declaration
/// @bind value_object // bind as Embind value_object
/// @bind rename=isHd // rename the field/JS member
/// @bind skip // exclude this field
/// @bind array_with_view // vector<u8>: also emit *View()
/// @bind js_name=MdxNoParent // override JS-side name
/// @bind cpp_expr=Track<f32>::kFoo // override C++ source expression
/// @bind fields=x;y;z // explicit field list (anon unions)
/// @bind track_template, instantiate=... // template marker (Track<T>)Multiple modifiers per line are comma-separated:
/// @bind value_object, fields=x;y;zlibclang's raw_comment association rules vary; both leading
(/// @bind) and trailing (///< @bind) forms work, but a field with
both a leading /// and a trailing ///< will only see the trailing
one. So when a field already has a trailing description, put the
annotation in the trailing comment:
std::vector<u8> vertexGroups; ///< @bind array_with_view — Bone groups per vertexThe em-dash (—) or -- separates the annotation from the
human-readable text.
Per-format settings live in tools/codegen/modules/<name>.py:
CONFIG = ModuleConfig(
name='mdx',
cpp_namespace='whiteout::mdx',
js_prefix='Mdx', # MdxBone, MdxLayer, …
embind_block='mdx', # EMSCRIPTEN_BINDINGS(mdx) { … }
headers=[
'include/whiteout/vector_types.h',
'include/whiteout/models/mdx/types.h',
'include/whiteout/models/mdx/structures.h',
],
output_path='bindings/wasm/mdx_bindings.cpp',
include_dirs=['include'],
skip_vector_js_names=['VectorU8', 'VectorString'], # registered in bindings.cpp
)When you add a sibling module (M2, M3, WEM), copy modules/mdx.py and
update the four headers/prefix/namespace fields.
tools/codegen/
├── annotations.py # @bind comment parser
├── ir.py # backend-neutral data classes (BindModule, BindClass…)
├── parser.py # libclang -> IR
├── emit_embind.py # IR -> Embind C++
├── emit_pybind.py # IR -> pybind11 C++ (placeholder)
├── codegen.py # CLI: python -m tools.codegen.codegen <module>
└── modules/
└── mdx.py # per-module config
The IR is intentionally minimal — every binding is a class, an enum, a constant, a Track instantiation, or a vector container. The Embind emitter resolves naming (Mdx prefix, value_object vs class, vector container names) from the IR plus a small set of conventions that match the project's hand-written code.
Both Embind and pybind11 are implemented. The IR is shared; only the emitter differs:
| Concept | Embind | pybind11 |
|---|---|---|
| Class | class_<C>("C").constructor<>() |
py::class_<C>(m, "C").def(py::init<>()) |
| Read/write field | .property(name, &C::f) |
.def_readwrite(name, &C::f) |
| Value object | value_object<V>("V").field(...) |
regular py::class_ (no separate concept) |
| Enum | enum_<E>("E").value(...) |
py::enum_<E>(m, "E").value(...) |
| Vector container | register_vector<T>("VecT") |
PYBIND11_MAKE_OPAQUE(...) + py::bind_vector(...) |
| Constant | constant("Name", v) |
m.attr("Name") = v |
| Bytes view | typed_memory_view(size, ptr) |
py::memoryview::from_memory(...) |
Naming: Embind keeps the JS-style Mdx/M2/M3 prefix on every class
name (MdxBone, M2Bone). pybind11 strips the prefix because every type
already lives in a submodule (whiteout.mdx.Bone, whiteout.m2.Bone).
Python keyword names (None, True, False) are renamed (NONE,
TRUE, FALSE) on the Python side.
The same @bind annotations drive both backends. There are no
backend-specific overrides yet; if you need one (e.g. @bind pybind11_skip),
add a check in the relevant emitter.