Skip to content

niXman/jsonrefl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CI

jsonrefl

Header-only C++14 library for compile-time reflection and JSON parsing(zero-copy)/serialization. No external dependencies. No external code generation. Just one header.

Table of contents

Features

User-facing capabilities (implementation details live in Internals):

  • Header-only — drop include/jsonrefl/jsonrefl.hpp into your project.
  • Two registration macros
    • JSONREFL_METADATA(type, members...) registers an existing struct;
    • JSONREFL_STRUCT(type, (type, name)...) declares a struct and registers it in one shot.
  • Rich type support — nested structs, std::vector, std::list, std::map, std::unordered_map, jsonrefl::optional_t, std::string, jsonrefl::string_view_t, bool, integers, floats. See Supported Types and portable aliases.
  • Two parsing functions
    • parse() parses any sequence of const chunks (or one whole document) and writes into the target object, without de-escaping;
    • parse_m() parses any sequence of mutable chunks (or one whole document) and de-escapes inside the same memory and letting jsonrefl::string_view_t members be true zero-copy slices of fully-decoded text.
  • Three serialization pathsto_string(), required_bytes() + to_buffer() (one allocation, exact size), and to_chunked_buffer() (fixed-size buffer + flush callback for sockets / constrained memory).
  • Pretty-print — every serialization function takes a pretty flag.
  • Compile-time introspection — query struct name, member count, and member types by name.
  • Compile-time member index — name → setter resolution is constant-cost on the parser hot path, with a clash-detection guarantee that turns name collisions into build errors.

Installation

Single header. Either copy it in:

cp jsonrefl/include/jsonrefl/jsonrefl.hpp /your/project/include/jsonrefl/
#include <jsonrefl/jsonrefl.hpp>

…or add the include path from CMake:

cmake_minimum_required(VERSION 3.5)
project(myapp LANGUAGES CXX)

set(CMAKE_CXX_STANDARD 14)
set(CMAKE_CXX_STANDARD_REQUIRED ON)

include_directories(path/to/jsonrefl/include)
add_executable(myapp main.cpp)

Quick Start

Define a struct, register it, parse and serialize:

#include <jsonrefl/jsonrefl.hpp>
#include <iostream>

struct point {
    double x;
    double y;
};
JSONREFL_METADATA(point, x, y);

int main() {
    point p{};
    auto pp = jsonrefl::make_parser(&p);
    static constexpr char document[] = R"({"x":3.14,"y":2.71})";
    pp.parse(document, sizeof(document) - 1);
    // p.x == 3.14, p.y == 2.71

    std::cout << jsonrefl::to_string(p)       << '\n';   // {"x":3.140000,"y":2.710000}
    std::cout << jsonrefl::to_string(p, true) << '\n';   // pretty-printed
}

Supported Types

Portable aliases for string view and optional

Prefer jsonrefl::string_view_t and jsonrefl::optional_t<T> in reflected structs and in examples so the same client code builds in both language modes supported by this header:

  • C++14 — aliases resolve to boost::string_view and boost::optional<T> (the header includes the corresponding Boost headers when __cplusplus < 201703L).
  • C++17 and later — aliases resolve to std::string_view and std::optional<T>.
C++ Type JSON Representation
bool true / false
int, int64_t, size_t, … number
double, float number
std::string string
jsonrefl::string_view_t string (zero-copy)
jsonrefl::optional_t<T> value or null
std::vector<T> array
std::list<T> array
std::map<K, V> object
std::unordered_map<K, V> object
struct with JSONREFL_METADATA object
nested combinations of the above nested JSON

Defining metadata

Two macros cover both common cases.

JSONREFL_METADATA — for an existing struct

Use when the struct is already defined (third-party header, generated code, your own type with a non-trivial constructor, …):

struct config {
    std::string host;
    int port;
};
JSONREFL_METADATA(config, host, port);

JSONREFL_STRUCT — declare and register in one shot

Use when the struct exists only as a JSON shape:

JSONREFL_STRUCT(
    point,
    (double, x),
    (double, y)
);

Nested types and containers

JSONREFL_METADATA composes — nested structs and standard containers Just Work:

struct color {
    std::string name;
    int r, g, b;
};
JSONREFL_METADATA(color, name, r, g, b);

struct palette {
    jsonrefl::string_view_t title;
    std::vector<color> colors;
};
JSONREFL_METADATA(palette, title, colors);

palette p{
    "Sunset",
    {
        {"Coral",   255, 127, 80},
        {"Gold",    255, 215,  0},
        {"Crimson", 220,  20, 60},
    }
};
std::cout << jsonrefl::to_string(p, true) << '\n';
{
   "title": "Sunset",
   "colors": [
      {"name": "Coral",   "r": 255, "g": 127, "b": 80},
      {"name": "Gold",    "r": 255, "g": 215, "b":  0},
      {"name": "Crimson", "r": 220, "g":  20, "b": 60}
   ]
}

Parsing

make_parser(&obj) returns a jsonrefl::parser<T> bound to obj. The parser exposes two entry points: parse() for general use, parse_m() for inplace decoding escapes.

CTAD alternative: jsonrefl::parser p{&obj}; and jsonrefl::parser p{&obj, &accum}; work too — there is a deduction guide on the class. make_parser() is just a thin one-liner kept for symmetry with the rest of the API; pick whichever style you prefer.

State codes

Both parse() and parse_m() return jsonrefl::state:

Value Meaning
ok Document is complete and the target object is fully populated
incomplete More bytes needed — feed the next chunk
invalid JSON is malformed or doesn't match the schema
extra_data A complete document was parsed but the input has trailing non-whitespace bytes
no_buffer An accum scratch buffer was needed (escape decoding or cross-chunk string) but the parser was constructed without accum in make_parser / ctor. Never returned by parse_m().
sv_cross_chunk A jsonrefl::string_view_t-typed value or key cannot live in the chosen feed shape. From parse() chunked: the value/key was split across two chunks (use std::string, or feed the whole document at once). From parse_m() chunked: contract C2 was violated by the producer.

parse() — single-shot

state parse(const char *ptr, std::size_t size, flags fl = flags::none) — there is no overload that takes string_view/std::string directly; use ptr and size. If your span is already a jsonrefl::string_view_t sv, forward it as parse(sv.data(), sv.size()). No accum is needed unless the document contains strings with escapes that must be decoded into std::string members (accum is bound in make_parser, see chunked parse()).

config cfg{};
auto pp = jsonrefl::make_parser(&cfg);

static constexpr char document[] = R"({"host":"localhost","port":8080})";
auto st = pp.parse(document, sizeof(document) - 1);
// st == jsonrefl::state::ok
// cfg.host == "localhost", cfg.port == 8080

For jsonrefl::string_view_t members the result points directly into the input buffer — the buffer must outlive the target object.

parse() — streaming (chunked)

Feed chunks one at a time. Bind an accum scratch buffer with make_parser(&cfg, &accum) — the parser uses it to hold partial leaves that span chunk boundaries and to decode escape sequences into std::string members. parse() itself always takes (ptr, size) only:

config cfg{};
std::string accum;
auto pp = jsonrefl::make_parser(&cfg, &accum);

static constexpr char part0[] = R"({"host":"local)";
static constexpr char part1[] = R"(host","port":8080})";
pp.parse(part0, sizeof(part0) - 1);  // -> incomplete
pp.parse(part1, sizeof(part1) - 1);  // -> ok

Realistic loop with recv():

config cfg{};
std::string accum;
auto pp = jsonrefl::make_parser(&cfg, &accum);
std::array<char, 4096> buf{};

for (;;) {
    const auto n = ::recv(fd, buf.data(), buf.size(), 0);
    if (n <= 0) { break; }

    const auto st = pp.parse(buf.data(), static_cast<std::size_t>(n));
    if (st == jsonrefl::state::ok)         { /* cfg fully parsed */         break; }
    if (st != jsonrefl::state::incomplete) { /* invalid / extra_data / … */ break; }
}

Notes on chunked parse():

  • No accum in make_parser ⇒ as soon as the parser would need to buffer cross-chunk bytes or decode escapes, it returns state::no_buffer. accum may stay empty between calls — the parser only writes into it when needed.
  • jsonrefl::string_view_t members under chunked feed — if the value or key happens to span a chunk boundary, the parser cannot represent it as a contiguous slice of input. It returns state::sv_cross_chunk immediately on the chunk where the split is detected (no need to feed the next one to learn this). Use std::string, or pass the whole document in one parse() call, or use parse_m() on a mutable buffer.

parse_m() — in-source single-shot (zero-copy escapes)

parse_m(char *ptr, std::size_t size) parses a mutable buffer in place. Escape sequences (\n, \t, \", \uXXXX, surrogate pairs) are decoded directly into the same buffer, shrinking each string. jsonrefl::string_view_t members then point at fully-decoded slices — no std::string fallback, no accum, state::no_buffer is never returned:

char buf[] = R"({"id":"42","msg":"hello\nworld","tag":"\uD83D\uDE00"})";

struct evt {
    jsonrefl::string_view_t id;
    jsonrefl::string_view_t msg;
    jsonrefl::string_view_t tag;
};
JSONREFL_METADATA(evt, id, msg, tag);

evt e{};
auto pp = jsonrefl::make_parser(&e);
auto st = pp.parse_m(buf, sizeof(buf) - 1);
// st == jsonrefl::state::ok
// e.id  -> "42"               (slice of buf)
// e.msg -> "hello\nworld"     (decoded \n, slice of buf)
// e.tag -> "\xF0\x9F\x98\x80" (decoded surrogate pair, slice of buf)
// buf must outlive e

parse_m() single-shot return states: ok, incomplete (buffer was shorter than the document), invalid, extra_data. no_buffer is never returned.

parse_m() — chunked (under contracts C1 + C2)

parse_m() may also be called repeatedly with different mutable buffers, streaming a document through the parser without ever copying. Two contracts must hold:

  • C1 — buffer lifetime. Every buffer fed to parse_m() must outlive any jsonrefl::string_view_t-typed field that ended up pointing into it (typically: lifetime of the target object). Buffers are independent regions of memory — you can recv() each chunk into a fresh std::vector<char> and store them.
  • C2 — atomic leaves. Every leaf JSON value (string, number, true, false, null) and every key, together with the chain of bytes that terminates it, lies entirely inside one buffer. The terminating chain is whatever bytes immediately follow the leaf in serialised form: a single , between siblings, or one or more closing }/] (when the leaf is the last item of one or more nested containers collapsing at once), optionally followed by a trailing ,. Structural pieces in the middle of containers and whitespace between tokens may fall on any byte boundary.

When both contracts hold, parse_m() returns incomplete after each non-final chunk and ok on the last one. jsonrefl::string_view_t members from earlier chunks remain valid because their backing buffers stay alive (C1).

to_chunked_buffer() automatically satisfies C2 in compact mode (pretty=false) when both:

  1. buf_size is large enough to hold the longest leaf together with its longest terminating chain (a few bytes for typical schemas; deep collapsing nesting adds one byte per level), and
  2. that terminating chain is no longer than 16 bytes (tail_buf[16] is an internal staging buffer; ~15+ levels collapsing at the same point fall back to separate writes — pathologically deep nesting only).

Internally the writer collects each leaf and its trailing chain into a single span before deciding to flush, so a chunk boundary cannot split <leaf><chain> while both conditions above hold. If the leaf alone is larger than buf_size, the writer falls back to multiple write() calls and atomicity is sacrificed (long-leaf fallback). Pretty mode does not provide the C2 guarantee at all — its multi-byte separators (,\n<indent>) are emitted as three separate writes; if you need chunked parse_m() round-tripping, serialise compact.

The writer/reader pair on this library therefore round-trips trivially:

std::vector<std::vector<char>> chunks;
char tmp[64];                          // > max leaf + framing -> C2 holds
jsonrefl::to_chunked_buffer(tmp, sizeof(tmp), src,
    [&](const void *data, std::size_t n) -> bool {
        chunks.emplace_back(static_cast<const char*>(data),
                            static_cast<const char*>(data) + n);
        return true;
    }
);

std::vector<kv> sink;
auto pp = jsonrefl::make_parser(&sink);
jsonrefl::state st = jsonrefl::state::incomplete;
for ( auto &c : chunks ) {
    st = pp.parse_m(c.data(), c.size());      // each chunk is its own buffer
    // st == incomplete except for the last chunk where st == ok
}

If C2 is violated (a leaf value or key is split across two buffers), parse_m() does not silently corrupt subsequent chunks — it returns state::sv_cross_chunk immediately on the chunk where the split is detected, so you can see the protocol error and reset().

parse_m() chunked return states: ok, incomplete, invalid, extra_data, sv_cross_chunk. no_buffer is never returned.

Resetting the parser

After any terminal state (invalid, extra_data, sv_cross_chunk) — or when you want to reuse a parser for a fresh document — call reset():

auto pp = jsonrefl::make_parser(&obj);
static constexpr char bad_json[] = R"(not json)";
static constexpr char good_json[] = R"({"i":1,"s":"x"})";  // illustration; match your `T`
auto st = pp.parse(bad_json, sizeof(bad_json) - 1);
// st == jsonrefl::state::invalid

pp.reset();
obj = {};

st = pp.parse(good_json, sizeof(good_json) - 1);
// st == jsonrefl::state::ok

Root containers (no wrapping struct)

Standard containers can be parsed at the document root without a wrapping struct:

std::vector<int> nums;
static constexpr char nums_doc[] = "[10, 20, 30]";
jsonrefl::make_parser(&nums).parse(nums_doc, sizeof(nums_doc) - 1);
// nums == {10, 20, 30}

std::map<std::string, std::string> kv;
static constexpr char kv_doc[] = R"({"key":"value"})";
jsonrefl::make_parser(&kv).parse(kv_doc, sizeof(kv_doc) - 1);
// kv == {{"key", "value"}}

Serialization

Three entry points, in order of escalation: convenient → exact-size → streaming.

to_string()

The convenient path. Returns std::string with JSON. Pass true for pretty-printed output:

auto json   = jsonrefl::to_string(obj);         // compact
auto pretty = jsonrefl::to_string(obj, true);   // indented

required_bytes() + to_buffer()

Zero-allocation path — compute the exact size, then write into your own buffer in one shot:

const auto n = jsonrefl::required_bytes(obj);
std::string buf;
buf.resize(n);
char *end = jsonrefl::to_buffer(buf.data(), obj);
// end - buf.data() == n, guaranteed

to_chunked_buffer()

Streaming serialization into a fixed-size buffer with a flush callback. Ideal for sockets and constrained memory:

char chunk[1472];   // e.g. UDP MTU
jsonrefl::to_chunked_buffer(chunk, sizeof(chunk), obj,
    [](const void *data, std::size_t size) -> bool {
        ::send(fd, data, size, 0);   // write chunk
        return true;                 // return false to abort
    }
);

In compact mode this writer is the C2-compatible producer for chunked parse_m() — see parse_m() chunked for the exact conditions.

Pretty printing

All three serializers accept the same trailing pretty flag (default false):

jsonrefl::to_string         (obj, /* pretty = */ true);
jsonrefl::required_bytes    (obj, /* pretty = */ true);
jsonrefl::to_buffer    (buf, obj, /* pretty = */ true);
jsonrefl::to_chunked_buffer (chunk, sizeof(chunk), obj, cb, /* pretty = */ true);

Root containers (no wrapping struct)

Standard containers can be serialized at the document root, without a wrapping struct:

std::vector<int>            v = {1, 2, 3};
std::map<std::string, int>  m = {{"a", 1}, {"b", 2}};

jsonrefl::to_string(v);   // [1,2,3]
jsonrefl::to_string(m);   // {"a":1,"b":2}

Internals: hybrid member index

This section is implementation detail — useful for performance-conscious users and for understanding error messages, but not required to use the library.

Resolving an incoming key (e.g. "preventedMatchId") to the right setter happens on the parser's hot path, so jsonrefl invests in keeping that lookup branchless and cache-friendly. The mechanism is selected at compile time per struct, based on the number of JSONREFL_METADATA(...) fields N.

Strategy by N

N Strategy Storage Lookup
0 … 16 linear_index<N> (true MPHF) two parallel std::array<…, N> of (hash, setter*) for (i: 0..N) if (hashes[i] == h) return setters[i]; — auto-vectorises into vpcmpeqd + vpmovmskb + tzcnt
17 … ∞ phf_index<N, M> (sparse PHF) M = next_pow2(N*4) slots of (hash, setter*) slot = phf_slot<M>(h, mult, seed); return used[slot] && table[slot].hash == h ? setter : nullptr;

Both paths key off the same compile-time FNV-1a hash and resolve to the same setter_base*, so the choice is purely about latency.

Why two flavours

N ≤ 16 — linear scan wins on real CPUs. A packed pair-of-arrays beats a sparse table for three reasons:

  • No modular index. Slot i is literally the declaration position of the i-th field, so the index is a minimal perfect hash (get_index(name) == i). No & (M - 1), no probe sequence, no indirection through used[].
  • SIMD-friendly memory layout. All hashes sit in one or two contiguous SIMD registers; the compiler reliably turns the loop into a vectorised compare:
    • AVX2: 8 uint32_t per ymm ⇒ ≤ 2 vector compares for N ≤ 16.
    • AVX-512: 16 uint32_t per zmm ⇒ 1 vector compare for N ≤ 16.
  • Branch-free reduction. vpmovmskb + tzcnt recover the matching slot index without a misprediction-prone scalar loop.

Empirically this beats the sparse PHF up to roughly N = 17 on AVX2 and N = 32 on AVX-512. The threshold is set at 16 so the same binary stays optimal on AVX2 hardware without sacrificing AVX-512 systems.

N > 16 — sparse PHF wins on cache and on instructions. Once the hash array no longer fits in one or two vector registers, the linear scan starts costing extra cache lines and extra compares. The sparse PHF instead does:

slot = ((hash * mult) >> shift) ^ seed) & (M - 1);
return used[slot] && table[slot].hash == hash ? table[slot].setter : nullptr;

Two multiplications/shifts and one masked load — independent of N. The cost is memory: M = next_pow2(N*4) slots (e.g. N=20M=128).

Hard-mode construction

Both flavours treat a name collision as a fatal construction error rather than a silent wrong-setter return:

  • linear_index<N> — after copying (hash, setter*) pairs, the constructor runs an O(N²) pairwise check. Two equal hashes call linear_index_collision_detected().
  • phf_index<N, M> — the constructor brute-force-searches (mult, seed) pairs looking for an injective mapping into M slots. If the search space is exhausted, it calls phf_perfect_search_failed().

Both *_failed() / *_detected() functions are intentionally non-constexpr. The effect depends on how the holder is instantiated:

Instantiation site What happens on collision
JSONREFL_METADATA(...) (expands to namespace-scope inline constexpr auto __jsonrefl_meta_T = …) — i.e. the standard usage Compile error — calling a non-constexpr function from a constant-initialiser is ill-formed. The compiler points at the offending __jsonrefl_meta_T definition.
Runtime-built holder (e.g. static const auto h = jsonrefl::object_holder(...) from a factory) std::abort() at first construction — a one-line message goes to stderr and the process dies. Acts as a backstop for code paths the compiler can't constant-fold.

The result is the same in either path: a name collision can't slip through into runtime behaviour.

Inspection

The chosen strategy is exposed for tests and tooling:

const auto &meta = jsonrefl::metadata<my_struct>();
using meta_t = std::remove_reference_t<decltype(meta)>;

if constexpr (meta_t::uses_minimal_index()) {
    // small struct => linear_index<N>: meta.index().hashes[i], meta.index().setters[i]
} else {
    // large struct => phf_index<N, M>: meta.index().table[slot], meta.index().used[slot]
}

meta.dump(std::ostream&) walks both flavours and prints the populated entries.

License

Apache License 2.0 — see LICENSE for details.

About

JSON reflection and serialization library

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors