norns converged: crone + matron --> single binary - bring-up & stabilization#1883
Conversation
|
I will look into these merge conflicts. Likely an artifact of the rebase. |
|
awesome! to fix the conflict, the branch just needs to be rebased against main again (there was a small change in main to device_midi.c this week.) |
|
Ah! Way less fiddly than I suspected. Perfect. Will do. Thank you! |
fix compilation (sidecar) fix many compiler errors (safe-c, etc), some linker problem remains add the RELEASE cxx flags to norns/wscript, retire crone/wscript
not sure how often this script is used but figured it should probably default to release builds
assuming the contents of this file were moved to jack_client.h at some point
- fix/rename flag to enable gprof profiling - add flag to enable debugging/syms - fix duplicate/conflicting c++ standard flags - add help strings to some configure flags
performs a single fork for the sidecar process along with: - set the child process name for visibiliy in ps - set the thread name in child process for visibility in gdb - send child a SIGHUP if parent dies - disconnect child from parent stdin the primary motivation for this change is simplicity and to allow easier debugging. the parent process remains the one running crone/matron.
- check for all dependencies in wscript - fix SEGV when ^D is entered - collapse duplicate rx loop implementations - increase const'ness of various functions
change to libnng-dev
provides the ability to alter audio routing for jack clients either by specific port(s) or by blocks of ports given a routing table. routing tables for system, softcut, and supercollider simplify connection management for those components. modifications to audio routing are tracked and the default routing for all components is restored on script cleanup. the `audio_post_restore_default_routing` hook allows mods to tweak the routing in a consistent fashion
- removes explict handling of reverse io - adds method to temporary disable change tracking
- capture buffer termination (issue monome#1569) - moves async command capture logic from lua to c - ensures sidecar requests from _norns.execute and _norns.system_cmd are not interleaved when called from different threads - adds progressive capture buffer allocation in server up to maximum of 10MB
giving explicitly created threads names makes it substantially easier to narrow in on a particular thread in gdb
- init vuPoll and tapePoll callbacks in MixerClient to prevent std:bad_function_call on boot. - expand VU meters from 4 to 12 channels across the crone -> matron -> lua bindings. - added visual rendering for the 8 new channels in the mix menu.
… prevent UI rendering crashes.
…nt graceful cleanup
* When rotated 90/270 degrees led's aren't set appropraitely When a `Grid.rotate()` is called with 1 or 3, the correct quadrant isn't chosen. The issue lies in the function `dev_monome_quad_idx`. This doesn't take into account rotation state, thus when called with something like: ```c dev_monome_quad_idx(md->m, 2, 12) ``` It returns quadrant 2. I think we technically want quadrant 1. We can read the rotation state from the monome device, and use that to correctly calculate the quadrant with a slightly different formula that both works for 128 and 256 grids. * Fix missing variable * We need to pass in the md reference * Possible fix for rows/cols not being reset after a rotation * Don't query grid for rotation status * Use Grid.update_devices * Respond to tehn's comments * Linter
|
@Dewb I put together a step by step guide that should get all the things going. Let's me know if anything is wonky. Happy to help. https://gist.github.com/colinmcardell/3b8982327c009c2c74f7e2a0dff22cab |
|
this is fantastic, thank you! working through testing. |
|
this is working wonderfully in all of my tests so far. maiden, maiden-repl, all great. i don't expect any issues with "hardware variants" as none of this really touched hardware-specific bits, but i'll test on a shield. similarly, i think mods should 'just work' but i'll try a few also (i tend not to use them in my own process) thank you again, this is very encouraging! |
|
one thing which probably needs to be adjusted is the names of the systems units in maiden. …it uses those to implement `;restart` in the web repls i can help sort that out. if it needs to be updated. -gregOn Mar 27, 2026, at 2:16 PM, brian crabtree ***@***.***> wrote:tehn left a comment (monome/norns#1883)
this is working wonderfully in all of my tests so far. maiden, maiden-repl, all great.
i don't expect any issues with "hardware variants" as none of this really touched hardware-specific bits, but i'll test on a shield.
similarly, i think mods should 'just work' but i'll try a few also (i tend not to use them in my own process)
thank you again, this is very encouraging!
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
|
I found a "bug" in the dev workflow where calling kill on the main service causes connection to the sidecar to fail, requiring reboot/noodling. I have a patch. Will push this weekend. |
this makes sense, i'll try your new maiden dev setup and see how i do |
|
So far so good for me as well. Are there any particular areas of risk/concern that should get extra attention? Tasks that involve the sidecar, messages that were formerly matron-crone but are now in-process, …? |
That's great! I've been focusing on reliable restarts and reconnection and general stability, trying to ensure audio/UI functions during extended sessions. The command queue is an area that might be potential issue if a script is firing off bursts of commands to Softcut. It would have to be a burst of >2048 commands within a single process block, which would be incredible @ 128 samples at 48kHz = ~2.67ms. 2048 commands is rather good amount, but if stress testing shows that more would be needed for practical usage, we could increase it at little memory cost. The trade off there is in latency of draining the queue and potential dropouts maybe? |
|
maiden |
|
Looking into what specifically is doing a shell out via sidecar.
A few things worth looking into for sidecar that I can think of is large outputs for things like I've been putting together little test scripts here and there, but it makes me think how nice it would be to have something like a diagnostic runner that checks all the things. :) Maybe I can start pulling together all of the little tests into something more comprehensive to be added as a SYSTEM > DIAGNOSTICS at some point. |
Could these become unit tests? |
Definitely. Something between unit and integration testing. I will wait until the dust settles and see what I can pull together. |
|
Do we feel like this is stable enough to merit posting to the BETA upgrade channel? (Basically people could use it by doing SYSTEM > UPDATE while holding K1 and selecting BETA). Reversing out wouldn't be un-trivial (I'd need to make a shell script if someone wanted to revert, though I'd likely just ask them to re-image instead). But I haven't run into any issues myself. This would allow us some time with a larger userbase and feedback--- it's no problem to incrementally edit/fix the BETA release. Edit: if so, I can put one together! |
|
I've noticed some weird behavior around recovery from supercollider fails and restarts -- I'm not sure if the same things wouldn't have happened in the main branch, but given that maiden ;restart needed updated, maybe there is some other kill-code that needs to be aligned with the new service names? |
|
i'm not sure what weirdness you've noticed, but it's not unexpected for things to be out of whack if sclang/scsynth is restarted w/o also restarting matron |
|
one thing which probably needs to be adjusted is the names of the systems units in maiden. …it uses those to implement `;restart` in the web repls i can help sort that out. if it needs to be updated. -gregOn Mar 27, 2026, at 2:16 PM, brian crabtree ***@***.***> wrote:tehn left a comment (monome/norns#1883)
this is working wonderfully in all of my tests so far. maiden, maiden-repl, all great.
i don't expect any issues with "hardware variants" as none of this really touched hardware-specific bits, but i'll test on a shield.
similarly, i think mods should 'just work' but i'll try a few also (i tend not to use them in my own process)
thank you again, this is very encouraging!
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***>
|
|
@ngwese looks like a email resend? but in case you missed it, i believe i fixed this concern: #1883 (comment) |
|
What do we think about merging this PR in as is ( Additionally, I can share a couple of options I have been scratching out regarding the |
|
I’ve been using this branch for a month or two now without encountering any novel issues. I think we should be talking about what the confidence line ought to be for merging it into main for a beta. |
|
Ah, one issue I’m having is that monome/maiden#228 did not fix |
Oh good call, I can try to repro. |
|
i think it'd be great to merge/flatten and get the ready for a beta! i believe i already made the appropriate changes somewhere for update.sh but they may have gotten lost in the shuffle. it's easy for me to recreate, however. can also test out the maiden service fix and confirm in the next couple days. thanks again for keeping this update alive. |
Interesting. Ok. Got maiden built and reproduced the issue. Two findings.
I can open a PR with fixes for each of these issues tomorrow. |
|
Thanks tons, I'm doing a final round of testing and looking forward to getting this finished up |
|
pushed a change to can prepare a beta release now. last pending issue is the maiden repl rename: monome/maiden#232 which is testing well for me but might prefer some glance at it before i post a release |
* norns converged: crone + matron --> single binary - bring-up & stabilization (#1883) * Add nng to dev/ci container definition to support work on converged branch * single-process rewrite (squashed) fix compilation (sidecar) fix many compiler errors (safe-c, etc), some linker problem remains add the RELEASE cxx flags to norns/wscript, retire crone/wscript * fix jack_client for c++ * change CRLF endings to LF in various source files * matron: embed lua-cjson module as `json` global * ensure edge.sh does release builds (#1552) not sure how often this script is used but figured it should probably default to release builds * remove empty jack_cpu.h file (#1553) assuming the contents of this file were moved to jack_client.h at some point * various build improvements - fix/rename flag to enable gprof profiling - add flag to enable debugging/syms - fix duplicate/conflicting c++ standard flags - add help strings to some configure flags * norns: only fork sidecar performs a single fork for the sidecar process along with: - set the child process name for visibiliy in ps - set the thread name in child process for visibility in gdb - send child a SIGHUP if parent dies - disconnect child from parent stdin the primary motivation for this change is simplicity and to allow easier debugging. the parent process remains the one running crone/matron. * norns: move sidecar to nng * ws-wrapper: replace nanomsg with nng * maiden-repl: replace nanomsg with nng - check for all dependencies in wscript - fix SEGV when ^D is entered - collapse duplicate rx loop implementations - increase const'ness of various functions * Update readme-setup.md change to libnng-dev * audio: connect, disconnect, and inspect audio routing provides the ability to alter audio routing for jack clients either by specific port(s) or by blocks of ports given a routing table. routing tables for system, softcut, and supercollider simplify connection management for those components. modifications to audio routing are tracked and the default routing for all components is restored on script cleanup. the `audio_post_restore_default_routing` hook allows mods to tweak the routing in a consistent fashion * audio: simplify handling of system routing - removes explict handling of reverse io - adds method to temporary disable change tracking * sidecar: serialize command invocation requests - capture buffer termination (issue #1569) - moves async command capture logic from lua to c - ensures sidecar requests from _norns.execute and _norns.system_cmd are not interleaved when called from different threads - adds progressive capture buffer allocation in server up to maximum of 10MB * matron: set thread names for easier debugging giving explicitly created threads names makes it substantially easier to narrow in on a particular thread in gdb * Run clang-format on rebased converged branch * Add libatomic dependency for nng * fix: add converged tape and VU bindings and expanded meters - init vuPoll and tapePoll callbacks in MixerClient to prevent std:bad_function_call on boot. - expand VU meters from 4 to 12 channels across the crone -> matron -> lua bindings. - added visual rendering for the 8 new channels in the mix menu. * refactor: improve crone `Poll` thread lifecycle safety * fix: Initialize mix menu engine, monitor, cut, and tape parameters to prevent UI rendering crashes. * fix: Add tape pause, resume, and loop functions * fix: sidecar - resolves IPC startup race with a sync pipe and implement graceful cleanup * Adds nng git submodule * Update test wscirpt includes and .cc source mapping. * test: remove unnecessary c-linkage from c++ matron tests * test: migrate clock test helpers from c to c++ * matron: update atomic types in internal clock * build: uupdate cmake includes and nng library fallback * Adds command queue concurrency support and increased capacity + unit tests * feat: oracle asynchronous i/o for tape and softcut operations to prevent UI blocking for heavy disk i/o * tidy: moves the git submodules for concurrentqueue and readerwriterqueue from the ./crone/lib path to ./third-party * refactor: replace custom queue in sidecar with BlockingReaderWriterQueue * update: adding additional paths to clang-format.sh * fix: apply clang-format * fix: wscript include of nng * refactor: update Tape to defer `sf_open` to the tapes internal disk thread, removes previously introduced `io_thread` in oracle. * fix: resolves issue with self-kill in system restart * fix: update no longer calls stop on old norns-crone.service * maiden-repl: wip - removes nanomsg in favor of nng maiden-repl application logic still needs nng registration and dialer setup * refactor: replace legacy internal OSC comms with direct C calls and cleanup unused handlers * build: use system libnng-dev and remove third-party/nng submodule * fix: resolve boot instability from JACK auto-start conflict and cleans up debug logs * refactor(maiden-repl): force IPv4, thread-safe rendering, input echo, resize support, rename crone→sc * fix: use systemd Restart=always and self-terminate for reliable restart * fix(ws-wrapper): add retry loop for WebSocket listener binding adds a retry mechanism to gracefully handle momentarily held ports during restarts, preventing crash loops. * Fix `Grid.rotation` bug seen in #1888 (#1890) * When rotated 90/270 degrees led's aren't set appropraitely When a `Grid.rotate()` is called with 1 or 3, the correct quadrant isn't chosen. The issue lies in the function `dev_monome_quad_idx`. This doesn't take into account rotation state, thus when called with something like: ```c dev_monome_quad_idx(md->m, 2, 12) ``` It returns quadrant 2. I think we technically want quadrant 1. We can read the rotation state from the monome device, and use that to correctly calculate the quadrant with a slightly different formula that both works for 128 and 256 grids. * Fix missing variable * We need to pass in the md reference * Possible fix for rows/cols not being reset after a rotation * Don't query grid for rotation status * Use Grid.update_devices * Respond to tehn's comments * Linter * fix(sidecar): unlink stale IPC socket before bind --------- Co-authored-by: Michael Dewberry <712405+Dewb@users.noreply.github.com> Co-authored-by: emb <emb@catfact.net> Co-authored-by: catfact <catfact@users.noreply.github.com> Co-authored-by: Greg Wuller <greg@afofo.com> Co-authored-by: brian crabtree <tehn@monome.org> Co-authored-by: Levi Cole <lckennedy@gmail.com> * update.sh: converged services * changelog * update.sh: source, nng * version.txt * update.sh: fix install * fix linter check --------- Co-authored-by: Colin McArdell <colin@colinmcardell.com> Co-authored-by: Michael Dewberry <712405+Dewb@users.noreply.github.com> Co-authored-by: emb <emb@catfact.net> Co-authored-by: catfact <catfact@users.noreply.github.com> Co-authored-by: Greg Wuller <greg@afofo.com> Co-authored-by: Levi Cole <lckennedy@gmail.com>
what
Based off the work of @catfact and @ngwese to the
norns-convergedbranch, which was then rebased forward to the latest onmainby @Dewb, this PR works through the hardware bring-up and stabilization of thenorns-convergedbranch through a variety of fixes and tidying commits.This PR brings the work over to a new branch on
monome/norns:norns-converged-2026in an effort to land all this great work to streamline and combine the matron and crone processes into a single process, and use shared memory rather than OSC for the communication between the control and audio threads.the bring-up has been iterative, but I find it to be quite stable and worth sharing at this time.
generally there is more work which I'm triaging and starting to draft out a task list. at this moment I'm finishing up work on
maiden-replthat I will PR to this branch soon, as I tidy it up and test it on a pre-converged system to ensure backwards compatibility.maiden-replis broken in this PR.notes
third-party/nng, and updated the cmake to support both system-installed nng first, and falls back to the submodule. It does makes sense as a follow up to remove this submodule in favor of the system-installed library to follow the trend within of the buildwscript. Also, the submodule is version 2.0 of nng and the system seems to install some version of 1.x, as does macOS homebrew. this is more reason to not use the submodule as a final solution, but also a note to potentially attempt to align on a specific version of nng (which is relevant formaiden-replcompilations on macOS/Linux).std::bad_function_callcrash on boot, initializing VU and tape poll callbacks inMixerClient. They are no longer null on first invocationcrone::Pollthread lifecycle safety.ConcurrentQueueWorker(io_queue) for tape and softcut operations to prevent UI blocking during heavy disk I/O.io_queuefrom oracle, since it was redundant for softcut. Tape operations were still synchronous and blocking so I updatedTape.hto defersf_opento the tape's internal disk thread making it safe to call fromoracle.concurrentqueueandreaderwriterqueuesubmodules fromcrone/lib/→third-party/.BlockingReaderWriterQueue.Commands.hto useConcurrentQueue(previouslyReaderWriterQueue) because it seems like commands need support for multiple producers (Lua, and external OSC messages).Commands.hIn order to consistently run the system on device, I've removed
norns-matron.serviceandnorns-crone.service, and addednorns-main.service. I've also updated thenorns.targetwith the appropriate changes so that the converged binary is properly started on boot./etc/systemd/system/norns-main.service
/etc/systemd/system/norns-sclang.service
/etc/systemd/system/norns.target
Test guide: https://gist.github.com/colinmcardell/3b8982327c009c2c74f7e2a0dff22cab