This is a rust implementation of Project Babble's baballonia face tracking sofware. It's designed to be a library; easy to integrate in a variety of frontend projects.
Libsnout requires the following build dependencies (in the form of fedora package names):
- llvm
- llvm-devel
- onnxruntime
- onnxruntime-devel
- rust
Clone the repository,
git clone https://github.com/Darksecond/libsnout.gitand then build the program.
cd libsnout
cargo build --release -p snout-cliThe snout-cli executable will be located under target/release/
Help on how to use the cli tool can be obtained with:
snout-cli helpBeware: snout-cli always requires a config file to be passed to it via the --config argument. It does not use internal defaults.
A template config.toml can be found in this repo.
Make sure to edit it to suit your needs.
Tracking can be disabled for specific points by setting their camera value to an empty string. Like so:
[eye.right]
camera = ""
# <...>
[eye.left]
camera = ""
# <...>
[face]
camera = "http://192.168.178.162"
# <...> The above example will disable both of the eye cameras, leaving only the face camera active.
The names of connected usb cameras can be found like so:
snout-cli --config <config.toml> list-camerasOnce you have located your desired camera in the outputted list, use the full name of the camera in the configuration file.
[eye.right]
camera = "Bigeye: Bigeye (800x400 @ 90fps)"Wireless mjpeg cameras can be entered as a url, like so
[eye.right]
camera = "http://192.168.178.162"The osc endpoint that tracking data gets sent to will need to be adjusted to be used with VRCFT. The following configuration will work with VRCFT.avalonia:
[output.osc]
destination = "127.0.0.1:8888"The default endpoint if none is supplied in the config already work with oscavmgr. But can be set manually, like so:
[output.osc]
destination = "127.0.0.1:9400"The paths to the face and eye tracking onnx models are relative to your current directory. An absolute path may be preferred and can be set by prefixing the path with a / like so:
[face]
# <...>
model = "/home/user/libsnout/faceModel.onnx" Libsnout comes with a working face tracking model. It's the same as in the baballonia repository, but ran through onnxsim.
Once you have set up your config.toml to point to your cameras, and set the output OSC destination to the correct values for your program of choice. You can start tracking with the following command:
snout-cli --config <config.toml> trackThis will start recording, along with sending data to the OSC endpoint specified in the configuration file.
Eye models can be trained with the following command:
snout-cli --config <config.toml> train <user_cal.bin> <output.onnx>the <user_cal.bin> file generated by baballonia can be found in the installation folder of the baballonia software. Next to the executable.
The resulting <output.onnx> can then be used in the .config toml, for the corresponding eye.
A camera frame can be captured and written to a file with the following command to help with debugging tracking issues, along with aligning your face:
snout-cli --config <config.toml> capture <SOURCE> <OUTPUT.jpeg><SOURCE> can be any of the following camera sources left-eye, right-eye, face,
<OUTPUT.jpeg> will be the name of the file that the camera frame gets written to.
cropping the image works slightly differently; instead of providing top/left/right/bottom coordinates it uses major/minor shift and scale. Scale 1 is 100%, increase it to zoom in (1.5 would be 150%). Major shift and minor shift range from -1 to 1.
Major shift shifts along the longest axis, minor shift shifts along the shortest axis. Minor shift only does something when zoomed in, if your input is a square then both will only function when zoomed in.
The camera stream will always be cropped into a square; so on a 16:9 image the sides are trimmed off along the longest axis, and Major shift will then allow you to shift the crop left or right. If you then zoom in on the cropped image, minor shift will allow you to shift the crop up or down.
I designed it this way to prevent users from squishing their face, since the model always wants a 240x240 pixel input and the image pipeline just squishes the cropped image to fit that, squishing your face if you don't have a perfectly square crop.
Right now it's licensed under the same license as Baballonia from Project Babble is, considering this is a derivative work.