From 206f6fb930ad00eb7fac2d0bf92f124de27ed258 Mon Sep 17 00:00:00 2001 From: tchimenti Date: Mon, 23 Dec 2024 05:17:13 -0300 Subject: [PATCH 1/2] Change to dockerfiles, bash and README in order to get up Mallory --- README.md | 202 ----------------------------------- docker/bin/up | 14 +-- docker/node/Dockerfile | 6 +- tests/mallory/dqlite/test.sh | 6 +- 4 files changed, 13 insertions(+), 215 deletions(-) delete mode 100644 README.md diff --git a/README.md b/README.md deleted file mode 100644 index 2116c1f..0000000 --- a/README.md +++ /dev/null @@ -1,202 +0,0 @@ -# Jepsen & Mallory - -Breaking distributed systems so you don't have to. - -**Jepsen** is a Clojure library. A test is a Clojure program which uses the Jepsen -library to set up a distributed system, run a bunch of operations against that -system, and verify that the history of those operations makes sense. Jepsen has -been used to verify everything from eventually-consistent commutative databases -to linearizable coordination systems to distributed task schedulers. It can -also generate graphs of performance and availability, helping you characterize -how a system responds to different faults. See -[jepsen.io](https://jepsen.io/analyses) for examples of the sorts of analyses -you can carry out with Jepsen. - -**Mallory** is a graybox extension to Jepsen, implemented in Rust. It hooks into -an existing Jepsen test and takes the role of the nemesis, deciding in real-time -which actions to inject and when, based on the _runtime_ behaviour of the system -under test. - -## Citing Mallory -Mallory has been accepted for publication at the 2023 ACM SIGSAC Conference on Computer and Communications Security (CCS 2023). -The paper is also available on [arXiv](https://arxiv.org/pdf/2305.02601.pdf). If you use this code in your scientific work, please cite the paper as follows: -``` -@inproceedings{mallory, -author={Meng, Ruijie and P{\^\i}rlea, George and Roychoudhury, Abhik and Sergey, Ilya}, -title={Greybox Fuzzing of Distributed Systems}, -booktitle={Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security}, -pages={1615--1629}, -year={2023}} -``` - -## Design Overview - -### Jepsen - -A Jepsen test runs as a Clojure program on a *control node*. That program uses -SSH to log into a bunch of *db nodes*, where it sets up the distributed system -you're going to test using the test's pluggable *os* and *db*. - -Once the system is running, the control node spins up a set of logically -single-threaded *processes*, each with its own *client* for the distributed -system. A *generator* generates new operations for each process to perform. -Processes then apply those operations to the system using their clients. The -start and end of each operation is recorded in a *history*. While performing -operations, a special *nemesis* process introduces faults into the system--_also -scheduled by the generator._ - -Finally, the DB and OS are torn down. Jepsen uses a *checker* to analyze the -test's history for correctness, and to generate reports, graphs, etc. The test, -history, analysis, and any supplementary results are written to the filesystem -under `store///` for later review. Symlinks to the latest -results are maintained at each level for convenience. - -### Mallory - -Mallory hooks into your Jepsen test and takes the place of the nemesis -generator. We use a custom version of Jepsen, modified to inform Mallory when -tests start and end and when client and nemesis operations are executed. Most -importantly, Mallory uses the nemeses defined in the Jepsen test---this requires -some modification of these nemeses, as explained in the tutorial below. - -As the test executes, Mallory observes the system under test and introduces -faults with the goal of inducing behaviour not seen before. - -## Documentation - -This [tutorial](doc/tutorial/index.md) walks you through writing a Jepsen test -from scratch. For reference, see the [API documentation](http://jepsen-io.github.io/jepsen/). - -## Setting up a Jepsen + Mallory environment - -We provide a ready-made environment using Vagrant: - -```bash -cd docker/ -vagrant plugin install vagrant-reload # only needed once -vagrant up -``` - -### Modifying an existing Jepsen test for Mallory - -If you have an existing Jepsen test harness, Mallory takes the place of your -existing nemesis package and generator. - -```Clojure -(:require [jepsen.mediator.wrapper :as med]) - -;; this should be a list of packages, as returned by -;; jepsen/nemesis/combined.clj:nemesis-packages -;; and NOT a combined package (as returned by compose-package) -;; If you have custom nemeses, you need to write a version of this yourself -;; that includes your custom nemesis. -packages (nemesis/nemesis-packages nemesis-opts) - -;; Previously, the nemesis package was obtained as such: -;; nemesis (nemesis/nemesis-package nemesis-opts) -nemesis (med/adaptive-nemesis packages nemesis-opts)] - -;; in your test, make the nemesis generator refer to the adaptive package: -:generator - (->> (:generator workload) - (gen/stagger (/ (:rate opts))) - ;; use the adaptive nemesis generator - (gen/nemesis (:generator nemesis)) - (gen/time-limit (:time-limit opts))) -``` - -IMPORTANT: -- if your nemesis package only uses nemeses in Jepsen's default - `jepsen/nemesis/combined.clj`, our distribution rewrites those so they are - usable by Mallory; -- if you package custom nemeses, you must modify them as follows: (1) add a - `:ops` field that returns the set of operations (and arguments) supported by - the nemesis, and (2) add a `:dispatch` field that takes an operation type - returned by `op` and returns an instantiated operation that can be invoked by - the nemesis client - -Here is an example nemesis adapted for use with Mallory: - -```Clojure -(defn partition-package - "A nemesis and generator package for network partitions. Options as for - nemesis-package." - [opts] - (let [needed? ((:faults opts) :partition) - db (:db opts) - targets (:targets (:partition opts) (partition-specs db)) - start (fn start [_ _] - {:type :info - :f :start-partition - :value (rand-nth targets)}) - stop {:type :info, :f :stop-partition, :value nil} - gen (->> (gen/flip-flop start (repeat stop)) - (gen/stagger (:interval opts default-interval))) - ;; Needed by Mallory -- to inform at start-up which operations this nemesis can perform - ops (cond-> [] - needed? (concat [{:f :start-partition :values (vec targets)}, {:f :stop-partition, :values [nil]}]))] - ;; Needed by Mallory -- to transform an operation type into a specific operation - (defn dispatch [op test ctx] - (case (:f op) - :start-partition ((fn start [_ _] {:type :info - :f :start-partition - :value (or (:value op) (rand-nth targets))}) test ctx) - :stop-partition stop - nil)) - - {:generator (when needed? gen) - :final-generator (when needed? stop) - :nemesis (partition-nemesis db) - :perf #{{:name "partition" - :start #{:start-partition} - :stop #{:stop-partition} - :color "#E9DCA0"}} - ;; these two fields are needed by Mallory - :ops ops - :dispatch dispatch})) -``` - -An example `nemesis-packages` function (with many custom nemesis packages): - -```Clojure -(defn nemesis-packages - "Constructs a nemesis and generators for dqlite." - [opts] - (let [opts (update opts :faults set)] - (->> (concat [(nc/partition-package opts) - (nc/db-package opts) - (member-package opts) - (stop-package opts) - (stable-package opts)] - (:extra-packages opts)) - (remove nil?)))) -``` - -A much simpler one: - -```Clojure -(defn nemesis-packages - "Builds a combined package for the given options." - [opts] - (->> (nc/nemesis-packages opts) - (concat [(member-package opts)]) - (remove nil?))) -``` - - -## Contributions - -### Contributors - - * Ruijie Meng - * George Pîrlea - * Abhik Roychoudhury - * Ilya Sergey - -### Other Contributors - -We use [Jepsen](https://jepsen.io/) as the underlying tool. Thanks to Jepsen's developers. We also welcome other contributors to improve and extend Mallory. - -## License - -This project is licensed under the Apache License 2.0 - see the [LICENSE](./LICENSE) file for details. diff --git a/docker/bin/up b/docker/bin/up index fcf1dd5..bb0fdff 100755 --- a/docker/bin/up +++ b/docker/bin/up @@ -139,7 +139,7 @@ rm -rf ./control/jepsen rm -rf ./node/jepsen mkdir -p ./control/jepsen/jepsen # Copy the jepsen directory if we're not mounting the JEPSEN_ROOT -if [ -z "${DEV}" ]; then + exclude_params=( --exclude=./docker --exclude=./.git @@ -171,7 +171,7 @@ if [ -z "${DEV}" ]; then ( cp -r ./control/jepsen ./node/jepsen ) -fi + if [ "${INIT_ONLY}" -eq 1 ]; then exit 0 @@ -180,29 +180,29 @@ fi exists docker || { ERROR "Please install docker (https://docs.docker.com/engine/installation/)"; exit 1; } -exists docker-compose || +exists docker compose || { ERROR "Please install docker-compose (https://docs.docker.com/compose/install/)"; exit 1; } if [ "${BUILD}" -eq 1 ]; then INFO "Running \`docker-compose build\`" # shellcheck disable=SC2086 - docker-compose --compatibility -p jepsen -f docker-compose.yml ${COMPOSE} ${DEV} build + docker compose --compatibility -p jepsen -f docker-compose.yml ${COMPOSE} ${DEV} build fi # We need a fresh share volume each time we start, so we have a correct set of # DB hosts. Why does Docker make sharing state SO hard docker run --rm -v jepsen_jepsen-shared:/data/ debian:bullseye rm /data/nodes || true -INFO "Running \`docker-compose up\`" +INFO "Running \`docker compose up\`" if [ "${RUN_AS_DAEMON}" -eq 1 ]; then # shellcheck disable=SC2086 - docker-compose --compatibility -p jepsen -f docker-compose.yml ${COMPOSE} ${DEV} up -d + docker compose --compatibility -p jepsen -f docker-compose.yml ${COMPOSE} ${DEV} up -d INFO "All containers started! Run \`docker ps\` to view, and \`bin/console\` to get started." else INFO "Please run \`bin/console\` in another terminal to proceed" # shellcheck disable=SC2086 - docker-compose --compatibility -p jepsen -f docker-compose.yml ${COMPOSE} ${DEV} up + docker compose --compatibility -p jepsen -f docker-compose.yml ${COMPOSE} ${DEV} up fi popd diff --git a/docker/node/Dockerfile b/docker/node/Dockerfile index 70971cc..40000f7 100644 --- a/docker/node/Dockerfile +++ b/docker/node/Dockerfile @@ -47,9 +47,9 @@ RUN echo "deb http://apt.llvm.org/buster/ llvm-toolchain-buster-12 main" >> /etc RUN echo "deb-src http://apt.llvm.org/buster/ llvm-toolchain-buster-12 main" >> /etc/apt/sources.list RUN wget -O - https://apt.llvm.org/llvm-snapshot.gpg.key | apt-key add - RUN apt-get update -qy -RUN apt-get install -qy clang-12 lld-12 -RUN ln -s /usr/bin/clang-12 /usr/bin/clang && ln -s /usr/bin/clang++-12 /usr/bin/clang++ && \ - ln -s /usr/bin/llvm-config-12 /usr/bin/llvm-config +RUN apt install -qy clang-11 lld-11 +RUN ln -s /usr/bin/clang-11 /usr/bin/clang && ln -s /usr/bin/clang++-11 /usr/bin/clang++ && \ + ln -s /usr/bin/llvm-config-11 /usr/bin/llvm-config # end build tools # Install coverage dependencies diff --git a/tests/mallory/dqlite/test.sh b/tests/mallory/dqlite/test.sh index 8113212..e63a490 100755 --- a/tests/mallory/dqlite/test.sh +++ b/tests/mallory/dqlite/test.sh @@ -66,7 +66,7 @@ setup-inner() { } setup() { - lxc launch images:ubuntu/22.04 jepsen -c limits.kernel.core=-1 + lxc launch ubuntu:ubuntu/22.04 jepsen -c limits.kernel.core=-1 sleep 5 push-this-repo lxc exec "$jepsen" -- "$workspace/jepsen.dqlite/test.sh" setup-inner "$@" @@ -106,8 +106,8 @@ run-inner() { } run() { - test "$(sysctl -n kernel.core_pattern)" = core || exit 1 - test "$(sysctl -n fs.suid_dumpable)" -gt 0 || exit 1 + test "$(sudo sysctl -n kernel.core_pattern)" = core || exit 1 + sudo sysctl -n fs.suid_dumpable push-this-repo lxc exec $jepsen -- \ env RAFT_BRANCH="${RAFT_BRANCH:-canonical/master}" \ From d38e93779ea4f6d0e76dde40f5787b9ba08e2d95 Mon Sep 17 00:00:00 2001 From: tchimenti Date: Mon, 23 Dec 2024 05:24:23 -0300 Subject: [PATCH 2/2] Add Readme --- README.md | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 88 insertions(+) create mode 100644 README.md diff --git a/README.md b/README.md new file mode 100644 index 0000000..c05e37a --- /dev/null +++ b/README.md @@ -0,0 +1,88 @@ +# Setup Guide + +## Prerequisites + +Before proceeding, make sure you have the following tools installed: + +- **Vagrant**: Download the appropriate version for your platform from the [official page](https://developer.hashicorp.com/vagrant/downloads). +- **VirtualBox**: This is the default option for VM management. It's not mandatory, but you can configure a different VM in the Vagrant options if you prefer. + +## VM Setup + +In the root of the project, execute the following commands: + +``` +cd docker +vagrant up +``` + +The first time you run this, it may take several minutes to complete. + +## Accessing the VM + +To enter the VM, run: + +``` +vagrant ssh +``` + +### Setting Up the Mediator + +Before running tests in Mallory, you'll need to set up the mediator. If it's your first time, follow these steps: + +1. **Install Rustup**: + +``` +curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh +``` + +2. **Install Musl Tools**: + +``` +sudo apt-get install musl-tools +``` + +3. **Add Musl Target for Rust**: + +``` +rustup target add x86_64-unknown-linux-musl +``` + +4. **Build the Mediator**: + +``` +cargo build --target=x86_64-unknown-linux-musl +``` + +If the command doesn't execute, make sure to add the `~/.cargo/bin` folder to your `PATH` variable or in your `.bashrc` file. + +### Running the Application + +Jepsen is run using Docker. It has a control plane, a main container that manages five nodes where the applications are deployed. Fortunately, a script will set up the environment for you. Simply execute: + +``` +cd /jepsen/docker +sudo ./bin/up +``` + +This may take over 10 minutes on the first run. + +### Running the Mediator + +Once Jepsen is set up, you can run the mediator. This module intercepts messages between nodes and sends them to Mallory. Open a new terminal tab, log into Vagrant again, and run: + +``` +cd /jepsen/mediator && target/x86_64-unknown-linux-musl/release/mediator qlearning event_history 0.7 +``` + +### Running Jepsen Tests + +Finally, to run Jepsen tests, access the control plane in another terminal tab and execute: + +``` +cd /jepsen/docker +sudo ./bin/console +``` + +Navigate to the test you want to execute and each folder it will tell you how to do it +