Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 34 additions & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
name: Release OCI artifact
on:
push:
tags:
- "v*.*.*"
permissions:
contents: read
packages: write
jobs:
release:
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Install Flux CLI
uses: fluxcd/flux2/action@main
- name: Login to GitHub Container Registry
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Push ring0 Flux manifests as OCI artifact
run: |
flux push artifact \
"oci://ghcr.io/${{ github.repository }}:${{ github.ref_name }}" \
--path=ring0/flux \
--source="${{ github.repositoryUrl }}" \
--revision="${{ github.ref_name }}@sha1:${{ github.sha }}"
- name: Tag artifact as latest
run: |
flux tag artifact \
"oci://ghcr.io/${{ github.repository }}:${{ github.ref_name }}" \
--tag latest
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,5 +8,7 @@ ring0/core-services/pki/files
*.csr

# Tooling
.claude
CLAUDE.md
.vscode
.task
17 changes: 11 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,8 +62,12 @@ UpdateLayoutConfig($c4ShapeInRow="3", $c4BoundaryInRow="1")
```plaintext
micro-cloud/
├── ring0/ # Low dependency services and core infrastructure (PKI, Netboot, Management)
│ ├── core-services/ # Manifests and templates for ring0 Kubernetes workloads
│ ├── flux/ # FluxCD Operator manifests (HelmRepositories, HelmReleases, Kustomizations)
│ └── scripts/ # Bootstrap and day-2 shell scripts
├── ring1/ # Experimental environments (Kubernetes clusters, etc.)
├── docs/ # Schematics, documentation, articles
├── docs/ # Schematics, documentation, examples
├── .github/ # GitHub Actions workflows (OCI artifact release)
├── LICENSE # Project license (Apache 2.0)
└── README.md # This file
```
Expand All @@ -83,10 +87,11 @@ The project relies on a home server rack composed of:

- **Network / VPN:** Tailscale
- **Containerization:** Incus (LXC / KVM)
- **PKI:** cfssl, cert-manager, openbao
- **PKI:** cfssl, cert-manager, OpenBao
- **Bootstrapping:** kea, matchbox, Talos
- **Orchestration:** Kubernetes, Kamaji, Tinkerbell
- **Middleware:** Netbox, Authentik
- **GitOps:** FluxCD Operator (OCI artifact releases via GitHub Actions)
- **Orchestration:** Kubernetes, Kamaji, Tinkerbell, Cluster API
- **Middleware:** Netbox, Authentik, External Secrets Operator, Zot

## Getting Started

Expand All @@ -109,9 +114,9 @@ incus remote add headnode-0 headnode-0
incus remote switch headnode-0
```

### 3. Start the bootstrap sequence
### 3. Bootstrap sequence

Refer to [ring0/README.md](ring0/README.md) for step-by-step operations (PKI, netboot, management node).
Refer to [ring0/README.md](ring0/README.md) for the full step-by-step sequence.

## Contributions

Expand Down
40 changes: 40 additions & 0 deletions docs/cluster-config-example.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# cluster-config — example ConfigMap consumed by Flux HelmRelease.valuesFrom references.
#
# Apply this before bootstrapping Flux (or let task flux create it):
# kubectl apply -f docs/cluster-config-example.yaml
#
# Values marked <RUNTIME:...> depend on live infrastructure state.
# Quick reference:
# ts_suffix : tailscale dns status | awk '/suffix =/ {gsub(")","");print $NF}'
# pki_endpoint : https://pki.<ts_suffix>
# openbao_ca_bundle : base64 < dist/bundle.crt | tr -d '\n'
# announcements_iface : talosctl ... get addresses \
# | grep "$INSTANCE_MANAGEMENT_SERVICES_IPADDR_CIDR" \
# | awk '{print $4}' | tail -n1 | awk -F/ '{print $1}'
# dhcp_bind_interface : interface on the management node facing the bootstrap VLAN
# bootstrap_endpoint : http://<bootstrap-incus-ip>/ (matchbox assets HTTP server)
# artifacts_file_server : http://<bootstrap-incus-ip>/ (Flatcar/Talos artifacts)
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-config
namespace: flux-system
data:
# ---- PKI ----------------------------------------------------------------
pki_org: "My Org"
pki_endpoint: "https://pki.my-cloud.ts.net"
openbao_ca_bundle: "<RUNTIME: base64 < dist/bundle.crt | tr -d '\\n'>"
# ---- Tailscale ----------------------------------------------------------
ts_suffix: "my-cloud.ts.net"
idp_hostname: "idp.my-cloud.ts.net"
# ---- Cilium L2 announcements --------------------------------------------
announcements_iface: "eth1"
# ---- BMaaS — service IPs assigned via CiliumLoadBalancerIPPool ----------
registry_ip: "192.168.3.4"
tinkerbell_ip: "192.168.3.5"
hookos_ip: "192.168.3.6"
dns_ip: "192.168.3.7"
# ---- Tinkerbell ---------------------------------------------------------
dhcp_bind_interface: "eth0"
bootstrap_endpoint: "http://192.168.2.2/"
artifacts_file_server: "http://192.168.2.2/"
79 changes: 63 additions & 16 deletions ring0/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -230,22 +230,67 @@ export BMAAS_NAMESPACE=bmaas-system
task management
```

## Installing the middlewares
## Bootstrapping Flux (GitOps)

### Installing the IDP service
After `task management`, bootstrap FluxCD Operator so it takes over all platform workloads.
The script creates the pre-requisite Secrets and the `cluster-config` ConfigMap, then applies the `FluxInstance`.

```shell
task idp
# Required variables (add to your variables.sh / environment)
export TS_OPERATOR_CLIENT_ID=xxxxxx
export TS_OPERATOR_CLIENT_SECRET=xxxxxx
export PKI_ENDPOINT=https://pki.<ts_suffix>
export PKI_ORG="My Cloud"
export DNS_IP=192.168.3.7
export HOOKOS_IP=192.168.3.6
export REGISTRY_IP=192.168.3.4
export TINKERBELL_IP=192.168.3.5
export ARTIFACTS_FILE_SERVER=http://<bootstrap-ip>/
export DHCP_BIND_INTERFACE=eth0
export BOOTSTRAP_ENDPOINT=http://<bootstrap-ip>/assets/tinkerbell
export ANNOUNCEMENTS_IFACE=eth1 # interface on management node facing the services VLAN
export GHCR_TOKEN=<GitHub PAT with read:packages>

task flux
```

> [!WARNING]
> Installing Authentik can be quite long because of the database initialization.
From this point Flux manages the following components from the OCI artifact (`ghcr.io/mgrzybek/micro-cloud`):

| Flux path | Component | Namespace |
| --- | --- | --- |
| `infrastructure/01-cilium` | Cilium CNI | kube-system |
| `infrastructure/01-cert-manager` | cert-manager | cert-manager |
| `infrastructure/02-cnpg-operator` | CloudNative PG operator | cnpg-system |
| `infrastructure/02-pg-cluster` | PostgreSQL cluster | platform-management |
| `infrastructure/02-tailscale-operator` | Tailscale Operator | tailscale |
| `apps/03-idp` | Authentik (IDP) | platform-management |
| `apps/04-cmdb` | Netbox (CMDB) | platform-management |
| `apps/04-eso` | External Secrets Operator | external-secrets |
| `apps/05-bmaas/zot` | Zot OCI registry | tinkerbell-system |
| `apps/05-bmaas/kamaji` | Kamaji | kamaji-system |
| `apps/05-bmaas/tinkerbell` | Tinkerbell | tinkerbell-system |

Monitor reconciliation with:

```shell
flux get all -A
```

## Post-Flux setup

### Exposing the IDP service

Once Flux has deployed Authentik, expose it on the tailnet:

```shell
task idp
```

Now you are ready to populate your directory as needed. Please note that Netbox uses two groups by default: `staff` and `superusers`. You have to add some users to these groups to be able to manage Netbox.

If you want to use Authentik's API to provision resources, you should create a token using the admin account at [https://idp.your-tailscale-suffix/if/admin/#/core/tokens).](https://idp/if/admin/#/core/tokens).
If you want to use Authentik's API to provision resources, you should create a token using the admin account at [https://idp.your-tailscale-suffix/if/admin/#/core/tokens).](https://idp/if/admin/#/core/tokens).

### Configuring the Netbox provider
### Exposing the CMDB service

The official documentation on how to integrate the SSO mechanism between Authentik and Netbox is [described here](https://integrations.goauthentik.io/documentation/netbox/).

Expand All @@ -262,15 +307,6 @@ task cmdb
> [!WARNING]
> Installing Netbox can be quite long because of the database initialization.

### Installing External Secrets Operator

Deploys ESO and creates a `ClusterSecretStore` backed by OpenBao (KV v2, AppRole auth).
The AppRole credentials are generated automatically during `task intermediate-fullchain`.

```shell
task eso
```

### Configuring Tinkerbell

Some pre-configuration is needed to make CoreDNS use Netbox as an IPAM. You must create a `coredns` service account able to read IP addresses from the IPAM section.
Expand All @@ -285,6 +321,17 @@ export DNS_IP=192.168.3.7
task bmaas
```

## Releasing a new version

Push a semver tag to trigger the GitHub Actions workflow that publishes the Flux manifests as an OCI artifact:

```bash
git tag v1.2.3
git push origin v1.2.3
```

Flux picks up the new tag automatically (or pin a specific version in `flux/clusters/management/flux-system/flux-instance.yaml`).

## Day-2: Adding a physical worker node

A physical machine can be added to the management cluster as a worker node via PXE boot through Matchbox.
Expand Down
25 changes: 25 additions & 0 deletions ring0/core-services/pki/debian-pki-cloud-init.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,16 @@ if [[ -z "$SUFFIX" ]]; then
exit 1
fi

if [[ -z "$SERVER_ADDR" ]]; then
echo "SERVER_ADDR must be present in /etc/cloud.sh"
exit 1
fi

if [[ -z "$SERVER_CIDR" ]]; then
echo "SERVER_CIDR must be present in /etc/cloud.sh"
exit 1
fi

function main() {
set -e

Expand Down Expand Up @@ -38,6 +48,21 @@ function prepare() {
apt -y install jq unzip wget
export HOME=/root
export pki=/var/lib/pki/files

echo "#####################"
echo "👷 Configuring the netboot iface"

cat <<EOF | tee /etc/systemd/system/bootstrap-network.service
[Unit]
Description=Bootstrap network configuration
Wants=network-online.target
After=network-online.target

[Service]
Type=oneshot
ExecStart=/bin/bash -c "if ! ip addr show | grep -q $SERVER_CIDR ; then ip addr add dev eth1 $SERVER_CIDR ; fi"
EOF

}

function install_go() {
Expand Down
53 changes: 53 additions & 0 deletions ring0/flux/apps/03-idp/helmrelease.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# authentik-secret must exist before reconciliation.
# task flux creates it with a random secret_key (generated once, stored in dist/).
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: idp
namespace: platform-management
spec:
interval: 1h
chart:
spec:
chart: authentik
sourceRef:
kind: HelmRepository
name: authentik
namespace: flux-system
valuesFrom:
- kind: Secret
name: authentik-secret
valuesKey: secret_key
targetPath: authentik.secret_key
- kind: ConfigMap
name: cluster-config
namespace: flux-system
valuesKey: ts_suffix
targetPath: server.ingress.hosts[0]
# Flux does not support string templating; the ingress host is patched by task flux:
# kubectl create configmap cluster-config ... --from-literal=idp_hostname=idp.<ts_suffix>
values:
authentik:
error_reporting:
enabled: true
server:
ingress:
enabled: true
hosts: []
postgresql:
enabled: false
redis:
enabled: true
global:
env:
- name: AUTHENTIK_POSTGRESQL__HOST
value: tooling-rw
- name: AUTHENTIK_POSTGRESQL__NAME
value: authentik
- name: AUTHENTIK_POSTGRESQL__USER
value: app
- name: AUTHENTIK_POSTGRESQL__PASSWORD
valueFrom:
secretKeyRef:
name: tooling-app
key: password
4 changes: 4 additions & 0 deletions ring0/flux/apps/03-idp/kustomization.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- helmrelease.yaml
63 changes: 63 additions & 0 deletions ring0/flux/apps/04-cmdb/helmrelease.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
# cmdb-netbox-remote-auth secret must exist before reconciliation.
# task flux creates it by rendering netbox-remote-auth.py.j2 with the Authentik OAuth credentials.
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: cmdb
namespace: platform-management
spec:
interval: 1h
chart:
spec:
chart: netbox
sourceRef:
kind: HelmRepository
name: netbox-community
namespace: flux-system
values:
defaultLanguage: fr-fr
postgresql:
enabled: false
externalDatabase:
existingSecretName: tooling-app
existingSecretKey: password
host: tooling-rw
port: 5432
username: app
database: netbox
valkey:
architecture: standalone
extraConfig:
- secret:
secretName: cmdb-netbox-remote-auth
items:
- netbox-remote-auth.py
optional: false
extraDeploy:
- "apiVersion: v1\nkind: ConfigMap\nmetadata:\n name: cmdb-netbox-sso-pipeline-roles\n namespace: platform-management\ndata:\n sso_pipeline_roles.py: |\n from netbox.authentication import Group\n class AuthFailed(Exception):\n pass\n def add_groups(response, user, backend, *args, **kwargs):\n try:\n groups = response['groups']\n except KeyError:\n pass\n for group in groups:\n group, created = Group.objects.get_or_create(name=group)\n user.groups.add(group)\n def remove_groups(response, user, backend, *args, **kwargs):\n try:\n groups = response['groups']\n except KeyError:\n user.groups.clear()\n pass\n user_groups = [item.name for item in user.groups.all()]\n delete_groups = list(set(user_groups) - set(groups))\n for delete_group in delete_groups:\n group = Group.objects.get(name=delete_group)\n user.groups.remove(group)\n def set_roles(response, user, backend, *args, **kwargs):\n user.is_superuser = False\n user.is_staff = False\n try:\n groups = response['groups']\n except KeyError:\n user.save()\n pass\n user.is_superuser = True if 'superusers' in groups else False\n user.is_staff = True if 'staff' in groups else False\n user.save()\n"
extraVolumes:
- name: remote-auth
secret:
secretName: cmdb-netbox-remote-auth
- name: sso-pipeline-roles
configMap:
name: cmdb-netbox-sso-pipeline-roles
defaultMode: 0755
- name: internal-ca-chain
configMap:
name: internal-ca-chain
extraVolumeMounts:
- name: remote-auth
mountPath: /etc/netbox/config/extra.py
subPath: netbox-remote-auth.py
readOnly: true
- name: sso-pipeline-roles
mountPath: /opt/netbox/netbox/netbox/custom_pipeline.py
subPath: sso_pipeline_roles.py
readOnly: true
- name: internal-ca-chain
mountPath: /usr/share/ca-certificates/
readOnly: true
extraEnvs:
- name: REQUESTS_CA_BUNDLE
value: /usr/share/ca-certificates/internal-ca.crt
Loading