Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/MainDistributionPipeline.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,13 +18,13 @@ jobs:
with:
duckdb_version: v1.5.3
ci_tools_version: v1.5-variegata
extension_name: duck_diff
extension_name: table_diff

code-quality-check:
name: Code Quality Check
uses: duckdb/extension-ci-tools/.github/workflows/_extension_code_quality.yml@v1.5-variegata
with:
duckdb_version: v1.5.3
ci_tools_version: v1.5-variegata
extension_name: duck_diff
extension_name: table_diff
format_checks: 'format;tidy'
4 changes: 2 additions & 2 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
cmake_minimum_required(VERSION 3.5)

# Set extension name here
set(TARGET_NAME duck_diff)
set(TARGET_NAME table_diff)

set(EXTENSION_NAME ${TARGET_NAME}_extension)
set(LOADABLE_EXTENSION_NAME ${TARGET_NAME}_loadable_extension)
Expand All @@ -13,7 +13,7 @@ set(CMAKE_CXX_STANDARD_REQUIRED ON)

include_directories(src/include)

set(EXTENSION_SOURCES src/duck_diff_extension.cpp)
set(EXTENSION_SOURCES src/table_diff_extension.cpp)

build_static_extension(${TARGET_NAME} ${EXTENSION_SOURCES})
build_loadable_extension(${TARGET_NAME} " " ${EXTENSION_SOURCES})
Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
PROJ_DIR := $(dir $(abspath $(lastword $(MAKEFILE_LIST))))

# Configuration of extension
EXT_NAME=duck_diff
EXT_NAME=table_diff
EXT_CONFIG=${PROJ_DIR}extension_config.cmake

# Include the Makefile from extension-ci-tools
Expand Down
36 changes: 18 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# duck_diff
# table_diff

A DuckDB extension for diffing two relations (tables, SQL queries, etc.) off a primary key. Given a
"left" and a "right" relation, it reports — per key — whether the row is
Expand All @@ -10,7 +10,7 @@ subset of columns to diff or ignore.

## Quick start

Get a DuckDB shell with `duck_diff` loaded (see [Install](#install) or
Get a DuckDB shell with `table_diff` loaded (see [Install](#install) or
[Building](#building)), then create two sample snapshots and diff them:

```sql
Expand Down Expand Up @@ -69,23 +69,23 @@ FROM table_diff_summary('FROM users_v1', 'FROM users_v2', pk := 'id'); -- fals

## Install

Each [GitHub Release](https://github.com/avaitla/duck_diff/releases) attaches a
Each [GitHub Release](https://github.com/avaitla/duckdb-table-diff/releases) attaches a
signed binary per platform. Download the one matching your DuckDB version and
platform, **saved as `duck_diff.duckdb_extension`** (DuckDB derives the
platform, **saved as `table_diff.duckdb_extension`** (DuckDB derives the
extension name from the filename), then load it under `-unsigned` (the binaries
are signed with a third-party key, so unsigned extensions must be enabled — a
launch flag, not a `SET`):

```sh
curl -L -o duck_diff.duckdb_extension \
https://github.com/avaitla/duck_diff/releases/download/v0.1.0/duck_diff-v1.5.2-osx_arm64.duckdb_extension
curl -L -o table_diff.duckdb_extension \
https://github.com/avaitla/duckdb-table-diff/releases/download/v0.1.0/table_diff-v1.5.2-osx_arm64.duckdb_extension
duckdb -unsigned
```

Load it with the full filepath:

```sql
LOAD '/path/to/duck_diff.duckdb_extension';
LOAD '/path/to/table_diff.duckdb_extension';
SELECT * FROM table_diff('FROM a', 'FROM b', pk := 'id');
```

Expand Down Expand Up @@ -200,13 +200,13 @@ single query you can also force one shared scan with a `WITH x AS MATERIALIZED
## Building

The repo vendors DuckDB and the build tooling as submodules, so a clone +
`make` produces a DuckDB shell with `duck_diff` preloaded:
`make` produces a DuckDB shell with `table_diff` preloaded:

```sh
git clone --recurse-submodules https://github.com/avaitla/duck_diff
cd duck_diff
git clone --recurse-submodules https://github.com/avaitla/duckdb-table-diff
cd table_diff
GEN=ninja make # first build compiles DuckDB; needs cmake + ninja
./build/release/duckdb # this shell already has duck_diff loaded
./build/release/duckdb # this shell already has table_diff loaded

build/release/test/unittest "test/sql/*" # run the SQL test suite
```
Expand All @@ -218,26 +218,26 @@ bundled `json` extension is required (built in automatically for tests).
### Using it in another DuckDB

The build also emits a loadable binary at
`build/release/extension/duck_diff/duck_diff.duckdb_extension`. It's locally
`build/release/extension/table_diff/table_diff.duckdb_extension`. It's locally
built (unsigned), so load it with unsigned extensions enabled:

```sh
duckdb -unsigned
```
```sql
LOAD 'build/release/extension/duck_diff/duck_diff.duckdb_extension';
LOAD 'build/release/extension/table_diff/table_diff.duckdb_extension';
SELECT * FROM table_diff('FROM a', 'FROM b', pk := 'id');
```

> **Installing without building:** each [GitHub Release](https://github.com/avaitla/duck_diff/releases)
> **Installing without building:** each [GitHub Release](https://github.com/avaitla/duckdb-table-diff/releases)
> attaches signed, per-platform `.duckdb_extension` binaries (see
> [docs/DISTRIBUTION.md](docs/DISTRIBUTION.md)). Download the one for your
> platform, **saved as `duck_diff.duckdb_extension`** (DuckDB derives the
> platform, **saved as `table_diff.duckdb_extension`** (DuckDB derives the
> extension name from the filename), then `LOAD` it under `-unsigned`:
> ```sh
> curl -L -o duck_diff.duckdb_extension \
> https://github.com/avaitla/duck_diff/releases/download/v0.1.0/duck_diff-v1.5.2-osx_arm64.duckdb_extension
> duckdb -unsigned -c "LOAD 'duck_diff.duckdb_extension'; SELECT * FROM table_diff('FROM a','FROM b', pk:='id');"
> curl -L -o table_diff.duckdb_extension \
> https://github.com/avaitla/duckdb-table-diff/releases/download/v0.1.0/table_diff-v1.5.2-osx_arm64.duckdb_extension
> duckdb -unsigned -c "LOAD 'table_diff.duckdb_extension'; SELECT * FROM table_diff('FROM a','FROM b', pk:='id');"
> ```

## TODO
Expand Down
2 changes: 1 addition & 1 deletion docs/DESIGN.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# duck_diff — Design
# table_diff — Design

A focused DuckDB extension that diffs two relations on a primary key and
reports, per key, whether it is identical, different, or exists only on one side.
Expand Down
22 changes: 11 additions & 11 deletions docs/DISTRIBUTION.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Distribution — signed binaries on GitHub Releases

`.github/workflows/Release.yml` builds `duck_diff` for every native platform on
`.github/workflows/Release.yml` builds `table_diff` for every native platform on
each GitHub Release, signs each binary, and attaches the signed
`.duckdb_extension` files (plus `SHA256SUMS`) to the release as assets.

Expand All @@ -19,7 +19,7 @@ simpler, fully self-contained route.

```sh
openssl genrsa -out private.pem 2048
openssl rsa -in private.pem -pubout -out duck_diff-signing-key.pub # public half (committed)
openssl rsa -in private.pem -pubout -out table_diff-signing-key.pub # public half (committed)
```
Keep `private.pem` out of the repo.

Expand Down Expand Up @@ -47,13 +47,13 @@ stamps the extension version from the tag (`git describe`). So a release is:
3. **Cut the release** (tag + publish in one step):
```sh
git checkout main && git pull
gh release create v0.2.0 --target main --title "duck_diff v0.2.0" --notes "see workflow"
gh release create v0.2.0 --target main --title "table_diff v0.2.0" --notes "see workflow"
```
(Or GitHub UI → **Releases → Draft a new release** → create tag `v0.2.0` on
`main` → **Publish**.)
4. **Done** — publishing fires `Release.yml`, which builds every platform, signs
each binary, attaches them as
`duck_diff-<duckdb_version>-<platform>.duckdb_extension` + `SHA256SUMS`, and
`table_diff-<duckdb_version>-<platform>.duckdb_extension` + `SHA256SUMS`, and
rewrites the release notes with install instructions, the source commit, and
the checksums. Watch it with `gh run watch` if you like.

Expand All @@ -69,17 +69,17 @@ stamps the extension version from the tag (`git describe`). So a release is:
## Installing (as a user)

Download the `*.duckdb_extension` matching your DuckDB version and platform from
the release assets and **save it as `duck_diff.duckdb_extension`** — DuckDB
the release assets and **save it as `table_diff.duckdb_extension`** — DuckDB
derives the extension name and entrypoint from the filename, so the name matters.
It's signed with a third-party key, so launch with `-unsigned`:

```sh
curl -L -o duck_diff.duckdb_extension \
https://github.com/<owner>/duck_diff/releases/download/v0.1.0/duck_diff-v1.5.2-osx_arm64.duckdb_extension
curl -L -o table_diff.duckdb_extension \
https://github.com/<owner>/duckdb-table-diff/releases/download/v0.1.0/table_diff-v1.5.2-osx_arm64.duckdb_extension
duckdb -unsigned
```
```sql
LOAD 'duck_diff.duckdb_extension';
LOAD 'table_diff.duckdb_extension';
FROM table_diff('FROM a', 'FROM b', pk := 'id');
```
From a client library, enable unsigned extensions in the connection config (e.g.
Expand All @@ -89,7 +89,7 @@ Python: `duckdb.connect(config={'allow_unsigned_extensions': True})`).

Each release ships `SHA256SUMS` (also inlined in the notes) and is signed with
the key whose public half is committed at
[`duck_diff-signing-key.pub`](../duck_diff-signing-key.pub):
[`table_diff-signing-key.pub`](../table_diff-signing-key.pub):

```
-----BEGIN PUBLIC KEY-----
Expand All @@ -113,15 +113,15 @@ payload is the SHA256 composite of everything before it (1 MiB chunks each
hashed, then the concatenation hashed — DuckDB's `compute-extension-hash.sh`):

```sh
F=duck_diff-v1.5.2-osx_arm64.duckdb_extension
F=table_diff-v1.5.2-osx_arm64.duckdb_extension
size=$(wc -c < "$F")
head -c $((size - 256)) "$F" > body
tail -c 256 "$F" > sig
: > chunks
split -b 1M body seg_
for f in seg_*; do openssl dgst -binary -sha256 "$f" >> chunks; rm "$f"; done
openssl dgst -binary -sha256 chunks > hash
openssl pkeyutl -verify -pubin -inkey duck_diff-signing-key.pub \
openssl pkeyutl -verify -pubin -inkey table_diff-signing-key.pub \
-sigfile sig -in hash -pkeyopt digest:sha256
# -> Signature Verified Successfully
```
2 changes: 1 addition & 1 deletion docs/functions.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# duck_diff function reference
# table_diff function reference

`table_diff`, `table_diff_summary`, and `schema_diff` are table functions. The
two relations are passed as **query strings**, written the way you
Expand Down
6 changes: 3 additions & 3 deletions examples/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Testing your own SQL with `table_diff` — `duckdb` CLI only

A copy-paste-into-your-project demonstration of how to write your own
regression tests with duck_diff's `table_diff`, in the [sqllogictest][slt]
regression tests with table_diff's `table_diff`, in the [sqllogictest][slt]
format, needing **nothing but the `duckdb` CLI** — no source build, no
`unittest` binary.

Expand All @@ -10,7 +10,7 @@ make setup # checks that duckdb is on PATH
make test # runs every tests/*.test
```

The examples assume the `duck_diff` extension is installed (see the
The examples assume the `table_diff` extension is installed (see the
[top-level README](../README.md#install)); each test `LOAD`s it.

## The examples
Expand All @@ -25,7 +25,7 @@ A test defines a golden table, runs your transformation, and asserts that

```
statement ok
LOAD duck_diff;
LOAD table_diff;

statement ok
CREATE TABLE actual_revenue AS
Expand Down
2 changes: 1 addition & 1 deletion examples/run_sqllogictest.sh
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ DUCKDB="${DUCKDB:-duckdb}"
TAB="$(printf '\t')"

# `-unsigned` is harmless when no extension is used; it lets your tests
# `LOAD duck_diff;` (or any installed extension) if you want richer assertions.
# `LOAD table_diff;` (or any installed extension) if you want richer assertions.

# Run a statement; output (incl. errors) on stdout, exit code preserved.
slt_stmt() { "$DUCKDB" "$1" -unsigned -batch -init /dev/null -c "$2" 2>&1; }
Expand Down
4 changes: 2 additions & 2 deletions extension_config.cmake
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# This file is included by DuckDB's build system. It specifies which extension to load

# Extension from this repo
duckdb_extension_load(duck_diff
duckdb_extension_load(table_diff
SOURCE_DIR ${CMAKE_CURRENT_LIST_DIR}
)

# duck_diff generates SQL that uses json_object / json_merge_patch, so the json
# table_diff generates SQL that uses json_object / json_merge_patch, so the json
# extension must be available. Build it in so tests can `require json`.
duckdb_extension_load(json)
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

namespace duckdb {

class DuckDiffExtension : public Extension {
class TableDiffExtension : public Extension {
public:
void Load(ExtensionLoader &db) override;
std::string Name() override;
Expand Down
16 changes: 8 additions & 8 deletions src/duck_diff_extension.cpp → src/table_diff_extension.cpp
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#define DUCKDB_EXTENSION_MAIN

#include "duck_diff_extension.hpp"
#include "table_diff_extension.hpp"
#include "duckdb.hpp"
#include "duckdb/common/exception.hpp"
#include "duckdb/common/string_util.hpp"
Expand Down Expand Up @@ -735,17 +735,17 @@ void LoadInternal(ExtensionLoader &loader) {

} // namespace

void DuckDiffExtension::Load(ExtensionLoader &loader) {
void TableDiffExtension::Load(ExtensionLoader &loader) {
LoadInternal(loader);
}

std::string DuckDiffExtension::Name() {
return "duck_diff";
std::string TableDiffExtension::Name() {
return "table_diff";
}

std::string DuckDiffExtension::Version() const {
#ifdef EXT_VERSION_DUCK_DIFF
return EXT_VERSION_DUCK_DIFF;
std::string TableDiffExtension::Version() const {
#ifdef EXT_VERSION_TABLE_DIFF
return EXT_VERSION_TABLE_DIFF;
#else
return "";
#endif
Expand All @@ -755,7 +755,7 @@ std::string DuckDiffExtension::Version() const {

extern "C" {

DUCKDB_CPP_EXTENSION_ENTRY(duck_diff, loader) {
DUCKDB_CPP_EXTENSION_ENTRY(table_diff, loader) {
duckdb::LoadInternal(loader);
}
}
File renamed without changes.
2 changes: 1 addition & 1 deletion test/sql/schema_diff.test
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
# description: schema_diff -- compare column names and types of two relations
# group: [sql]

require duck_diff
require table_diff

require json

Expand Down
2 changes: 1 addition & 1 deletion test/sql/table_diff.test
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ SELECT * FROM table_diff('FROM l', 'FROM r', pk := 'id');
----
Catalog Error

require duck_diff
require table_diff

require json

Expand Down
2 changes: 1 addition & 1 deletion test/sql/table_diff_composite.test
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
# description: composite primary key (pk as a list) and join-back
# group: [sql]

require duck_diff
require table_diff

require json

Expand Down
2 changes: 1 addition & 1 deletion test/sql/table_diff_context.test
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
# description: context columns -- 'context' pulls extra (non-compared) columns into the <c>_left/<c>_right expansion
# group: [sql]

require duck_diff
require table_diff

require json

Expand Down
2 changes: 1 addition & 1 deletion test/sql/table_diff_errors.test
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
# description: v1 error cases (required pk, missing key, duplicate keys)
# group: [sql]

require duck_diff
require table_diff

require json

Expand Down
2 changes: 1 addition & 1 deletion test/sql/table_diff_expand.test
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
# description: per-column expansion -- compared columns always emit <c>_left/<c>_right (native types) + <c>_diff_status, context columns emit <c>_left/<c>_right
# group: [sql]

require duck_diff
require table_diff

require json

Expand Down
2 changes: 1 addition & 1 deletion test/sql/table_diff_normalize.test
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
# description: value-normalization flags (numeric_tolerance, timestamp_precision, null_equals_empty)
# group: [sql]

require duck_diff
require table_diff

require json

Expand Down
Loading
Loading