Conversation
Introduce permutation entropy (perm_entropy) as an end-to-end aggregate
UDF example demonstrating correct two-tier memory management in TDengine.
Memory ownership rules:
- Framework buffer (interBuf->buf / newInterBuf->buf): pre-allocated by
udfd; UDF must write state via memcpy, never replace the pointer.
- UDF heap (state->values): allocated by UDF via realloc, freed by UDF
in finish() and on every error path.
Changes:
- docs/examples/udf/perm_entropy.c: clean educational implementation
- docs/zh/07-develop/09-udf.md: new section with memory ownership table
- docs/examples/udf/compile_udf.sh: add perm_entropy build step
- source/libs/function/test/perm_entropy.c: CI version with trace log
- source/libs/function/test/CMakeLists.txt: add libperm_entropy.so target
- test/cases/12-UDFs/test_udf_restart_taosd.py: correctness test and
RSS leak detection test for perm_entropy
Add permutation entropy as an official aggregate UDF example, with fixes from PR review, CI test, and build system corrections. Changes: - docs/examples/udf/perm_entropy.c: overflow guards, n_windows guard, upfront TINYINT/SMALLINT type validation, correct ensure_capacity - source/libs/function/test/perm_entropy.c: same fixes, no logging - docs/zh/07-develop/09-udf.md: add -lm flag, add supertable DDL - test/cases/12-UDFs/test_udf_restart_taosd.py: add test_perm_entropy and test_perm_entropy_rss_leak; remove ASAN overhead; Windows branch in prepare_perm_entropy_so() - source/libs/function/CMakeLists.txt: always build UDF example SO files regardless of BUILD_TEST so stale ASAN-built .so files are not left behind - source/libs/function/test/CMakeLists.txt: guard only runUdf executable behind BUILD_TEST; UDF example libraries always built
Mirror the Chinese aggregate function example 4 (permutation entropy) into the English documentation, including memory ownership table, callback responsibilities, DDL, compile command, and code include.
There was a problem hiding this comment.
Code Review
This pull request introduces a Permutation Entropy aggregate UDF example to demonstrate the 'accumulate-all-data-then-compute' pattern, including C source code, documentation in English and Chinese, and integration tests. The review feedback identifies potential null pointer dereferences and missing buffer size validations in the UDF implementation, an invalid CMake library type definition, and a build configuration inconsistency where the test directory is processed regardless of the test build flag.
There was a problem hiding this comment.
Pull request overview
Adds a new aggregate UDF example (perm_entropy, permutation entropy) to the codebase, along with build integration, documentation, and CI/system-test coverage to exercise the “accumulate all data then compute” aggregate-UDF pattern.
Changes:
- Introduce
perm_entropyC aggregate UDF source (docs + function test build copy) and build it via CMake. - Add new Python system tests that register/query
perm_entropy, including interval/partition scenarios and a taosd restart check. - Document the pattern and add a compile script entry for the new example.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| test/cases/12-UDFs/test_udf_restart_taosd.py | Adds perm_entropy registration/query/restart tests and an RSS sampling helper. |
| source/libs/function/test/perm_entropy.c | Adds a CI-built copy of the perm_entropy aggregate UDF. |
| source/libs/function/test/CMakeLists.txt | Builds the perm_entropy module and links libm where needed; gates runUdf on BUILD_TEST. |
| source/libs/function/CMakeLists.txt | Always includes the function test subdir when UDF is enabled. |
| docs/zh/07-develop/09-udf.md | Adds a Chinese documentation section for the new aggregate example and memory-ownership rules. |
| docs/en/07-develop/09-udf.md | Adds an English documentation section for the new aggregate example and memory-ownership rules. |
| docs/examples/udf/perm_entropy.c | Adds the documented/canonical perm_entropy implementation. |
| docs/examples/udf/compile_udf.sh | Adds compilation of libperm_entropy.so into /tmp/udf. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- perm_entropy.c (docs + test): return NAN instead of 0.0 when calloc fails in compute_perm_entropy(), so OOM is not silently masked as a valid entropy result - test_udf_restart_taosd.py: fail fast with tdLog.exit when libperm_entropy.so is not found, instead of silently passing an empty path to CREATE AGGREGATE FUNCTION - compile_udf.sh: fix cleanup list - replace libsqrsum.so with libl2norm.so to match the actual artifact compiled by the script
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| DLL_EXPORT int32_t perm_entropy_start(SUdfInterBuf *interBuf) { |
| int *counts = (int *)calloc(n_patterns, sizeof(int)); | ||
| if (counts == NULL) return NAN; |
| # 30 rows @ 1s interval → three 10s windows; each should return a value | ||
| tdSql.query( | ||
| "select perm_entropy(val) from perm_t0 interval(10s)" | ||
| ) | ||
| tdSql.checkRows(3) | ||
| tdLog.info("test3 pass: interval window returns 3 rows") |
| if len(rss_samples) >= 2: | ||
| first = next((v for v in rss_samples if v > 0), 0) | ||
| last = rss_samples[-1] | ||
| tdLog.info("taosudf RSS: start=%d KB end=%d KB growth=%d KB (%.1f MB)" | ||
| % (first, last, last - first, (last - first) / 1024.0)) | ||
| else: | ||
| tdLog.info("taosudf not observed via /proc – skipping RSS log") |
| if(${BUILD_TEST}) | ||
| add_subdirectory(test) | ||
| endif() | ||
| add_subdirectory(test) |
Description
Issue(s)
Checklist
Please check the items in the checklist if applicable.