Skip to content

Feat/udf perm entropy agg example2#35271

Open
facetosea wants to merge 6 commits intomainfrom
feat/udf-perm-entropy-agg-example2
Open

Feat/udf perm entropy agg example2#35271
facetosea wants to merge 6 commits intomainfrom
feat/udf-perm-entropy-agg-example2

Conversation

@facetosea
Copy link
Copy Markdown
Contributor

Description

Issue(s)

  • Close/close/Fix/fix/Resolve/resolve: Issue Link

Checklist

Please check the items in the checklist if applicable.

  • Is the user manual updated?
  • Are the test cases passed and automated?
  • Is there no significant decrease in test coverage?

    Introduce permutation entropy (perm_entropy) as an end-to-end aggregate
    UDF example demonstrating correct two-tier memory management in TDengine.

    Memory ownership rules:
    - Framework buffer (interBuf->buf / newInterBuf->buf): pre-allocated by
      udfd; UDF must write state via memcpy, never replace the pointer.
    - UDF heap (state->values): allocated by UDF via realloc, freed by UDF
      in finish() and on every error path.

    Changes:
    - docs/examples/udf/perm_entropy.c: clean educational implementation
    - docs/zh/07-develop/09-udf.md: new section with memory ownership table
    - docs/examples/udf/compile_udf.sh: add perm_entropy build step
    - source/libs/function/test/perm_entropy.c: CI version with trace log
    - source/libs/function/test/CMakeLists.txt: add libperm_entropy.so target
    - test/cases/12-UDFs/test_udf_restart_taosd.py: correctness test and
      RSS leak detection test for perm_entropy
Add permutation entropy as an official aggregate UDF example, with fixes
from PR review, CI test, and build system corrections.

Changes:
- docs/examples/udf/perm_entropy.c: overflow guards, n_windows guard,
  upfront TINYINT/SMALLINT type validation, correct ensure_capacity
- source/libs/function/test/perm_entropy.c: same fixes, no logging
- docs/zh/07-develop/09-udf.md: add -lm flag, add supertable DDL
- test/cases/12-UDFs/test_udf_restart_taosd.py: add test_perm_entropy
  and test_perm_entropy_rss_leak; remove ASAN overhead; Windows branch
  in prepare_perm_entropy_so()
- source/libs/function/CMakeLists.txt: always build UDF example SO files
  regardless of BUILD_TEST so stale ASAN-built .so files are not left behind
- source/libs/function/test/CMakeLists.txt: guard only runUdf executable
  behind BUILD_TEST; UDF example libraries always built
Mirror the Chinese aggregate function example 4 (permutation entropy)
into the English documentation, including memory ownership table,
callback responsibilities, DDL, compile command, and code include.
Copilot AI review requested due to automatic review settings April 30, 2026 03:20
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a Permutation Entropy aggregate UDF example to demonstrate the 'accumulate-all-data-then-compute' pattern, including C source code, documentation in English and Chinese, and integration tests. The review feedback identifies potential null pointer dereferences and missing buffer size validations in the UDF implementation, an invalid CMake library type definition, and a build configuration inconsistency where the test directory is processed regardless of the test build flag.

Comment thread docs/examples/udf/perm_entropy.c
Comment thread source/libs/function/test/perm_entropy.c
Comment thread source/libs/function/CMakeLists.txt
Comment thread source/libs/function/test/CMakeLists.txt
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new aggregate UDF example (perm_entropy, permutation entropy) to the codebase, along with build integration, documentation, and CI/system-test coverage to exercise the “accumulate all data then compute” aggregate-UDF pattern.

Changes:

  • Introduce perm_entropy C aggregate UDF source (docs + function test build copy) and build it via CMake.
  • Add new Python system tests that register/query perm_entropy, including interval/partition scenarios and a taosd restart check.
  • Document the pattern and add a compile script entry for the new example.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
test/cases/12-UDFs/test_udf_restart_taosd.py Adds perm_entropy registration/query/restart tests and an RSS sampling helper.
source/libs/function/test/perm_entropy.c Adds a CI-built copy of the perm_entropy aggregate UDF.
source/libs/function/test/CMakeLists.txt Builds the perm_entropy module and links libm where needed; gates runUdf on BUILD_TEST.
source/libs/function/CMakeLists.txt Always includes the function test subdir when UDF is enabled.
docs/zh/07-develop/09-udf.md Adds a Chinese documentation section for the new aggregate example and memory-ownership rules.
docs/en/07-develop/09-udf.md Adds an English documentation section for the new aggregate example and memory-ownership rules.
docs/examples/udf/perm_entropy.c Adds the documented/canonical perm_entropy implementation.
docs/examples/udf/compile_udf.sh Adds compilation of libperm_entropy.so into /tmp/udf.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test/cases/12-UDFs/test_udf_restart_taosd.py
Comment thread docs/examples/udf/perm_entropy.c Outdated
Comment thread source/libs/function/test/perm_entropy.c Outdated
Comment thread docs/examples/udf/compile_udf.sh Outdated
- perm_entropy.c (docs + test): return NAN instead of 0.0 when calloc
  fails in compute_perm_entropy(), so OOM is not silently masked as a
  valid entropy result
- test_udf_restart_taosd.py: fail fast with tdLog.exit when
  libperm_entropy.so is not found, instead of silently passing an empty
  path to CREATE AGGREGATE FUNCTION
- compile_udf.sh: fix cleanup list - replace libsqrsum.so with
  libl2norm.so to match the actual artifact compiled by the script
Copilot AI review requested due to automatic review settings May 2, 2026 04:01
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +98 to +99

DLL_EXPORT int32_t perm_entropy_start(SUdfInterBuf *interBuf) {
Comment on lines +98 to +99
int *counts = (int *)calloc(n_patterns, sizeof(int));
if (counts == NULL) return NAN;
Comment on lines +763 to +768
# 30 rows @ 1s interval → three 10s windows; each should return a value
tdSql.query(
"select perm_entropy(val) from perm_t0 interval(10s)"
)
tdSql.checkRows(3)
tdLog.info("test3 pass: interval window returns 3 rows")
Comment on lines +886 to +892
if len(rss_samples) >= 2:
first = next((v for v in rss_samples if v > 0), 0)
last = rss_samples[-1]
tdLog.info("taosudf RSS: start=%d KB end=%d KB growth=%d KB (%.1f MB)"
% (first, last, last - first, (last - first) / 1024.0))
else:
tdLog.info("taosudf not observed via /proc – skipping RSS log")
if(${BUILD_TEST})
add_subdirectory(test)
endif()
add_subdirectory(test)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants