RFC-0025 Derived column by ScrapCodes · Pull Request #61 · prestodb/rfcs

ScrapCodes · 2026-04-24T07:35:00Z

What is a derived column?

A column created by applying a SQL expression or a UDF to an existing column in a table.

Why do we need that, since we can always apply a UDF to a column during project, filter or join?

Indeed, a derived column consumes O(N) storage, where N is the number of rows in the table. We still need them because, the performance benefits outweigh the disadvantage of extra storage it consumes. Let us understand with the following use case example:

A compute engine like Presto can easily push down a filter predicate e.g. SELECT col1, col2, FROM table T1 WHERE col1='constant_value' , this allows for pruning the number of rows required for TableScan by applying the filtering WHERE col1=’constant_value’. This is not true of when a UDF is involved in the filter predicate, let us take an example SELECT col1, col2, FROM table T1 WHERE lower(col1)='constant_value'. While optimizers can easily push down the filter predicate, however, it can not be used in filtering using the lower and upper bound metrics, for example Iceberg manifest statics and Parquet row group statistics. As a result, we end up scanning a large number of rows.

So, to support push down of certain predicates (with UDFs in them) and reduce the amount of data scanned, derived column bring massive performance improvements. Derived columns have already been proven in RDBMS system e.g. DB2 [1], and now we intend to bring them to Presto.

jja725

Agree that how write work would be the main concern here with compatibility with all the engine

ScrapCodes · 2026-05-07T16:17:55Z

@tdcmeehan has volunteered to be a co-author ! Yay!

aditi-pandit · 2026-05-19T19:19:25Z

+   {
+            "udfSpecList" : [ {
+               "derivedColumnType" : "PERSISTENT",
+               "derivedColumnExpression" : "SQL expression",


Can you give more info about the SQL dialect of this expression ? Seems like you want atleast Presto and Spark to understand it.

To be clear, deriving a common subset of expressions that are interpretable by both Spark and Presto is hard and likely outside of the scope of this RFC. I think the most straightforward thing is to treat them like views, which defer on cross-platform interpretability and force any consumer of the view SQL to understand Presto's dialect. Cross platform expressions can be considered an orthogonal yet important task.

prestodb-ci added the from:IBM PRs from IBM label Apr 24, 2026

prestodb-ci requested review from a team, BryanCutler and infvg and removed request for a team April 24, 2026 07:35

ScrapCodes marked this pull request as draft April 24, 2026 07:35

ScrapCodes removed request for BryanCutler and infvg April 24, 2026 07:35

ScrapCodes force-pushed the derived-column-spec branch 4 times, most recently from 474cf06 to 221e8c0 Compare April 24, 2026 11:37

ScrapCodes force-pushed the derived-column-spec branch from 221e8c0 to 8690c4e Compare May 4, 2026 09:02

ScrapCodes changed the title ~~[WIP] RFC-0025 Derived column~~ RFC-0025 Derived column May 4, 2026

ScrapCodes marked this pull request as ready for review May 4, 2026 09:58

prestodb-ci requested review from a team, infvg and wanglinsong and removed request for a team May 4, 2026 09:58

ScrapCodes force-pushed the derived-column-spec branch 3 times, most recently from 3bdd1ef to 2448a60 Compare May 4, 2026 12:23

aditi-pandit reviewed May 4, 2026

View reviewed changes

Comment thread RFC-0025-derived-column-support.md Outdated

jja725 self-requested a review May 5, 2026 18:25

ScrapCodes force-pushed the derived-column-spec branch from 2448a60 to 8c6c4a4 Compare May 6, 2026 16:39

jja725 reviewed May 7, 2026

View reviewed changes

Comment thread RFC-0025-derived-column-support.md Outdated

Comment thread RFC-0025-derived-column-support.md Outdated

ScrapCodes force-pushed the derived-column-spec branch from 8c6c4a4 to 03935be Compare May 7, 2026 16:11

tdcmeehan reviewed May 7, 2026

View reviewed changes

Comment thread RFC-0025-derived-column-support.md Outdated

Comment thread RFC-0025-derived-column-support.md Outdated

Comment thread RFC-0025-derived-column-support.md Outdated

Comment thread RFC-0025-derived-column-support.md Outdated

tdcmeehan reviewed May 8, 2026

View reviewed changes

Comment thread RFC-0025-derived-column-support.md Outdated

Comment thread RFC-0025-derived-column-support.md

Comment thread RFC-0025-derived-column-support.md Outdated

ScrapCodes force-pushed the derived-column-spec branch from 03935be to 6629e4e Compare May 18, 2026 10:54

ScrapCodes requested review from aditi-pandit, jja725 and tdcmeehan May 18, 2026 17:52

tdcmeehan reviewed May 18, 2026

View reviewed changes

Comment thread RFC-0025-derived-column-support.md

ScrapCodes requested a review from tdcmeehan May 19, 2026 06:47

ScrapCodes force-pushed the derived-column-spec branch from 82feae1 to 334810b Compare May 19, 2026 10:35

aditi-pandit reviewed May 19, 2026

View reviewed changes

ScrapCodes commented May 20, 2026

View reviewed changes

Comment thread RFC-0025-derived-column-support.md Outdated

ScrapCodes force-pushed the derived-column-spec branch from e9533da to 7b6ee54 Compare May 25, 2026 14:00

RFC-0025 Derived column

73e85c3

ScrapCodes force-pushed the derived-column-spec branch from 7b6ee54 to 73e85c3 Compare May 26, 2026 06:11

ScrapCodes requested a review from aditi-pandit May 26, 2026 06:14

Conversation

ScrapCodes commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

jja725 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ScrapCodes commented May 7, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aditi-pandit May 19, 2026

Choose a reason for hiding this comment

Uh oh!

tdcmeehan May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ScrapCodes commented Apr 24, 2026 •

edited

Loading