Skip to content

Number of rows in each column should be the same, but got [ArrayBuffer(8192, 0)] #4211

@andygrove

Description

@andygrove

Describe the bug

I was trying to run TPC-H @ 1TB benchmarks to compare native_datafusion and native_iceberg_compat performance. Some queries fail when using native_iceberg_compat.

org.apache.spark.SparkException: Job aborted due to stage failure: Task 9 in stage 18.0 failed 4 times, most recent failure: Lost task 9.3 in stage 18.0 (TID 24604) (240.56.187.86 executor 9): org.apache.spark.SparkException: Number of rows in each column should be the same, but got [ArrayBuffer(8192, 0)]
	at org.apache.comet.vector.NativeUtil.exportBatch(NativeUtil.scala:173)
	at org.apache.comet.vector.NativeUtil.exportBatchToAddresses(NativeUtil.scala:97)
	at org.apache.comet.NativeColumnarToRowConverter.convert(NativeColumnarToRowConverter.scala:90)
	at org.apache.spark.sql.comet.CometNativeColumnarToRowExec.$anonfun$doExecute$3(CometNativeColumnarToRowExec.scala:211)
	at scala.collection.Iterator$$anon$10.nextCur(Iterator.scala:594)
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:608)
	at scala.collection.Iterator$$anon$9.hasNext(Iterator.scala:583)
	at org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:179)

Steps to reproduce

No response

Expected behavior

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions