HoodieSchema collapses TINYINT/SMALLINT into INT and loses engine type width needed by writer paths

### Task Description

**Describe the problem**

`HoodieSchema` currently collapses engine-level small integer types into a single `INT` type, which loses integer width information needed to faithfully reconstruct engine-native schemas.

Examples:
- Spark: `ByteType | ShortType  -> HoodieSchemaType.INT`
- Flink: `TINYINT | SMALLINT  -> HoodieSchemaType.INT`

This becomes a problem because writer paths are built from `HoodieSchema`, and later reconstruct engine-native schemas from it.

On Spark, the path is roughly:

1. `HoodieSparkSchemaConverters.toHoodieType(...)` maps `ByteType/ShortType/IntegerType` to `HoodieSchemaType.INT`
2. `HoodieSparkFileWriterFactory` reconstructs `StructType` from `HoodieSchema`
3. `HoodieSparkSchemaConverters.toSqlType(...)` maps `HoodieSchemaType.INT` back to `IntegerType`
4. `HoodieRowParquetWriteSupport` makes type-dependent row accessor decisions from that reconstructed `StructType`

At that point, the original Spark type width is already lost. For example, an original `ShortType` field is reconstructed as `IntegerType`, and writer code may go through `row.getInt(...)` instead of `row.getShort(...)`.

Flink has the same issue in principle:
- `TINYINT/SMALLINT` are also collapsed into `HoodieSchemaType.INT`
- writer code later reconstructs `RowType` from `HoodieSchema`
- the original integer width is no longer recoverable from `HoodieSchema` alone

So this is not just a schema round-trip fidelity issue. `HoodieSchema` currently does not preserve enough information for engine-native writer construction.

**To Reproduce**

1. Create a Spark `StructType` or Flink `RowType` containing `TINYINT` or `SMALLINT`
2. Convert it to `HoodieSchema`
3. Reconstruct engine-native schema from that `HoodieSchema`
4. Observe that the field has already been widened to `INT` / `IntegerType`

**Expected behavior**

`HoodieSchema` should preserve integer width information so that:
- Spark `ByteType`, `ShortType`
- Flink `TINYINT`, `SMALLINT`

do not all collapse into the same schema type.

**Environment Description**

Hudi version: current master / current branch

**Possible solutions**

1. Add new types:
   add native integer-width types such as `TINYINT` and `SMALLINT` to `HoodieSchema`, and preserve them across Spark/Flink converters.

2. Fix the writer builders:
   avoid reconstructing engine writer schema from `HoodieSchema` alone in paths that need engine-native getter semantics, and instead pass the original engine schema through the writer boundary.

The first direction seems more consistent with the long-term schema-system design. `rfc-99` already describes primitive integer widths such as `TINYINT` and `SMALLINT` as first-class logical types.

### Task Type

Code improvement/refactoring

### Related Issues

**Parent feature issue:** (if applicable )
**Related issues:**
NOTE: Use `Relationships` button to add parent/blocking issues after issue is created.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HoodieSchema collapses TINYINT/SMALLINT into INT and loses engine type width needed by writer paths #18974

Task Description

Task Type

Related Issues

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

HoodieSchema collapses TINYINT/SMALLINT into INT and loses engine type width needed by writer paths #18974

Description

Task Description

Task Type

Related Issues

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions