Skip to content

[server] TabletServer leaves orphan replica directories after restart during table/partition deletion #3387

@gyang94

Description

@gyang94

Search before asking

  • I searched in the issues and found nothing similar.

Fluss version

0.9.0 (latest release)

Please describe the bug 🐞

Summary

When a TabletServer restarts while one of its replicas is being deleted (the TS was offline when dropTable or dropPartition was issued), the on-disk replica directories on the restarted TabletServer are never removed. This produces a permanent disk leak — the directories have no in-memory owner and are never reaped.

Reproduction

  1. Start a Fluss cluster with 3 TabletServers.
  2. Create a table with replication-factor = 3.
  3. Stop TabletServer-1.
  4. Drop the table (or partition).
  5. Restart TabletServer-1.
  6. Observe that the data directory on TabletServer-1 (<dataDir>/<db>/<table>-<tableId>/log-0/) still exists.

Root cause

Two independent gaps in the TabletServer's cleanup logic produce the leak, depending on whether the deleted entity is a table or a partition.

Gap 1: stopReplica for NoneReplica is a no-op (affects partition deletion)

When a TabletServer restarts, ReplicaManager.allReplicas is empty — buckets are only registered when NotifyLeaderAndIsr arrives. However, LogManager.startup() calls loadAllLogs(), which walks the data directories and loads every on-disk log tablet into LogManager.currentLogs.

For partition deletion, the table metadata still exists in ZK, so loadAllLogs succeeds and the orphan log tablet enters currentLogs. When the coordinator re-sends stopReplica(delete=true) after the TS reconnects, the bucket resolves to NoneReplica (not in allReplicas), and the current code does nothing:

if (hostedReplica instanceof NoneReplica) {
    // do nothing for this case.
    result.add(new StopReplicaResultForBucket(tb));
}

The TS responds "success" without touching the on-disk directory. The log tablet sits in currentLogs with no way to be cleaned up.

Gap 2: SchemaNotExistException handler leaves empty parent directories (affects table deletion)

For table deletion, dropTable removes table metadata (including schemas) from ZK synchronously, before sending stopReplica. When the offline TS restarts:

  1. loadAllLogsloadLoggetTableInfoSchemaNotExistException (schema gone from ZK)
  2. The existing handler deletes log-N/ and kv-N/ tablet directories
  3. But the parent directory (<db>/<table>-<tableId>/ or <db>/<table>-<tableId>/<partition>-p<partitionId>/) is left behind empty

Since the log tablet was already cleaned up at startup, currentLogs does not contain the bucket. If a stopReplica(delete=true) later arrives, the NoneReplica branch finds nothing to sweep — the empty parent directory leaks.

Solution

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions