Skip to content

HDFS-17927. Make TestDiskError.testShutdown use deterministic volume failure injection#8517

Open
smengcl wants to merge 2 commits into
apache:trunkfrom
smengcl:HDFS-17927-test
Open

HDFS-17927. Make TestDiskError.testShutdown use deterministic volume failure injection#8517
smengcl wants to merge 2 commits into
apache:trunkfrom
smengcl:HDFS-17927-test

Conversation

@smengcl
Copy link
Copy Markdown
Contributor

@smengcl smengcl commented May 27, 2026

Description of PR

TestDiskError.testShutdown() currently drives DataNode shutdown by repeatedly creating files until dn.isDatanodeUp() becomes false. If shutdown does not happen promptly, the test can loop for a long time and generate excessive logs.

This change replaces the file-create loop with deterministic failure injection for both DataNode data dirs, then explicitly runs dn.checkDiskError() and waits boundedly for the DataNode to exit.

How was this patch tested?

  • The test TestDiskError.testShutdown itself.

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

AI Tooling

If an AI tool was used:

@smengcl smengcl requested a review from adoroszlai May 27, 2026 02:43
@smengcl smengcl added the test test code changes label May 27, 2026
@hadoop-yetus

This comment was marked as outdated.

@adoroszlai adoroszlai requested a review from boky01 May 27, 2026 07:16
@smengcl smengcl marked this pull request as ready for review May 30, 2026 04:56
Copilot AI review requested due to automatic review settings May 30, 2026 04:56
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes TestDiskError.testShutdown() deterministic by replacing repeated file creation with direct DataNode volume failure injection and a bounded wait for shutdown.

Changes:

  • Uses DataNodeTestUtils.injectDataDirFailure() on both DataNode storage dirs.
  • Calls dn.checkDiskError() directly and waits up to 30 seconds for the DataNode to report down.
  • Cleans up injected failures in a finally block and updates imports accordingly.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

return;
}
// Bring up two more datanodes
cluster.startDataNodes(conf, 2, true, null, null);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure the 2 additional DNs are not required neither in this test case nor to keep the following test cases stable (in the setup phase we are creating a single DN cluster)?

DataNodeTestUtils.injectDataDirFailure(dir1, dir2);
dn.checkDiskError();
GenericTestUtils.waitFor(() -> !dn.isDatanodeUp(), 100, 30000);
assertFalse(dn.isDatanodeUp(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to reach this line?
I mean if dn.isDatanodeUp() is true, then GenericTestUtils.waitFor(() will throw an exception, so if GenericTestUtils.waitFor() did not throw exception that means the DN is down.

@boky01
Copy link
Copy Markdown
Contributor

boky01 commented May 30, 2026

Hi @smengcl,
Thanks for the patch, it's a better version then the previous one. I left a few comments, please check them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

HDFS test test code changes trunk

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants