Skip to content

Crypto Module Migration#104

Merged
harvey0100 merged 2 commits into
avocado-framework:mainfrom
harvey0100:crypto
Mar 16, 2026
Merged

Crypto Module Migration#104
harvey0100 merged 2 commits into
avocado-framework:mainfrom
harvey0100:crypto

Conversation

@harvey0100
Copy link
Copy Markdown
Contributor

Migrates the crypto module from avocado to autils. Adds hash_file() for file hashing, plus unit and functional tests, metadata, and docs. The implementation matches the avocado version, with the deprecation block removed and imports updated for autils.

@harvey0100 harvey0100 self-assigned this Feb 24, 2026
@mr-avocado mr-avocado Bot moved this to Review Requested in Default project Feb 24, 2026
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @harvey0100, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the autils library by migrating a robust cryptographic hashing utility and introducing a comprehensive general I/O utility module. The crypto module provides file hashing capabilities, while the genio module offers a suite of common file operations, including reading, writing, and pattern matching, with an integrated file comparison feature leveraging the new hashing functionality. This change improves code organization, reusability, and maintainability within autils, supported by thorough testing and updated documentation.

Highlights

  • Crypto Module Migration: The crypto module, including the hash_file function, has been migrated from the avocado project to autils.file. The implementation matches the original avocado version, with deprecation blocks removed and imports updated for autils.
  • New GenIO Module: A new genio module has been introduced under autils.file, providing a collection of generic I/O utilities for file operations such as reading, writing, appending, and pattern matching. This module integrates with the new crypto module for file comparison.
  • Comprehensive Testing: Both the crypto and genio modules are accompanied by new unit and functional tests. These tests cover various scenarios, including core algorithm functionality, size limits, edge cases, error handling, and real-world use cases like download verification and log analysis.
  • Documentation and Metadata: New documentation entries in docs/source/utils.rst and metadata files (metadata/file/crypto.yml, metadata/file/genio.yml) have been added to properly integrate and describe the new crypto and genio modules within the autils project.
Changelog
  • autils/file/crypto.py
    • Added the crypto module, including the hash_file function for cryptographic file hashing.
  • autils/file/genio.py
    • Added the genio module, providing various file I/O utilities and integrating crypto.hash_file for file comparison.
  • docs/source/utils.rst
    • Updated the documentation to include the newly added crypto and genio modules.
  • metadata/file/crypto.yml
    • Added metadata for the new crypto module, specifying its description, categories, maintainers, supported platforms, and test files.
  • metadata/file/genio.yml
    • Added metadata for the new genio module, specifying its description, categories, maintainers, supported platforms, and test files.
  • tests/functional/modules/file/crypto.py
    • Added functional tests for the crypto.hash_file function, covering real-world scenarios like download verification and tampering detection.
  • tests/functional/modules/file/genio.py
    • Added functional tests for the genio module, validating file read/write, pattern matching, and comparison operations in practical scenarios.
  • tests/unit/modules/file/crypto.py
    • Added unit tests for the crypto.hash_file function, covering various algorithms, size limits, edge cases, and error handling.
  • tests/unit/modules/file/genio.py
    • Added unit tests for the genio module's functions, including file read/write, line operations, pattern matching, and error handling.
Activity
  • No specific activity (comments, reviews, progress) has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The PR successfully migrates the crypto and genio modules to autils, providing good test coverage and documentation. I've identified a few areas for improvement, primarily focusing on memory efficiency when handling large files and a bug in hash_file when dealing with special files (like /dev/zero) where os.path.getsize returns 0.

I am having trouble creating individual review comments. Click here to see my feedback.

autils/file/crypto.py (57-60)

high

Using os.path.getsize() to cap the size parameter causes issues with special files (e.g., /dev/zero, pipes, or some files in /proc) where the reported size is 0, even though they contain data. This results in an empty hash being returned even if a positive size was requested. Additionally, it introduces a race condition (TOCTOU) if the file size changes between the call to getsize() and the subsequent read(). It's more robust to read until EOF or until the requested number of bytes is reached.

    if not size or size <= 0:
        size = None

autils/file/crypto.py (71-78)

high

This loop should be updated to handle the case where size is None (meaning hash the entire file), ensuring it works correctly for both regular and special files without relying on a pre-calculated file size.

        while size is None or size > 0:
            read_amt = chunksize if size is None else min(chunksize, size)
            data = file_to_hash.read(read_amt)
            if not data:
                if size is not None:
                    LOG.debug("Nothing left to read but size=%d", size)
                break
            hash_obj.update(data)
            if size is not None:
                size -= len(data)

autils/file/genio.py (122)

medium

Calling file_obj.readlines() creates an intermediate list of all lines in memory before the list comprehension runs. Iterating directly over the file object is more memory-efficient.

            contents = [line.rstrip("\n") for line in file_obj]

autils/file/genio.py (150)

medium

Using file_obj.readlines() loads the entire file into memory. For large files (like logs), this can lead to high memory consumption. Iterating over the file object directly processes it line by line.

        for line in file_obj:

autils/file/genio.py (286)

medium

Reading the entire file into memory with content_file.read() can be problematic for very large files. While re.MULTILINE is used, if the pattern doesn't actually span multiple lines, iterating line by line would be much safer. If multi-line matching is required, consider reading in chunks or using a memory-mapped file for large inputs.

autils/file/genio.py (310-311)

medium

Comparing file sizes before computing cryptographic hashes is a significant optimization. If the sizes differ, the files cannot be equal, and we can avoid the expensive hashing process entirely.

    if os.path.getsize(filename) != os.path.getsize(other):
        return False
    hash_1 = crypto.hash_file(filename)
    hash_2 = crypto.hash_file(other)

- Add autils/file/crypto.py (hash_file without deprecation block)
- Add unit and functional tests with updated imports
- Add metadata/file/crypto.yml
- Add crypto to docs under File section

Reference: avocado-framework#43
Assisted-By: Cursor-Claude-4-Sonnet
Signed-off-by: Harvey Lynden <hlynden@redhat.com>
Remove test_file_tampering_detection and test_symlink_follows_to_target
to match avocado selftests/functional/utils/crypto.py line for line.

Made-with: Cursor
@harvey0100 harvey0100 merged commit 3745ffd into avocado-framework:main Mar 16, 2026
5 checks passed
@github-project-automation github-project-automation Bot moved this from Review Requested to Done 114 in Default project Mar 16, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 16, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 0.00%. Comparing base (07bdbd3) to head (6488869).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files
@@     Coverage Diff     @@
##   main   #104   +/-   ##
===========================
===========================

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done 114

Development

Successfully merging this pull request may close these issues.

1 participant