Skip to content

attachments includes phantom multipart/body parts (classifier returns non-attachments) #22

@kurok

Description

@kurok

Finding (High). extract_mail_parts pushes the parent part after recursing (post-order, mail_parser.rs:86), so multipart/* container nodes and body parts all land in attachments with an empty filename (mail_parser.rs:49–57). A one-image email yields 4 entries; test_attachments.py:6 asserts len == 4, encoding the bug as expected behavior.

Fix. Classify by a real filename via Content-Disposition: attachment then Content-Type; name=; do not push multipart containers; do not call get_body_raw() on containers (it re-includes children's bytes). See sketch in the audit.

Acceptance

  • len(attachment_mail.attachments) == 1 for the single-attachment fixture
  • attachments[0].filename is populated
  • .pyi stub and README reflect the corrected semantics

⚠️ Breaking change — requires a minor version bump. See Open Question on whether current shape is relied upon.

Audit ref: M2-T1. Related: #DISPOSITION.

Metadata

Metadata

Assignees

No one assigned

    Labels

    auditSurfaced by the repo-auditbreaking-changeChanges public API/behaviorbugSomething isn't workingpriority: highHigh priority

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions