Skip to content

Why am I seeing repeated diff_urls? #45

@jose

Description

@jose

Hi,

Let's take for example, the falcon Python project from the falconry organization and this diff_url. There are, in the BugSwarm dataset, 13 images corresponding to this diff_url, each with its own Job ID, Build ID, etc.:

falconry-falcon-30823226071
falconry-falcon-30823226301
falconry-falcon-30823227282
falconry-falcon-30823227361
falconry-falcon-30823227617
falconry-falcon-30823227692
falconry-falcon-30823227759
falconry-falcon-30823227836
falconry-falcon-30823227929
falconry-falcon-30823227988
falconry-falcon-30823228066
falconry-falcon-30823228169
falconry-falcon-30823228234

Overall, there are 2,073 unique diff_urls out of the 3,947 entries listed here.

My question is, should one consider this as (a) a single bug with multiple artifacts or (b) multiple bugs each with its own artifact? I would say the former...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions