Skip to content

Suprisingly low perf on code.Debug #31

@seyuboglu

Description

@seyuboglu

Thanks for the work on this benchmark.

I was wondering why the baseline accuracies on code.Debug are so low.

de.Debug | 37.06% | < 5% | 17.77% | < 5% | 9.14% | 13.96% | 7.36%

Since it's multiple choice with four options, random guessing should give at least 25%.
Have you released the outputs from your evaluation runs anywhere?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions