Smooth-BLEU bug fixed by Aktsvigun · Pull Request #1 · AlexGidiotis/Bayesian-Summarization

Aktsvigun · 2022-06-29T15:36:23Z

Alexios, hi once again!

I guess I've found a bottleneck in the current implementation. Namely, the smoothing method is not the ideal at the moment. It still cuts the values for sentences of less than 4 tokens (please see the attached image 1). This results in unreasonably high BLEUVar scores for short sentences even when the model is confident about the instance (even when the results are constant across 20 MC runs, we get the minimum of 0.468 for two-token summaries).

This is especially aggravated for AESLC dataset, where the summaries are extremely short (please see the distribution of lengths of golden summaries after tokenization in image 2). To be precise, 49.5% of summaries from train sample are of length less than 4 tokens after tokenization - which means that the current implementation of BLEUVar scores calculation is highly biased towards short summaries).

This PR request aims at fixing this bug.

Kind,
Akim Tsvigun

AlexGidiotis

@Aktsvigun nice work 🚀 Could we add a unit test for this function under test/test_bleuvar.py?

Aktsvigun added 2 commits June 29, 2022 18:34

Smooth-BLEU bug fixed

67834e1

Smooth-BLEU bug fixed

ce80d73

AlexGidiotis requested changes Jul 6, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Smooth-BLEU bug fixed#1

Smooth-BLEU bug fixed#1
Aktsvigun wants to merge 2 commits into
AlexGidiotis:mainfrom
Aktsvigun:main

Aktsvigun commented Jun 29, 2022

Uh oh!

AlexGidiotis left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Aktsvigun commented Jun 29, 2022

Uh oh!

AlexGidiotis left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants