Skip to content

Smooth-BLEU bug fixed#1

Open
Aktsvigun wants to merge 2 commits into
AlexGidiotis:mainfrom
Aktsvigun:main
Open

Smooth-BLEU bug fixed#1
Aktsvigun wants to merge 2 commits into
AlexGidiotis:mainfrom
Aktsvigun:main

Conversation

@Aktsvigun

Copy link
Copy Markdown

Alexios, hi once again!

I guess I've found a bottleneck in the current implementation. Namely, the smoothing method is not the ideal at the moment. It still cuts the values for sentences of less than 4 tokens (please see the attached image 1). This results in unreasonably high BLEUVar scores for short sentences even when the model is confident about the instance (even when the results are constant across 20 MC runs, we get the minimum of 0.468 for two-token summaries).

This is especially aggravated for AESLC dataset, where the summaries are extremely short (please see the distribution of lengths of golden summaries after tokenization in image 2). To be precise, 49.5% of summaries from train sample are of length less than 4 tokens after tokenization - which means that the current implementation of BLEUVar scores calculation is highly biased towards short summaries).

This PR request aims at fixing this bug.

Kind,
Akim Tsvigun

image
image

@AlexGidiotis AlexGidiotis left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Aktsvigun nice work 🚀 Could we add a unit test for this function under test/test_bleuvar.py?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants