Finetuning on InfographicVQA

I was unable to achieve the result shown in the [UDOP paper](https://arxiv.org/abs/2212.02623).

I used the [udop-unimodel-large-224 checkpoint](https://huggingface.co/ZinengTang/Udop/resolve/main/udop-unimodel-large-224.zip).

My ANLS score is **0.407903**.
This is nowhere near 0.461 as shown in the table below taken from the paper.

![image](https://github.com/microsoft/i-Code/assets/2378072/91593064-6426-4050-a64d-0ad75c76b73c)

Since I noticed that the batch size, warmup steps and weight decay given in https://github.com/microsoft/i-Code/blob/main/i-Code-Doc/scripts/finetune_duebenchmark.sh is different from reported in the paper, I also tried changing the finetuning script to use the paper's settings.

<img src="https://github.com/microsoft/i-Code/assets/2378072/61b6076d-a285-4e18-8223-dd3900b3466b" alt="drawing" width="400"/>

Lastly, I also tried adding the task prompt prefix since it is not done so in the existing code. I followed approach from https://github.com/microsoft/i-Code/issues/71#issuecomment-1623201208



Results of the 3 different finetuning configurations:
|Task prefix | Hyperparameter settings | ANLS Score |
|------------|-------------------------|------------|
| No | Unchanged finetuning script | 0.407903 |
| No | Paper's settings | 0.40174 |
| Yes | Unchanged finetuning script | 0.408355 |

Other changes I made:
- Change to use pytorch's AdamW, based from https://github.com/microsoft/i-Code/issues/63#issuecomment-1608019905

    Within `baselines-master` in `due-benchmark` repo:
    - Apply fix from https://github.com/due-benchmark/baselines/issues/7#issue-1638167863
    - in `baselines-master/benchmarker/data/utils.py`, I changed `dtype` of label_name from `U100` to `U1024` to prevent truncation of questions during display

Please assist

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Finetuning on InfographicVQA #125

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Task prefix	Hyperparameter settings	ANLS Score
No	Unchanged finetuning script	0.407903
No	Paper's settings	0.40174
Yes	Unchanged finetuning script	0.408355

Finetuning on InfographicVQA #125

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions