Method comparison: Image benchmark shows sample images#3309
Open
BenjaminBossan wants to merge 2 commits into
Open
Method comparison: Image benchmark shows sample images#3309BenjaminBossan wants to merge 2 commits into
BenjaminBossan wants to merge 2 commits into
Conversation
The Gradio space for the PEFT benchmarks already contains logic to show the image generation results once the experiments have been run and pushed to the repo. However, it doesn't contain any logic to show generated sample images. Those are quite important, as the metrics are not sufficient to estimate the quality of the fine-tuned model. This PR adds a gallery element below the dataframe that shows sample images. By default, it just shows the training images from the dataset to give viewers an impression of what they're dealing with. If a user clicks on a row in the dataframe, e.g. the row for LoRA, the five sample images for that PEFT method are shown (if they can be found in the HF bucket). There is a toggle for the user to switch between showing generated sample images and training images. Morever, in this PR, I made a change to use json.dumps to put the configs into the dataframe instead of simple coercing them to string. That makes it easier to parse the data. I also noticed that in the dataframe, we would show the combined metrics between all tasks, which included metrics that are not shared. So e.g. for the image gen task, we would show test accuracy, and for MetaMath, we would show DINO similarity. The PR changes the logic to only show relevant metrics. Moreover, the image gen metrics were on the far right of the dataframe, making them hard to view. Now the order puts them further to the left. Finally, I added a short task description above the dataframe so that viewers can understand at a glance what they're looking at. It includes a link to the checkpoints on HF buckets.
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The Gradio space for the PEFT benchmarks already contains logic to show the image generation results once the experiments have been run and pushed to the repo. However, it doesn't contain any logic to show generated sample images. Those are quite important, as the metrics are not sufficient to estimate the quality of the fine-tuned model.
This PR adds a gallery element below the dataframe that shows sample images. By default, it just shows the training images from the dataset to give viewers an impression of what they're dealing with. If a user clicks on a row in the dataframe, e.g. the row for LoRA, the five sample images for that PEFT method are shown (if they can be found in the HF bucket). There is a toggle for the user to switch between showing generated sample images and training images.
Morever, in this PR, I made a change to use json.dumps to put the configs into the dataframe instead of simple coercing them to string. That makes it easier to parse the data.
I also noticed that in the dataframe, we would show the combined metrics between all tasks, which included metrics that are not shared. So e.g. for the image gen task, we would show test accuracy, and for MetaMath, we would show DINO similarity. The PR changes the logic to only show relevant metrics.
Moreover, the image gen metrics were on the far right of the dataframe, making them hard to view. Now the order puts them further to the left.
Finally, I added a short task description above the dataframe so that viewers can understand at a glance what they're looking at. It includes a link to the checkpoints on HF buckets.