Hi, thank you for open-sourcing the training data.
While inspecting the cold-start data, I noticed that most samples do not seem to include an image input field: 63,636 out of 65,807 samples fall into this case. Among the samples that do include images, only 356 appear to contain the original question image; for the others, the only images seem to come from tool-call outputs rather than the input question itself.
Could you clarify whether this is intentional, or whether the uploaded cold-start data might be incomplete or not the intended version?
Thank you.
Hi, thank you for open-sourcing the training data.
While inspecting the cold-start data, I noticed that most samples do not seem to include an image input field: 63,636 out of 65,807 samples fall into this case. Among the samples that do include images, only 356 appear to contain the original question image; for the others, the only images seem to come from tool-call outputs rather than the input question itself.
Could you clarify whether this is intentional, or whether the uploaded cold-start data might be incomplete or not the intended version?
Thank you.