Hi,
I collected the simulation data by using your repositories as it is. When I reproduce the results using your bash script, I get the following:
for pi0:
Indep:
seen : \mstd{81.52}{3.26}
unseen: \mstd{80.23}{10.78}
LSTM:
seen: \mstd{85.19}{1.68}
unseen: \mstd{81.09}{10.26}
for open-pi0:
Indep:
seen: \mstd{95.87}{1.07}
unseen: \mstd{82.14}{7.05}
LSTM:
seen: \mstd{92.88}{0.91}
unseen: \mstd{74.26}{7.41}
For openvla experiments, the unseen score drops around 5.
What could be the reason for this drastic changes? Am I missing something? I though for pi0, it could be due to newer version of the fine-tuned models but after seeing the change in simplerenv version as well, im not so sure. Is the seed in the data collection still differs the collected data?
Hi,
I collected the simulation data by using your repositories as it is. When I reproduce the results using your bash script, I get the following:
for pi0:
Indep:
seen : \mstd{81.52}{3.26}
unseen: \mstd{80.23}{10.78}
LSTM:
seen: \mstd{85.19}{1.68}
unseen: \mstd{81.09}{10.26}
for open-pi0:
Indep:
seen: \mstd{95.87}{1.07}
unseen: \mstd{82.14}{7.05}
LSTM:
seen: \mstd{92.88}{0.91}
unseen: \mstd{74.26}{7.41}
For openvla experiments, the unseen score drops around 5.
What could be the reason for this drastic changes? Am I missing something? I though for pi0, it could be due to newer version of the fine-tuned models but after seeing the change in simplerenv version as well, im not so sure. Is the seed in the data collection still differs the collected data?