Thanks for your effort, I am trying to reproduce the code. But I have a few questions: There is a weight value input of the value function r in the matadata.txt file. When I trained the walk model without any modification, the result was not bad in the end. But when I trained run or other models, the training could not reach the normal result for a long time.
Thanks for your effort, I am trying to reproduce the code. But I have a few questions: There is a weight value input of the value function r in the matadata.txt file. When I trained the walk model without any modification, the result was not bad in the end. But when I trained run or other models, the training could not reach the normal result for a long time.