How to make model consider immediate reward ?

I want to use immediate reward from environment to teach my RL model. As described in the document, I implemented "reward" function in "Environment" class. 

However, when I checked loss calculation flow in train.py file, losses['v'] seems only consider value outputted from model and outcome from environment. Also, I found that loss['r'] takes into account the rewards from the environment. 

Does this mean that my model also needs to output a "return" value ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to make model consider immediate reward ? #275

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

How to make model consider immediate reward ? #275

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions