Skip to content

How to make model consider immediate reward ? #275

Description

@glitter2626

I want to use immediate reward from environment to teach my RL model. As described in the document, I implemented "reward" function in "Environment" class.

However, when I checked loss calculation flow in train.py file, losses['v'] seems only consider value outputted from model and outcome from environment. Also, I found that loss['r'] takes into account the rewards from the environment.

Does this mean that my model also needs to output a "return" value ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions