update_q_table question

Hi there,

I noticed that the `max_future_q` value in this function `def update_q_table(self, LEARNING_RATE, DISCOUNT, old_paths, new_paths)` might be incorrect. It appears that `max_future_q` is still using the state of `old_paths.path_queues`, whereas it should be using `new_paths.path_queues`. Could you please confirm if the following correction is valid? Thank you very much!

# Compute indices for the new state (next state)
    future_indices = [
        math.ceil(min(10, new_paths.path_queues[0] / 10)),  # New state (queue 1)
        math.ceil(min(10, new_paths.path_queues[1] / 10))   # New state (queue 2)
    ]

# Get the best Q-value for the new state
    max_future_q = np.max(self.q_table[:, future_indices[0], future_indices[1]])

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update_q_table question #4

Compute indices for the new state (next state)

Get the best Q-value for the new state

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

update_q_table question #4

Description

Compute indices for the new state (next state)

Get the best Q-value for the new state

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions