-
Notifications
You must be signed in to change notification settings - Fork 343
Open
Description
In the Listing 3.7, we use both memory replay and target network to improve the stablility.
However, in the memory loop:
if len(replay) > batch_size:
minibatch = random.sample(replay, batch_size)
...
action_batch = torch.Tensor([a for (s1,a,r,s2,d) in minibatch])
The compiler tells me this error:
---> 42 action_batch = torch.Tensor([a for (s1,a,r,s2,d) in minibatch])
too many dimensions 'str'
I suppose that when we memory, the action is represented by a characteristic. There, nevertheless, corresponding number are needed.
So I propose to make a reverse action set to fill this transform.
Metadata
Metadata
Assignees
Labels
No labels