RL Examples had bugs on current gym version

Your issue may already be reported!
Please search on the [issue tracker](https://github.com/pytorch/examples/issues) before creating one.

## Context


* Pytorch version:
* Operating System and version: Ubuntu 20

## Your Environment

* Installed using source? [yes/no]:
* Are you planning to deploy it using docker container? [yes/no]:
* Is it a CPU or GPU environment?:
* Which example are you using: reinforcement_learning
* Link to code or data to repro [if any]:

## Expected Behavior
This example script (reinforce.py and actor_critic.py) should be running well without encountering any bugs.


## Current Behavior
When running the script (reinforce.py and actor_critic.py), there are error:
```bash
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-8-263240bbee7e>](https://localhost:8080/#) in <cell line: 1>()
----> 1 main()

[<ipython-input-4-6af08085b221>](https://localhost:8080/#) in main()
     87     running_reward = 10
     88     for i_episode in count(1):
---> 89         state, _ = env.reset()
     90         ep_reward = 0
     91         for t in range(1, 10000):  # Don't infinite loop while learning

ValueError: too many values to unpack (expected 2)
```

## Possible Solution
Here I put my pull request that run on my system (gym version 0.25.2)
https://github.com/pytorch/examples/pull/1212

## Steps to Reproduce


1. Go to folder reinforcement_learning
2. run actor_critic.py or reinforce.py with gym version 0.25.2
...

## Failure Logs [if any]



```[tasklist]
### Tasks
- [ ] https://github.com/pytorch/examples/pull/1212
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RL Examples had bugs on current gym version #1213

Context

Your Environment

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce

Failure Logs [if any]

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RL Examples had bugs on current gym version #1213

Description

Context

Your Environment

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce

Failure Logs [if any]

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions