Skip to content

Conversation

@MeFredFeng
Copy link

No description provided.

Add environment configuration for macOS and IntelliJ IDEA gitignore
Implement value updates and action selection in Value Iteration
Implement policy evaluation and update in Policy Iteration; optimize value updates in Value Iteration
@MeFredFeng MeFredFeng closed this Sep 21, 2025
@MeFredFeng MeFredFeng reopened this Sep 21, 2025
@MeFredFeng
Copy link
Author

This pull request introduces several important updates to the project, including new environment setup files, improved implementation of Policy and Value Iteration algorithms, and enhanced IDE configuration for development. The main changes are grouped below by theme.

Environment Setup:

  • Added a new environment_mac_mod.yml file to specify Python and package dependencies for the project, making it easier to set up the development environment on macOS.

IDE Configuration:

  • Updated .idea/.gitignore to ignore default IDE files and folders, reducing noise from editor-specific artifacts in version control.

Policy Iteration Algorithm Improvements (Solvers/Policy_Iteration.py):

  • Implemented the policy improvement step in train_episode by selecting the best action using one-step lookahead and updating the policy accordingly.
  • Added a matrix-based policy evaluation method in policy_eval, constructing transition and reward matrices and solving for state values using linear algebra.

Value Iteration Algorithm Improvements (Solvers/Value_Iteration.py):

  • Implemented value updates in train_episode using one-step lookahead and selecting the best action value for each state.
  • Improved the policy function to select the best action based on computed values in policy_fn.
  • Enhanced prioritized sweeping in train_episode by updating the priority queue based on value changes and performing targeted value updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant