五月天青色头像情侣网名,国产亚洲av片在线观看18女人,黑人巨茎大战俄罗斯美女,扒下她的小内裤打屁股

歡迎光臨散文網(wǎng) 會員登陸 & 注冊

Reinforcement Learning_Code_Policy Gradient

2023-04-10 23:35 作者:別叫我小紅  | 我要投稿

Following results and code are the implementation of policy gradient, including REINFORCE, in Gymnasium's Cart Pole environment.

RESULTS:

Visualizations of (i) changes in scores and?losses, and (ii) animation results.

Since REINFROCE makes use of?Monte Carlo estimation, its convergence rate is slow and it does?not converge after 10 thousand steps.

However, it has got a not too bad result and is hopefully to achieve more than 200 points if?more steps are given.

Fig. 1. Changes in scores and?losses.

Fig. 2. Animation results.


CODE:

NetWork.py


REINFORCEAgent.py


train_and_test.py


The above code are mainly based on Chapter 9 of?Hands-on Reinforcement Learning [1] and my previous implementation of value function apporximation with Mente Carlo [2].


Reference

[1]?https://hrl.boyuai.com/

[2]?https://www.bilibili.com/read/cv22924612



Reinforcement Learning_Code_Policy Gradient的評論 (共 條)

分享到微博請遵守國家法律
南陵县| 新野县| 琼中| 通山县| 湘乡市| 石门县| 新化县| 汉川市| 缙云县| 沛县| 静乐县| 河北省| 鄯善县| 千阳县| 留坝县| 饶平县| 灌南县| 游戏| 建水县| 清丰县| 昌邑市| 巴彦县| 泾川县| 株洲市| 任丘市| 习水县| 邵武市| 南和县| 得荣县| 铁岭县| 湛江市| 内丘县| 正蓝旗| 自贡市| 托克托县| 青岛市| 仁布县| 普兰店市| 六枝特区| 葫芦岛市| 平果县|