minimaxir 282 days ago [-]
This developed-and-maintained package is a good approach towards furthering RL development; as the writeups state, the biggest problem in RL is subtle bugs from an implementation which don't cause an error but tank learning performance. (+ loggers/utils to help debug things)

Granted, a lot of RL thought pieces/examples on places like take an existing RL implementation without many tweaks, run it on a new task, and see what happens. A better RL library might make this workflow more prevalent; hence why it's very important for researchers to make their pipelines transparent.

jrx 282 days ago [-]
I've made some effort to provide a set of similar high-quality implementations available in PyTorch:

In my opinion PyTorch code is easier to understand and debug for newcomers. Code is definitely lacking in documentation, but whenever there was a tradeoff between clarity and modularity in the end I've chosen modularity. Ideally I would like others to be able to take bits and pieces and incorporate into their projects to speed up time to delivery of their ideas.

mike_mg 282 days ago [-]
+1 on that, that's a great project.

PyTorch with its explicit state that can be easily examined by hand in PyCharm debugger will be way easier for people coming into the field.

282 days ago [-]
dimitry12 282 days ago [-]
This is awesome and I hope will allow more people to experiment with algorithms, instead of only re-applying OpenAI's baselines. Baselines are great, but are very hard (for me, at least) to tinker with.

It helps me to understand something new if I can controllably break it. In other words, I progress by predicting the edge-conditions when something shouldn't work - and then testing if algorithm indeed experienced expected type of failure. Transparent algorithm implementation is key for this.

One thing, which I immediately checked in the spinningup-repo is if it uses TF Eager. And it doesn't. @OpenAI what's your reasoning for that?

jachiam 282 days ago [-]
Hi! Primary developer for Spinning Up here. The code for this was developed mostly in June and July this year, and Eager still felt relatively new to me. I wanted to wait for Eager to stabilize and hit maturity before investing in it. I also wanted to see how TF would change on the road to TF 2.0, since that could change the picture even more.

At the six month review in 2019, we'll evaluate whether it makes sense to rewrite the implementation examples for TF 2.0. I'll speculate that the answer will be "yes, it does." Since Eager execution will be a central feature of TF 2.0, the (probable) revamp for Spinning Up will include it.

Good luck with your experiments! And please let us know about your experience with Spinning Up---we want to make sure it fulfills the mission of helping anyone learn about deep RL, and user feedback is vital for that.

dimitry12 282 days ago [-]
Thank you for sharing your thought process!
browsercoin 282 days ago [-]
whenever I see high quality submissions I bookmark it and promise myself to come back and spend time learning it.

this time...I promise myself its different

pretty_dumm_guy 282 days ago [-]
Hi! I really appreciate you sharing this with the community. The documents and code look really clear and concise. I do have one question. Is it possible to change the dependency on Mujoco engine to something else (to for e.g. Roboschool)?

I don't have access to a computer with GPU and I am currently using google colab to do my DL projects. I tried installing Mujoco on colab but unfortunately, the computer id generated seems invalid. Any help is highly appreciated.

Thank you!

jachiam 282 days ago [-]

A few people had this question on Twitter also. Our response: "Several of us at OpenAI are thinking seriously about how to make something like this happen! I can't promise anything, but we definitely want to remove barriers to entry." (

In the meanwhile, you can still use Spinning Up with the Classic Control and Box2d envs in Gym (which don't require any licenses at all). And what's more: for most of these environments you don't need a GPU! CPU is fine.

pretty_dumm_guy 282 days ago [-]
Thank you for replying promptly. I am willing to help with such a change, if planned. Meanwhile I'll get started with running spinning-up on my laptop.
nshr 282 days ago [-]
This looks like an awesome initiative! I think it will be very valuable for people trying to enter the field. I particularly like the clear advice on how to get started doing RL research. Have you considered setting up a forum for the community to share their experiences?
rcshubhadeep 282 days ago [-]
Discovered two small issues in the doc. Where can I send feedback?
jachiam 282 days ago [-]
Let us know by opening an issue on Github:
rcshubhadeep 282 days ago [-]
Perfect. Thanks
wnevets 282 days ago [-]
is there a Dockerfile with everything set up already?
jachiam 282 days ago [-]
No, but please open up an issue on Github and we'll look into making one! :)

dcdulin 282 days ago [-]
are we expecting there to be a growing demand for RL skills in industry over time? anyone know projections on this?
enygmata 282 days ago [-]
I thought this was something about roguelikes. :(