Welcome to LearnRL’s community !¶
LearnRL is a library to use and learn reinforcement learning. It’s also a community off supportive enthousiasts loving to share and build RL-based AI projects ! We would love to help you make projects with LearnRL, so join us on Discord !
About LearnRL¶
LearnRL is a framework to use and learn reinforcement learning with a wandb integration for a good visualisation ! Our motto is clean, sharable and readable Agents ! As such, you can plug and play agents on any environment, but also look how agents are built to learn !
- Also, LearnRL is cross platform compatible ! That’s why no agents are built-in learnrl itself, but you can check:
You can build and run your own Agent in a clear and sharable manner !
import learnrl as rl
import gym
class MyAgent(rl.Agent):
def act(self, observation, greedy=False):
""" How the Agent act given an observation """
...
return action
def learn(self):
""" How the Agent learns from his experiences """
...
return logs
def remember(self, observation, action, reward, done, next_observation=None, info={}, **param):
""" How the Agent will remember experiences """
...
env = gym.make('FrozenLake-v0', is_slippery=True) # This could be any gym Environment !
agent = MyAgent(env.observation_space, env.action_space)
pg = rl.Playground(env, agent)
pg.fit(2000, verbose=1)
Note that ‘learn’ and ‘remember’ are optional, so this framework can also be used for baselines !
Of course, you can logs any custom metrics that your Agent/Env gives you and even chose how to aggregate them through episodes or cycles: See the metric codes for more details.
metrics=[
('reward~env-rwd', {'steps': 'sum', 'episode': 'sum'}),
('handled_reward~reward', {'steps': 'sum', 'episode': 'sum'}),
'value_loss~vloss',
'actor_loss~aloss',
'exploration~exp'
]
pg.fit(2000, verbose=1, metrics=metrics)
- The Playground will allow you to have clean logs adapted to your will with the verbose parameter:
- Verbose 1episodes cycles - If your environment makes a lot of quick episodes.
- Verbose 2episode - To log each individual episode.
- Verbose 3steps cycles - If your environment makes a lot of quick steps but has long episodes.
- Verbose 4step - To log each individual step.
- Verbose 5detailled step - To debug each individual step (with observations, actions, …).
The Playground also allows you to add Callbacks with ease, for example the WandbCallback to have a nice dashboard ! TODO: Show wandb logging
Features¶
Use this API to create your own agents and environments (even multiplayer!) with great compatibility and visualisation.
Get started¶
- Create:
TODO: Numpy DQN tutorial
TODO: Tensorflow tutorials
TODO: Pytorch tutorials
- Visualize:
TODO: Tensorboard visualisation tutorial
TODO: Wandb visualisation tutorial
TODO: Wandb sweep tutorial
Table Of Content¶
Contribute¶
Support¶
If you are having issues, please contact us on Discord.