Atari¶
Installation¶
Install Sample Factory with Atari dependencies with PyPI:
Running Experiments¶
Run Atari experiments with the scripts in sf_examples.atari
.
The default parameters have been chosen to match CleanRL's configuration (see reports below) and are not tuned for throughput. (see some better parameters at the end of the document).
To train a model in the BreakoutNoFrameskip-v4
environment:
python -m sf_examples.atari.train_atari --algo=APPO --env=atari_breakout --experiment="Experiment Name"
To visualize the training results, use the enjoy_atari
script:
python -m sf_examples.atari.enjoy_atari --algo=APPO --env=atari_breakout --experiment="Experiment Name"
Multiple experiments can be run in parallel with the launcher module. atari_envs
is an example launcher script that runs atari envs with 4 seeds.
python -m sample_factory.launcher.run --run=sf_examples.atari.experiments.atari_envs --backend=processes --max_parallel=8 --pause_between=1 --experiments_per_gpu=10000 --num_gpus=1
List of Supported Environments¶
Specify the environment to run with the --env
command line parameter. The following Atari v4 environments are supported out of the box.
Various APPO models trained on Atari environments are uploaded to the HuggingFace Hub. The models have all been trained for 2 billion steps with 3 seeds per experiment. Videos of the agents after training can be found on the HuggingFace Hub.
Atari Command Line Parameter | Atari Environment name | Model Checkpooints |
---|---|---|
atari_alien | AlienNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_amidar | AmidarNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_assault | AssaultNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_asterix | AsterixNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_asteroid | AsteroidsNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_atlantis | AtlantisNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_bankheist | BankHeistNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_battlezone | BattleZoneNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_beamrider | BeamRiderNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_berzerk | BerzerkNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_bowling | BowlingNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_boxing | BoxingNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_breakout | BreakoutNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_centipede | CentipedeNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_choppercommand | ChopperCommandNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_crazyclimber | CrazyClimberNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_defender | DefenderNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_demonattack | DemonAttackNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_doubledunk | DoubleDunkNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_enduro | EnduroNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_fishingderby | FishingDerbyNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_freeway | FreewayNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_frostbite | FrostbiteNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_gopher | GopherNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_gravitar | GravitarNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_hero | HeroNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_icehockey | IceHockeyNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_jamesbond | JamesbondNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_kangaroo | KangarooNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_krull | KrullNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_kongfumaster | KungFuMasterNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_montezuma | MontezumaRevengeNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_mspacman | MsPacmanNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_namethisgame | NameThisGameNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_phoenix | PhoenixNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_pitfall | PitfallNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_pong | PongNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_privateye | PrivateEyeNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_qbert | QbertNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_riverraid | RiverraidNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_roadrunner | RoadRunnerNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_robotank | RobotankNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_seaquest | SeaquestNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_skiing | SkiingNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_solaris | SolarisNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_spaceinvaders | SpaceInvadersNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_stargunner | StarGunnerNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_surround | SurroundNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_tennis | TennisNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_timepilot | TimePilotNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_tutankham | TutankhamNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_upndown | UpNDownNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_venture | VentureNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_videopinball | VideoPinballNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_wizardofwor | WizardOfWorNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_yarsrevenge | YarsRevengeNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
atari_zaxxon | ZaxxonNoFrameskip-v4 | π€ Hub Atari-2B checkpoints |
Reports¶
- Sample Factory was benchmarked on Atari against CleanRL and Baselines. Sample Factory was able to achieve similar sample efficiency as CleanRL and Baselines using the same parameters.
Better parameters (more envs, double buffering, async learning)¶
--experiment=breakout_faster
--env=atari_breakout
--summaries_use_frameskip=False
--num_workers=16
--num_envs_per_worker=8
--worker_num_splits=2
--train_for_env_steps=100000000
--rollout=32
--normalize_input=True
--normalize_returns=True
--serial_mode=False
--async_rl=True
--batch_size=1024
--wandb_user=<user>
--wandb_project=sf2_atari_breakout
--wandb_group=breakout_w16v8r32
--with_wandb=True
Report: https://wandb.ai/apetrenko/sf2_atari_breakout/reports/sf2-breakout-w16v8r32--Vmlldzo0MjM1MTQ4