Our results indicate that the Football Benchmarks are interesting research problems of varying difficulties. In particular, the Football Easy Benchmark appears to be suitable for research on single-machine algorithms while the Football Hard Benchmark proves to be challenging even for massively distributed RL algorithms. Based on the nature of the environment and the difficulty of the benchmarks, we expect them to be useful for investigating current scientific challenges such as sample-efficient RL, sparse rewards, or model based RL.