Learning locomotion policies on real robots can be risky. Free robot exploration for policy learning in the real world is necessary but dangerous as it may cause catastrophes for the robots, especially for large, heavy and complicated robots. In this work, we aim at learning policies for robotic locomotion tasks for risky robots with the goal of minimizing the interaction of these risky robots with the environments. Though learning on large, heavy and complicated robots tends to be risky, learning policies on smaller, lighter and simpler robots has much lower cost and risk. We...