The ReWiND method, which consists of three phases: learning a reward function, pre-training, and using the reward function ...