Enhancing Language Model Reasoning with Expert Iteration: Bridging the Gap Through Reinforcement Learning

Enhancing Language Model Reasoning with Expert Iteration: Bridging the Gap Through Reinforcement Learning