Improving RLHF (Reinforcement Learning from Human Feedback) with Critique-Generated Reward Models

Improving RLHF (Reinforcement Learning from Human Feedback) with Critique-Generated Reward Models