User Comment Replies — AI Alignment Forum

Nice work! I agree with many of the opinions. Nevertheless, the survey is still missing several key citations.

Personally, I have made several contributions to RLHF:

Reinforcement learning for bandit neural machine translation with simulated human feedback (Nguyen et al., 2017) is the first paper that shows the potential of using noisy rating feedback to train text generators.
Interactive learning from activity description (Nguyen et al., 2021) is one of the first frameworks for learning from descriptive language feedback with theoretical guarante

... (read more)

AI ALIGNMENT FORUM
AF