VLM-RM: Specifying Rewards with Natural Language — AI Alignment Forum