SERI MATS '21, Cognitive science @ Yale '22, Meta AI Resident '23, LTFF grantee. Currently doing alignment research @ AE Studio. Very interested in work at the intersection of AI x cognitive science x alignment x philosophy.
The idea to combine SOO and CAI is interesting. Can you elaborate at all on what you were imagining here? Seems like there are a bunch of plausible ways you could go about injecting SOO-style finetuning into standard CAI—is there a specific direction you are particularly excited about?
Makes sense, thanks—can you also briefly clarify what exactly you are pointing at with 'syntactic?' Seems like this could be interpreted in multiple plausible ways, and looks like others might have a similar question.