Contextual Constitutional AI
Summary In this post, I motivate an extension of constitutional AI (CAI) and present one possible concrete execution of that strategy. TL;DR: When generating AI feedback during the CAI process, principles from the constitution are randomized for each pair of red-teamed prompts and initial responses. A helpful-only model then critiques...
Sep 28, 202415