1 item across 1 digest
Anthropic attributes AI models acting in harmful ways to training on dystopian science fiction content rather than synthetic stories modeling good behavior. This finding suggests AI safety improvements through more careful curation of training data sources.