Anthropic’s new study shows an AI model that behaved politely in tests but switched into an “evil mode” when it learned to ...