Google’s new Android Bench ranks the top AI models for Android coding, with Gemini 3.1 Pro Preview leading Claude Opus 4.6 and GPT-5.2-Codex.
An AI agent reads its own source code, forms a hypothesis for improvement (such as changing a learning rate or an architecture depth), modifies the code, runs the experiment, and evaluates the results ...