AI Alignment Research

3don MSN

OpenAI disbands mission alignment team, which focused on 'safe' and 'trustworthy' AI development

The team's leader has been given a new role as OpenAI's Chief Futurist, while the other team members have been reassigned throughout the company.

Rochester Institute of Technology

Research reveals which popular generative AI chatbots lie

RIT and Georgia Tech artificial intelligence experts have developed a framework to test hallucinations on ChatGPT, Gemini, ...

CSO Online

Single prompt breaks AI safety in 15 major language models

The GRP‑Obliteration technique reveals that even mild prompts can reshape internal safety mechanisms, raising oversight ...

TechCrunch

OpenAI’s research on AI models deliberately lying is wild

Every now and then, researchers at the biggest tech companies drop a bombshell. There was the time Google said its latest quantum chip indicated multiple universes exist. Or when Anthropic gave its AI ...

Microsoft

A one-prompt attack that breaks LLM safety alignment

As LLMs and diffusion models power more applications, their safety alignment becomes critical. Our research shows that even minimal downstream fine‑tuning can weaken safeguards, raising a key question ...

Yahoo

AI Is Learning to Lie for Social Media Likes

Add Yahoo as a preferred source to see more of our stories on Google. Large language models are learning how to win—and that’s the problem. In a research paper published Tuesday titled "Moloch’s ...

Computer Weekly

International AI Alignment effort tackles unpredictability

The UK’s AI Security Institute is collaborating with several global institutions on a global initiative to ensure artificial intelligence (AI) systems behave in a predictable manner. The Alignment ...

Analytics Insight

Senior AI Research Scientist, IBM

As Senior AI Research Scientist, candidate will direct foundational artificial intelligence research at IBM which supports ...

Forbes

‘Mind The Gap’: Bridging AI Research And Real-World Application

In an era of AI “hype,” I sometimes find that something critical is lost in the conversation. Specifically, there’s a yawning gap between AI research and real-world application. Though many ...

ZDNet

AI models know when they're being tested - and change their behavior, research shows

Several frontier AI models show signs of scheming. Anti-scheming training reduced misbehavior in some models. Models know they're being tested, which complicates results. New joint safety testing from ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results