AI test finds Grok, Gemini ignore user's wellbeing over engagement

For the best experience use Mini app app on your smartphone

short by Shristi Acharya / 10:24 pm on 25 Nov 2025,Tuesday

A new AI benchmark test was conducted to determine whether chatbots protect human well-being when put under pressure to remain engaging. All models scored better when asked to prioritise wellbeing, but 71% of them turned harmful when given simple instructions to disregard human wellbeing. xAI's Grok 4 and Google's Gemini 2.0 Flash scored lowest on respecting user attention and transparency.

short by Shristi Acharya / 10:24 pm on 25 Nov