Modelling Bench - Search News

Morning Overview on MSN

OpenAI launches GPT-5.5, its first fully retrained base model

OpenAI on April 24, 2026, released GPT-5.5, calling it the company’s first base model trained entirely from scratch rather ...

Morning Overview on MSN

GPT-5.5 tops Claude Opus 4.7 on Terminal-Bench with an 82.7% score

OpenAI’s GPT-5.5 has posted an 82.7% score on Terminal-Bench 2.0, a benchmark that throws AI agents into difficult, ...

Decrypt

Tencent's New Hy3 AI Model Is the Most Efficient Chinese LLM No One's Talking About

Tencent just open-sourced Hy3 preview, a model that punches above its weight on coding agents, reasoning, and search—built in ...

Unite.AI

MiniMax Open Sources M2.7, a Self-Evolving Agent Model

Chinese AI company MiniMax has released the weights for MiniMax M2.7, a 229-billion-parameter Mixture-of-Experts model that participated in its own development cycle – marking what the company calls ...

Live Science

Scientists design new 'AGI benchmark' that indicates whether any future AI model could cause 'catastrophic harm'

OpenAI scientists have designed MLE-bench — a compilation of 75 extremely difficult tests that can assess whether a future advanced AI agent is capable of modifying its own code and improving itself.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results