olmo-eval: An evaluation workbench for the model development loop
Hugging Face outlined updates on olmo-eval: An evaluation workbench for the model development loop: olmo-eval: An evaluation workbench for the model development…
AI news, research, models, robotics, chips, startups, and infrastructure coverage.
Hugging Face outlined updates on olmo-eval: An evaluation workbench for the model development loop: olmo-eval: An evaluation workbench for the model development…
The IPO market is back, and it’s not the same companies leading the charge. FAANG had a good run, but a new acronym is taking over: MANGOS —…
AgentPerf from Artificial Analysis, the industry’s first agentic AI benchmark, gives developers, enterprises and infrastructure providers a clear way to compare systems…
Google DeepMind is funding research into the potential dangers of millions of different AI agents interacting with each other online. According to Rohin…
Anthropic Tuesday publicly released Claude Fable 5, its first "Mythos-class" model that it says surpasses its previous frontier Opus models in overall…
Hugging Face outlined updates on Introducing North Mini Code: Cohere’s First Model For Developers: introducing North Mini Code: Cohere’s First Model For…
NVIDIA GPUs with Confidential Computing are now used for confidential inference in Apple’s Private Cloud Compute (PCC), as it expands beyond Apple’s…
Google DeepMind Blog — introducing Gemma 4 12B: a unified, encoder-free multimodal model
Today, Google DeepMind released DiffusionGemma — an experimental open model built for exceptionally fast text generation. NVIDIA has optimized DiffusionGemma to run even faster across NVIDIA GeForce RTX GPUs, the NVIDIA RTX PRO platform and NVIDIA…
ZDNET AI — known primarily for containers and servers, Alpine Linux isn't always considered for traditional desktop use. I think this lightweight…