← All posts

insights-and-analytics

Posts in insights-and-analytics.

LangSmith, RAGAS & LLM-as-a-Judge: The State of LLM Eval in 2026

We swept 156,928 job postings and 435,000+ blog and news articles for ~90 LLM-eval frameworks and ~55 evaluation methodologies. The result: hiring names a tiny tool set (LangSmith + Langfuse = 56% of all eval-tool mentions; no framework over 1%), practitioners are converging on LLM-as-a-judge with a rubric, and the benchmarks the press argues about — SWE-bench, MMLU, GPQA — show up in roughly zero job descriptions.

OpenAI Isn't Building a Phone Like Apple. They're Building an AI OS Like Google.

A press rumor said OpenAI is building a phone. We pulled all 746 OpenAI postings; 19 sit on a team called "Consumer Devices." But the team is 90% software and AI research, with no in-house industrial design or mechanical engineering. The hardware roles are procurement and integration. The shape of the team is Google-in-2007 (build an OS, ship it on a partner's hardware), not Apple-in-2007 (build it all yourself).