New category of llm benchmarks.