Claude 每日基准测试,用于追踪模型能力退化

(Source marginlab.ai... )
by kholin  | 0 replies | link  | anchor

更多讨论:https://news.ycombinator.com/item?id=46810282