Mindber
首页发现榜单模型竞技场最新动态对比价格博客
Mindber

独立目录,用于发现、比较并监测 AI 应用、AI 智能体与自动化软件。

系统运行正常
ENEnglishCN中文ESEspañolIDIndonesiaVITiếng ViệtTHไทย

产品

  • 发现
  • 榜单
  • 对比
  • 价格
  • 提交工具

资源

  • 方法论
  • 活跃度信号
  • 榜单方法论
  • 验证等级
  • 方法论更新
  • 数据来源
  • 博客
  • 报告

公司

  • 关于
  • 认领页面
  • 报告错误
  • 联系

法律

  • 条款
  • 隐私
  • 免责声明
  • DMCA
  • 删除与数据抹除

AI 辅助生成,发布前经人工审核。Mindber 聚合公开数据,不构成投资、法律或采购建议。

Mindber Score™、Mindber Innovation Index™、Mindber Functionality Score™ 与 Mindber Activity Score™ 均为 Mindber 商标。

© 2026 Mindber. 保留所有权利。v2.5
  • 首页
  • 发现
  • 榜单
  • 模型竞技场
  • 对比
  • 登录
跳到主要内容
Rankings/Methodology

Mindber Model Index — Methodology

How we rank AI models across quality, speed, and price into one number.

Data sources

Arena.ai — Crowd ELO

Real user pairwise comparisons across 12 boards: Agent, Text, Search, Vision, Document, Code (WebDev + Image-to-WebDev). Each model's position on each board contributes a vote-weighted quality signal.

Artificial Analysis — Objective benchmarks

Provider-agnostic measurements of Intelligence Index, output speed (t/s), latency (TTFT), and blended price per 1M tokens — updated continuously across 200+ model–provider combinations.

What we exclude

Image and video generation boards (text-to-image, image-edit, text-to-video, image-to-video, video-edit) are excluded from the MMI. These boards rank creative generation models on aesthetic preference — a different task class from language reasoning. Mixing them with text LLMs produces a misleading composite.

Formula

Step 1 — Percentile normalization

Every raw metric is converted to a percentile rank within its own pool so different scales are comparable. Speed and price use lower-is-better inversion (faster = higher percentile; cheaper = higher percentile). Result: all signals live in [0, 1].

Step 2 — Quality fusion (Q)

Arena and AA each produce a quality estimate. Confidence in each source is weighted before fusing:

cA = totalVotes / (totalVotes + 2000) // Arena confidence; saturates at ≈1 for high-vote models

cAA = 1 if model appears in AA data, else 0

Q = (cA × qArena + cAA × qAA) / (cA + cAA)

qArena is the vote-weighted mean of the model's per-board Arena percentiles. qAA is its AA Intelligence Index percentile. Models with no quality signal from either source are excluded.

Step 3 — Composite score (raw)

Efficiency metrics (speed, latency, price) are blended with Q using the Overall preset weights. Missing metrics are dropped and remaining weights are renormalized, so a model with no price data isn't penalized.

raw = (W_q×Q + W_s×spd + W_l×lat + W_p×prc) / (W_q + present_weights)

Step 4 — Coverage shrinkage

Models with thin data (few votes, only one source) are shrunk toward the median to prevent a handful of votes from catapulting an obscure model to the top.

presence = (hasArena ? 1 : 0 + hasAA ? 1 : 0) / 2

coverage = presence × (0.5 + 0.5 × cA) // clamped [0, 1]

MMI = 100 × (raw × coverage + median_raw × (1 − coverage))

A model with millions of votes and AA data has coverage ≈ 1 and its MMI equals its raw score × 100. A model with 50 votes and no AA data gets pulled toward the pack median.

Rank by presets

The Rank by toggle on the Overall board re-weights and re-ranks client-side without a new data fetch. Sub-scores (quality, speed, latency, price) are shipped with each row for this purpose.

PresetQuality (Q)SpeedLatencyPriceUse case
Overall60%12%8%20%Default — balanced across all signals
Frontier90%3%3%4%Quality-dominant; cost & speed ignored
Value45%10%5%40%Intelligence per dollar
Speed45%25%20%10%Latency-sensitive applications

Weights are renormalized over present metrics per model — missing efficiency data does not count against a model.

Attribution

  • Arena.ai — crowd preference data, Agent Arena methodology, pairwise comparison infrastructure
  • Artificial Analysis — objective intelligence benchmarks, speed, latency, and pricing data

Mindber does not claim ownership of source data. MMI is a derived, compute-on-read composite computed from publicly available leaderboard snapshots. Last methodology revision: 2026-06-14.

Questions about this methodology? Contact support@mindber.com.← Back to Leaderboard

本页目录

  • 数据来源
  • What we exclude
  • Formula
  • Rank by presets
  • Attribution

相关页面

  • Model Arena
  • Rankings
登录
跳到主要内容
Rankings/Methodology

Mindber Model Index — Methodology

How we rank AI models across quality, speed, and price into one number.

Data sources

Arena.ai — Crowd ELO

Real user pairwise comparisons across 12 boards: Agent, Text, Search, Vision, Document, Code (WebDev + Image-to-WebDev). Each model's position on each board contributes a vote-weighted quality signal.

Artificial Analysis — Objective benchmarks

Provider-agnostic measurements of Intelligence Index, output speed (t/s), latency (TTFT), and blended price per 1M tokens — updated continuously across 200+ model–provider combinations.

What we exclude

Image and video generation boards (text-to-image, image-edit, text-to-video, image-to-video, video-edit) are excluded from the MMI. These boards rank creative generation models on aesthetic preference — a different task class from language reasoning. Mixing them with text LLMs produces a misleading composite.

Formula

Step 1 — Percentile normalization

Every raw metric is converted to a percentile rank within its own pool so different scales are comparable. Speed and price use lower-is-better inversion (faster = higher percentile; cheaper = higher percentile). Result: all signals live in [0, 1].

Step 2 — Quality fusion (Q)

Arena and AA each produce a quality estimate. Confidence in each source is weighted before fusing:

cA = totalVotes / (totalVotes + 2000) // Arena confidence; saturates at ≈1 for high-vote models

cAA = 1 if model appears in AA data, else 0

Q = (cA × qArena + cAA × qAA) / (cA + cAA)

qArena is the vote-weighted mean of the model's per-board Arena percentiles. qAA is its AA Intelligence Index percentile. Models with no quality signal from either source are excluded.

Step 3 — Composite score (raw)

Efficiency metrics (speed, latency, price) are blended with Q using the Overall preset weights. Missing metrics are dropped and remaining weights are renormalized, so a model with no price data isn't penalized.

raw = (W_q×Q + W_s×spd + W_l×lat + W_p×prc) / (W_q + present_weights)

Step 4 — Coverage shrinkage

Models with thin data (few votes, only one source) are shrunk toward the median to prevent a handful of votes from catapulting an obscure model to the top.

presence = (hasArena ? 1 : 0 + hasAA ? 1 : 0) / 2

coverage = presence × (0.5 + 0.5 × cA) // clamped [0, 1]

MMI = 100 × (raw × coverage + median_raw × (1 − coverage))

A model with millions of votes and AA data has coverage ≈ 1 and its MMI equals its raw score × 100. A model with 50 votes and no AA data gets pulled toward the pack median.

Rank by presets

The Rank by toggle on the Overall board re-weights and re-ranks client-side without a new data fetch. Sub-scores (quality, speed, latency, price) are shipped with each row for this purpose.

PresetQuality (Q)SpeedLatencyPriceUse case
Overall60%12%8%20%Default — balanced across all signals
Frontier90%3%3%4%Quality-dominant; cost & speed ignored
Value45%10%5%40%Intelligence per dollar
Speed45%25%20%10%Latency-sensitive applications

Weights are renormalized over present metrics per model — missing efficiency data does not count against a model.

Attribution

  • Arena.ai — crowd preference data, Agent Arena methodology, pairwise comparison infrastructure
  • Artificial Analysis — objective intelligence benchmarks, speed, latency, and pricing data

Mindber does not claim ownership of source data. MMI is a derived, compute-on-read composite computed from publicly available leaderboard snapshots. Last methodology revision: 2026-06-14.

Questions about this methodology? Contact support@mindber.com.← Back to Leaderboard

本页目录

  • 数据来源
  • What we exclude
  • Formula
  • Rank by presets
  • Attribution

相关页面

  • Model Arena
  • Rankings