scoring-calibration

star 1

Skill for venue-calibrated scoring, score weighting formulas, decision rules, anti-bias mechanisms, and score interpretation across different venue tiers.

Vahidrostami By Vahidrostami schedule Updated 3/4/2026

name: scoring-calibration description: > Skill for venue-calibrated scoring, score weighting formulas, decision rules, anti-bias mechanisms, and score interpretation across different venue tiers.

Scoring Calibration

Use this skill when computing review scores, applying decision rules, or calibrating review standards to a specific venue.

Score Dimensions

Every review scores these 6 dimensions plus confidence:

Dimension Range Description
Overall 1-10 Holistic assessment
Soundness 1-10 Technical correctness
Novelty 1-10 Originality of contribution
Clarity 1-10 Writing and presentation quality
Significance 1-10 Impact and importance
Reproducibility 1-10 Can results be reproduced?
Confidence 1-5 Reviewer's self-assessed expertise

Venue-Calibrated Interpretation

Top-Tier (NeurIPS, Nature, Science, ICML)

Score Meaning
8-10 Strong accept — top 10% of submissions
6-7 Weak accept — above threshold, some issues
5 Borderline — could go either way
3-4 Weak reject — below threshold, significant issues
1-2 Strong reject — fundamental flaws

Acceptance threshold: Mean ≥ 7, no critical issues

Mid-Tier (AAAI, ECML, PLOS ONE)

Score Meaning
7-10 Strong accept
5-6 Accept with revisions
4 Borderline
2-3 Reject
1 Strong reject

Acceptance threshold: Mean ≥ 6, critical issues addressed

Workshop / Preprint

Score Meaning
6-10 Accept
4-5 Accept with minor revisions
3 Borderline
1-2 Reject

Acceptance threshold: Mean ≥ 5, no fatal flaws

Score Weighting Formula

The weighted final score combines dimensions with fixed weights:

final_score = (
    0.30 × mean(soundness) +
    0.20 × mean(novelty) +
    0.20 × mean(significance) +
    0.15 × mean(clarity) +
    0.15 × mean(reproducibility)
)

These weights can be overridden in .review-config.yaml:

review:
  score_weights:
    soundness: 0.30
    novelty: 0.20
    significance: 0.20
    clarity: 0.15
    reproducibility: 0.15

Decision Rules

Condition Decision
All reviewers ≥ 7, no critical weaknesses Accept
All reviewers ≥ 6, only minor weaknesses Accept with Minor Revision
Mean ≥ 5, no more than 1 reviewer below 5 Major Revision
Mean < 5 or 2+ reviewers below 4 Reject
Strong disagreement (spread ≥ 4 points) Discussion round before decision

Venue-Adjusted Thresholds

The decision rules above use venue-specific thresholds:

Rule Parameter Top-Tier Mid-Tier Workshop
Accept threshold ≥ 7 ≥ 6 ≥ 5
Accept-minor threshold ≥ 6 ≥ 5 ≥ 4
Major revision threshold ≥ 5 ≥ 4 ≥ 3
Reject threshold < 5 < 4 < 3

Anti-Bias Mechanisms

Anchoring Prevention

  • Reviewers assign scores BEFORE writing detailed comments
  • Score-first protocol prevents narrative from biasing quantitative assessment

Confirmation Bias Mitigation

  • Reviewer γ (Generalist) has no domain priors — provides an unbiased perspective
  • If all reviews are uniformly positive (all ≥ 7), flag for confirmation bias check

Authority Bias Prevention

  • Author identity optionally stripped in double-blind mode
  • Reviewer profiles focus on expertise, not prestige

Positivity Bias Prevention

  • EIC prompt emphasizes that rejection is a valid and useful outcome
  • Decision rules explicitly model rejection conditions

Novelty Bias Prevention

  • Score weights rank soundness (0.30) above novelty (0.20)
  • A technically correct but incremental paper scores higher than a novel but unsound one

Score Trajectory Tracking

Track scores across revision rounds to detect convergence or stalling:

score_trajectory:
  round_1:
    alpha: 5
    beta: 7
    gamma: 6
    mean: 6.0
    weighted: 5.95
  round_2:
    alpha: 7
    beta: 8
    gamma: 7
    mean: 7.3
    weighted: 7.25
    delta: +1.3
  convergence_status: "improving"

Diminishing Returns

If delta ≤ 0.3 for 2 consecutive rounds:
  → Flag DIMINISHING_RETURNS
  → Consider declaring EXHAUSTED
Install via CLI
npx skills add https://github.com/Vahidrostami/delaunay --skill scoring-calibration
Repository Details
star Stars 1
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator
Vahidrostami
Vahidrostami Explore all skills →