creditaudit-2textnd-dimension-evaluation

star 5

Evaluate and select LLMs using CreditAudit's 2D framework: mean ability plus stability risk (fluctuation) across system prompt variations. Assigns credit grades (AAA–BBB) to models based on performance volatility. Use when: 'compare models for deployment', 'which LLM is most stable', 'evaluate model robustness to prompt changes', 'credit grade these models', 'model selection for agentic pipeline', 'rank models by reliability'.

By ndpvt-web schedule Updated 2/12/2026

play_arrow Run Skill in Manus View GitHub

Skill instructions (SKILL.md) could not be loaded from local cache or raw GitHub repository.

Install via CLI

npx skills add https://github.com/ndpvt-web/arxiv-claude-skills --skill creditaudit-2textnd-dimension-evaluation

Repository Details

star Stars 5

call_split Forks 0

navigation Branch main

article Path SKILL.md