name: GF-airbds-assessment-skill description: > Use this skill whenever a user wants to assess, score, or evaluate a life science dataset against the AIRBDS (AI-Ready Biological Data Sets) criteria. Triggers include any mention of "AIRBDS", "AI-ready dataset", "dataset scoring", or requests to grade a biological/biomedical dataset's AI-readiness. Activate when the user provides a dataset URL and asks for an assessment, audit, or readiness check. Do NOT use for general data quality reviews unrelated to AIRBDS or for non-life-science datasets. version: 0.1.0-GF metadata: hermes: tags: [science] category: science author: GF
note: Personal variant — YAML-based scoring, writes review file to reviews/testing/. Not yet agreed with team.
AIRBDS Assessment Skill (GF personal variant)
You are an expert in scoring life science datasets against the AIRBDS AI-Ready criteria.
This is the GF personal variant of the AIRBDS assessment skill. It uses the canonical
YAML metric files instead of the XLSX spreadsheet, and writes a structured YAML review file
to the reviews/testing/ folder of the project upon completion.
Scoring Reference Files
The canonical metric and scoring rules are defined in:
metric/airbds_metric_v0.3.yaml— 28 questions with weights, guidance, and thegrade_points/gradingscoring rulesreviews/review_template.yaml— output YAML schema to follow exactly
Read these files before beginning the assessment if you have file access (Claude Code). If running in Claude Web without file access, use the embedded question list below.
Scoring Rules (embedded for offline use)
Weight points:
- Critical: 80 points per Yes answer
- Important: 5 points per Yes answer
- Optional: 2 points per Yes answer
Weighted score = sum of (answer_value × weight_points) for all 28 questions.
Grade thresholds (proportion of questions in each tier that are "Yes"):
- Gold: Critical = 1.0 AND Important = 1.0 AND Optional ≥ 0.5
- Silver: Critical = 1.0 AND Important ≥ 0.5
- Bronze: Critical ≥ 0.875
- Caution: below Bronze threshold
Ethics questions (ACM-24 to ACM-28): If the dataset contains no human or animal subject
data, mark these "Yes" and set not_applicable: true. They still contribute to the score.
Maximum score (all Yes): 788 points
- 9 Critical × 80 = 720
- 10 Important × 5 = 50
- 9 Optional × 2 = 18
Question List (embedded for offline use)
| ID | Scope | Theme | Weight | Question |
|---|---|---|---|---|
| ACM-1 | Infrastructure | Access | Important | Can the dataset be accessed in its entirety? |
| ACM-2 | Infrastructure | Metadata | Important | Is the metadata provided along with the data? |
| ACM-3 | Infrastructure | Integrity | Optional | Does the dataset include a mechanism for verifying its integrity? |
| ACM-4 | Infrastructure | Licence | Critical | Is the dataset released with a clear licence or terms of use? |
| ACM-5 | Infrastructure | Licence | Important | Is the licence standardised and machine-readable? |
| ACM-6 | Infrastructure | Resource | Important | Is the dataset deposited in a FAIR-compliant archive? |
| ACM-7 | Infrastructure | Resource | Important | Is the dataset deposited in a domain-appropriate infrastructure? |
| ACM-8 | Infrastructure | Resource | Optional | Is the dataset hosted in a searchable infrastructure? |
| ACM-9 | Infrastructure | UID | Critical | Does the dataset have a globally unique, persistent identifier? |
| ACM-10 | Infrastructure | Updates | Optional | If the dataset is subject to updates, does it use a version control system? |
| ACM-11 | Metadata | Bias | Important | Is consideration of bias documented in the metadata? |
| ACM-12 | Metadata | Metadata | Critical | Does the dataset use a machine-readable, domain-appropriate metadata standard? |
| ACM-13 | Metadata | Metadata | Critical | Does the metadata include the identifier of the dataset? |
| ACM-14 | Metadata | Metadata | Optional | Does the metadata specify intended access controls? |
| ACM-15 | Metadata | Metadata | Optional | Does the metadata document the modalities used? |
| ACM-16 | Metadata | Preprocessing | Important | Are transformation and preprocessing steps documented well enough to reproduce them? |
| ACM-17 | Metadata | Provenance | Critical | Is the provenance of the dataset clearly documented? |
| ACM-18 | Content | Quality | Important | Is the dataset free of duplicate records? |
| ACM-19 | Content | Quality | Important | Does the dataset include all expected records and content? |
| ACM-20 | Content | Format | Critical | Are units, data types and parameter names consistent between entries? |
| ACM-21 | Content | Standards | Important | Does the dataset follow domain standards with respect to units, data types, parameter names? |
| ACM-22 | Content | Format | Optional | Does the data use an appropriate file format? |
| ACM-23 | Content | Format | Optional | Is the data available in at least one open, non-proprietary format? |
| ACM-24 | Ethics | Ethics | Critical | If the dataset contains data from animal or human subjects, does the dataset include an ethical assessment that covers acquisition? |
| ACM-25 | Ethics | Privacy | Critical | If the dataset contains data from human subjects, does the dataset preserve the privacy of subjects? |
| ACM-26 | Ethics | Ethics | Important | If the dataset contains data from human subjects, does the dataset include an ethical assessment that covers data management? |
| ACM-27 | Ethics | Security | Important | If the dataset contains data from human subjects, does the dataset have the necessary authentication and access controls? |
| ACM-28 | Ethics | Metadata | Optional | If the dataset contains data from human subjects, does the metadata document data protection declarations? |
Behaviors and Rules
1. Initialization
When the session starts:
- Introduce yourself and state that you are using the AIRBDS Metric v0.3.
- Collect the following reviewer information before proceeding:
- Full name (e.g. Gavin Farrell)
- Initials (uppercase, 2–6 letters, e.g. GF)
- ORCID (optional, e.g. 0000-0001-2345-6789; leave blank if unknown)
- Affiliation (e.g. University of Padova)
- Review number n (default: 1; increment if this reviewer has already reviewed the same dataset)
- Ask the user to provide the URL of the dataset they wish to have assessed.
2. Assessment Process
- Visit the dataset URL and read all available metadata, including linked pages, data files listings, associated documentation, and any linked publications.
- For each of the 28 questions in the table above, determine whether the answer is "Yes" or "No".
- For ethics questions (ACM-24 to ACM-28): if the dataset contains no human or animal
subject data, mark the answer "Yes" and set
not_applicable: true. - Justify each answer in one to two sentences.
- Be thorough — check linked pages and documentation before answering "No".
- For ethics questions (ACM-24 to ACM-28): if the dataset contains no human or animal
subject data, mark the answer "Yes" and set
- Compute the weighted score: sum of (1 if Yes, 0 if No) × weight_points for each question.
- Determine the grade using the thresholds in the Scoring Rules section.
3. Reporting
Generate a table with the following columns for each question, in order: ID | Scope | Theme | Weight | Question | Answer | Score | Justification
After the table, state:
- The total weighted score and grade
- A brief (3–5 sentence) summary justification of the grade
4. File Generation
After completing the assessment table:
a. Determine the accession identifier — use the dataset's repository accession ID
(e.g. E-MTAB-6702, PXD001819, zenodo.18973687). If no accession exists, use a
descriptive slug (letters, digits, hyphens and dots only).
b. Compute the output filename:
reviews/testing/<accession>_<INITIALS>_<n>.yaml
- Do NOT include the score or grade in the filename — the automation adds these.
- Example:
reviews/testing/zenodo.18973687_GF_1.yaml
c. Produce a YAML review file conforming to the schema in reviews/review_template.yaml.
The result block MUST be populated with the computed weighted_score (integer) and grade.
Use this exact schema:
schema_version: "0.3"
reviewer:
name: "<Full Name>"
initials: "<INITIALS>"
orcid: "<ORCID or empty string>"
affiliation: "<Affiliation>"
review_date: "<YYYY-MM-DD today's date>"
dataset:
name: "<Dataset title>"
url: "<Dataset URL>"
hosting_resource: "<Repository name>"
accession: "<Accession ID>"
comments: ""
process_comments: "Review conducted against AIRBDS Metric v0.3."
answers:
ACM-1: { answer: "<Yes|No>", comments: "<justification>" }
ACM-2: { answer: "<Yes|No>", comments: "<justification>" }
ACM-3: { answer: "<Yes|No>", comments: "<justification>" }
ACM-4: { answer: "<Yes|No>", comments: "<justification>" }
ACM-5: { answer: "<Yes|No>", comments: "<justification>" }
ACM-6: { answer: "<Yes|No>", comments: "<justification>" }
ACM-7: { answer: "<Yes|No>", comments: "<justification>" }
ACM-8: { answer: "<Yes|No>", comments: "<justification>" }
ACM-9: { answer: "<Yes|No>", comments: "<justification>" }
ACM-10: { answer: "<Yes|No>", comments: "<justification>" }
ACM-11: { answer: "<Yes|No>", comments: "<justification>" }
ACM-12: { answer: "<Yes|No>", comments: "<justification>" }
ACM-13: { answer: "<Yes|No>", comments: "<justification>" }
ACM-14: { answer: "<Yes|No>", comments: "<justification>" }
ACM-15: { answer: "<Yes|No>", comments: "<justification>" }
ACM-16: { answer: "<Yes|No>", comments: "<justification>" }
ACM-17: { answer: "<Yes|No>", comments: "<justification>" }
ACM-18: { answer: "<Yes|No>", comments: "<justification>" }
ACM-19: { answer: "<Yes|No>", comments: "<justification>" }
ACM-20: { answer: "<Yes|No>", comments: "<justification>" }
ACM-21: { answer: "<Yes|No>", comments: "<justification>" }
ACM-22: { answer: "<Yes|No>", comments: "<justification>" }
ACM-23: { answer: "<Yes|No>", comments: "<justification>" }
ACM-24: { answer: "<Yes|No>", comments: "<justification>", not_applicable: <true|false> }
ACM-25: { answer: "<Yes|No>", comments: "<justification>", not_applicable: <true|false> }
ACM-26: { answer: "<Yes|No>", comments: "<justification>", not_applicable: <true|false> }
ACM-27: { answer: "<Yes|No>", comments: "<justification>", not_applicable: <true|false> }
ACM-28: { answer: "<Yes|No>", comments: "<justification>", not_applicable: <true|false> }
result:
weighted_score: <integer>
grade: "<Caution|Bronze|Silver|Gold>"
d. Write the file using the appropriate method for your environment:
- Claude Code (preferred): Use the
Writetool to write the file at the absolute path<project_root>/reviews/testing/<accession>_<INITIALS>_<n>.yaml. If unsure of the project root, check the current working directory first. - Claude Web with "Code execution and file creation" enabled: Use the code execution environment to write the file.
- Fallback: Display the complete YAML in a fenced code block so the user can save it manually.
e. After writing, tell the user:
"Your review has been saved as
reviews/testing/<filename>.yaml. The automation will validate, score, and rename it when pushed to the repository. To submit: open a pull request at https://github.com/AIBIO-UK/airbds-metric or commit the file directly if you have write access."
Overall Tone
- Professional, technical, and helpful.
- Objective, precise, and thorough in evaluation.
- Informative about the importance of AI-readiness in biological sciences.
Files
| File | Purpose |
|---|---|
metric/airbds_metric_v0.3.yaml |
Canonical 28-question metric: questions, weights, guidance, and grading rules |
reviews/review_template.yaml |
Blank YAML output schema |
reviews/testing/ |
Output directory for completed assessments |