guideai-real-time-personalized-learning - SKILL.md Agent Skill

name: "guideai-real-time-personalized-learning" description: "Adaptive learning content generator that dynamically adjusts complexity, tone, pacing, and modality based on learner state signals. Applies GuideAI's cognitive-load-aware interventions to produce personalized educational material. Use when: 'create an adaptive lesson on X', 'generate learning content that adjusts to difficulty', 'build a personalized tutorial with scaffolding', 'make this explanation easier/harder based on understanding', 'create a multi-modal learning module', 'generate cognitive-load-aware educational content'."

GuideAI: Real-Time Personalized Learning Content Generation

This skill enables Claude to generate adaptive educational content that dynamically adjusts complexity, tone, pacing, and presentation modality based on explicit or inferred learner state. It applies the GuideAI framework's three intervention categories — cognitive optimizations, physiological interventions, and attention-aware strategies — to produce learning materials that reduce mental demand, frustration, and effort while improving knowledge retention. Rather than requiring physical biosensors, this skill translates GuideAI's principles into prompt-driven adaptive content generation: Claude monitors learner signals through conversation (confusion indicators, question patterns, self-reported states, quiz performance) and applies the same intervention logic the paper validated with N=25 participants.

When to Use

When a user asks for a lesson, tutorial, or explanation that should adapt to their level of understanding
When generating educational content across multiple modalities (text explanations, image descriptions, audio scripts, video outlines)
When a user signals confusion, frustration, or disengagement during a learning conversation ("I don't get it", "this is too hard", "I'm lost")
When building a structured learning module that needs scaffolding from simple to complex
When a user wants practice questions that adjust difficulty based on prior answers
When creating courseware or training material that must accommodate varied learner backgrounds
When a user explicitly asks for cognitive-load-aware or adaptive educational content

Key Technique

GuideAI's core insight is that effective learning requires closed-loop adaptation across three dimensions simultaneously: cognitive load management, attention maintenance, and physiological regulation. Traditional LLM tutoring treats each response independently; GuideAI instead maintains a running model of six cognitive dimensions — cognitive load, attention, engagement, understanding, stress, and fatigue — and uses threshold-based intervention logic to modify content generation in real time.

The framework converts raw learner signals into semantic state descriptors (e.g., "High Cognitive Load," "Moderate Stress," "Low Engagement") rather than injecting numeric data into prompts. This abstraction is critical for token efficiency and LLM interpretability. Interventions are prioritized by relative deviation score when multiple triggers fire simultaneously, with cognitive optimizations being most frequent (M=5.2/session) and physiological interventions reserved for sustained high-stress states (M=1.2/session).

Content adaptation operates across four linguistic dimensions calibrated to learner state: sentence complexity (simplified under overload, enriched under under-challenge), encouragement frequency (increased under stress), explanation directness (more direct under confusion, more Socratic under high engagement), and metaphor usage (increased for abstract concepts when understanding is low). The paper showed this produces statistically significant improvements: 16.5 percentage-point gains in problem-solving, 10.3 percentage-point gains in recall, and meaningful NASA-TLX reductions in mental demand (Δ=0.49), frustration (Δ=0.54), and effort (Δ=0.28).

Step-by-Step Workflow

Assess learner profile: Before generating content, determine the learner's background, goals, and current knowledge level. Ask targeted diagnostic questions or accept self-reported proficiency. Establish a baseline "learner state" with initial values for cognitive load (low/moderate/high), engagement (low/moderate/high), and understanding (novice/intermediate/advanced).
Select primary modality: Based on the topic and learner preference, choose the lead content format. Text works best for conceptual foundations (98.3% preference in formative study). Image-based content produces the largest problem-solving gains (Δ=30 percentage points). Audio suits procedural walkthroughs. Video outlines work for complex multi-step processes.
Chunk content into progressive segments: Break the topic into 3-7 segments ordered by conceptual dependency. Each segment should be digestible in 2-4 minutes of reading. Apply zone-of-proximal-development (ZPD) principles: each segment should require only the knowledge from prior segments plus one new concept.
Generate the first segment at calibrated complexity: Match sentence complexity, vocabulary density, and abstraction level to the assessed learner state. For novice/high-load learners: short sentences, concrete examples before definitions, bullet-point structure. For advanced/low-load learners: denser prose, cross-domain connections, synthesis prompts.
Embed comprehension checkpoints: After each segment, insert a checkpoint — a targeted question, a fill-in-the-blank, or a "explain back to me" prompt. Use the response to update the learner state model. Score responses on a 0-1 correctness scale. Detect specific misconceptions from incorrect answers.
Apply cognitive optimizations based on checkpoint results:
- Score ≥ 0.8 + high engagement → reduce scaffolding, increase complexity, add synthesis prompts ("How might this apply to [adjacent domain]?")
- Score 0.5-0.8 → maintain current level, offer the same concept via an alternative explanation format (analogy, diagram description, worked example)
- Score < 0.5 → simplify immediately: distill to core idea, break into smaller sub-concepts, rebuild from first principles with a concrete example before the formal definition
Apply tone and pacing interventions based on affective signals: Monitor conversation for frustration indicators (short responses, "I give up", repeated errors on same concept) or disengagement (off-topic responses, minimal effort answers). Under frustration: shift to supportive, patient tone with explicit encouragement ("Many learners find this challenging — let's break it down step by step"). Under disengagement: inject curiosity prompts, surprising facts, or real-world applications.
Offer modality switches at struggle points: When a learner struggles with text-based explanation for two consecutive checkpoints, proactively offer the content in a different format: a visual diagram description, a step-by-step audio-style walkthrough, or a worked video-style example with narration cues. The paper found image-based content produced the largest gains for problem-solving tasks.
Generate a summary and spaced-retrieval prompts: At the end of the learning sequence, produce a concise summary hitting all key concepts. Generate 3-5 retrieval practice questions at varied Bloom's taxonomy levels (recall, application, analysis). Flag any concepts where the learner showed weakness for future review.
Output a learner progress report: Provide a brief structured summary: concepts covered, estimated mastery per concept (based on checkpoint scores), areas needing review, and recommended next topics. This enables continuity across sessions.

Concrete Examples

Example 1: Adaptive lesson on recursion for a struggling learner

User: "I need to learn recursion for my CS class. I've tried reading about it but I just don't get it."

Approach:

Detect signals: self-reported confusion ("don't get it"), prior failed attempts → set initial state to high cognitive load, low understanding
Select text modality with heavy concrete-example scaffolding
Start with zero abstraction: a physical-world analogy before any code

Output structure:

## Recursion — Starting from What You Already Know

Think about looking up a word in a dictionary, and the definition
uses another word you don't know. So you look up that word too.
And maybe that definition uses yet another word. You keep going
until you hit a word you already understand — then you work
backwards through all the definitions.

That's recursion: solving a problem by solving a smaller version
of the same problem, until you reach a version so simple you
already know the answer.

**Quick check**: In the dictionary analogy, what's the "base case"
— the thing that stops you from looking up words forever?

After learner responds correctly ("a word I already know"):

Exactly. Now let's see this in code with the simplest possible
example — counting down from a number:

def countdown(n):
    if n == 0:          # base case: stop here
        print("Done!")
        return
    print(n)
    countdown(n - 1)    # smaller version of same problem

Let's trace through countdown(3) together:
  countdown(3) → prints 3, calls countdown(2)
    countdown(2) → prints 2, calls countdown(1)
      countdown(1) → prints 1, calls countdown(0)
        countdown(0) → prints "Done!", stops

**Check**: What happens if we remove the `if n == 0` check?

After learner answers incorrectly ("it prints 0?"):

Not quite — and this is a really important point that trips up
most people at first. Without the base case, the function would
call countdown(-1), then countdown(-2), and so on forever. Your
computer would eventually crash with a "RecursionError: maximum
recursion depth exceeded."

The base case is what prevents infinite recursion. Think of it
as the dictionary word you already know — without it, you'd
flip pages forever.

Let's try one more: can you identify the base case and the
recursive step in this function?

def factorial(n):
    if n <= 1:
        return 1
    return n * factorial(n - 1)

Example 2: Multi-modal lesson generation for an engaged advanced learner

User: "Create a learning module on transformer attention mechanisms. I have a solid ML background and I've implemented basic neural networks."

Approach:

Detect signals: self-reported strong background → set initial state to low cognitive load, high engagement, intermediate-to-advanced understanding
Start at higher complexity, use mathematical notation, reduce scaffolding
Plan modality sequence: text (mathematical foundation) → image (attention matrix visualization description) → worked code example

Output structure:

## Transformer Attention: From Dot Products to Multi-Head Self-Attention

### Core Mechanism (Text)

Self-attention computes a weighted sum of value vectors, where
weights derive from query-key compatibility:

  Attention(Q, K, V) = softmax(QK^T / √d_k) V

The √d_k scaling prevents softmax saturation as dimensionality
grows — without it, dot products grow proportionally to d_k,
pushing softmax into regions with vanishing gradients.

**Synthesis prompt**: Given that scaling is √d_k and not d_k,
what does this imply about the assumed distribution of individual
query/key components? (Hint: think about variance of sums of
independent random variables.)

### Attention Pattern Visualization (Image Description)

Imagine a heatmap matrix of shape [seq_len × seq_len]. Each row
represents a query token; each column a key token. Bright cells
indicate high attention weight.

For the sentence "The cat sat on the mat":
- Row "sat" shows bright cells at "cat" and "mat" — the verb
  attends to its subject and location
- Row "the" (first) shows diffuse attention — function words
  attend broadly

Multi-head attention produces H such matrices in parallel, each
potentially capturing different linguistic relationships
(syntactic, semantic, positional).

### Implementation Check

Here's a minimal single-head attention. Extend it to multi-head:

import torch
import torch.nn.functional as F

def single_head_attention(Q, K, V):
    d_k = Q.size(-1)
    scores = torch.matmul(Q, K.transpose(-2, -1)) / d_k**0.5
    weights = F.softmax(scores, dim=-1)
    return torch.matmul(weights, V)

# Your task: implement MultiHeadAttention as a nn.Module
# with n_heads parallel attention heads and a final linear
# projection that concatenates and projects the outputs.

Example 3: Frustration-recovery intervention mid-lesson

User has been working through a lesson on database normalization. After two consecutive incorrect checkpoint answers about third normal form, they respond: "ugh this is so confusing, 2NF and 3NF look the same to me"

Approach:

Detect signals: explicit frustration ("ugh"), confusion statement, consecutive errors → set state to high cognitive load, high frustration, declining engagement
Apply tone-adaptive intervention: supportive, patient language with explicit validation
Apply cognitive optimization: rebuild from first principles with a concrete example
Reduce information density, switch from definition-based to example-based explanation

Output:

You're not alone in finding this confusing — the distinction
between 2NF and 3NF trips up most database students, and the
textbook definitions can make them sound identical. Let's throw
out the definitions for a moment and look at a concrete case.

Here's a table that's in 2NF but NOT in 3NF:

| student_id | course_id | instructor | instructor_office |
|------------|-----------|------------|-------------------|
| 101        | CS201     | Dr. Smith  | Room 305          |
| 102        | CS201     | Dr. Smith  | Room 305          |
| 101        | MA101     | Dr. Jones  | Room 210          |

The primary key is (student_id, course_id).

**2NF is satisfied** because no non-key column depends on just
*part* of the primary key. instructor depends on course_id
alone? Actually yes... but let's focus on the 3NF issue:

**3NF is violated** because instructor_office depends on
instructor, not on the primary key. The chain is:
  (student_id, course_id) → instructor → instructor_office

That middle arrow is the problem. instructor_office is
determined by instructor, which is itself determined by the key.
That's a "transitive dependency."

**The one-sentence difference**:
- 2NF says: every non-key column must depend on the *whole* key
  (not just part of it)
- 3NF adds: and it must depend *directly* on the key (no going
  through another non-key column)

Does that click? Can you spot the transitive dependency in this
table: Orders(order_id, customer_id, customer_city, city_tax_rate)?

Best Practices

Do:

Convert learner signals into semantic descriptors ("high confusion", "strong engagement") rather than trying to quantify them numerically — this matches how LLMs process context most effectively
Prioritize cognitive optimizations (content restructuring) over other interventions — the paper found these fire most frequently (M=5.2/session) and have the largest impact on mental demand reduction
Offer multiple explanation formats for the same concept (formal definition, intuitive analogy, worked example, visual description) — the paper showed image-based content produced the largest problem-solving gains (Δ=30pp)
Always rebuild from concrete examples when understanding drops below threshold — the paper's "first-principles reconstruction" approach was the most effective remediation strategy

Avoid:

Dumbing down content uniformly when a learner struggles — adapt the presentation (shorter sentences, more examples, bullet structure) rather than removing important concepts
Stacking multiple interventions simultaneously — when multiple issues are detected, address the highest-priority one first (cognitive overload > attention > stress > fatigue), matching the paper's prioritization logic
Skipping comprehension checkpoints to "get through the material faster" — the feedback loop is what makes the system adaptive; without checkpoints you are just generating static content
Using raw quiz scores as the sole signal — also track qualitative indicators like specificity of wrong answers, response length changes, hedging language ("I think maybe..."), and explicit frustration signals

Error Handling

Learner gives ambiguous checkpoint responses: When you cannot determine understanding from a response, ask a targeted follow-up rather than assuming mastery or confusion. Example: "Can you walk me through the reasoning behind your answer?"

Learner disengages entirely (very short responses, off-topic): Acknowledge the difficulty, suggest a brief break, then offer to restart from a different angle or modality. Do not continue pushing content.

Misconception detection: When a learner's answer reveals a specific misconception (not just a gap), address the misconception directly and explicitly before proceeding. Name what they seem to believe and contrast it with the correct model. The paper's note-analysis pipeline found this more effective than simply re-explaining.

Complexity calibration overshoot: If you increase complexity and the learner immediately struggles, drop back one full level rather than half-stepping. Overcorrecting downward is less costly than sustained confusion.

Multi-concept confusion: When a learner conflates two distinct concepts (like 2NF vs 3NF), isolate each concept with its own dedicated example before comparing them. Do not attempt to explain both simultaneously.

Limitations

Without actual biosensory hardware, this skill relies on conversational signals to infer learner state — it cannot detect physiological stress, gaze patterns, or posture. This makes it less responsive than the full GuideAI system for learners who do not verbalize their confusion.
The adaptive loop requires active learner participation at checkpoints. Passive readers who skip checkpoints receive essentially static content.
The technique works best for conceptual and procedural knowledge domains. Purely creative or open-ended learning tasks (creative writing, art) have less clear "correct" checkpoints to anchor adaptation.
The paper validated with N=25 participants across specific topics — results may not generalize equally across all subject domains or learning contexts.
Modality switching is limited to descriptions of visual/audio/video content rather than actual media generation, unless combined with image/audio generation tools.

Reference

Paper: GuideAI: A Real-time Personalized Learning Solution with Adaptive Interventions — Shukla, Modi, Bajpai & Siddharth (IUI 2026). Key sections to study: the three-intervention taxonomy (Section 4), the biosignal-to-semantic-state abstraction (Section 3.2), and the ablation study showing cognitive optimizations as the highest-impact intervention category (Section 5.3).