name: "personality-as-relational-infrastructure" description: "Design LLM messaging systems that infuse Big Five personality traits for sustained user engagement. Uses aggregate-exposure personality alignment rather than per-message optimization. Trigger phrases: 'personality-aligned messages', 'BFPT messaging', 'adaptive notification system', 'personality-infused prompts', 'behavior change messaging', 'JITAI system design'"
Personality as Relational Infrastructure
This skill enables Claude to design and implement LLM-powered messaging systems that embed Big Five Personality Traits (BFPT) — Openness, Conscientiousness, Extraversion, Agreeableness, Neuroticism — into system-initiated messages for behavior change, notifications, coaching, and sustained user engagement. The core insight, drawn from Hofer et al. (2026), is that personality-based personalization works through aggregate exposure (cumulative tone consistency across many messages) rather than per-message optimization. This means the system architecture should prioritize consistent personality alignment across an entire message stream, not perfecting each individual message in isolation.
When to Use
- When the user is building a notification, coaching, or nudge system that sends repeated messages to users over time (e.g., fitness apps, habit trackers, learning platforms)
- When the user asks to personalize LLM-generated messages using personality profiles or Big Five traits
- When designing JITAI (Just-In-Time Adaptive Intervention) systems that deliver context-aware prompts at opportune moments
- When the user wants to implement prompt strategies (baseline, few-shot, fine-tuned, or RAG) for personality-aligned text generation
- When building A/B testing or evaluation frameworks for personalized messaging systems
- When the user asks how to make AI-generated messages feel more natural, appropriate, or less annoying over repeated exposure
- When implementing Communication Accommodation Theory principles in chatbot or messaging UX
Key Technique
Aggregate Exposure over Per-Message Optimization. The paper tested four LLM generation strategies — baseline prompting, few-shot prompting, fine-tuned models, and retrieval-augmented generation (RAG) — each with and without Big Five Personality Trait infusion. The surprising finding: infusing personality traits into a single message produced no measurable improvement on that specific message's perceived quality (no trial-level effect). However, participants who received a higher proportion of personality-informed messages across their entire experience rated the overall message stream as more personalized, more appropriate, and reported less negative affect. This is a person-level exposure effect.
Communication Accommodation Theory (CAT) explains why. CAT describes how communicators adjust their style to converge with or diverge from their audience. When an LLM consistently mirrors a user's personality traits across messages — matching the energy level of an extravert, the precision preferences of a conscientious person, or the warmth expectations of an agreeable person — users perceive the system as more attuned to them. This convergence effect is cumulative: it builds through pattern recognition across interactions, not through any single message being noticeably "better."
Practical implication for system design: Instead of investing engineering effort into making each message maximally personality-optimized (diminishing returns), invest in ensuring personality alignment is consistently present across the message stream. A messaging system that infuses personality traits into 80-100% of its outputs will outperform one that perfectly optimizes 20% of messages. The architecture should treat personality as infrastructure — a persistent system-prompt-level concern — not as a per-request feature.
Step-by-Step Workflow
Define the user personality model. Collect or infer Big Five trait scores for each user. Represent each trait on a continuous scale (e.g., 1-5 or 0-100). Store as a structured profile:
{ openness: 4.2, conscientiousness: 3.8, extraversion: 2.1, agreeableness: 4.5, neuroticism: 1.9 }. If direct assessment is unavailable, infer from user behavior patterns or allow self-reporting via a brief questionnaire (e.g., TIPI-10).Translate trait scores into communication style directives. Map each trait dimension to concrete language properties. For example:
- High Openness → use creative metaphors, varied vocabulary, exploratory framing
- High Conscientiousness → use specific numbers, structured lists, clear action items
- High Extraversion → use enthusiastic tone, social references, exclamation energy
- High Agreeableness → use warm language, collaborative framing, empathetic acknowledgments
- High Neuroticism → use reassuring tone, minimize pressure language, emphasize safety and incremental progress
Embed personality directives at the system-prompt level. Place personality alignment instructions in the system prompt (not the user message), ensuring every generation from the model inherits the personality frame. This is the critical architectural decision: personality is infrastructure, not a per-message addon.
Choose a generation strategy based on available resources:
- Baseline prompting: Include personality directives directly in the system prompt. Lowest cost, easiest to implement. Sufficient for most applications.
- Few-shot prompting: Provide 3-5 example messages that exemplify the target personality style in the prompt. Better style consistency.
- Fine-tuned model: Train on a corpus of messages labeled by personality style. Best consistency but highest upfront cost.
- RAG: Retrieve personality-matched example messages from a vector store at generation time. Good balance of quality and flexibility.
Design the JITAI message schema. Each message needs: (a) situational context (time of day, user activity state, recent behavior), (b) intervention goal (encourage, remind, celebrate, redirect), (c) personality-aligned tone directives from step 2. Structure this as a generation request template.
Ensure high personality-infusion coverage. Based on the paper's findings, target 80-100% of messages in the stream to carry personality alignment. Do not selectively apply personality only to "important" messages — the effect is cumulative and depends on proportion of exposure.
Implement message variation to prevent fatigue. Within the personality-consistent frame, vary message structure, length, and specific content. Personality alignment should constrain tone and style, not make messages repetitive.
Build evaluation around aggregate metrics, not per-message scores. Measure perceived personalization, appropriateness, and affective response at the session/week level, not per individual message. Use within-between decomposition: separate whether a specific message scores well (trial-level) from whether the user's overall experience improves (person-level).
Implement A/B testing with personality-infusion proportion as the variable. Compare groups receiving 0%, 50%, and 100% personality-infused messages. Measure person-level outcomes: overall perceived personalization, message appropriateness, and user affect over the exposure period.
Iterate on trait-to-style mappings using CAT principles. Monitor for over-accommodation (style feels patronizing) or under-accommodation (style feels generic). Adjust the intensity of personality expression based on user feedback signals.
Concrete Examples
Example 1: Fitness App Notification System
User: "I'm building a fitness app that sends daily motivation messages. I want to personalize them based on user personality. How should I architect this?"
Approach:
- Define a personality profile schema stored per user
- Create a system prompt template with personality slot injection
- Generate messages with personality as persistent infrastructure
Output — System prompt template:
You are a fitness coaching assistant. You send one motivational message
per day to help the user stay active.
USER PERSONALITY PROFILE:
- Openness: {openness}/5 — {"Use creative analogies and novel framing" if >= 3.5 else "Use straightforward, familiar language"}
- Conscientiousness: {conscientiousness}/5 — {"Include specific metrics, times, and action steps" if >= 3.5 else "Keep it flexible and low-pressure"}
- Extraversion: {extraversion}/5 — {"Use energetic, enthusiastic tone with social references" if >= 3.5 else "Use calm, reflective tone focused on personal growth"}
- Agreeableness: {agreeableness}/5 — {"Use warm, supportive language and collaborative framing" if >= 3.5 else "Use direct, results-focused language"}
- Neuroticism: {neuroticism}/5 — {"Emphasize safety, self-compassion, and small wins" if >= 3.5 else "Use confident, challenge-oriented framing"}
CONTEXT: {time_of_day}, user's last activity was {days_since_active} days ago,
current streak: {streak_days} days.
Generate a short motivational message (1-3 sentences). Match the personality
style consistently. Do not mention personality traits directly.
Sample output for a user with high Conscientiousness (4.5) and low Extraversion (1.8):
Your 3pm window is open — a 20-minute walk would put you at 7,200 steps
for the day. You've hit your target 4 of the last 5 days. Quiet consistency
is building something real.
Sample output for a user with high Extraversion (4.7) and high Agreeableness (4.3):
Hey! Your walking crew hit the park yesterday — maybe today's your day to
join them? Even a quick 15 minutes together would be awesome. You've got this!
Example 2: Learning Platform Nudge System
User: "We have a language learning app. Users drop off after week 2. Can we use personality-based messaging to improve retention?"
Approach:
- Collect Big Five scores during onboarding (10-item TIPI questionnaire)
- Build a message generation pipeline with personality as system-level config
- Ensure all automated messages carry personality alignment (not just special ones)
- Track retention at the cohort level grouped by personality-infusion proportion
Output — Implementation architecture:
# Personality-infused message generation pipeline
from dataclasses import dataclass
@dataclass
class PersonalityProfile:
openness: float # 1.0 - 5.0
conscientiousness: float
extraversion: float
agreeableness: float
neuroticism: float
def build_style_directives(profile: PersonalityProfile) -> str:
"""Translate trait scores into concrete language style instructions."""
directives = []
if profile.openness >= 3.5:
directives.append("Use creative metaphors and playful language.")
else:
directives.append("Use clear, literal explanations.")
if profile.conscientiousness >= 3.5:
directives.append("Include progress stats and specific next steps.")
else:
directives.append("Keep it casual; avoid rigid structure.")
if profile.extraversion >= 3.5:
directives.append("Reference community, leaderboards, social features.")
else:
directives.append("Focus on personal mastery and quiet achievement.")
if profile.agreeableness >= 3.5:
directives.append("Use encouraging, empathetic tone.")
else:
directives.append("Use matter-of-fact, efficient tone.")
if profile.neuroticism >= 3.5:
directives.append(
"Normalize setbacks. Emphasize that any progress counts."
)
else:
directives.append("Use confident, forward-looking framing.")
return " ".join(directives)
def generate_nudge(profile: PersonalityProfile, context: dict) -> str:
"""Generate a personality-aligned nudge message.
Key design decision: personality directives live in the system prompt,
ensuring EVERY message carries the alignment. This is infrastructure,
not a per-message feature.
"""
style = build_style_directives(profile)
system_prompt = (
f"You write short retention nudges for a language learning app. "
f"Style: {style} "
f"Keep messages to 1-2 sentences. Never mention personality."
)
user_prompt = (
f"User context: learning {context['language']}, "
f"current streak: {context['streak']} days, "
f"last session: {context['days_inactive']} days ago, "
f"skill level: {context['level']}. "
f"Generate one nudge message."
)
return call_llm(system_prompt=system_prompt, user_prompt=user_prompt)
Example 3: Evaluating a Personality-Infused Messaging System
User: "How should I measure whether personality-based messages are working?"
Approach:
- Do NOT evaluate individual messages in isolation — the effect is aggregate
- Set up within-between measurement framework
- Compare cohorts by personality-infusion proportion
Output — Evaluation framework:
WRONG approach (per-message evaluation):
"Rate this message on a scale of 1-5 for personalization."
→ This will show NO significant effect (confirmed by the paper).
RIGHT approach (aggregate exposure evaluation):
After N messages over T days, measure:
1. Perceived Personalization (person-level):
"Overall, how well did the messages match your communication style?"
Scale: 1 (not at all) to 7 (extremely well)
2. Perceived Appropriateness (person-level):
"Overall, how appropriate were the messages you received?"
Scale: 1 (not at all) to 7 (extremely)
3. Affective Response (person-level):
"How did the messages make you feel overall?"
Scale: 1 (very negative) to 7 (very positive)
Comparison groups:
- Control: 0% personality-infused messages
- Partial: 50% personality-infused messages
- Full: 100% personality-infused messages
Analysis: Compare person-level means across groups.
Expected: Full > Partial > Control on all three metrics.
Best Practices
- Do: Place personality alignment in the system prompt so it persists across all messages by default. Treat it as infrastructure, not a feature toggle.
- Do: Map personality traits to concrete language behaviors (word choice, sentence structure, framing), not to vague "be more X" instructions.
- Do: Maintain high personality-infusion coverage (80%+ of messages). The effect depends on cumulative proportion, not individual message quality.
- Do: Evaluate at the session/week level, not per-message. Use within-between decomposition to separate trial-level noise from person-level signal.
- Avoid: Over-accommodating — exaggerating personality style to the point of caricature (e.g., excessive exclamation marks for extraverts). Subtle, consistent alignment beats heavy-handed single-message personalization.
- Avoid: Selectively applying personality only to "high-stakes" messages while sending generic messages the rest of the time. This breaks the aggregate exposure pattern that drives the effect.
Error Handling
- No personality data available: Fall back to a neutral, moderately warm baseline style (mid-range on all five traits). Offer an onboarding questionnaire (TIPI-10 takes under 2 minutes) or infer traits from behavioral signals over time.
- Conflicting trait combinations: Some profiles create tension (e.g., high Neuroticism + high Extraversion). Prioritize the trait most relevant to the message context: Neuroticism when addressing setbacks, Extraversion when suggesting social activities.
- User reports messages feel "off": This likely indicates over-accommodation. Reduce the intensity of personality expression by moving trait thresholds closer to the midpoint (e.g., only apply strong trait-specific language at scores above 4.0 instead of 3.5).
- Message fatigue despite personality alignment: Personality alignment constrains tone, not content. Increase content variation (different topics, framings, information types) while maintaining the personality-consistent style layer.
- Evaluation shows no effect: Verify that you are measuring at the person-level (aggregate), not the trial-level (per-message). The paper found zero trial-level effects but significant person-level effects. If you are already measuring correctly, check personality-infusion coverage — the effect requires consistent exposure.
Limitations
- The paper's evidence comes from a retrospective evaluation, not a longitudinal in-situ study. Real-world deployment may produce different effect sizes.
- The study used physical activity as the application domain. Generalization to other domains (finance, education, mental health) is plausible but unvalidated.
- Big Five traits are broad dimensions. Finer-grained personality models (e.g., HEXACO, or domain-specific preference profiles) might yield stronger effects but lack the same empirical backing from this study.
- The approach assumes users have relatively stable personality profiles. For contexts where user state changes rapidly (crisis support, acute illness), state-based personalization may be more appropriate than trait-based.
- The study tested with 90 participants. Effect sizes were meaningful but the sample limits generalizability, particularly for underrepresented personality profiles.
Reference
Hofer, D. P., Haag, D., Islambouli, R., & Smeddinck, J. D. (2026). Personality as Relational Infrastructure: User Perceptions of Personality-Trait-Infused LLM Messaging. arXiv:2602.06596v1. https://arxiv.org/abs/2602.06596v1
Key takeaway: Personality-based personalization works through aggregate exposure proportion, not per-message optimization. Look for the within-between decomposition results (Section on ordinal multilevel models) and the Communication Accommodation Theory analysis.