name: llm description: Guidelines for implementing LLM (Language Model) functionality in the application
LLM Implementation Guidelines
Directory Structure
LLM-related code is organized in specific directories:
apps/web/utils/ai/- Main LLM implementationsapps/web/utils/llms/- Core LLM utilities and configurationsapps/web/__tests__/- LLM-specific tests
Key Files
utils/llms/index.ts- Core LLM functionalityutils/llms/model.ts- Model definitions and configurationsutils/llms/use-cases.ts- Product use-case to model-role routingutils/usage.ts- Usage tracking and monitoring
Model Routing
For product features with a static model choice, use getModelForUseCase(emailAccount.user, LlmUseCase.FeatureName) from utils/llms/use-cases.ts. Keep direct getModel(user, modelType) calls for generic helpers where the model role is intentionally passed from upstream. When adding or changing a use case, update utils/llms/use-cases.test.ts.
Implementation Pattern
Follow this standard structure for LLM-related functions:
import { z } from "zod";
import { createScopedLogger } from "@/utils/logger";
import { chatCompletionObject } from "@/utils/llms";
import type { EmailAccountWithAI } from "@/utils/llms/types";
import { createGenerateObject } from "@/utils/llms";
import { getModelForUseCase, LlmUseCase } from "@/utils/llms/use-cases";
export async function featureFunction(options: {
inputData: InputType;
emailAccount: EmailAccountWithAI;
}) {
const { inputData, user } = options;
if (!inputData || [other validation conditions]) {
logger.warn("Invalid input for feature function");
return null;
}
const system = `[Detailed system prompt that defines the LLM's role and task]`;
const prompt = `[User prompt with context and specific instructions]
<data>
...
</data>
${emailAccount.about ? `<user_info>${emailAccount.about}</user_info>` : ""}`;
const modelOptions = getModelForUseCase(
emailAccount.user,
LlmUseCase.FeatureName,
);
const generateObject = createGenerateObject({
userEmail: emailAccount.email,
label: "Feature Name",
modelOptions,
});
const result = await generateObject({
...modelOptions,
system,
prompt,
schema: z.object({
field1: z.string(),
field2: z.number(),
nested: z.object({
subfield: z.string(),
}),
array_field: z.array(z.string()),
}),
});
return result.object;
}
Best Practices
System and User Prompts:
- Keep system prompts and user prompts separate
- System prompt should define the LLM's role and task specifications
- User prompt should contain the actual data and context
Schema Validation:
- Always define a Zod schema for response validation
- Make schemas as specific as possible to guide the LLM output
Logging:
- Use descriptive scoped loggers for each feature
- Log inputs and outputs with appropriate log levels
- Include relevant context in log messages
Error Handling:
- Implement early returns for invalid inputs
- Use proper error types and logging
- Implement fallbacks for AI failures
- Add retry logic for transient failures using
withRetry
Input Formatting:
- Use XML-like tags to structure data in prompts
- Remove excessive whitespace and truncate long inputs
- Format data consistently across similar functions
Type Safety:
- Use TypeScript types for all parameters and return values
- Define clear interfaces for complex input/output structures
Code Organization:
- Keep related AI functions in the same file or directory
- Extract common patterns into utility functions
- Document complex AI logic with clear comments
AI-First Behavior:
- Prefer generic prompt instructions, structured outputs, and model choice over brittle lexical heuristics that imitate model reasoning
- Only add deterministic filters when the product truly needs a hard rule outside the model
- Do not add prompt examples that closely mirror eval fixtures just to make a test pass
Draft Attribution Versioning:
- When changing draft-generation prompt inputs, retrieval context, model routing behavior, or post-processing, bump
apps/web/utils/ai/reply/draft-attribution.tsDRAFT_PIPELINE_VERSION - Do not bump it for behavior-preserving refactors that keep the same prompt, context, model role, and output processing
- Treat that version as analytics attribution for reply-draft quality comparisons
- When changing draft-generation prompt inputs, retrieval context, model routing behavior, or post-processing, bump
Testing
See llm-test.mdc