name: documenso-incident-runbook description: | Incident response procedures for Documenso integration issues. Use when diagnosing production incidents, handling outages, or responding to Documenso service disruptions. Trigger with phrases like "documenso incident", "documenso outage", "documenso down", "documenso troubleshooting". allowed-tools: Read, Bash(curl:), Bash(kubectl:), Grep version: 1.0.0 license: MIT author: Jeremy Longshore jeremy@intentsolutions.io
Documenso Incident Runbook
Overview
Step-by-step procedures for responding to Documenso integration incidents.
Prerequisites
- Access to monitoring dashboards
- kubectl access (if using Kubernetes)
- Documenso dashboard access
- On-call escalation contacts
Incident Severity Levels
| Severity | Description | Response Time | Examples |
|---|---|---|---|
| P1 Critical | Complete service outage | 15 minutes | API down, all signing blocked |
| P2 High | Significant degradation | 30 minutes | High error rates, slow responses |
| P3 Medium | Partial impact | 2 hours | Some operations failing |
| P4 Low | Minor issues | 24 hours | Cosmetic issues, warnings |
Quick Diagnostic Commands
# Check Documenso API status
curl -s https://status.documenso.com/api/v2/status.json | jq '.status'
# Test API connectivity
curl -v -H "Authorization: Bearer $DOCUMENSO_API_KEY" \
https://app.documenso.com/api/v2/documents?perPage=1
# Check your service health
curl -s https://your-app.com/health | jq
# View recent error logs
kubectl logs -l app=signing-service --since=10m | grep -i error
# Check pod status
kubectl get pods -l app=signing-service
Incident Response Procedures
Procedure 1: Documenso API Unresponsive
Symptoms:
- Connection timeouts
- 5xx errors from Documenso
- Health checks failing
Steps:
Confirm the issue (2 min)
# Check Documenso status page curl -s https://status.documenso.com/api/v2/status.json | jq # Direct API test curl -v --max-time 10 \ -H "Authorization: Bearer $DOCUMENSO_API_KEY" \ https://app.documenso.com/api/v2/documents?perPage=1Enable graceful degradation (5 min)
# If using feature flags curl -X POST https://your-feature-flags.com/api/flags \ -d '{"flag": "documenso_enabled", "value": false}' # Or via kubectl kubectl set env deployment/signing-service DOCUMENSO_ENABLED=falseNotify stakeholders
Subject: [P1] Documenso Integration Degraded Status: Documenso API is unresponsive Impact: Document signing temporarily unavailable Action: Graceful degradation enabled ETA: Monitoring for recoveryMonitor for recovery
# Watch for recovery watch -n 30 'curl -s https://status.documenso.com/api/v2/status.json | jq .status'Re-enable after recovery
kubectl set env deployment/signing-service DOCUMENSO_ENABLED=true
Procedure 2: High Error Rate (4xx/5xx)
Symptoms:
- Error rate > 5%
- Mixed successful and failed requests
- Specific operations failing
Steps:
Identify error pattern (5 min)
# Check recent errors kubectl logs -l app=signing-service --since=30m | \ grep -E "statusCode.*[45][0-9]{2}" | \ sort | uniq -c | sort -rn | head -20Check specific error types
# 401 Unauthorized - API key issue # 403 Forbidden - Permission issue # 404 Not Found - Resource deleted # 422 Validation - Bad request data # 429 Rate Limited - Too many requests # 500+ Server Error - Documenso issueFor 401/403 errors:
# Verify API key echo $DOCUMENSO_API_KEY | head -c 10 # Test key directly curl -H "Authorization: Bearer $DOCUMENSO_API_KEY" \ https://app.documenso.com/api/v2/documents?perPage=1For 429 errors:
# Check request rate kubectl logs -l app=signing-service --since=5m | \ grep "documenso" | wc -l # Reduce concurrency kubectl set env deployment/signing-service DOCUMENSO_CONCURRENCY=1For 5xx errors:
- Check Documenso status page
- Enable circuit breaker
- Implement retry with backoff
Procedure 3: Document Stuck in Pending
Symptoms:
- Documents not completing
- Recipients not receiving emails
- Webhooks not firing
Steps:
Check document status
curl -H "Authorization: Bearer $DOCUMENSO_API_KEY" \ https://app.documenso.com/api/v2/documents/{documentId}Check recipient status
# Look for recipient signing status curl -H "Authorization: Bearer $DOCUMENSO_API_KEY" \ https://app.documenso.com/api/v2/documents/{documentId} | \ jq '.recipients[] | {email, signingStatus}'Verify webhook delivery
- Check Documenso dashboard for webhook logs
- Check your webhook endpoint logs
- Test webhook endpoint manually
Manual intervention
# Resend document curl -X POST \ -H "Authorization: Bearer $DOCUMENSO_API_KEY" \ https://app.documenso.com/api/v2/documents/{documentId}/resend
Procedure 4: Webhook Not Receiving Events
Symptoms:
- No webhook events received
- Events delayed significantly
- Partial event delivery
Steps:
Verify webhook configuration
- Check Documenso dashboard webhook settings
- Verify URL is correct and HTTPS
- Confirm events are subscribed
Test webhook endpoint
curl -X POST https://your-app.com/webhooks/documenso \ -H "Content-Type: application/json" \ -H "X-Documenso-Secret: your-secret" \ -d '{"event":"test","payload":{}}'Check webhook logs in Documenso
- Dashboard -> Team Settings -> Webhooks
- View delivery attempts and responses
Check your endpoint
# Verify endpoint is responding curl -I https://your-app.com/webhooks/documenso # Check for errors in logs kubectl logs -l app=signing-service | grep webhookResend failed webhooks
- Use Documenso dashboard to retry failed deliveries
Circuit Breaker Implementation
// Emergency circuit breaker
class DocumensoCircuitBreaker {
private failures = 0;
private lastFailure = 0;
private state: "closed" | "open" | "half-open" = "closed";
private readonly threshold = 5;
private readonly timeout = 60000; // 1 minute
async execute<T>(operation: () => Promise<T>): Promise<T> {
if (this.state === "open") {
if (Date.now() - this.lastFailure > this.timeout) {
this.state = "half-open";
} else {
throw new Error("Circuit breaker is open");
}
}
try {
const result = await operation();
this.onSuccess();
return result;
} catch (error) {
this.onFailure();
throw error;
}
}
private onSuccess(): void {
this.failures = 0;
this.state = "closed";
}
private onFailure(): void {
this.failures++;
this.lastFailure = Date.now();
if (this.failures >= this.threshold) {
this.state = "open";
console.error("Circuit breaker opened!");
}
}
getState(): string {
return this.state;
}
}
Post-Incident Review Template
## Incident Report: [Title]
**Date:** YYYY-MM-DD
**Duration:** X hours Y minutes
**Severity:** P1/P2/P3/P4
**Author:** [Name]
### Summary
Brief description of what happened.
### Timeline
- HH:MM - Initial alert triggered
- HH:MM - Investigation started
- HH:MM - Root cause identified
- HH:MM - Mitigation applied
- HH:MM - Service restored
### Root Cause
Technical explanation of why the incident occurred.
### Impact
- X documents affected
- Y users impacted
- Z minutes of downtime
### Resolution
What was done to resolve the incident.
### Action Items
- [ ] Implement additional monitoring
- [ ] Update runbook
- [ ] Add circuit breaker
### Lessons Learned
What we learned from this incident.
Emergency Contacts
| Role | Contact | Escalation Time |
|---|---|---|
| On-call Engineer | [Slack/Phone] | Immediate |
| Team Lead | [Slack/Phone] | 15 minutes |
| Documenso Support | support@documenso.com | 30 minutes |
| Engineering Manager | [Slack/Phone] | 1 hour |
Output
- Incident diagnosed
- Mitigation applied
- Stakeholders notified
- Post-incident review scheduled
Resources
- Documenso Status
- Documenso Support
- [Internal Wiki - Signing Service]
Next Steps
For data handling procedures, see documenso-data-handling.