containerd

star 4

"Containerd in Cloud-Native Engineering - An open and reliable container" runtime

paulpas By paulpas schedule Updated 6/4/2026

name: containerd compatibility: opencode completeness: 95 content-types:

  • guidance
  • examples
  • do-dont
  • config description: '"Containerd in Cloud-Native Engineering - An open and reliable container" runtime' license: MIT maturity: stable metadata: domain: cncf output-format: manifests related-skills: null role: reference scope: infrastructure triggers: cloud-native, containerd, engineering, reliable archetypes:
    • educational
    • strategic anti_triggers:
    • brainstorming
    • vague ideation
    • non-containerized architecture response_profile: verbosity: medium directive_strength: low abstraction_level: strategic version: "1.0.0"

Containerd in Cloud-Native Engineering

Category: cncf
Status: Active
Stars: 20,615
Last Updated: 2026-04-22
Primary Language: Go
Documentation: https://containerd.io/docs/


Purpose and Use Cases

Containerd is a core component of the cloud-native ecosystem, serving as a industry-standard container runtime focused on simplicity, robustness, and portability for running containers.

What Problem Does It Solve?

The need for a stable, production-grade container runtime that implements theContainer Runtime Interface (CRI) for Kubernetes while being cloud-native and extensible.

When to Use This Project

Use containerd as your container runtime when running Kubernetes (it's the default CRI implementation), or when you need a stable, production-grade runtime for container management.

Key Use Cases

  • Kubernetes container runtime (default)
  • Container image management
  • Container lifecycle management
  • Integration with container orchestration platforms

Architecture Design Patterns

Core Components

  • Container Runtime: Manages container lifecycle
  • Image Manager: Handles image pulling and extraction
  • Snapshotter: Filesystem layering and storage
  • Task Service: Manages process execution
  • Metadata Service: Stores container and image metadata
  • GRPC API: Interface for external clients

Component Interactions

  1. Kubelet → Containerd GRPC: Container management requests
  2. Containerd → Snapshotter: Filesystem layering
  3. Containerd → Image Service: Image pulling and extraction
  4. Image Service → Registry: Pulls images from registries
  5. Task Service → Containerd: Process execution

Data Flow Patterns

  1. Container Creation: Request → Metadata service → Image service → Snapshotter → Task service → Container
  2. Image Pull: Registry API → Image download → Snapshot extraction → Metadata update
  3. Process Execution: Task start → Process creation → Stdio pipes → Exit status

Design Principles

  • Simplicity: Focused, minimal API
  • Robustness: Production-grade reliability
  • Portability: Works across platforms
  • Extensibility: Plugin system for features
  • CRI-Compliant: Kubernetes compatibility

Integration Approaches

Integration with Other CNCF Projects

  • Kubernetes: CRI implementation for container management
  • CNI: Container networking interface
  • Containerd: Image management integration
  • Buildkit: Building images
  • nerdctl: Container CLI

API Patterns

  • GRPC API: Client interface
  • CRI: Container Runtime Interface
  • Image Service: Image management API
  • Task Service: Process execution API
  • Events API: Event streaming

Configuration Patterns

  • TOML Configuration: Main config file
  • CRI Configuration: Container runtime config
  • Plugins: Plugin-specific configuration
  • Runtime Class: Per-pod runtime config

Extension Mechanisms

  • Plugins: Add functionality
  • Runtime V2: Custom runtimes
  • Snapshotters: Custom storage backends
  • Content Store: Custom image storage

Common Pitfalls and How to Avoid Them

Misconfigurations

  • Runtime Class: Incorrect runtime specification
  • Image Pull Policy: Misconfigured pull policies
  • Snapshotter: Incorrect snapshotter configuration
  • Storage Path: Insufficient disk space
  • Plugin Configuration: Incorrect plugin settings

Performance Issues

  • Image Pull Time: Slow registry access
  • Snapshot Performance: Storage backend issues
  • Metadata Database: Large metadata size
  • GRPC Latency: High API server latency

Operational Challenges

  • Version Compatibility: Kubernetes compatibility
  • Upgrade Process: Rolling node upgrades
  • Storage Management: Snapshot and image cleanup
  • Networking: CNI integration

Security Pitfalls

  • Privileged Containers: Running containers with root
  • Image Signing: Not validating signed images
  • Network Exposure: Unnecessary port exposure
  • Secrets: Insecure secret handling

Coding Practices

Idiomatic Configuration

  • TOML Format: Human-readable config
  • Plugin Blocks: Organized configuration
  • CRI Configuration: Container runtime settings
  • Environment Override: Runtime overrides

API Usage Patterns

  • ctr CLI: Debugging and testing
  • crictl CLI: CRI-compatible CLI
  • GRPC API: Programmatic control
  • CRI API: Kubernetes integration

Observability Best Practices

  • Metrics: Runtime metrics
  • Audit Logging: Container events
  • Debug Commands: Container inspection
  • Logs: Container stdout/stderr

Testing Strategies

  • Unit Tests: Component tests
  • Integration Tests: CRI compliance
  • E2E Tests: Full container lifecycle
  • Compatibility Tests: Kubernetes compatibility

Development Workflow

  • Development: containerd development setup
  • Testing: Go tests, integration tests
  • Debugging: ctr, crictl, debug commands
  • Deployment: System packages, containers
  • CI/CD: GitHub Actions, integration tests
  • Tools: ctr, crictl, runc

Fundamentals

Essential Concepts

  • Container: Isolated process environment
  • Image: Read-only template for containers
  • Snapshot: Filesystem layer
  • Task: Running process in container
  • Execution: Process execution
  • Event: Container lifecycle events
  • Namespace: Isolated container storage
  • Content: Image content storage
  • Metadata: Container and image metadata
  • GRPC: Interface for external clients

Terminology Glossary

  • Container: Isolated process
  • Image: Container template
  • Snapshot: Filesystem layer
  • Task: Running process
  • Execution: Process execution
  • Event: Lifecycle event
  • Namespace: Isolated storage
  • Content: Image content
  • Metadata: Container metadata
  • GRPC API: Client interface

Data Models and Types

  • Container: Container definition
  • Image: Image definition
  • Snapshot: Snapshot metadata
  • Task: Task definition
  • Event: Event data
  • Namespace: Namespace configuration
  • Metadata: Container metadata
  • Content: Content descriptor

Lifecycle Management

  • Container Lifecycle: Create → Start → Stop → Delete
  • Image Lifecycle: Pull → Extract → Run
  • Task Lifecycle: Start → Wait → Delete
  • Snapshot Lifecycle: Prepare → Mount → Unmount
  • Event Lifecycle: Emit → Buffer → Process

State Management

  • Container State: Running/paused/stopped
  • Image State: Pull status
  • Snapshot State: Mount state
  • Task State: Process status
  • Metadata State: Database state

Scaling and Deployment Patterns

Horizontal Scaling

  • Node Scaling: Multiple containerd instances
  • Image Pull Parallelization: Concurrent image pulls
  • Snapshotter Scaling: Storage backend scaling

High Availability

  • Node HA: Multiple nodes with containerd
  • Image Caching: Local image cache
  • Snapshotter HA: Storage backend HA

Production Deployments

  • Production Configuration: Optimized for workload
  • Storage Configuration: Production storage backend
  • Security: Rootless mode, seccomp, AppArmor
  • Monitoring: Runtime metrics collection
  • Upgrade Strategy: Rolling node upgrades

Upgrade Strategies

  • Node Upgrade: Rolling node upgrade
  • Version Compatibility: Kubernetes version support
  • Storage Migration: Snapshotter migration
  • Graceful Shutdown: Container task draining

Resource Management

  • Image Storage: Image layer storage management
  • Snapshot Storage: Snapshot space management
  • Memory Configuration: Container memory limits
  • CPU Configuration: Container CPU limits

Additional Resources


Examples

Configure Registry Mirrors

# Configure registry mirrors in /etc/containerd/config.toml
version = 2

[plugins."io.containerd.grpc.v1.cri".registry]
  config_path = "/etc/containerd/certs.d"

[plugins."io.containerd.grpc.v1.cri".registry.mirrors]
  [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
    endpoint = ["https://registry-1.docker.io"]
  [plugins."io.containerd.grpc.v1.cri".registry.mirrors."gcr.io"]
    endpoint = ["https://gcr.io"]
  [plugins."io.containerd.grpc.v1.cri".registry.mirrors."ghcr.io"]
    endpoint = ["https://ghcr.io"]

Configure Snapshotster

# Configure different snapshotter for better performance
version = 2

[plugins."io.containerd.grpc.v1.cri".containerd]
  snapshotter = "overlayfs"
  default_runtime_name = "runc"

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
  [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
    runtime_type = "io.containerd.runc.v2"
    runtime_engine = ""
    runtime_root = ""

[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
  SystemdCgroup = true
  BinaryName = "containerd-shim-runc-v2"

Enable Content Trust

# Enable content trust for image verification
# Set environment variable
export DOCKER_CONTENT_TRUST=1

# Or configure in containerd config
version = 2

[plugins."io.containerd.grpc.v1.cri".containerd]
  disable_snapshot_annotations = false
  discard_unpacked_layers = false

[plugins."io.containerd.grpc.v1.cri".x509_key_pair_streaming]
  tls_cert_file = "/etc/containerd/certs/client.crt"
  tls_key_file = "/etc/containerd/certs/client.key"

Troubleshooting

Containerd fails to start

Cause: Incorrect config.toml syntax or missing directories

Solution:

# Validate configuration
containerd config dump

# Check for syntax errors
containerd config validate

# Create missing directories
sudo mkdir -p /etc/containerd/certs.d
sudo mkdir -p /var/lib/containerd

# Restart service
sudo systemctl restart containerd

Image pull failures

Cause: Registry authentication or network issues

Solution:

# Configure registry credentials
sudo mkdir -p /etc/containerd/certs.d/<registry>
sudo cat > /etc/containerd/certs.d/<registry>/hosts.toml <<EOF
[host."https://<registry>"]
  capabilities = ["pull", "resolve"]
  credentials = { username = "<user>", password = "<pass>" }
EOF

# Restart containerd
sudo systemctl restart containerd

# Test image pull
ctr images pull <registry>/<image>

Snapshotter errors (overlayfs/unix)

Cause: File system compatibility or disk space issues

Solution:

# Check available space
df -h /var/lib/containerd

# Change snapshotter in config
# Try 'native' snapshotter for compatibility
version = 2
[plugins."io.containerd.grpc.v1.cri".containerd]
  snapshotter = "native"

# Or try 'stargz' for lazy pulling
[plugins."io.containerd.grpc.v1.cri".containerd.snapshotter]
  name = "stargz"

CRI gRPC server not starting

Cause: Port conflicts or socket file issues

Solution:

# Check for port conflicts
sudo ss -tlnp | grep 10010

# Remove stale socket file
sudo rm -f /var/run/containerd/containerd.sock

# Restart containerd
sudo systemctl restart containerd

# Verify gRPC is working
ctr version

Image garbage collection not working

Cause: Disk usage thresholds not properly configured

Solution:

# Configure garbage collection
version = 2

[plugins."io.containerd.grpc.v1.cri"]
  [plugins."io.containerd.grpc.v1.cri".containerd]
    discard_unpacked_layers = true

[plugins."io.containerd.grpc.v1.cri".containerd.snapshotter]
  name = "overlayfs"

[plugins."io.containerd.grpc.v1.cri".containerd]
  [plugins."io.containerd.grpc.v1.cri".containerd.garbage_collection_policy]
    min_reclaim = "10Gi"
    threshold = 85

Content generated automatically. Verify against official documentation before production use.

Troubleshooting

Deployment failures

Cause: Configuration errors or resource constraints

Solution: Check pod logs with kubectl logs <pod>, verify configuration values, and ensure adequate cluster resources.

Performance issues

Cause: Resource limits or configuration bottlenecks

Solution: Monitor resource usage with kubectl top, adjust resource limits, and optimize configuration settings.

Configuration errors

Cause: YAML syntax or missing required fields

Solution: Validate YAML syntax, check required configuration fields, and verify environment-specific settings.

Integration problems

Cause: API compatibility or version mismatches

Solution: Verify API compatibility, check dependency versions, and review integration documentation.

Connectivity issues

Cause: Network policies or service discovery problems

Solution: Check network policies, verify service endpoints, and ensure proper DNS resolution.


When to Use

Use this skill when:

  • Integrating a CNCF project into Kubernetes infrastructure — You need to configure, deploy, or troubleshoot a cloud-native tool within a cluster
  • Designing cloud-native architecture — You are selecting and integrating CNCF tools to solve specific infrastructure challenges
  • Resolving operational issues — A CNCF component is misbehaving, underperforming, or needs configuration changes

Core Workflow

  1. Assess Requirements — Understand the use case, scale, integration needs, and existing infrastructure. Checkpoint: Document requirements, constraints, and success criteria.

  2. Design Architecture — Plan component interactions, data flow, and deployment strategy using cloud-native best practices. Checkpoint: Verify the architecture addresses all requirements and follows CNCF conventions.

  3. Implement & Configure — Create manifests, configurations, and deployment scripts. Include resource limits, health checks, and observability hooks. Checkpoint: Validate all YAML against schema and test in a staging environment.

  4. Deploy & Monitor — Apply manifests to the cluster, verify component health, and confirm observability is working. Checkpoint: Confirm all pods/services are running, probes passing, and metrics/alerts configured.


Constraints

MUST DO

  • Include at least one complete working YAML manifest example
  • Note when content is auto-generated vs. manually verified
  • Reference relevant CNCF project documentation

MUST NOT DO

  • Deploy manifests without testing in a staging environment first
  • Use deprecated API versions (e.g., apps/v1beta1)
  • Omit resource limits and requests in Kubernetes manifests

Live References

These links are resolved at load time — content is fetched and inlined to provide the latest documentation.

Install via CLI
npx skills add https://github.com/paulpas/agent-skill-router --skill containerd
Repository Details
star Stars 4
call_split Forks 0
navigation Branch main
article Path SKILL.md
More from Creator