name: juicefs-skill description: Work with JuiceFS, a high-performance POSIX file system for cloud-native environments. Use when dealing with distributed file systems, object storage backends (S3, Azure, GCS), metadata engines (Redis, MySQL, TiKV), or when users mention JuiceFS, cloud storage, big data, or ML training storage. license: Apache-2.0 compatibility: Requires JuiceFS client, metadata engine (Redis/MySQL/TiKV/SQLite), and object storage access metadata: author: Herald Yu & GitHub Copilot version: 1.0 based_on: JuiceFS Community Edition
JuiceFS Skill
Prerequisites
JuiceFS Client Installation
The initialization script can install JuiceFS automatically if needed.
Standard Installation (Recommended)
curl -sSL https://d.juicefs.com/install | sh -
This installs to /usr/local/bin/juicefs (accessible system-wide).
Manual Installation
wget https://github.com/juicedata/juicefs/releases/latest/download/juicefs-linux-amd64.tar.gz
tar -zxf juicefs-linux-amd64.tar.gz
sudo install juicefs /usr/local/bin/
Verify Installation
juicefs version
Using the Initialization Script
The initialization script will:
- Check if JuiceFS is in your PATH
- Offer to install it automatically if not found
- Guide you through the process
Overview
JuiceFS is a high-performance POSIX file system designed for cloud-native environments. It separates data and metadata storage:
- Data: Stored in object storage (S3, GCS, Azure Blob, local disk, etc.)
- Metadata: Stored in databases (Redis, MySQL, PostgreSQL, TiKV, SQLite, etcd)
- Client: Mounts the file system and coordinates data/metadata
When to Use This Skill
Use this skill when:
- Setting up or managing JuiceFS file systems
- Integrating JuiceFS with Kubernetes, Hadoop, or Docker
- Optimizing JuiceFS performance for specific workloads
- Troubleshooting JuiceFS issues
- Migrating data to/from JuiceFS
- Configuring JuiceFS for big data, ML training, or shared storage
Core Concepts
Architecture
┌─────────────┐
│ JuiceFS │
│ Client │
└──────┬──────┘
│
┌──┴───────────┐
│ │
┌───▼────┐ ┌───▼────────┐
│Metadata│ │Object │
│Engine │ │Storage │
│(Redis) │ │(S3, etc.) │
└────────┘ └────────────┘
Data Organization
- Files → Chunks (max 64 MiB) → Slices (variable) → Blocks (4 MiB) → Object Storage
Metadata Engines
- Redis: Best for production, fast, supports Sentinel/Cluster for HA
- MySQL/PostgreSQL: Good for production with existing infrastructure
- TiKV: Excellent for large-scale distributed deployments
- SQLite: Simple, single-node, good for testing/development
- etcd: Small to medium scale
🔒 Security: Protecting Sensitive Credentials
IMPORTANT FOR AI AGENTS: When working with JuiceFS in AI agent environments, credentials (AK/SK, passwords) should NOT be exposed to the AI model to prevent data leakage.
SKILL Responsibility Boundary
What This SKILL Provides:
- Security guidance for AI agents working with JuiceFS
- Method to prevent AI agents from accessing sensitive credentials
- Secure initialization process with binary compilation
- Clear separation between admin setup (root) and agent usage (non-root)
What This SKILL Does NOT Handle:
- How AI agents are deployed or run
- Host system security configuration
- Network security setup
- General system administration
Design Philosophy: This SKILL assumes the AI agent runs as a non-root user and provides maximum isolation between the agent and sensitive information. Security recommendations under root/admin mode are ineffective as root has unrestricted access.
When Credential Protection is Required
Use the secure initialization approach when using:
- ✅ Object storage with access keys (S3, OSS, Azure Blob, GCS, etc.)
- ✅ Databases with passwords (Redis, MySQL, PostgreSQL with auth)
- ✅ Any configuration containing sensitive information
NOT required for:
- ❌ Local storage (
--storage file) + SQLite3 (no password) - ❌ Unauthenticated metadata engines
Secure Initialization Process
Instead of directly running juicefs format and juicefs mount commands that expose credentials:
IMPORTANT: The initialization script MUST be run with root/administrator privileges (sudo)
Why root is required:
- To install shc (Shell Script Compiler) if not present
- To compile scripts into secure binaries
- To set proper ownership and permissions
- To ensure AI agent user cannot access credentials
Run the initialization script:
# MUST run as root/admin
sudo ./scripts/juicefs-init.sh
# Script will prompt for AI agent username
Re-running the script: The script is designed to be re-runnable and will:
- Detect and prompt before overwriting existing binary
- Check if filesystem already exists (skip formatting if so)
- Allow you to update configuration without reformatting
This interactive script will:
- Prompt for AI agent username
- Prompt for all sensitive configuration (AK/SK, passwords, URLs)
- Install shc (Shell Script Compiler) if not present
- Format the filesystem if needed
- Generate wrapper script with embedded credentials
- Compile wrapper into binary using shc
- Name binary after filesystem for easy identification
- Verify binary functionality
- Clean up intermediate files (wrapper script, C source)
- Set proper permissions and ownership (root:AI_AGENT_USER group, 750)
Generated binary (in juicefs-scripts/ directory):
<filesystem-name>- Compiled binary wrapper (e.g.,prod-data)
The binary:
- Contains embedded credentials (compiled into binary format, obfuscated)
- Accepts any JuiceFS command and parameters
- Named after filesystem for easy identification and management
- One filesystem = one binary program
- Owned by root, executable by AI agent user
AI Agent usage:
# Switch to AI agent user first
su - aiagent
# Show available commands
./juicefs-scripts/myfs
# Mount filesystem
./juicefs-scripts/myfs mount /mnt/jfs
# Mount with custom options
./juicefs-scripts/myfs mount --cache-size 204800 /mnt/jfs
# Check filesystem status
./juicefs-scripts/myfs status
# Show statistics
./juicefs-scripts/myfs stats /mnt/jfs
# Unmount filesystem
./juicefs-scripts/myfs umount /mnt/jfs
# Run benchmark
./juicefs-scripts/myfs bench /mnt/jfs
Example: Secure Setup Flow
Step 1: Admin initializes (one-time setup as root):
cd /path/to/juicefs-skill
sudo ./scripts/juicefs-init.sh
# Enter AI agent username: aiagent
# Follow prompts to enter:
# - Filesystem name: prod-data
# - Redis: localhost:6379 with password
# - S3 bucket + AWS credentials
# - Compression option (for format)
# Binary compiled: juicefs-scripts/prod-data
# Owned by root, executable by aiagent
Step 2: AI agent user uses the binary:
# Switch to AI agent user
su - aiagent
# Mount filesystem (specify mount point at runtime)
./juicefs-scripts/prod-data mount /mnt/jfs
# Check status
./juicefs-scripts/prod-data status
# Work with filesystem
ls /mnt/jfs
cp data.csv /mnt/jfs/
# Unmount when done
./juicefs-scripts/prod-data umount /mnt/jfs
For AI Agents: Detection and Guidance
When a user asks to format or mount JuiceFS:
Check if using sensitive resources:
# User mentions: S3, OSS, Azure, GCS, Redis password, MySQL password, etc.
Provide guidance:
⚠️ SECURITY WARNING: This operation requires sensitive credentials.
To prevent credential exposure to the AI model:
1. Please run the initialization script yourself (MUST use sudo):
sudo ./scripts/juicefs-init.sh
# Script will prompt for AI agent username
2. Follow the prompts to configure your filesystem
- The script will install shc if needed
- Script will compile credentials into a binary using shc
- Binary will be named after your filesystem
- Binary owned by root, executable by AI agent
3. Once complete, I can help you use the generated binary:
- Show commands: ./juicefs-scripts/<name>
- Mount: ./juicefs-scripts/<name> mount <mountpoint>
- Status: ./juicefs-scripts/<name> status
- Unmount: ./juicefs-scripts/<name> umount <mountpoint>
This keeps your AK/SK and passwords secure from the AI model.
The binary contains compiled credentials that cannot be read with simple commands.
Note: Root privileges are required for shc installation, binary compilation,
and setting proper ownership/permissions.
Insecure Setup (Local Development Only)
For local development without sensitive data:
# This is safe for AI agents - no credentials involved
juicefs format \
--storage file \
--bucket /tmp/jfs-storage \
sqlite3:///tmp/jfs.db \
dev-fs
juicefs mount sqlite3:///tmp/jfs.db /mnt/jfs-dev
Essential Commands
1. Format a File System
Create a new JuiceFS file system:
# Basic format with Redis and S3
juicefs format \
--storage s3 \
--bucket https://mybucket.s3.amazonaws.com \
redis://localhost:6379/1 \
my-juicefs
# With compression
juicefs format \
--storage s3 \
--bucket https://mybucket.s3.amazonaws.com \
--compress lz4 \
redis://localhost:6379/1 \
my-juicefs
# Local development with SQLite
juicefs format \
--storage file \
--bucket /data/storage \
sqlite3://myjfs.db \
dev-fs
2. Mount a File System
# Basic mount
juicefs mount redis://localhost:6379/1 /mnt/jfs
# Production mount with cache optimization
juicefs mount \
--cache-dir /ssd/cache \
--cache-size 204800 \
--writeback \
-d \
redis://localhost:6379/1 \
/mnt/jfs
# Mount with prefetch for read-heavy workloads
juicefs mount \
--cache-dir /nvme/cache \
--cache-size 409600 \
--prefetch 3 \
redis://localhost:6379/1 \
/mnt/jfs
Key Mount Options:
--cache-dir: Cache directory (default:~/.juicefs/cache)--cache-size: Cache size in MiB (default: 102400 = 100GB)--writeback: Enable write-back cache for better write performance--prefetch N: Enable read prefetch with N threads--buffer-size: Read buffer size in MiB (default: 300)-d: Run in background (daemon mode)
3. Unmount
# Graceful unmount
juicefs umount /mnt/jfs
# Force unmount
juicefs umount -f /mnt/jfs
4. Sync Data
# Sync local to JuiceFS
juicefs sync /local/path/ jfs://redis://localhost:6379/1/remote/path/
# Sync between JuiceFS file systems
juicefs sync jfs://redis://localhost:6379/1/src/ jfs://redis://localhost:6379/2/dst/
# Sync from S3 to JuiceFS
juicefs sync s3://bucket/path/ /mnt/jfs/path/
# Dry run
juicefs sync --dry-run /source/ /dest/
5. Status and Monitoring
# Show file system status
juicefs status redis://localhost:6379/1
# Real-time statistics
juicefs stats /mnt/jfs
# Profile operations
juicefs profile /mnt/jfs
# Benchmark
juicefs bench /mnt/jfs
6. Configuration
# View configuration
juicefs config redis://localhost:6379/1
# Set trash retention
juicefs config redis://localhost:6379/1 --trash-days 7
# Set capacity quota
juicefs config redis://localhost:6379/1 --capacity 1048576
7. Maintenance
# Garbage collection (dry run first)
juicefs gc redis://localhost:6379/1 --dry
# Actual garbage collection
juicefs gc redis://localhost:6379/1
# Dump metadata for backup
juicefs dump redis://localhost:6379/1 backup.json
# Load metadata from backup
juicefs load redis://localhost:6379/1 backup.json
8. S3 Gateway
# Start S3-compatible gateway
export MINIO_ROOT_USER=admin
export MINIO_ROOT_PASSWORD=12345678
juicefs gateway redis://localhost:6379/1 localhost:9000
Configuration by Workload
Big Data Processing (Hadoop/Spark)
juicefs mount \
--cache-dir /ssd/cache \
--cache-size 204800 \
--writeback \
redis://redis:6379/1 \
/mnt/jfs
Machine Learning Training
juicefs mount \
--cache-dir /nvme/cache \
--cache-size 409600 \
--prefetch 3 \
--buffer-size 600 \
redis://redis:6379/1 \
/mnt/ml-data
Shared Development Environment
juicefs mount \
--cache-size 102400 \
redis://redis:6379/1 \
/mnt/shared
Backup/Archive (Write-heavy)
juicefs mount \
--writeback \
--buffer-size 600 \
redis://redis:6379/1 \
/mnt/backup
Kubernetes Integration
Basic PersistentVolume
apiVersion: v1
kind: PersistentVolume
metadata:
name: juicefs-pv
spec:
capacity:
storage: 10Pi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Retain
csi:
driver: csi.juicefs.com
volumeHandle: juicefs-volume
fsType: juicefs
nodePublishSecretRef:
name: juicefs-secret
namespace: default
Troubleshooting
Mount Fails
Check metadata engine:
# For Redis redis-cli -h localhost -p 6379 pingCheck credentials: Verify access keys for object storage
Check logs:
tail -f /var/log/juicefs.log
Slow Performance
Check cache hit rate:
juicefs stats /mnt/jfsIncrease cache:
juicefs umount /mnt/jfs juicefs mount --cache-size 204800 redis://localhost:6379/1 /mnt/jfsEnable prefetch for sequential reads:
juicefs mount --prefetch 3 redis://localhost:6379/1 /mnt/jfs
No Space Left on Device
Clean cache:
rm -rf ~/.juicefs/cache/*Increase free-space-ratio:
juicefs mount --free-space-ratio 0.2 redis://localhost:6379/1 /mnt/jfs
Common Patterns
Production Setup with HA
# Format with Redis Sentinel
juicefs format \
--storage s3 \
--bucket https://prod-bucket.s3.amazonaws.com \
redis://sentinel1:26379,sentinel2:26379,sentinel3:26379/mymaster/1 \
prod-fs
# Mount with optimized settings
juicefs mount \
--cache-dir /ssd/cache \
--cache-size 204800 \
--writeback \
-d \
redis://sentinel1:26379,sentinel2:26379,sentinel3:26379/mymaster/1 \
/mnt/jfs
Development Setup
# Format with SQLite (local)
juicefs format \
--storage file \
--bucket /tmp/jfs-storage \
sqlite3:///tmp/jfs.db \
dev-fs
# Mount
juicefs mount sqlite3:///tmp/jfs.db /mnt/jfs-dev
Data Migration
# Step 1: Mount source and destination
juicefs mount redis://source:6379/1 /mnt/source
juicefs mount redis://dest:6379/1 /mnt/dest
# Step 2: Sync data
juicefs sync /mnt/source/ /mnt/dest/
# Or use juicefs sync directly
juicefs sync jfs://redis://source:6379/1/ jfs://redis://dest:6379/1/
Performance Tuning Quick Guide
| Workload | Cache Size | Cache Dir | Extra Options |
|---|---|---|---|
| Read-heavy | 200-400GB | SSD/NVMe | --prefetch 3 |
| Write-heavy | 100-200GB | SSD | --writeback --buffer-size 600 |
| ML Training | 400GB+ | NVMe | --prefetch 3 --cache-size 409600 |
| Mixed | 100-200GB | SSD | Default |
| Small files | 100GB | SSD | --prefetch 1 |
Security Best Practices
🔒 Protect credentials in AI agent environments:
- Use
./scripts/juicefs-init.shto create compiled binary with embedded credentials - The script uses shc (Shell Script Compiler) to protect sensitive information
- Binary is named after filesystem for easy management
- Credentials are compiled into binary format (obfuscated by shc)
- This prevents AI models from easily accessing AK/SK, passwords, and sensitive URLs
- See the "Security: Protecting Sensitive Credentials" section above for details
- Use
Enable encryption:
juicefs format --encrypt-secret redis://localhost:6379/1 secure-fsUse TLS for metadata engine: Connect via
rediss://instead ofredis://Use HTTPS for object storage: Always use HTTPS endpoints
IAM roles: Use IAM roles instead of static access keys when possible
Network isolation: Use VPC/private networks for production
Advanced Security Recommendations
For production environments requiring maximum security:
1. Secret Management Services:
- AWS Secrets Manager / Parameter Store
- HashiCorp Vault
- Azure Key Vault
- Benefits: Centralized rotation, auditing, time-limited access
2. IAM-Based Authentication:
- AWS: Use IAM roles with EC2 instance profiles
- Azure: Use Managed Identity
- GCP: Use Workload Identity
- Benefits: No static credentials, automatic rotation
3. Certificate-Based Authentication:
- Use TLS client certificates for Redis/databases
- Benefits: No passwords to protect, automatic validation
4. Configuration File Encryption:
- age (modern encryption tool)
- SOPS (Secrets OPerationS)
- Benefits: Version-controllable configs, separate key management
See scripts/SECURITY_MODEL.md for detailed implementation guidance.
Environment Variables
The initialization script does NOT export sensitive environment variables. Instead, credentials are compiled into secure binaries.
For reference, JuiceFS supports these environment variables:
# Custom cache (✓ Safe - no credentials)
export JUICEFS_CACHE_DIR=/ssd/cache
# Debug logging (✓ Safe - no credentials)
export JUICEFS_LOGLEVEL=debug
# AWS credentials (⚠️ NOT RECOMMENDED - exposes to AI agent)
# export AWS_ACCESS_KEY_ID=your-key
# export AWS_SECRET_ACCESS_KEY=your-secret
# Redis password (⚠️ NOT RECOMMENDED - exposes to AI agent)
# export REDIS_PASSWORD=your-password
Recommended approach: Use the initialization script which compiles credentials into binaries rather than using environment variables.
Quick Decision Trees
Choosing a Metadata Engine
- Redis: Fast, production-ready, supports HA (Sentinel/Cluster)
- MySQL/PostgreSQL: Already have infrastructure, need SQL features
- TiKV: Large scale, need horizontal scalability
- SQLite: Development, testing, single node
- etcd: Small to medium scale, already using etcd
Choosing Cache Size
- Working set < 100GB: 100GB cache (102400 MiB)
- Working set 100-500GB: 200-400GB cache
- Working set > 500GB: 400GB+ cache
- Rule of thumb: 10-20% of working set size
References
For detailed information, see the references:
- Comprehensive Reference - Complete JuiceFS documentation
- Quick Start Guide - Task patterns and troubleshooting flowcharts
- Table of Contents - Index of all topics
Resources
- Official Documentation: https://juicefs.com/docs/community/introduction
- GitHub Repository: https://github.com/juicedata/juicefs
- Quick Start: https://juicefs.com/docs/community/quick_start_guide
- Command Reference: https://juicefs.com/docs/community/command_reference
- Community: https://github.com/juicedata/juicefs/discussions
Installation
# Linux AMD64
curl -sSL https://d.juicefs.com/install | sh -
# macOS (Homebrew)
brew install juicefs
# Docker
docker pull juicedata/juicefs
Tips for AI Agents
- Always check metadata engine connectivity first
- Cache is critical - allocate sufficient space on fast storage
- Use
--writebackfor write-heavy,--prefetchfor read-heavy workloads - Monitor with
juicefs statsregularly - Test with
juicefs benchbefore production - Plan for metadata engine HA in production
- Use compression (
--compress lz4) to reduce costs - Enable trash (
--trash-days 7) for safety - Run
juicefs gcregularly - Keep JuiceFS client updated