VM Deployment (Redis Enterprise)
VM Deployment with Redis Enterprise¶
Deploy the Redis SRE Agent on a Linux VM using Redis Enterprise as the application database (for agent state, vector search, and task queues).
This guide covers bare-metal deployment without containers.
Prerequisites¶
System Requirements¶
- OS: Linux (Ubuntu 22.04+, Debian 11+, RHEL 8+, or similar)
- CPU: 4+ cores recommended (2 minimum)
- RAM: 8GB+ recommended (4GB minimum)
- Disk: 20GB+ free space (for application, artifacts, and Redis data)
- Network: Outbound HTTPS access for OpenAI API and package downloads
Software Requirements¶
- Python: 3.12 or 3.11 (3.12 recommended)
- Redis Enterprise: 7.4+ with RediSearch module enabled
- uv: Package manager (installed below)
- Git: For cloning the repository
- curl: For health checks and testing
Access Requirements¶
- OpenAI API key (or OpenAI-compatible endpoint)
- Redis Enterprise connection URL with credentials
- Optional: Prometheus, Loki endpoints for tool providers
1. Install System Dependencies¶
Ubuntu/Debian¶
sudo apt-get update
sudo apt-get install -y \
git \
curl \
ca-certificates \
python3.12 \
python3.12-venv \
python3.12-dev \
build-essential \
redis-tools
RHEL/Rocky/AlmaLinux¶
sudo dnf install -y \
git \
curl \
ca-certificates \
python3.12 \
python3.12-devel \
gcc \
gcc-c++ \
make \
redis
2. Install uv Package Manager¶
curl -LsSf https://astral.sh/uv/install.sh | sh
source $HOME/.cargo/env # Add uv to PATH
Verify installation:
uv --version
3. Clone and Install the Agent¶
# Clone repository
git clone https://github.com/redis-applied-ai/redis-sre-agent.git
cd redis-sre-agent
# Install dependencies in a virtual environment
uv sync
# Verify installation
uv run redis-sre-agent --help
4. Configure Redis Enterprise¶
Create a Database¶
In Redis Enterprise, create a database with:
- Modules: RediSearch enabled
- Memory: 2GB+ recommended
- Eviction policy:
noeviction(recommended for operational data) - Persistence: AOF or RDB (recommended)
Get Connection Details¶
Note your Redis Enterprise connection URL:
redis://username:password@redis-enterprise-host:port/0
Or for TLS:
rediss://username:password@redis-enterprise-host:port/0
5. Configure Environment¶
Create .env file:
cp .env.example .env
Edit .env with your configuration:
# Required: Redis Enterprise connection
REDIS_URL=redis://username:password@your-redis-enterprise:12000/0
# Required: OpenAI API key
OPENAI_API_KEY=sk-your-openai-api-key
# Recommended: Master key for secret encryption (32-byte base64)
# Generate with: python3 -c 'import os, base64; print(base64.b64encode(os.urandom(32)).decode())'
REDIS_SRE_MASTER_KEY=your-32-byte-base64-key
# Optional: Tool provider endpoints
TOOLS_PROMETHEUS_URL=http://your-prometheus:9090
TOOLS_LOKI_URL=http://your-loki:3100
# Optional: Observability
# OTEL_EXPORTER_OTLP_ENDPOINT=http://your-collector:4318/v1/traces
# LANGCHAIN_TRACING_V2=true
# LANGCHAIN_API_KEY=your-langsmith-key
6. Prepare Knowledge Base Artifacts¶
The agent needs runbooks and documentation ingested into the knowledge base.
Option A: Use Pre-built Artifacts (Faster)¶
If you have pre-built artifacts from another environment:
# Copy artifacts to ./artifacts directory
scp -r user@build-server:/path/to/artifacts ./artifacts
Option B: Build Artifacts Locally¶
# Prepare source documents (scrape docs, generate runbooks)
uv run redis-sre-agent pipeline prepare-sources \
--source-dir ./source_documents \
--artifacts-path ./artifacts \
--prepare-only
# This creates a batch in ./artifacts/YYYY-MM-DD/
7. Ingest Knowledge Base¶
Ingest the prepared artifacts into Redis Enterprise:
# Find the latest batch
ls -lt ./artifacts/ | head -n 5
# Ingest the batch (replace with your batch date)
uv run redis-sre-agent pipeline ingest \
--batch-date 2025-01-15 \
--artifacts-path ./artifacts
# Expected output:
# ✅ Ingestion completed!
# Batch date: 2025-01-15
# Documents processed: 450
# Chunks created: 12500
# Chunks indexed: 12500
This creates vector embeddings and indexes them in Redis Enterprise using RediSearch.
8. Start the Agent and Worker¶
The agent consists of two processes:
- API: FastAPI server (port 8000)
- Worker: Background task processor
Start the Worker (Terminal 1)¶
cd redis-sre-agent
uv run redis-sre-agent worker --concurrency 4
Expected output:
✅ SRE tasks registered with Docket
✅ Worker started, waiting for SRE tasks... Press Ctrl+C to stop
Prometheus metrics server started on :9101
Start the API (Terminal 2)¶
cd redis-sre-agent
uv run uvicorn redis_sre_agent.api.app:app --host 0.0.0.0 --port 8000
Expected output:
INFO: Started server process
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8000
Verify Health¶
# Quick health check
curl -fsS http://localhost:8000/
# Detailed health check
curl -fsS http://localhost:8000/api/v1/health | jq
# Expected: status "healthy" or "degraded" (degraded is OK if workers just started)
9. Create a Target Redis Instance¶
Create an instance record for a Redis you want to monitor/triage:
uv run redis-sre-agent instance create \
--name "Production Cache" \
--connection-url "redis://prod-redis:6379/0" \
--environment production \
--usage cache \
--description "Primary production cache" \
--monitoring-identifier "redis-prod-cache" \
--logging-identifier "redis-prod-cache"
# Output: ✅ Created instance redis-production-1234567890
List instances to get the instance ID:
uv run redis-sre-agent instance list
10. Test the Agent with a Health Check¶
Run a triage query against your target instance:
# Replace <instance-id> with the ID from step 9
uv run redis-sre-agent query \
"Check memory pressure and identify any slow operations" \
--redis-instance-id <instance-id>
The agent will:
- Create a task and thread
- Use the triage agent with parallel research tracks
- Query Prometheus metrics (if configured)
- Run Redis command diagnostics (INFO, SLOWLOG, etc.)
- Analyze findings and provide recommendations
Expected output:
✅ Task created: task_01JXXXXXXXXXXXXXXXXXXXXX
📋 Thread: thread_01JXXXXXXXXXXXXXXXXXXXXX
🔄 Status: running
[Agent output with analysis and recommendations]
✅ Task completed
11. Production Deployment¶
Run as Systemd Services¶
Create /etc/systemd/system/redis-sre-worker.service:
[Unit]
Description=Redis SRE Agent Worker
After=network.target
[Service]
Type=simple
User=sre-agent
WorkingDirectory=/opt/redis-sre-agent
Environment="PATH=/home/sre-agent/.cargo/bin:/usr/local/bin:/usr/bin:/bin"
ExecStart=/home/sre-agent/.cargo/bin/uv run redis-sre-agent worker --concurrency 4
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
Create /etc/systemd/system/redis-sre-api.service:
[Unit]
Description=Redis SRE Agent API
After=network.target redis-sre-worker.service
[Service]
Type=simple
User=sre-agent
WorkingDirectory=/opt/redis-sre-agent
Environment="PATH=/home/sre-agent/.cargo/bin:/usr/local/bin:/usr/bin:/bin"
ExecStart=/home/sre-agent/.cargo/bin/uv run uvicorn redis_sre_agent.api.app:app --host 0.0.0.0 --port 8000
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
Enable and start:
sudo systemctl daemon-reload
sudo systemctl enable redis-sre-worker redis-sre-api
sudo systemctl start redis-sre-worker redis-sre-api
# Check status
sudo systemctl status redis-sre-worker
sudo systemctl status redis-sre-api
# View logs
sudo journalctl -u redis-sre-worker -f
sudo journalctl -u redis-sre-api -f
Reverse Proxy (Optional)¶
Put the API behind nginx or similar:
server {
listen 443 ssl;
server_name sre-agent.example.com;
ssl_certificate /etc/ssl/certs/sre-agent.crt;
ssl_certificate_key /etc/ssl/private/sre-agent.key;
location / {
proxy_pass http://127.0.0.1:8000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
location /api/v1/ws/ {
proxy_pass http://127.0.0.1:8000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
}
12. Monitoring the Agent¶
Prometheus Metrics¶
The agent exposes metrics at:
- API:
http://localhost:8000/api/v1/metrics - Worker:
http://localhost:9101/
Add to your Prometheus scrape config:
scrape_configs:
- job_name: 'redis-sre-agent'
static_configs:
- targets: ['sre-agent-vm:8000']
metrics_path: /api/v1/metrics
- job_name: 'redis-sre-worker'
static_configs:
- targets: ['sre-agent-vm:9101']
Logs¶
- Worker logs:
sudo journalctl -u redis-sre-worker -f - API logs:
sudo journalctl -u redis-sre-api -f - Set
LOG_LEVEL=DEBUGin.envfor verbose logging
Troubleshooting¶
Agent can't connect to Redis Enterprise¶
- Verify connection URL and credentials
- Check network connectivity:
redis-cli -u "redis://user:pass@host:port" PING - Ensure RediSearch module is enabled:
redis-cli -u "..." MODULE LIST
Knowledge base ingestion fails¶
- Check Redis memory:
redis-cli -u "..." INFO memory - Verify RediSearch index:
redis-cli -u "..." FT._LIST - Check artifacts directory exists and has content
Worker not processing tasks¶
- Verify worker is running:
ps aux | grep redis-sre-agent - Check Redis connectivity from worker
- Review worker logs for errors
High memory usage¶
- Reduce worker concurrency:
--concurrency 2 - Limit LLM context in
.env:MAX_ITERATIONS=15 - Monitor with:
ps aux | grep redis-sre-agent
Next Steps¶
- Configure schedules for recurring health checks:
docs/how-to/scheduling-flows.md - Set up tool providers (Prometheus, Loki):
docs/how-to/tool-providers.md - Enable observability (OTel, LangSmith):
docs/operations/observability.md - Explore the API:
http://localhost:8000/docs