Configuration
HYDRA — .env file
# Required
GROQ_API_KEY=gsk_your_key_here
# SSH server
SSH_PORT=2222
SSH_HOST=0.0.0.0
# Logging
LOG_DIR=logs
LOG_LEVEL=INFO
# LLM settings
LLM_CACHE_SIZE=200
LLM_CACHE_TTL=300
# Anti-fingerprinting
AUTH_DELAY_MIN=0.5
AUTH_DELAY_MAX=2.0
SSH_BANNER=SSH-2.0-OpenSSH_8.9p1 Ubuntu-3ubuntu0.6
HYDRA — hydra_link.yaml
Configures the PDX ↔ HYDRA bridge:
hydra:
install_path: "../hydra-honeypot"
logs_dir: "logs"
training_output_dir: "training_output"
feedback_store: "data/feedback.yaml"
auto_collect:
enabled: true
min_sessions: 10
pdx_destination: "training_output/hydra_collected"
PDX — Scope policy
scope_policy.yaml defines what targets PDX is allowed to scan (for the Burp bridge):
scope:
include:
- "*.target.com"
- "api.target.com"
exclude:
- "*.google.com"
- "*.cloudflare.com"
policy:
max_requests_per_second: 10
respect_robots_txt: true
follow_redirects: true
max_depth: 3
Session classifier thresholds
Adjust in code or via config dict:
| Parameter | Default | Description |
ephemeral_threshold_s | 5 | Sessions shorter than this = bot_ephemeral |
recon_threshold_s | 20 | Combined with cmd count for bot_recon |
recon_max_cmds | 3 | Max commands to still qualify as recon-only |
human_min_duration_s | 20 | Minimum duration for likely_human |
human_min_non_disc | 1 | Min non-discovery commands for likely_human |
PromptGuard thresholds
| Parameter | Default | Description |
warn_threshold | 0.5 | Score above this is logged as warning |
block_threshold | 0.8 | Score above this is logged as block (no actual blocking) |
Quality pipeline
| Parameter | Default | Description |
min_quality | 0.3 | Minimum quality score to keep |
dedup_threshold | 0.85 | Trigram similarity threshold |
min_tokens | 50 | Minimum output length |
max_tokens | 2,000 | Maximum output length |
Fine-tuning defaults
| Parameter | Default | Description |
| Model | Qwen 2.5 7B | Base model for LoRA |
| LoRA rank | 16 | Adapter rank |
| Epochs | 3 | Training epochs |
| Learning rate | 2e-4 | Managed by Unsloth |
| Batch size | Auto | Based on available VRAM |