Skip to content

Configuration

HYDRA — .env file

# Required
GROQ_API_KEY=gsk_your_key_here

# SSH server
SSH_PORT=2222
SSH_HOST=0.0.0.0

# Logging
LOG_DIR=logs
LOG_LEVEL=INFO

# LLM settings
LLM_CACHE_SIZE=200
LLM_CACHE_TTL=300

# Anti-fingerprinting
AUTH_DELAY_MIN=0.5
AUTH_DELAY_MAX=2.0
SSH_BANNER=SSH-2.0-OpenSSH_8.9p1 Ubuntu-3ubuntu0.6

HYDRA — hydra_link.yaml

Configures the PDX ↔ HYDRA bridge:

hydra:
  install_path: "../hydra-honeypot"
  logs_dir: "logs"
  training_output_dir: "training_output"
  feedback_store: "data/feedback.yaml"

  auto_collect:
    enabled: true
    min_sessions: 10
    pdx_destination: "training_output/hydra_collected"

PDX — Scope policy

scope_policy.yaml defines what targets PDX is allowed to scan (for the Burp bridge):

scope:
  include:
    - "*.target.com"
    - "api.target.com"
  exclude:
    - "*.google.com"
    - "*.cloudflare.com"

policy:
  max_requests_per_second: 10
  respect_robots_txt: true
  follow_redirects: true
  max_depth: 3

Session classifier thresholds

Adjust in code or via config dict:

Parameter Default Description
ephemeral_threshold_s 5 Sessions shorter than this = bot_ephemeral
recon_threshold_s 20 Combined with cmd count for bot_recon
recon_max_cmds 3 Max commands to still qualify as recon-only
human_min_duration_s 20 Minimum duration for likely_human
human_min_non_disc 1 Min non-discovery commands for likely_human

PromptGuard thresholds

Parameter Default Description
warn_threshold 0.5 Score above this is logged as warning
block_threshold 0.8 Score above this is logged as block (no actual blocking)

Quality pipeline

Parameter Default Description
min_quality 0.3 Minimum quality score to keep
dedup_threshold 0.85 Trigram similarity threshold
min_tokens 50 Minimum output length
max_tokens 2,000 Maximum output length

Fine-tuning defaults

Parameter Default Description
Model Qwen 2.5 7B Base model for LoRA
LoRA rank 16 Adapter rank
Epochs 3 Training epochs
Learning rate 2e-4 Managed by Unsloth
Batch size Auto Based on available VRAM