Skip to main content

How Moonlet Monitors Solana Validator Health

Moonlet operates institutional-grade validator clusters on Solana. To protect delegators and maximize rewards, we combine conventional observability tools with an AI-driven monitoring layer that detects anomalies before they affect uptime or vote-credits.


1 Key Metrics We Track 24 / 7

CategoryExamplesWhy It Matters to You
ConsensusSlot vote rate, skip-rate, delinquent status, voted creditsDirectly determines epoch rewards.
Proof-of-HistoryTPU/RPC PoH drift, slot-leading lagEnsures our node stays synchronized with cluster time.
NetworkPacket loss, UDP RTT, gossip peer countHigh latency → missed leader slots and lower rewards.
SystemCPU %, RAM, NVMe I/O, GPU loadSolana is hardware-intensive; saturation can stall vote signing.
SecuritySSH login anomalies, unsigned kernel modules, validator identity mismatchPrevents key-compromise and slashable offenses.
Blockchain APIRPC health, rate-limit errors, block-commit timesGuarantees users get real-time data in Moonlet UI.

2 Our Monitoring Stack

LayerTooling
CollectionPrometheus exporters (systemd, Solana-Exporter, node-exporter), Loki log ingestion
AI Anomaly EngineCustom LSTM & Prophet models flag outliers in vote-credits, slot times, and resource trends (24-hour look-back)
AlertingPagerDuty, Slack, and on-call SMS for P1 events
DashboardsGrafana & internal Moonlet “Validator Pulse” panel—visible to ops and exposed read-only in the user dashboard

3 Automated Self-Healing

  1. Slot-Skip Spike > 3 % (5 min window)

    • AI engine triggers restart of the validator TPU and toggles leader-schedule voting.
  2. PoH drift > 150 ms

    • Clock resync with Chrony NTP; if unresolved, fail over to standby node in another region.
  3. RPC 5xx Error Burst

    • Traffic is routed to redundant RPC pool; faulty container is auto-replaced via Kubernetes.

Average MTTR (mean-time-to-recovery) last quarter: 58 seconds.


4 What Users See in the Moonlet Dashboard

IndicatorMeaning
Green “Healthy” badgeVote-credits ≥ 99 % of cluster average, skip-rate ≤ network median.
Yellow “Degraded” badgeMomentary issue detected; self-healing in progress. Rewards typically unaffected.
Red “Action” badgePersistent performance drop (> 2 epochs). We notify delegators via in-app banner and email.

Click Validator Details → Health to view live skip-rate, vote credits, commission, and historical uptime charts.


5 Compliance & Transparency

  • SOC 2 Type II & ISO 27001 controls govern our monitoring pipeline (log integrity, access control, incident response).

  • Weekly performance snapshots are published to a public JSON feed so delegators can audit our vote-credit history.

  • All validator binaries are built from reproducible sources; checksums are logged and signed for every upgrade.


6 FAQ

QuestionAnswer
Will I lose rewards if the validator restarts?No. Restarts are staggered outside leader slots; vote-credits stay ≥ 99 %.
How often do you upgrade the validator client?Within 24 h of an official Solana stable release—earlier if it’s a critical security patch.
Do you slash delegators?Solana currently has limited slashing; our architecture (geo-redundant, AI-monitored) is built to avoid slashable events if they’re enabled in the future.

Moonlet’s AI-backed observability ensures high vote-credit performance and near-zero downtime—so your SOL keeps compounding, epoch after epoch.