How does Metoro Guardian achieve a 100% capture rate without sampling?
Metoro Guardian utilizes eBPF kernel hooks to directly access kernel signals. This allows it to capture every syscall and packet, ensuring a complete and unsampled view of all requests, errors, and timeouts within the Kubernetes environment, with negligible overhead.
What is the 'secret' behind Metoro's accurate root cause analysis, and how does it differ from other 'AI-powered' tools?
Metoro's secret lies in bridging runtime telemetry with code repository insights. Unlike tools that only see 'what happened' (telemetry) or rely on incomplete data, Metoro combines real-time metrics, logs, traces, and profiling with direct source code access, recent changes, and deployment history. This comprehensive 'telemetry + code' approach allows its AI to identify the exact broken line and generate targeted fixes, avoiding the pitfalls of sampling, manual instrumentation, and unstructured data that often plague other AI tools.
Does Metoro Guardian replace an engineering team or make autonomous decisions like deploying fixes or rolling back changes?
No, Metoro Guardian is designed as an AI SRE teammate to augment, not replace, engineering teams. It will automatically find issues, root cause them, and raise PRs with suggested fixes. However, every deployment, rollback, or implementation of a fix requires explicit human approval and review, ensuring engineers maintain complete control over critical actions.
How does Metoro handle existing alerts, and can it detect issues that don't have monitors configured?
Metoro Guardian can investigate your existing alerts when they fire, providing root cause analysis. Additionally, it autonomously monitors your cluster and detects anomalies, performance degradations, and errors even in areas where you don't have specific monitors or alerts configured, effectively uncovering blind spots before they escalate into incidents.
What kind of context does Metoro use to improve its root cause analysis over time?
Metoro combines layered runtime and engineering context. This includes eBPF kernel signals (traces, logs, metrics, profiling), custom OTel metrics and traces, code repositories and deploy history (commits, diffs, deploys), and incident memory (past incidents, runbooks, Slack threads, tickets). By continuously learning from this unified context, its investigation quality improves with each new signal and incident.