Site Reliability Engineer
- Investigated and resolved production issues across Kubernetes infrastructure, improving reliability through direct root-cause analysis.
- Built internal tooling to analyze cluster alert trends over time, making recurring failure patterns visible earlier.
- Developed an internal log analysis workflow using ML-based pattern detection to surface repeat failures faster.
2025 - Present
TP-Link Systems