Now

What I'm working on and learning right now. Last updated: June 2026.

Building

I'm leading the architecture and evolution of DaVinci, Severstal's shared GenAI platform.

My current focus is:

designing multi-data-center inference, degraded operation, failover, and recovery
establishing model lifecycle, release gates, safe rollout, and rollback
improving inference observability, SLOs, load testing, and GPU capacity planning
evolving the AI gateway, model routing, quotas, and fallback policies
improving the performance and reliability of production LLM serving

The platform currently runs on 24 NVIDIA H200 GPUs and is expanding to 48 H200 and 8 H100 GPUs across two data centers.

Learning

I'm currently going deeper into:

distributed systems and multi-data-center architecture
LLM inference performance and GPU serving
Kubernetes networking, scheduling, and reliability
Go for infrastructure and platform services
performance engineering and advanced systems design

Writing

I'm writing about production LLM inference, AI infrastructure, distributed systems, and practical lessons from operating real systems.

Looking ahead

My long-term focus is Staff-level AI infrastructure, LLM inference, and distributed systems roles where I can combine hands-on engineering with cross-team technical direction.