About
Hi, I'm Andrey
I'm a Staff Software Engineer focused on production LLM inference, AI infrastructure, and distributed systems. I like turning complex systems into something boring, reliable, and easy to reason about.
At Severstal, I lead the architecture and evolution of DaVinci, a shared GenAI platform supporting enterprise AI products, coding agents, and agentic workflows. Its inference foundation currently runs on 24 NVIDIA H200 GPUs and is expanding to 48 H200 and 8 H100 GPUs across two data centers.
My work covers Kubernetes and vLLM model serving, traffic management, performance and reliability engineering, observability, model lifecycle, capacity planning, and multi-data-center failover and recovery.
Background
I've spent 15+ years building backend, cloud, distributed, and SaaS systems in startups and large companies across Germany and Russia.
Before working on AI infrastructure, I:
- led the modernization of a $3M+ ARR SaaS platform, improving critical backend paths by 2–10x and reaching 99.998% availability;
- designed and launched a content platform that grew to more than 20 million monthly active users;
- built systems in fintech, data privacy, payments, media, and B2B SaaS;
- worked across senior individual-contributor, technical-leadership, and CTO roles.
The common thread has been taking responsibility for systems that are important, technically complicated, and expected to work reliably in production.
What I care about
I'm especially interested in:
- production LLM serving and inference performance
- reliable AI platforms and model-delivery systems
- observability, evaluation, capacity planning, and failure modes
- distributed systems that remain understandable and maintainable
- developer tooling that improves delivery without hiding operational complexity
I prefer measurable reliability, explicit trade-offs, and production evidence over impressive demos.
Writing and contact
This site is where I write about AI infrastructure, LLM systems, distributed systems, and software engineering.
You can also find me on GitHub and LinkedIn, download my résumé, or contact me by email.
Outside work, I spend time with my family, read books, and explore new places whenever I can.