Andrey Krisanov

Production LLM inference, AI infrastructure, and distributed systems

About

Hi, I'm Andrey

I'm a Staff Software Engineer focused on production LLM inference, AI infrastructure, and distributed systems. I like turning complex systems into something boring, reliable, and easy to reason about.

At Severstal, I lead the architecture and evolution of DaVinci, a shared GenAI platform supporting enterprise AI products, coding agents, and agentic workflows. Its inference foundation currently runs on 24 NVIDIA H200 GPUs and is expanding to 48 H200 and 8 H100 GPUs across two data centers.

My work covers Kubernetes and vLLM model serving, traffic management, performance and reliability engineering, observability, model lifecycle, capacity planning, and multi-data-center failover and recovery.

Background

I've spent 15+ years building backend, cloud, distributed, and SaaS systems in startups and large companies across Germany and Russia.

Before working on AI infrastructure, I:

  • led the modernization of a $3M+ ARR SaaS platform, improving critical backend paths by 2–10x and reaching 99.998% availability;
  • designed and launched a content platform that grew to more than 20 million monthly active users;
  • built systems in fintech, data privacy, payments, media, and B2B SaaS;
  • worked across senior individual-contributor, technical-leadership, and CTO roles.

The common thread has been taking responsibility for systems that are important, technically complicated, and expected to work reliably in production.

What I care about

I'm especially interested in:

  • production LLM serving and inference performance
  • reliable AI platforms and model-delivery systems
  • observability, evaluation, capacity planning, and failure modes
  • distributed systems that remain understandable and maintainable
  • developer tooling that improves delivery without hiding operational complexity

I prefer measurable reliability, explicit trade-offs, and production evidence over impressive demos.

Writing and contact

This site is where I write about AI infrastructure, LLM systems, distributed systems, and software engineering.

You can also find me on GitHub and LinkedIn, download my résumé, or contact me by email.

Outside work, I spend time with my family, read books, and explore new places whenever I can.