Skip to content

vLLM Metrics in Production

January 28, 2026

A hands-on guide to vLLM monitoring: the key Prometheus metrics (TTFT, TPOT, queueing, KV cache, swapping), Grafana panels, and alert rules that help you debug latency and plan capacity.

Choosing Apache Kafka For A New Project – A Questionnaire

August 29, 2023

You're starting a new project with Apache Kafka. Before setting up broker parameters and writing producers and consumers, what questions should you ask yourself? To ensure a smooth start, I have prepared the following checklist/questionnaire.

My "It's not DNS" story

August 12, 2023

The story about the DNS resolver, Linux VMs, experienced infrastructure team, and me troubleshooting an incident happened on Sunday morning.