The Severities We Refuse to Name
Why every severity scale is shorter than the thing it measures
Every severity scale is a map of what an organisation is willing to admit.
The top of the ladder is well-lit and well-trodden. The bottom, less so. Walk down it slowly and you notice the lighting getting worse.
SEV-0 is the incident that ends one company and starts another in its place - the same logo, the same office, the same payroll, but a different company, the way a building is a different building after a fire even if the bricks are the same. You learn about a SEV-0 the way you learn about anything serious in this industry: late, indirect, and from someone who would rather not be telling you. The principal engineer at the bar who says "we don't deploy on Fridays anymore" and does not explain why. The staff engineer who flinches, fractionally, at the mention of a particular subsystem. The runbook with a section so over-engineered it could only have been written by someone who watched the previous version fail. SEV-0 is the inheritance nobody hands you. The architecture remembers. The taxonomy does not.
SEV-1 is the one everyone understands. The site is down. The money has stopped. Your company is mentioned by name on a news site. Someone senior is awake who should not be awake, and someone junior is typing with the terrible precision of a person who knows their commit history will be read aloud in a room next week. SEV-1 is loud, expensive, and - because of the noise and the cost - honest. You cannot hide a SEV-1. The category works because the incident refuses to be ignored.
SEV-2 is the incident that does not sleep, and arranges for you not to either. It is too big to ignore and too small to escalate to someone important. It is real enough that the channel stays open through the night. So you hold the line for four hours, sometimes eight, and you watch the clock the whole time, because the longer it runs the more likely it becomes that someone important will have to be woken anyway, and at that point the incident is no longer a SEV-2. SEV-2 is the severity that is partly defined by how quickly you can make it stop being one. It is where you learn that incident management is a clock-management problem.
SEV-3 is the workhorse. It is where most of incident management actually lives - the elevated error rates, the latency creep, the integration partner who has chosen today to have feelings about their API contract. It is also, by volume and by neglect, the severity most likely to be ignored. Not rejected. Not triaged and deprioritised. Ignored. Left in the channel like a glass on a counter that someone will get to eventually. And then four hours pass, and the glass is still there, and the customers who were patient at hour one are no longer patient at hour four, and the SEV-3 is no longer a SEV-3. It has become a SEV-2 by sheer laziness - not because the incident got worse, but because nobody made it better while making it better was still cheap. If you want to know whether an organisation's incident management is real or performative, watch how it handles a SEV-3 on a Friday afternoon. The answer is usually: it doesn't.
SEV-4 is the severity that half the industry claims to have and nobody actually runs. It is the incident too small to mobilise for and too real to dismiss - the queue that backed up for six minutes, the endpoint that five-hundred'd for a fraction of a percent of traffic, the alert that fired and resolved before the channel filled. In theory this is where the organisation learns. In practice it is where the organisation files and forgets, because the cost of taking a SEV-4 seriously is higher than the cost of shipping something else instead. So the category quietly empties. And in some places - I worked inside one - it never existed to begin with. The scale goes one, two, three, and then straight to the end. A house with no ground floor. Everyone who worked there understood why without ever quite saying it.
SEV-5 is the category we do not have, because having it would mean admitting what it contains. It is the documentation that went stale in 2023 and is still being cited in 2026. It is the monitoring nobody trusts, because the thresholds were set by someone who left three reorgs ago. It is the single engineer who understands the billing pipeline and is currently interviewing at a competitor. It is the runbook that has been wrong for fourteen months, and the team that has learned to work around the wrongness, and the new hire who will inherit the workaround as the thing itself. None of this will page you. All of it will kill you. The reason we do not have a severity for slow erosion is that a severity implies a response, and the response to slow erosion is structural, and structural responses require someone willing to say aloud that the house is on fire even though nothing is visibly burning.
The ladder does not end at SEV-3. It does not end at SEV-4. It ends somewhere below, in a category we have decided not to name.
It will wait there, whether we name it or not.





Interesting, we were just talking with @adhorn about severity and categorisation of incidents in general - if severity is even the "right" measure.
Also - awesome graphics across posts!!