Friday, 27 September 2024

High availability (HA) and the 9's

High availability (HA) is what everyone wants nowadays - the higher, the better.

The targeted availability depends on the business need, as it is different for i.e. a system that controls a billboard standing by the road somewhere, or for a system controlling a nuclear power plant.




Systems require regular maintenance, so availability is affected by both the scheduled and unscheduled unavailability. For example: a service may be down because of a regular patch of an operating system or a software deployment, but it could also be unavailable after some non-redundant power supply goes down somewhere in some single-point-of-failure component.

The term "availability" is not equal to "uptime", but the calculation is trivial:
  • Each year has about 365.25 days, which is 8,766 hours.
  • Let us say that a service had 8 hours of downtime over a year.
  • Then the uptime is 8,766 – 8 = 8,758 hours.
  • And the availability (in %) in this case is calculated as 8,758 hours / 8,766 hours * 100 % =  99.9087 %  ≈ 99.91 % so a three 9's.

What is the "three 9's"?

System availability is often measured by the "number of nines", so how many "9" figures are present in the availability percentage.

Here are some approximate calculations for availabilities for different downtimes:



When a system is designed for a customer, and they are asked about what availability they would like to have, it is only natural that they say 99.999 % or even better. 

It is first when they hear the price for the wanted uptime they start to remember their ROI and to consider other options.

The problem with those costs is that they are not progressing linearly with adding more 9's  - it goes more to an exponential growth.

It is not an easy task to calculate those expenses, as many of the factors are influencing the costs.

Let me try to mention some:
  • Architecture (both system and software)
  • Team (Operations / DevOps / Developers)
  • Monitoring
  • Maintenance
  • Testing & Fire drills
  • Processes implemented
  • Documentation



For a discussion regarding this matter, please join/visit the "Mission Critical Systems Forum" group, this post.



No comments:

Post a Comment

Fra idé til fundament: Sådan starter vi et IT-projekt rigtigt

Når et nyt IT-projekt sparkes i gang, er det fristende at springe direkte til teknologivalg, implementeringen af nye fede deatures, eller sy...