SLA, SLO, SLI defined (the short version)

Over the years I have had the opportunity to work with many companies either as an employee or consultant. During this time, I have observed that many of the organizations and teams I work with are not as familiar as they should be with these important aspects of designing, building, deploying, and supporting production systems. As a matter of fact, I have also seen many times where the organization itself has not defined any of these, and therefore have no clearly defined expectations of what should be considered acceptable when it comes to uptime, performance, continuity, time to resolve issues, data redundancy, etc… or how to hold those that are responsible accountable. The technical stakeholders have no idea what is expected of them, or what the experience needs to be for the users of the application.

Here is a quick outline of each of Service Level Agreement / Objective / Indicators.

  1. Service Level Agreements
    1. An agreement between provider and client about measurable metrics like uptime, responsiveness, and responsibilities that typically have, for example, an SLA may promise that teams will resolve reported issues with Product X within 24 hours, but that same SLA doesn’t spell out what happens if the client takes 24 hours to send answers or screenshots to help your team diagnose the problem
  2. Service Level Objectives
    1. If the SLA is the formal agreement between you and your customer, SLOs are the individual promises you’re making to that customer such as a specific metric like uptime or response time within a specific SLA and are what set customer expectations and tell IT and DevOps teams what goals they need to hit and measure themselves against.
  3. SLI: Service Level Indicator
    1. Measures compliance with an SLO (99.95% uptime for example) as in being able to measure the actual percent to make sure the SLO was met.