What Is Site Reliability Engineering?

Site reliability engineering (SRE) is becoming a popular approach to app and website delivery, started at Google and expanding to many large technology‑driven enterprises. While the definition of SRE varies widely – from part DevOps to part networking with a dash of customer experience thrown in – the majority of SRE teams are responsible for the following:

  • Site maintenance – The SRE team owns the back‑end infrastructure and the customer experience it provides.
  • Site infrastructure engineering – Sites with successful customer experiences are highly automated and robust, largely due to the work of the SRE team.
  • Relationships – In many cases, SRE teams are also charged with empowering development teams to operate faster and collaborate more efficiently, usually by defining guiding principles and providing self‑service tools.
  • Performance tracking – SRE teams identify a company’s critical systems and establish ways to measure the effectiveness of their activities on those systems to ensure there’s a satisfactory return on investment (ROI). These systems can include website traffic, customer data, and website durability, among others.

The SRE team is usually part of a larger development or customer experience (CX) team. In either case, a primary objective of the “parent” team is to maintain and improve customer experience on their organization’s website in the service of achieving business goals.

How Can NGINX Help?

F5 NGINX provides a suite of products that together form the core of what you need to create apps and APIs with performance, reliability, security, and scale.

To explore SRE in more depth, watch this video from NGINX Conf 2019 in which leaders from top companies discuss how to determine whether your organization needs an SRE team and – if you already have one – how best to scale it and establish key performance indicators (KPIs).

Resources