Site Reliability Engineering

Fundamental transformation of IT service delivery and management: Technical Application Management (TAM) becomes a software discipline.

Site Reliability Engineering (SRE) marks a fundamental transformation in the world of IT service delivery and management. Traditional Technical Application Management, also known as TAM, is evolving into a software discipline. The TAM professional of the future is not only responsible for system administration but also a full-fledged software engineer. This means they are familiar with a wide range of software development tools and practices.

Site Reliability Engineers (SREs) no longer solely administer systems; they have evolved into adept software engineers, well-versed in an array of development tools and practices. This transformative journey positions the development environment as a pivotal component within the C2 Platform methodology.

An integral facet of SRE involves seamlessly integrating the Integrated Development Environment (IDE) into the management process. For the C2 Platform, the IDE of choice is Visual Studio Code. SRE teams employ Continuous Integration (CI), Continuous Deployment (CD) pipelines, and rigorous unit testing to uphold system quality and reliability.

Google, a pioneer in SRE, suggests that an ideal SRE team consists of half individuals with a software background and the other half with a system engineering background. This diverse composition enables the team to address the complexity of modern IT systems using software engineering principles. All team members share a passion for programming and strive for automated solutions.

There is a strong relationship between SRE and DevOps. While DevOps focuses on collaboration between development and operations, SRE takes it a step further. SRE team members are responsible for designing and implementing software-driven solutions that enhance the reliability, scalability, and efficiency of IT systems. A frequently cited statement in the DevOps world is “automation is the key to DevOps,” and this also holds true for SRE.

The Atlassian article “Love DevOps? Wait until you meet SRE” provides in-depth insights into the concepts and benefits of SRE and how it relates to DevOps.

“In Conversation with Ben Treynor” is an engaging discussion with Ben Treynor, one of the founders of Site Reliability Engineering at Google. He discusses the origin of SRE and its impact on the IT industry.

Fundamentally, it’s what happens when you ask a software engineer to design an operations function.
"In Conversation with Ben Treynor" 

Automation is the key to DevOps … and SRE
Love DevOps? Wait until you meet SRE | Atlassian 



Last modified September 21, 2024: scripts commits_blame.py RWS-272 (78a9266)