Expert Advice on How to Make Yourself the Best Site Reliability Engineer Candidate

Experts Share Their Advice

According to payscale.com, the average salary of a site reliability engineer in August 2020 is around $117,000. That’s an attractive salary, but not quite as attractive as the more than $156,000 paid to the very best SREs. And on top of this, there could be bonuses, profit sharing, retirement benefits, and other non-cash benefits such as healthcare.

To be hired into the most lucrative SRE roles, you need to make sure that you’re the stand-out candidate during the hiring process. Here are a few tips from Allee Clark (Site Reliability Engineer), Caleb Hurd (Director of Site Reliability Engineering), and Michael Bidak, Senior Executive Recruiter here at Kofi Group.

What Do Employers Want from Their SREs?

SREs are critical members of staff in an organization. They’re responsible for the performance and efficiency of systems and processes. They improve productivity and reliability and reduce error by automating tasks. They bridge the gap between development and operations.

Great SREs have specific skillsets. You’ll have a combination of soft skills, technical skills, and the personality to match. For example:

  • You’ll be a tenacious problem solver with an eye for detail, but also understand the big picture and how the work you do contributes to the bigger effort
  • You’ll be a keen learner, inquisitive, and a creative solution finder
  • You’ll bring expert knowledge to the table, whatever your background
  • You may have experience of programming or developing systems, or be a systems administrator with a flair for finding gaps left by incomplete code
  • You’ll be analytical, pragmatic, and a confident communicator who can influence people’s ways of thinking and doing
  • Sometimes you’ll need to ‘sell’ your solution to employees, their managers, and perhaps to the C-suite
  • You’ll also need to be strong enough to say no when others would find it easier to fall into line and agree a solution that you believe is not the right one

Hiring organizations want passionate, intelligent, curious SREs. As you can imagine, finding people with the right combination of skills is difficult. To be the stand-out candidate, you must demonstrate that you are all the above, that you collaborate well with colleagues, and can be trusted with large-scale decision-making.

Allee says:

(There are) three lenses to consider in SRE: the lens of the industry, the need for the role, and what SRE means. Always provide the best artifacts for what the company needs. A large area of SRE experience may only be obtained through hands-on experience. Include details of hands-on experience scaling out systems to improve reliability, and how you effectively supported that system across your domain.

Caleb Hurd believes that SRE is a transient field that may go away in the next 5 years. The reason there’s so much demand is that the industry is just now getting to a state with tech like Kubernetes that empower developers. Developers want to be able to control their own destiny. But because the tooling is so bad, you need a group of people to focus on these technologies.”

Consequently, when he is hiring, he looks for someone who has soft skills as well as the hard skills. “They have the be sharp, able to pick up new tech quickly, strong development skills, adaptability,” he says. “In addition to that, they need a strange balance between being collaborative, humble, and also not being a doormat… having the backbone to address how teams and applications are run. It’s difficult to find engineers who have that consultative personality. If I had to choose, I would rather train hard skills over soft skills.”

Tips to Stand Out as an SRE Candidate

Michael Bidak has been filling Site Reliability Engineering positions for the past two years. His experience in working with growing startups and engineers shows that there are five key areas in which SREs must show they excel:

  1. Communication
  2. Learning/Curiosity
  3. Measuring
  4. Architecture and system design
  5. Fitting detail into the big picture

1.    Demonstrate Your Communication Skills

Michael says:

SREs are typically the first line of defense for when ‘shit hits the fan’, and having strong communication and listening skills are key. Engineers are known to be incredible problem solvers, which is why Ben Treynor puts them on operational issues over at Google.

Being able to solve a problem is great, but being able to effectively communicate the thought process behind your solution is where the magic happens.

Allee agrees, telling us that:

Great SREs communicate risk and prioritize effectively. Some risks are worse than others, and typically a finite amount of engineering resources to address them. A process to communicate the relative importance of risks and to provide guidance on which risks should be addressed first through application reliability reviews and carrying out the work. An example of communicating risk externally to users is Laura Nolan’s post on a Slack outage.

Tips that Caleb provides include to read the book ‘Crucial Conversations’. “It will help you learn how to have hard conversations…like going into a system and telling the architect that things need to be changed, without making them feel attacked.”

2.    Show That You Are a Learner

As an SRE, your work is never done. When you have automated the simple things, you’ll move on to the next level of complexity in the system or process. You’ll need to understand how things work to drive efficiency and make improvements. To achieve this, you’ll be a keen learner. Michael puts it succinctly by advising candidates to:

Be a sponge! Touch everything and practice coding often.

Caleb says, ”There’s an SRE manual and an SRE guide online for free by Google. I would suggest memorizing it from front to back”

3.    Measure Everything

Measure everything!” says Michael. “If it isn’t being measured, how do you know it is working?

Allee believes that statistical analysis skills will become increasingly important to SREs:

For the future, statistics for SREs will become a common skill,” he says. “Possessing the skills to perform statistical analysis on running systems and predict capacity will not only feel technically rewarding, but will also help the company’s cost management plan as systems continue to grow.

(Here’s a great resource as recommended by Allee: Using statistics of the extremes for software reliability analysis of safety critical systems)

4.    Increase Your Expertise in Architecture and Systems Design

Know architecture and system design like the back of your hand,” Michael says. “I have never worked a job that hasn’t had a systems architecture and design question. Having these skills polished and ready to be battle-tested may be the make or break for your dream role.

“One skill I look for is understanding of operating systems, networking, cloud-based infrastructure,” says Caleb. “Your typical developer will know the basics after a few years, but I want someone who has experienced an outage and brought a system back online. I want them to have gone through that baptism via trial by fire. They should also be able to work their way down the stack.”

“If they’ve experienced outages or designed systems, that’s a big help.”

You should prepare answers to interview questions that demonstrate your knowledge and experience in this area. In your answer, provide relatable context and the benefits that your work has led to.

5.    Show That You Are Detail-Oriented with a Big Picture Focus

Michael believes that the best SRE candidates show they are detailed and use metrics to describe their experience.

Hiring managers want to know that you are a strong engineer and that you understand how your work affects the business and at what scale,” he says. “An example answer in an interview might be: ‘We built an automation tool (written in Golang) to help migrate AWS regions when service availability falls below 70%. This saved our company over 100 engineering hours in the first month of being implemented and brought another nine into our uptime.’

Caleb says that SRE’s would be wise to refocus their attention on their resumes. I really like resumes that humanize people. I’m looking for individuals, not just a code monkey. I’m not saying people should be quirky, but they should find a way to tell the reader something about themselves.

“At the top of the resume (the part the people actually read), I would talk about a specific project that you were part of that had a significant business impact. I think every hiring manager wants someone who will make an impact, not just take orders.”

Of interviews, Caleb says, “When you interview someone, it’s the equivalent of going on 3-4 dates and getting married. I can go on your LinkedIn and you could go to our company website and we can both find basic details. But that’s no different from sitting down with a person on a first date and noticing that they’re attractive. I’m more interested in why not…so tell me who you really are as a person.”

So, there they are. Our five top tips to help you become the star SRE candidate. To benefit from a confidential conversation about your career and learn of some of the best opportunities for SREs today, contact Kofi Group today.