Site Reliability Engineer

--Cary, NC, USA--

Job Description

Cary, North Carolina, United States

Overview

We are looking for a candidate to join a multi-functional SRE team. You should be having cloud engineering experience in such area acting as the SME on operation automation and monitoring, identifying TOIL within the teams existing systems and processes, recommending, and implementing automated solutions to reduce TOIL and improve the efficiency and effectiveness of the team.

Essential Skills and Experience (MUST HAVE)

  • Hands-on in defining and creation of CUJ, SLO, SLI, Error Budgeting based on NFR
  • Strong knowledge on IAAC - Terraform, GitHub, Docker Images
  • Strong scripting like Bash, Powershell, Python, and Ansible
  • Good knowledge on containers like Kubernetes
  • Design and implementation of automated workflows
  • Experience of reducing TOIL in an SDLC or IT operations environment
  • Good understanding of SCM tools: Git, GitHub, Sonarcube
  • Having fair understanding of ITSM process
  • Pro-active and analytical mindset.

Familiarity with the following (Nice to Have)

  • Fair understanding on build and release tools like Maven, Ant, Gradle, Puppet, Jenkins, TeamCity, udeploy
  • Knowledge on Micro-services.
  • Any programming languages like Java, C#.
  • Understanding on CI/CD pipelines.
  • Understanding of Architecture and Implementation of three tier web applications