Sr. Site Reliability Engineer Job Description

Sr. Site Reliability Engineer Job Description Template

Our company is looking for a Sr. Site Reliability Engineer to join our team.

Responsibilities:

  • Develop software solutions to deploy, configure, monitor applications which are deployed in the cloud;
  • Assist our software engineering team to ensure accurate monitoring and metrics are being built into the applications before going to production;
  • Participate in on-call rotation;
  • Drive and improve the whole lifecycle of operational readiness – from inception and design, through deployment, operation and refinement;
  • Maintain up-to-date documentation on deployments, processes and standard operating procedures/run-books;
  • Investigate failures, identify root cause, and implement remedy for continued improvement;
  • Be on-call as needed to ensure uptime, customer SLAs, and requirements around responsiveness are met on an ongoing basis;
  • Develop automation and tools to improve engineering lead time from commit to production;
  • Diagnose and solve problems occurring with our highly available production systems and build solutions and automation to prevent issues in the future;
  • Build and Improve monitoring around our systems and infrastructure;
  • Design, Develop, and implement software that improves the stability, scalability, availability, and latency of our enterprise SaaS products.

Requirements:

  • Experience deploying, operating and debugging server software on Linux at scale;
  • Experience with virtualized environments (AWS experience a plus);
  • You possess strong computer science fundamentals: data structures, algorithms, programming languages, distributed systems, and information retrieval;
  • Experience with functional or imperative programming languages — e.g., PHP, Python, Ruby, Go, C, or Java (used without frameworks);
  • Bachelor’s degree in Computer Science, Engineering or related field, or equivalent training, fellowship, or work experience;
  • Experience using deployment automation/configuration management, especially Chef;
  • Professional experience in web application engineering, working in a team environment.