Site Reliability Engineer (SRE) Job Description Template
Our company is looking for a Site Reliability Engineer (SRE) to join our team.
Responsibilities:
- You partner with Engineering stakeholders to design and deliver a reliable, scalable, secure, and performant platform;
- You stay current on technical trends in order to suggest innovative tools and approaches to interesting problems;
- You share your expertise with the entire Engineering organization;
- You participate in a 24/7 on-call rotation. And yes, we use PagerDuty to manage our on-call schedules.
Requirements:
- A passion for problem solving with strong analytical capabilities;
- Comfort with Linux/Unix command line;
- Exceptional and demonstrable web development experience;
- Know at least one of {Python, Ruby, Java, C++, C#, Go} at an intermediate level;
- Experience with relational databases, and NoSQL databases;
- 3+ years of AWS administration;
- Experience in automating releases, continuous integration/delivery systems and relevant tools (e.g. Jenkins, CircleCI, Travis CI, Buildkite, etc.);
- Excellent knowledge of a scripting language like; Ruby, Python or Go;
- Experience with Docker in a production environment including container orchestration (e.g. Nomad, Mesos, Kubernetes, etc.);
- AWS-based, cloud-native infrastructure and managed services, such as AWS Redshift, EC2, S3 and other storage options, VPCs, IAM;
- Experience working on cloud based infrastructure e.g AWS, GCP, Azure;
- Experience with infrastructure as code (Terraform or CloudFormation);
- You are empathetic: You take others’ opinions into account and clearly communicate your thoughts to reach technical solutions quickly;
- Knowledge of configuration management systems like Ansible, Chef or Puppet;
- You consider it important to understand and appreciate your customers, and enjoy seeing your work improve the work of others.