Senior Site Reliability / DevOps Engineer Job Description Template
Our company is looking for a Senior Site Reliability / DevOps Engineer to join our team.
Responsibilities:
- Supports the business infrastructure to ensure service availability, occasionally outside of business hours;
- Monitoring database performance;
- Research and adapt to emerging technologies;
- Maintaining continuous integration environments;
- Helps direct the effort of engineer team members;
- Identifying solutions for scalability challenges;
- Root cause analysis of application deployment and performance problems;
- Monitoring software application for uptime, performance, trends, and anomalies;
- Identifies areas in technology maintenance where human effort can be automated;
- Reinforces the intended technical and cultural norms;
- Helps the team build resilient infrastructure and tooling solutions to support Engineering;
- Recommends and collaborates on process and practice improvement;
- Deploying software through staging environments and to multiple production sites;
- Support and enhance Windows based application server infrastructure;
- Cultivate interpersonal relationships through superior communication skills.
Requirements:
- You will have 3+ years with cloud environments and provisioning automation;
- Deep understanding of common scripting languages (Ruby, Python, Bash). Powershell is a plus;
- Experience with distributed systems and the challenges with operating them as they scale;
- Working knowledge of networking and web concepts and ability to debug issues down to the packets;
- Bachelors or Masters in Computer Science or equivalent.2+ years of work experience;
- Intimate familiarity with the DevOps toolkit (Terraform, Ansible, Chef, and other tools);
- Strong programming and problem-solving skills;
- Demonstrated understanding of security best practices;
- Prometheus;
- Mastery of infrastructure build and configuration automation technologies (like Terraform, Ansible, Puppet, CodeDeploy, Chef);
- Expertise in container/container-fleet-orchestration technologies (like Docker, Kubernetes, AWS ECS);
- Expertise with continuous-deployment software development lifecycles in the Cloud (CI/CD);
- Significant experience troubleshooting concurrent and distributed system interactions;
- Cloud and container native Linux administration/build/management skills (AWS AMIs, Packer, etc.);
- Octopus Deploy.