Site Reliability Engineer (SRE)
Ova Technologies
Site Reliability Engineer (SRE)
We are seeking a highly skilled Site Reliability Engineer (SRE) to ensure the reliability, scalability, performance, and availability of mission-critical applications and infrastructure. The ideal candidate will combine software engineering and operations expertise to build automated solutions, improve system resilience, and minimize service disruptions. The SRE will work closely with development, DevOps, cloud, and support teams to enhance system stability and operational excellence.
Key Responsibilities:
- Design, implement, and maintain highly available and scalable infrastructure solutions.
- Monitor application and infrastructure performance to ensure optimal system health.
- Develop automation tools to streamline deployment, monitoring, incident response, and operational tasks.
- Define and manage Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Service Level Agreements (SLAs).
- Perform root cause analysis (RCA) for production incidents and implement preventive measures.
- Collaborate with development teams to improve application reliability and performance.
- Manage capacity planning and infrastructure scaling strategies.
- Build and maintain observability solutions including monitoring, logging, and alerting systems.
- Participate in incident management, on-call rotations, and disaster recovery planning.
- Implement security, compliance, and operational best practices.
- Drive continuous improvement initiatives to reduce operational overhead through automation.
Required Skills:
- Strong understanding of Linux/Unix systems administration.
- Expertise in monitoring, alerting, and observability practices.
- Experience with cloud platforms and distributed systems.
- Strong troubleshooting and performance optimization skills.
- Knowledge of networking, security, and system architecture.
- Excellent problem-solving and communication abilities.
Technical Skills:
- Operating Systems: Linux, Unix, Windows Server
- Cloud Platforms: AWS, Azure, Google Cloud Platform (GCP)
- Containerization: Docker, Kubernetes, OpenShift
- Infrastructure as Code (IaC): Terraform, CloudFormation, Ansible
- Monitoring Tools: Prometheus, Grafana, Datadog, New Relic, Dynatrace
- Logging Tools: ELK Stack (Elasticsearch, Logstash, Kibana), Splunk
- CI/CD Tools: Jenkins, GitHub Actions, GitLab CI/CD, Azure DevOps
- Programming/Scripting: Python, Go, Bash, PowerShell
- Databases: PostgreSQL, MySQL, MongoDB, Redis
- Version Control: Git, GitHub, GitLab, Bitbucket
Qualifications:
- Bachelor's degree in Computer Science, Information Technology, Engineering, or a related field.
- Relevant certifications are preferred:
- AWS Certified DevOps Engineer
- Google Professional Cloud DevOps Engineer
- Microsoft Azure DevOps Engineer Expert
- Certified Kubernetes Administrator (CKA)
Experience:
- 4–8 years of experience in Site Reliability Engineering, DevOps, Cloud Engineering, or Infrastructure Operations.
- Hands-on experience supporting production environments and cloud-native applications.
- Experience with Kubernetes, container orchestration, and automation frameworks.
- Experience implementing monitoring and observability solutions.
Preferred Qualifications:
- Experience managing large-scale distributed systems and microservices architectures.
- Knowledge of chaos engineering and reliability testing practices.
- Experience with performance tuning and capacity planning.
- Familiarity with security best practices and compliance standards.
- Experience with serverless and event-driven architectures.
Preferred Qualities:
- Strong ownership mindset and accountability.
- Ability to remain calm and effective during critical incidents.
- Excellent analytical and debugging skills.
- Strong collaboration and cross-functional communication abilities.
- Passion for automation, reliability, and continuous improvement.
Employment Type:
Full-Time
Location:
Remote / Hybrid / On-site
Nice to Have:
- Experience with SaaS, FinTech, Healthcare, E-commerce, or HR Tech platforms.
- Knowledge of AI-driven observability and incident management tools.
- Experience implementing self-healing infrastructure and automated remediation.
- Familiarity with cost optimization strategies in cloud environments.
- Experience mentoring engineers and driving reliability best practices across teams.
- ...Senior Site Reliability Engineer (SRE) Our client is a global technology consulting and digital solutions company that enables enterprises across industries to reimagine business models, accelerate innovation, and maximize growth by harnessing digital technologies....SuggestedLocal area
- New York, United States | Posted on 11/13/2025 Title: Senior Site Reliability Engineer (SRE) Location: Remote AboutJanuary AtJanuary, we’re transforming the lives of borrowers by bringing humanity to consumer finance. Our data-driven products empower financial institutions...SuggestedRemote work
- ...complexity of our systems. We’re looking for a experienced SRE to take ownership of reliability across our multi-region, cloud-native platform. You’ll... ...failure simulations to harden the platform. Mentor engineers and set best practices for SRE across the company. What...SuggestedRemote work
- ...self-healing, deployment/rollback automation). Establish reliability standards: SLOs/SLIs, error budgets, production readiness reviews... ..., and release risk controls. Performance and reliability engineering: capacity planning, load/performance analysis, resilience...Suggested
- ...Site Reliability Engineer I, Abhishek, would like to share a job opportunity as Site Reliability Engineer in Jacksonville, FL, Cary, NC or New York, NY (Onsite) location for a Fulltime position. In case, if you are not comfortable with this location, please share your...SuggestedFull timeWork visa
$86k - $138k
AWS Cloud Site Reliability Engineer ( SRE/Mid Level) job at Peraton. United States. Responsibilities We are seeking an experienced and motivated AWS Cloud Site Reliability Engineer (SRE) to join our dynamic team. As an AWS Site Reliability Engineer, you will play a critical...$60 - $65 per hour
...SRE Engineer (W2) Jersey City, NJ (Onsite) 6 Months Contract to Hire Job Description: Proficient in application development skills for more than one technology as well as multiple design techniques. Working proficiency in development toolset to design...Full timeContract workWork experience placement- ...: Versana is seeking a motivated SRE/DevOps Engineer with strong observability experience to... ...) and indicators. • Improve system reliability and resiliency. • Conduct post-incident... ...: • 5+ years of experience as a Site Reliability Engineer or similar role....Work experience placementLocal area
$170k - $225k
About The Role Zora is looking for an experienced infrastructure / site reliability software engineer to work closely with the development team to ensure that the infrastructure / site reliability meets the needs of the business and is scalable and highly available, including...Local areaRemote workHome officeFlexible hours$100k - $150k
...technologies to create scalable, secure, and user-friendly applications. As we continue to grow, we’re looking for a skilled Site Reliability Engineer (SRE) to join our dynamic team and contribute to our mission of transforming business processes through technology. This...Full timeH1bLocal areaImmediate startRemote workVisa sponsorshipWork visa- Insight Global is seeking a professional for the SRE Dashboard/AI Agent Analysis position in New York, NY. The role involves developing and maintaining web applications utilizing Java and Spring, while also designing RESTful APIs and collaborating with a diverse team. Candidates...
- Freelanceshop is looking for a remote SRE Observability Engineer (Datadog Specialist) to enhance our cloud-based platforms. This critical role involves designing monitoring systems to ensure reliability and performance. You will collaborate with various teams to provide...Remote job
- P2P.org is looking for an experienced Site Reliability Engineer to enhance our scalable, secure, and automated infrastructure. This fully remote role... .... The ideal candidate will have at least 4 years of SRE experience, advanced Kubernetes skills, and a strong background...Remote job
- Versana LLC. in New York is seeking a motivated SRE/DevOps Engineer to enhance their cloud-based platform. The role requires strong experience... ...DevOps practices to manage public cloud and ensure system reliability. Candidates should have over 5 years of relevant experience...
- The Consulting Solutions is seeking an experienced Senior / Staff Engineer for our SRE, InfraSec team in Seattle. The role involves leading the security of cloud-based infrastructure, mentoring a team of SREs, and collaborating with other engineering teams to ensure high...Remote job
- itD Tech is looking for a skilled Sr. Software Engineer/SRE in the UK to join our Observability team. This remote position requires expertise... ..., and operating observability systems that ensure service reliability and performance. The ideal candidate has strong technical...Remote job
- ...industry to new heights. “Be a Person Others Want to Follow!” About the role DroneUp is seeking an SRE - Platform Engineer who will focus on ensuring the reliability, scalability, and performance of our internal and client-facing IT infrastructure and developer platform...Contract workRemote work
- Mosaec is seeking a talented Platform/Site Reliability Engineer for a remote position. You will work with startups across the US and Europe, focusing... ...extensive experience in Platform Engineering, DevOps, or SRE roles and a strong grasp of technologies like Python, Go, Kubernetes...Remote jobFlexible hours
$189k - $283.6k
...The Role As a member of the SRE team, you will proactively and reactively improve the reliability of Block's platform and critical infrastructure... ...~ A strong desire to perform and grow as an engineer ~5+ years of software development experience...Full timeLocal areaRemote workRelocation packageFlexible hoursShift work$182.3k - $220k
...putting patients first - and that mission depends on reliable, secure, and scalable systems. As a Senior SRE on the infrastructure team, you'll sit at the core... ...infrastructure and building tools that empower our engineers to ship safely and confidently. You will work...Local areaFlexible hours$133k - $185k
...world's most complex and mission-critical systems. As a Site Reliability Engineer III at JPMorgan Chase within the Chief Data & Analytics Office... ...AI capabilities within the work environment to support SRE workflows with strong validation habits and awareness of data...Work at office$123k - $165k
...Site Reliability Engineer II Our engineering fleet is a horizontal set of teams providing engineering services across the organization. Our specific... ...technical problems. Job Description The Streaming SRE squad drives improvements in performance, resiliency, and...- ...About the job Senior Site Reliability Engineer About the Company Stellar is a decentralized, public blockchain that gives developers the tools... ...of working in cloud-based systems operations, as a SRE or DevOps engineer. ~ First-hand experience with configuration...
- ...DevOps Engineer DevOps teams in our Infrastructure Engineering group enable Company to continually disrupt the Insure tech space. Our... ...Computer Science or related. ~5+ years experience as a DevOps/SRE. ~ Experience in explaining complex problems and solutions to...
$160k - $250k
...transform how enterprises manage and engage with their IT ecosystems. About the Role We're looking for a Senior Site Reliability Engineer (SRE) to own the reliability, performance, and scalability of our AI-native platform. You'll operate at the intersection of...Work at officeLocal area$165k - $225k
...operationalize AI and run it as a true business performance engine delivering measurable value. For more, visit the Dataiku... ...taking a look here. How you'll make an impact: As a Site Reliability Engineer (SRE) with advanced expertise in networking and security, you'...Work at officeFlexible hours- Trine Infotech is seeking a hands-on AWS SRE/DevOps Engineer to ensure the reliability of cloud-based production systems. The role entails monitoring, incident management, and optimization within AWS environments. Candidates must be AWS Certified Solutions Architect and...Remote job
- Mondrian Alpha is seeking a technology-driven SRE / Application Support Engineer to join their team in New York. You will monitor and troubleshoot complex... ...compensation and excellent benefits, including healthcare and on-site meals. #J-18808-Ljbffr Mondrian Alpha
$182.3k - $220k
...putting patients first - and that mission depends on reliable, secure, and scalable systems. As a Senior SRE on the infrastructure team, you’ll sit at the core... ...infrastructure and building tools that empower our engineers to ship safely and confidently. You will work...Local areaFlexible hours$38 per hour
Job Title: SRE Engineer Location: Englewood Cliffs NJ Job ID :89089-1 Pay Rate: $38 (all Inclusive) Duration: 6-12 Months... ...are seeking a skilled Data Engineer with expertise in AWS and Site Reliability Engineering (SRE) to join our team. The successful candidate...Work experience placementLocal areaImmediate startRelocation
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Site Reliability Engineer (SRE). Be the first to apply!
- site reliability engineering manager New York, NY
- site reliability engineer remote New York, NY
- site reliability engineer sre New York, NY
- site reliability engineer New York, NY
- on-site clinical research associate (traveling/remote) New York, NY
- junior website developer New York, NY
- site merchandiser New York, NY
- IT site lead New York, NY
- site acquisition specialist New York, NY
- site leader New York, NY


