Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Site Reliability Engineer

Iru

About Iru

Iru is the AI-powered security & IT platform used by the world's fastest-growing companies to secure their users, apps, and devices. Built for the AI era, Iru unifies identity & access, endpoint security & management, and compliance automation-collapsing the stack and giving IT & security time and control back.

Iru is backed by some of the smartest investors in tech-General Catalyst, Tiger Global, Felicis, Greycroft, and First Round Capital. In July 2024, Iru raised $100 million from General Catalyst, valuing the company at $850 million. Customers include Notion, Cursor, Lovable, Replit, and Mercor, and Iru partners with industry leaders such as ServiceNow and AWS. Iru was named to Forbes' America's Best Startup Employers 2025 list for employee engagement and satisfaction.

The Opportunity

We are looking for a Senior SRE to own how we detect, respond to, and learn from incidents, and to drive consistent observability across services and teams. This role sits at the intersection of reliability engineering and cross-team enablement-you will work alongside our Infrastructure team to complement their platform-building work with a sharp focus on operational excellence and measurable reliability. You will partner with engineering and platform teams to reduce MTTD and MTTR, and to make reliability measurable, repeatable, and ultimately team-owned.

What You Will Do

  • Lead and refine the incident lifecycle: detection, triage, communication, mitigation, resolution, and post-incident review.
  • Define and maintain severity models, escalation paths, on-call expectations, and runbooks/playbooks-keeping them current and usable under pressure.
  • Facilitate blameless postmortems; turn findings into tracked remediations and shared learning that reduces repeat incidents.
  • Improve coordination during major incidents: roles, tooling, customer/stakeholder updates, and handoffs.
  • Partner with security, support, and product on incident communications and regulatory or contractual obligations where applicable.
Observability Standardization & SLI/SLO Evangelism
  • Establish and maintain organization-wide standards for metrics, logs, and traces in Datadog-including naming conventions, cardinality, retention, and sampling-so teams can instrument consistently and confidently.
  • Define and drive adoption of SLOs, SLIs, and error budgets across engineering teams; meet teams where they are-bootstrapping SLI/SLO programs for teams starting from scratch and improving rigor for teams that already have them, with the long-term goal of teams owning their own observability.
  • Build and maintain reusable Datadog dashboard templates, monitor templates, and alerting patterns that teams can adopt and adapt-reducing the activation energy for doing observability well.
  • Champion golden signals and RED/USE-style alerting philosophies; align alerts with user-impacting symptoms, not just low-level infrastructure noise.
  • Partner with the Infrastructure team on observability stack decisions, multi-tenancy, cost controls, and data lifecycle.
  • Continuously reduce alert noise through threshold tuning, ownership assignment, and on-call load management.
Reliability Culture
  • Mentor engineers on operational excellence, safe deployment practices, and production readiness; help engineering teams grow their own reliability instincts.
  • Contribute to capacity planning, chaos/game-day exercises, and reliability reviews for critical changes.
  • Serve as a connective layer between the SRE and Infrastructure teams-aligning on tooling, standards, and shared goals.
Requirements
  • Experience: 5+ years in SRE, production engineering, or equivalent, including on-call responsibility for customer-facing systems.
  • Incidents: Proven experience running or significantly improving incident response (process, tooling, or both) in a distributed systems environment.
  • Observability: Deep, hands-on experience with Datadog-building dashboards, monitors, and instrumentation standards across multiple teams or services. Experience with metrics, logging, and tracing at scale.
  • SLI/SLO Programs: Demonstrated experience defining SLOs/SLIs and error budget policies in production; comfortable working with teams to codify the metrics their reliability posture is based on.
  • Systems: Strong understanding of Linux, networking, distributed systems failure modes, and cloud or hybrid infrastructure (Kubernetes, load balancers, databases, queues).
  • Automation: Proficiency in at least one of Go, Python, or similar for tooling and automation; comfort with IaC concepts (Terraform or equivalent).
  • Communication: Clear written and verbal communication; ability to facilitate discussions during high-pressure incidents and deliberate postmortems alike.
  • Collaboration: Track record of influencing without direct authority and driving adoption across engineering teams.
Nice to Have
  • Experience with OpenTelemetry or similar vendor-neutral instrumentation strategies.
  • Familiarity with PagerDuty, Incident.io, Opsgenie, or similar; Statuspage or equivalent for external communications.
  • Experience in a hyper-growth startup environment.
  • Experience in regulated or high-compliance environments.
  • Contributions to internal developer platforms or shared reliability tooling.
What Success Looks Like
  • Fewer repeated incidents and clearer, actionable postmortem outcomes that teams act on.
  • Engineering teams across the org have well-defined SLIs/SLOs they own and actively use to drive reliability decisions.
  • A shared Datadog observability layer with consistent signals, templated dashboards, and actionable alerts tied to user impact.
  • Engineers know how to instrument, where to look, and how to respond-with sustainable, well-supported on-call.

Benefits & Perks

Competitive salary


Hybrid work environment (3 days in office per week)


100% individual and dependent medical + dental + vision coverage

401(K) with a 4% company match

20 days PTO

Iru Wellness Week the first week in July

Equity for full-time employees

In-office lunch stipend provided


Up to 16 weeks of paid leave for new parents

Paid Family and Medical Leave

Modern Health mental health benefits for individuals and dependents

Fertility benefits

Working Advantage employee discounts

Onsite fitness center

Free parking

Exciting opportunities for career growth

We are excited to be serving a significant need for a fast-growing market, and are proud of the high-performing team we have brought together so far. If you're someone who wants to engage in new, exciting projects that will challenge your skills in the best way possible, we would love to connect with you.

At Iru, we believe in fostering an inclusive environment in which employees feel encouraged to share their unique perspectives, leverage their strengths, and act authentically. We know that diverse teams are strong teams, and welcome those from all backgrounds and varying experiences.

Iru is proud to be an equal opportunity employer committed to diversity and inclusion in the workplace. Qualified applicants will be considered for employment without regard to race, color, religion, national origin, age, sex, sexual orientation, gender identity, physical or mental disability, protected veteran or military status or any other status protected by applicable law. #LI-Hybrid

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.
Vacancy posted 14 hours ago
Similar jobs that could be interesting for youBased on the Senior Site Reliability Engineer in Miami, FL vacancy
  • $127k - $249k

     ...The Team Platform Engineering is the department within SRE that is responsible for a range of critical infrastructure and operational...  ...fleet, alongside the critical components that ensure cluster reliability and security (e.g., CoreDNS, cert-manager, and Gatekeeper).... 
    Senior
    Work at office
    Local area
    Remote work
    Worldwide
    Flexible hours

    MongoDB

    Miami, FL
    1 day ago
  • $127k - $249k

     ...Eastern or Central time zones. We are looking for an experienced Senior Engineer for our SRE, Atlas team to support, maintain and grow the...  ...workloads. Role Overview We are seeking a talented Site Reliability Engineer (SRE) with a strong infrastructure background.... 
    Senior
    Local area
    Remote work
    Worldwide
    Flexible hours

    MongoDB

    Miami, FL
    4 days ago
  • $126k - $248k

     ...SLOs, shape capacity plans, and ensure the reliability, durability, and operational safety of...  ...underpins Atlas. You'll join a small, senior team of SREs as founding members of this...  ...processes. We are a small team of software engineers with a strong bias towards software... 
    Senior
    Local area
    Immediate start
    Remote work
    Worldwide
    Flexible hours
    Shift work

    MongoDB

    Miami, FL
    21 hours ago
  • A leading digital healthcare firm in Miami is seeking a Senior Software Engineer to enhance platform reliability and security. You will design monitoring systems, lead incident reviews, and optimize Kubernetes and AWS environments. Ideal candidates have Kubernetes experience... 
    Senior

    eMed LLC.

    Miami, FL
    2 days ago
  •  ...Job Summary We are seeking an experienced Senior DevOps / Site Reliability Engineer (SRE) with strong application and infrastructure knowledge. The role requires hands-on expertise in AWS, Kubernetes, CI/CD, monitoring, and .NET-based applications to ensure high availability... 
    Suggested

    Prophecy Technologies

    Miami, FL
    13 hours ago
  • $100k - $115k

     ...Internal Developer Platform Engineer Analytic Partners is a global leader in commercial...  ...teams as customers and optimizing for reliability, usability, and delivery velocity. Define...  ...of experience in Platform Engineering, Site Reliability Engineering, DevOps, or... 
    Temporary work

    Analytic Partners

    Miami, FL
    1 day ago
  • $125k - $350k

     ...world-class liquidity, competitive pricing and seamless front-to-back execution in a broad array of financial products. Our teams of engineers, traders and researchers harness leading-edge quantitative research and the accelerating power of compute, machine learning and AI... 

    Citadel

    Miami, FL
    3 days ago
  •  ...Staff Site Reliability Engineer Austin, TX and/or Miami, FL Core Scientific is a leading provider of infrastructure for high-performance compute in North America. Our mission is to accelerate digital innovation by scaling high-value compute rapidly, efficiently,... 
    Full time
    Work at office
    Monday to Friday

    Core Scientific

    Miami, FL
    21 hours ago
  • $151k - $297k

     ...The Team Platform Engineering is the department within SRE that is responsible for a range of critical infrastructure and operational...  ...fleet, alongside the critical components that ensure cluster reliability and security (e.g., CoreDNS, cert-manager, and Gatekeeper).... 
    Local area
    Immediate start
    Remote work
    Worldwide
    Flexible hours
    Shift work

    MongoDB

    Miami, FL
    1 day ago
  •  ...company in Miami seeks a talented Software Engineer to enhance their critical infrastructure...  ...expertise to improve performance and reliability, engaging deeply with cross-functional projects...  .... This full-time position requires on-site presence five days a week, offering a... 
    Full time

    OpenEvidence

    Miami, FL
    21 hours ago
  • Site Reliability Engineer On site in Miami Contract Talento has partnered with an enterprise organization on a search for an SRE Engineer based in Miami, FL. The Site Reliability Engineer (SRE) ensures the availability, performance, security, and reliability of the organization... 
    Contract work

    TalentoHC

    Miami, FL
    4 days ago
  •  ...Nubank in Miami is looking for software engineers who are eager to work with cutting-edge technology and distributed systems. You will collaborate in building microservices, utilizing agile methodologies, and engaging in continuous delivery practices. The role offers... 
    Senior

    Nubank

    Doral, FL
    2 days ago
  • $85k - $148k

     ...A leading tech solutions provider is seeking a Senior Mainframe Systems Programmer to ensure stability and efficient operation of z/VM...  ...with a focus on z/VM, and the ability to work both remotely and on-site. Competitive salary between $85,000 and $148,000 annually, along... 
    Senior
    Remote work

    Ensono

    Doral, FL
    2 days ago
  • $300k - $360k

     ...any hidden fees or compounding interest. As a Director of Site Reliability Engineering, you will own execution for reliability, availability, and...  ...values learning, experimentation, and accountability. As a senior technical leader, you will balance hands‑on technical... 
    Work at office
    Remote work
    Flexible hours

    Affirm

    Doral, FL
    3 days ago
  • Framework Ventures is seeking a Senior Solutions Engineer based in Miami, Florida. In this role, you will be pivotal in driving the adoption of Chainlink products in the capital markets sector, working closely with financial institutions and providing technical expertise... 
    Senior

    Framework Ventures

    Miami, FL
    21 hours ago
  • Bachelor s degree. Five years of experience in software development and systems analysis are required. Additional related work experience in software development and systems analysis may substitute for the required college education on a year-for-year basis.
    Senior
    Work experience placement

    Miami-Dade Seaport Department

    Miami, FL
    21 hours ago
  •  ...Senior Systems Programmer - StorageRemote - United StatesJR012900 At Ensono, our Purpose is to be a relentless ally, disrupting...  ...five traits are the key to achieving our purpose: Honesty, Reliability, Curiosity, Collaboration, and Passion. Role Summary :... 
    Senior
    Temporary work
    Work experience placement
    Work at office
    Remote work
    Flexible hours

    Ensono

    Miami, FL
    5 days ago
  •  ...A leading financial technology company in Miami is seeking a backend engineer for their Information Security team. The role focuses on designing and implementing core customer identity and authentication management (CIAM) services. Candidates should have significant hands... 
    Senior
    Remote work

    Affirm

    Doral, FL
    2 days ago
  •  ...A software development company in Miami is searching for a Senior Android Engineer. In this role, you will be responsible for developing high-quality, native Android applications using Kotlin. You will collaborate with various teams to design and implement new features... 
    Senior
    Remote work
    Flexible hours

    AgileEngine

    Doral, FL
    2 days ago
  • $50k - $100k

    Salary Depends on Experience: $50,000--$100,000 Job Schedule: Full Time Job Category: Information Technology Expertise in Integrated Database Management Systems (IDMS) systems, Online IDMS programs using ADSO, IDD, ADSC, MAPC, DDDL, DMLO, OLP, PFCH, and DME, Batch...
    Senior
    Full time

    Stellar

    Miami, FL
    2 days ago
  • $80k - $148k

     ...Senior ADABAS Mainframe Systems ProgrammerRemote - United StatesJR012423 Position Summary The successful candidate will perform mainframe...  ...keys to SAG products o SAG products license management at DR site & bringing up ADABAS & Natural at DR site o Installation &... 
    Senior
    Full time
    Temporary work
    Work experience placement
    Remote work
    Work from home
    Flexible hours

    Ensono

    Miami, FL
    2 days ago
  •  ...Nu is seeking a Software Engineer to join our innovative team in Miami, United States. In this role, you will work with distributed systems and collaborate on building microservices using cutting-edge technology. We value a process-light organization with small, independent... 
    Senior
    Work from home

    Nu

    Doral, FL
    2 days ago
  • $85k - $148k

     ...Senior Mainframe Systems Programmer - zOSRemote - United StatesJR013799 Position Summary The Senior Mainframe Systems Programmer...  ...remotely most of the time so if you are not required to be on a client site, you can choose to work from home or in our Ensono offices.?... 
    Senior
    Full time
    Temporary work
    Work experience placement
    Remote work
    Work from home
    Flexible hours

    Ensono

    Miami, FL
    14 hours ago
  • $153.9k - $188.1k

     ...on troubleshooting and support escalation. In this role, you will develop automation tools, optimize processes, and ensure system reliability. A Bachelor’s degree in Computer Science or a related field and 2 years of relevant experience are required. Telecommuting is... 
    Remote job
    Flexible hours

    I did my part and supported the Regular Toilet

    Miami, FL
    4 days ago
  •  ...phone! Join now! We are building a comprehensive platform for classic, exotic, and specialty vehicles. We are seeking a Senior Software Engineer to help take our Phoenix + LiveView infrastructure to the next level. This is a high-impact feature-full role on the... 
    Senior
    Start working today
    Remote work

    Classic

    Doral, FL
    1 day ago
  •  ...A technology solutions company is seeking a Senior Salesforce Platform Developer to design and implement tailored Salesforce solutions. The ideal candidate will have 5-7 years of experience and expertise in Apex, Visualforce, and Lightning Components, alongside strong... 
    Senior

    Cloud Hybrid Technologies LLC

    Doral, FL
    2 days ago
  • A leading technology staffing agency in Miami is seeking an experienced Backup Technologies Manager with a strong focus on managing Commvault systems. The ideal candidate will have extensive experience in backup technology management, including hands-on expertise in installation...
    Senior

    Commvault

    Miami, FL
    3 days ago
  • $85k - $148k

     ...Senior Mainframe Systems Programmer - zVMRemote - United StatesJR012366 A Senior IBM z/VM Systems Programmer ensures the stability,...  ...remotely most of the time so if you are not required to be on a client site, you can choose to work from home or in our Ensono offices.... 
    Senior
    Full time
    Temporary work
    Remote work
    Work from home
    Flexible hours

    Ensono

    Miami, FL
    1 day ago
  • LiquidFi is seeking an experienced developer to focus on Backend and Smart Contract technologies. The role requires 10+ years of experience with a strong emphasis on Solidity and Python, particularly using frameworks such as Django and tools like RabbitMQ. The ideal...
    Senior
    Contract work
    Remote work

    LiquidFi

    Doral, FL
    2 days ago
  • Options market making presents a unique challenge in combining scale with business complexity. With over a million of listed options at the heart of billions of messages, we handle internet-scale datasets while being constrained on hardware due to latency requirements ...
    Senior
    Work experience placement

    Citadel Securities

    Doral, FL
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Site Reliability Engineer. Be the first to apply!