Senior Site Reliability Engineer

$175k - $200k

Order.co

Order.co is the System of Action for the Office of the CFO, transforming the way businesses purchase and pay into an intuitive, B2C-like shopping experience. Order.co leverages embedded AI agents and embedded financial products to reinvent the way businesses connect with their vendors. End users enjoy a seamless, zero-training buying experience, while finance and procurement leaders gain a single platform to orchestrate how the business “should operate”. The result is an all-in-one solution that serves as a gravitational pull for spend and data, automating and eliminating procurement and finance workflows from requisition to reconciliation along the way. Order.co is on the cutting edge of B2B Agentic Commerce, poised to be the market leader in creating a more predictive, prescriptive, and personalized experience for users. Founded in 2016 and headquartered in New York City, Order.co oversees nearly half a billion in annualized spend across hundreds of customers like WeWork, SoulCycle, Lume, and solidcore. Order.co has raised $75M in funding from industry-leading investors like MIT, Stage 2 Capital, Rally Ventures, 645 Ventures, and more. Order.co has been proudly named a 50 to Watch by Spend Matters and a Best Place to Work by BuiltIn and Inc. Magazine. The Role As a Senior Site Reliability Engineer on the Platform team, you will ensure that software systems are reliable, scalable, performant, and operationally efficient. You blend software engineering skills with infrastructure and operations expertise to keep critical systems running smoothly while enabling rapid product development. Responsibilities Reliability Engineering & Infrastructure Ownership Design, build, and operate highly available, scalable, and fault-tolerant infrastructure and platform services Own reliability, availability, latency, and operational excellence for critical production systems and services Define and maintain service level objectives (SLOs), service level indicators (SLIs), and error budgets across platform systems Lead incident response efforts for complex production outages; drive root-cause analysis and long-term remediation actions Build resilient systems that gracefully handle failures, traffic spikes, dependency degradation, and regional outages Continuously improve system reliability through automation, observability, performance tuning, and capacity planning Develop infrastructure automation and self-service tooling to reduce operational toil and improve engineering velocity Build and maintain CI/CD pipelines, deployment automation, and release engineering workflows Implement infrastructure as code (IaC) practices using tools such as Terraform, CloudFormation, and container orchestration Improve developer experience by building reliable internal platforms, operational tooling, and standardized deployment patterns Drive adoption of GitOps, immutable infrastructure, and automated remediation patterns Observability & Operational Excellence Design and maintain comprehensive monitoring, logging, tracing, and alerting systems for distributed services Establish actionable alerting standards that reduce noise while improving incident detection and response times Analyze production trends, system bottlenecks, and failure patterns to proactively prevent incidents Lead operational readiness reviews, disaster recovery planning, and game‑day exercises Improve mean time to detect (MTTD) and mean time to recovery (MTTR) through tooling, automation, and process refinement Participate actively in architecture and infrastructure design reviews Propose scalable and reliable platform designs that account for multi‑region deployment, redundancy, failover, and security considerations Evaluate trade-offs between reliability, scalability, operational complexity, and engineering velocity Identify systemic risks and operational gaps before they become production incidents Partner with engineering teams to ensure services are designed with operability, observability, and resilience in mind from day one Security & Compliance Approach infrastructure and operational practices with a strong security mindset Implement and maintain secure cloud networking, secrets management, IAM policies, and infrastructure hardening standards Partner with Security and Compliance teams to ensure systems meet organizational and regulatory requirements Drive operational best practices around vulnerability management, patching, and production access controls End-to-End Ownership & Collaboration Scope and estimate infrastructure and reliability initiatives accurately Coordinate production rollouts, maintenance events, and reliability improvements across teams Communicate operational risks, dependencies, and incident impacts clearly to technical and non-technical stakeholders Collaborate closely with Software Engineering, Security, Product, and Operations teams to improve platform reliability and scalability Serve as a trusted escalation point during critical production incidents Mentorship & Technical Leadership Mentor junior and mid-level engineers on reliability engineering principles, operational excellence, and infrastructure best practices Raise the operational maturity of the engineering organization through documentation, reviews, and technical guidance Drive improvements in team standards around observability, incident management, automation, and infrastructure design Influence technical decisions through credibility, operational expertise, and strong engineering judgment Qualifications You are motivated by accountability — you own outcomes, not just tasks You are results‑oriented and measure success by shipped, working software You are motivated by correctness in code that touches money — the consequences of a bug land on real customer balances, and you take that seriously You love helping people on your team grow and improve Writing tests is an integral part of your development process, not an afterthought You know how to design and build software incrementally — you don't need a complete spec to make progress Collaborating with the people around you to achieve a goal motivates you You are collaborative, open‑minded, and actively developing your craft You are curious and pragmatic about AI‑driven solutions — you apply them where they add real value and stay skeptical where they don't Familiarity with AI‑assisted development tools — you understand how they work, where they help, and where they fail. Prior hands‑on use is a plus; intellectual curiosity and the instinct to evaluate AI output critically are what matter Technical Skills Strong foundation in computer science fundamentals: data structures, algorithms, and system design Familiarity with building production‑grade applications and services using Ruby and Ruby on Rails Deep expertise with Linux systems administration and production troubleshooting Strong experience operating cloud infrastructure at scale, particularly within AWS environments Experience with Kubernetes, container orchestration, and cloud‑native infrastructure patterns Proficiency with infrastructure as code tools such as Terraform or CloudFormation Expertise designing and operating CI/CD pipelines and deployment automation systems Deep understanding of observability tooling including Datadog, OpenTelemetry, or similar platforms Strong knowledge of distributed systems reliability patterns including redundancy, failover, autoscaling, rate limiting, and graceful degradation Experience building automation and operational tooling using languages such as Python, Go, Bash, or Ruby Strong understanding of networking fundamentals including DNS, load balancing, TLS, VPNs, firewalls, and service discovery Hands‑on experience with incident response, root‑cause analysis, and production operations in high‑availability environments Familiarity with SRE methodologies including SLOs, SLIs, error budgets, capacity planning, and operational maturity modeling Experience implementing secure infrastructure and cloud security best practices including IAM, secrets management, and vulnerability remediation Proven ability to design scalable, resilient, and maintainable platform systems and APIs Experience supporting distributed microservices architectures and event‑driven systems Strong understanding of operational excellence principles including automation‑first engineering and toil reduction Experience using AI‑assisted engineering tools (e.g., Claude, GitHub Copilot) as force multipliers while applying sound operational and engineering judgment Excellent debugging and systems thinking skills across infrastructure, networking, application, and platform layers What Great Looks Like A Senior Software Engineer on the Platform team who is thriving at this level demonstrates: Reliable delivery of complex work — consistently ships multi‑part solutions on time with low defect rates Low defects in owned areas — proactively monitors and improves the quality of the systems they own; that means incident‑free quarters in code paths that move funds and clean reconciliation against vendor reports Measurable mentorship impact — engineers around you write better code because of your reviews and guidance Someone we can depend on for the work that matters — especially the work that touches money. Failure Modes We Screen Against We actively evaluate candidates for the following anti‑patterns during the interview process: Failure Mode What It Looks Like Strong coder, weak owner Ships code but doesn't manage to the task — owns the merge, not the outcome; hands off and moves on without monitoring or fixing post‑release issues Hoards knowledge instead of sharing — becomes a single point of failure and blocks team growth Proposes solutions without considering trade‑offs — jumps to conclusions, resists alternative approaches Produces AI‑generated output without verifying it against the codebase, tests, or business context Interview Process Our 5‑round process is designed to evaluate you across all competency areas. AI tools are permitted in technical rounds. Round Format What We Evaluate 60 min, conversational Career trajectory, mentorship philosophy, technical influence examples, communication style 2 — Take‑Home + PR Discussion 72h take‑home + 60 min live Navigating unfamiliar code, ownership and decomposition discipline visible in your PR, root‑cause judgment, AI tool usage Requirements gathering, schema/API design, trade‑off articulation, calibrated code‑review judgment on a teammate's PR 4 — Team Interview (conditional) 30 min, behavioral Collaboration patterns, mentorship behavior, negotiation behavior with cross‑functional partners 5 — Culture Add 30 min, People Team Organizational values alignment Round 4 is conditional: it runs when the team needs additional behavioral signal after Rounds 2 and 3, and is otherwise skipped. Your recruiter will tell you whether it's scheduled before your loop is finalized. The Round 2 (Take‑Home + PR Discussion) and Round 3 (System Design) exercises are drawn from real problems so the technical evaluation is grounded in the work you'd actually be doing. What You’ll Receive Competitive compensation including base salary, bonus, and equity Employer‑sponsored 401(k) with match Comprehensive medical, dental, and vision coverage Flexible time off and hybrid work environment The anticipated annual salary range for this role is $175,000 - $200,000 . Actual compensation and title will be commensurate with experience, qualifications, knowledge, and skills. #J-18808-Ljbffr Order.co

Apply

Vacancy posted 2 days ago

Similar jobs that could be interesting for youBased on the Senior Site Reliability Engineer in New York, NY vacancy

Senior Lead Site Reliability Engineer
...professionals for this role. JOB DESCRIPTION Elevate your engineering prowess to unprecedented levels by joining a team of... ...professionals and position yourself among the top echelon in site reliability. As a Sr Lead Site Reliability Engineer at JPMorgan Chase...
Senior
J.P. Morgan
Jersey City, NJ
7 days ago
Senior Manager of Site Reliability Engineering - Securitized Products, Production Management - NA
...Guide and shape the future of technology at a globally recognized firm, driven by pride in ownership. As a Senior Manager of Site Reliability Engineering at JPMorgan Chase within the Corporate Investment Bank, Markets team, you are the non-functional requirement owner...
Senior
Bank staff
Shift work
J.P. Morgan
New York, NY
4 days ago
Senior Site Reliability Engineer
...critical services in a new public cloud platform? Join our IaaS Site Reliability Engineering (SRE) team. We design, develop, and operate infrastructure... ...of a clear career path in our SRE team: SRE I → SRE II → Senior → Senior II → Principal → Senior Principal. Each step...
Senior
Work at office
Remote work
Akamai
New York, NY
4 days ago
Senior Site Reliability Engineer
$150k - $170k
...Senior Site Reliability Engineer – Zip Co Join to apply for the Senior Site Reliability Engineer role at Zip Co At Zip, we build cloud‑native software applications that serve millions of customers and process billions of dollars in payments. We’re looking for a seasoned...
Senior
Casual work
Work at office
Remote work
Flexible hours
ZIP
New York, NY
6 days ago
Senior Site Reliability Engineer
$65 - $75 per hour
...Confluence, and IT Service Management tools. Description: As an Engineer 2, you will collaborate with management, departments, and... ...event management, and automation across the IT organization. Seniority level Mid-Senior level Employment type Contract Job function Information...
Senior
Contract work
Remote work
SBS Creatix
New York, NY
4 days ago
Senior Site Reliability Engineer, Fleet Management
$127k - $249k
...The Team Platform Engineering is the department within SRE that is responsible for a range of critical infrastructure and operational... ...fleet, alongside the critical components that ensure cluster reliability and security (e.g., CoreDNS, cert-manager, and Gatekeeper). As...
Senior
Work at office
Local area
Remote work
Worldwide
Flexible hours
MongoDB
New York, NY
4 days ago
Senior Site Reliability Engineer
...Senior Site Reliability Engineer – Azure Cloud Join to apply for the Senior Site Reliability Engineer role at Concord Technologies Concord Technologies is growing! Currently seeking a full‑time Senior Site Reliability Engineer (Sr. SRE) , with experience engineering solutions...
Senior
Full time
Local area
Immediate start
Remote work
Flexible hours
Concord Technologies
New York, NY
4 days ago
Remote Senior Site Reliability Engineer
$141k
...part of our journey! About the role We are committed to providing our customers with reliable and secure services so we are expanding our central Site Reliability Engineering team. You will be responsible for building and leading processes to ensure the reliability...
Senior
Local area
Remote work
Home office
Flexible hours
GrabJobs
New York, NY
5 hours ago
Senior Site Reliability Engineer
$150k - $200k
...Join to apply for the Senior Site Reliability Engineer role at Gradle Inc. Develocity is a first‑of‑its‑kind toolchain observability and acceleration platform that helps software teams adopt and improve DORA capabilities (including continuous delivery) in order to achieve...
Senior
Full time
Local area
Remote work
Work from home
Gradle Inc.
New York, NY
4 days ago
Senior Site Reliability Engineer
$185k - $227k
...by this common purpose and we are hiring the world’s best engineers, scientists, designers, product managers, operations experts... ...compelling, read on for more details. ROLE AND RESPONSIBILITIES A Senior Site Reliability Engineer (SRE) is expected to own the operational stability...
Senior
Remote work
JUUL Labs
New York, NY
4 days ago
Senior Software Engineer - Site Reliability Engineering
$130k - $165k
...Job Title: Senior Software Engineer Company: Snapsheet Job Location: USA, Remote Job Type: Full-time, direct hire Job Department: Technology Team: Site Reliability Engineering About Snapsheet Snapsheet exists to simplify claims. We leverage...
Senior
Full time
Temporary work
Local area
Remote work
Visa sponsorship
Work visa
Flexible hours
Snapsheet
New York, NY
3 days ago
Senior Site Reliability Engineer
$157.5k - $254.35k
...signature and contract lifecycle management (CLM). What you’ll do We are looking for a self‑motivated, driven and creative Senior Site Reliability Engineer to join the Site Reliability team. Metrics and analytics drive engineering at DocuSign and ensure that we are...
Senior
Contract work
Work at office
Local area
Remote work
DocuSign
New York, NY
5 days ago
Senior Site Reliability Engineer — Observability & CI/CD
jobr.pro is seeking a Senior Site Reliability Engineer in New York, NY, to enhance platform reliability and engineering excellence. You will be instrumental in implementing observability, security, and CI/CD practices. This role involves coaching teams and optimizing workflows...
Senior
jobr.pro
New York, NY
3 days ago
Senior Site Reliability Engineer
$150k - $175k
...Site Reliability Engineer At ASAPP, our mission is simple: deliver the best AI-powered customer experience—faster than anyone else. To achieve that, we're guided by principles that shape how we think, build, and execute. We value customer obsession, purposeful speed...
Senior
Remote work
ASAPP
New York, NY
4 days ago
Senior Site Reliability Engineer, Commodities Tech
$175k - $245k
A leading asset management firm in New York is seeking a Site Reliability Engineer to ensure high availability of technology services. The ideal candidate will have experience with AWS, Docker, and various operating systems. This role includes responsibilities like streamlining...
Senior
Point72 Asset Management, L.P
New York, NY
2 days ago
Senior Site Reliability Engineer - Scalable AI Infra
Tavily Inc. in New York City is seeking a Senior Site Reliability Engineer to manage Kubernetes clusters and own the full infrastructure. You will improve CI/CD pipelines and ensure systems are reliable and scalable. This role offers the chance to work on real scaling...
Senior
Tavily Inc.
New York, NY
2 days ago
Senior Site Reliability Engineer — Scale & Observability
Tabs in New York is seeking a Staff Site Reliability Engineer to evolve their platform, manage AWS infrastructure, and enhance incident response processes. This role offers a chance to shape operational excellence and reduce toil through automation. Ideal candidates should...
Senior
Tabs
New York, NY
1 day ago
Senior Site Reliability Engineer — NYC Onsite
Legora-Ab is seeking a Senior Site Reliability Engineer to join our NYC engineering hub. You will own critical services, enhancing reliability across our platform and collaborating closely with engineering teams in Stockholm. This is a full-time, in-office position focused...
Senior
Full time
Work at office
Legora-Ab
New York, NY
4 days ago
Senior Site Reliability Engineer
...requirements unforgiving, and the impact immediate. This isn’t a reactive firefighting role. It’s proactive, engineering-focused SRE where you’ll automate reliability, engineer for performance, and shape infrastructure strategy at the firm level. What they’re looking for:...
Senior
Immediate start
Campbell North Ltd.
New York, NY
2 days ago
Senior Site Reliability Engineer - Scalable Workflows
$180k - $200k
Parabola is looking for a Senior Site Reliability Engineer to improve performance and reliability of its software systems in New York. This role requires 5+ years of SRE or DevOps experience and expertise in AWS and containerization tools. Offering a salary of $180,000...
Senior
Work at office
3 days per week
Parabola
New York, NY
4 days ago
Senior Site Reliability Engineer
...acquisition, and Connor was a machine learning research engineer at Scale AI. The rest of our team comes from... ...redefining go-to-market with state-of-the-art AI. As a Senior SRE, you'll tackle the scaling and reliability challenges that come with adding terabytes of data...
Senior
Unify
New York, NY
4 days ago
Senior Site Reliability Engineer (SRE
New York, United States | Posted on 11/13/2025 Title: Senior Site Reliability Engineer (SRE) Location: Remote AboutJanuary AtJanuary, we’re transforming the lives of borrowers by bringing humanity to consumer finance. Our data-driven products empower financial institutions...
Senior
Remote work
Govserviceshub
New York, NY
4 days ago
Senior Site Reliability Engineer
...the future of legal tech — we’re defining it. Ready to join us in building the intelligent future of law? The role As a Senior Site Reliability Engineer you'll join the founding SRE team at our new NYC engineering hub, sitting within Foundations. You'll own critical...
Senior
Work at office
Dormont Manufacturing Co
New York, NY
2 days ago
Senior Site Reliability Engineer (Agentic Search)
$156k - $262k
Senior Site Reliability Engineer (Agentic Search) New York City, New York, United States About Tavily We're building the infrastructure layer for agentic web interaction at scale. Our API is designed from the ground up to power Retrieval-Augmented Generation (RAG) and...
Senior
Temporary work
Immediate start
Remote work
Tavily Inc.
New York, NY
2 days ago
Senior Site Reliability Engineer II (New York)
$104.9k - $174.7k
...Management. You can learn more about LexisNexis Risk at the link below, About the Role: We are hiring a hands-on Senior Site Reliability Engineer (SRE) to actively build, operate, and improve the reliability of our production systems. This is not a purely advisory...
Senior
Full time
Work at office
Local area
Remote work
Flexible hours
LexisNexis® Risk Solutions
New York, NY
17 hours ago
Senior DevOps Engineer / Site Reliability Engineer
...Job Description A major financial services company in NYC is growing its team rapidly, and they are looking for a Senior DevOps Engineer / Site Reliability Engineer who can join. If you’re passionate about high-availability, reliability, automation, we’d be excited...
Senior
The Greene Group
New York, NY
a month ago
Site Reliability Engineer (Senior or Staff), Atlas New York City; United States
$127k - $249k
...Eastern or Central time zones. We are looking for an experienced Senior Engineer for our SRE, Atlas team to support, maintain and grow the... ...workloads. Role Overview We are seeking a talented Site Reliability Engineer (SRE) with a strong infrastructure background. This...
Senior
Local area
Remote work
MongoDB
New York, NY
5 days ago
Site Reliability Engineer (Senior or Staff)
$127k - $249k
Platform Engineering is the department within SRE that is responsible for a range of critical infrastructure and operational functions that... ...maintains our continuous delivery infrastructure, ensuring reliable code deployment from development through production for all engineering...
Senior
Local area
Worldwide
Flexible hours
AlleyCorp
New York, NY
1 day ago
Senior DevOps Engineer/Site Reliability Engineer-East Coast
$165k - $215k
...collaboration, fostering creativity and innovation that drives real impact in the market. We are seeking a highly skilled Senior DevOps / Site Reliability Engineer (SRE) to join our globally distributed engineering organization. This is a hands‑on senior‑level role focused on...
Senior
Stellar Cyber
New York, NY
4 days ago
Senior Site Reliability and Infrastructure Engineer
$160k - $215k
...team members based closer to our customer sites (i.e. Bay Area). We strongly support our employees (including software engineers) to visit customer sites — ask us about this... .... You will build the observability and reliability foundations that let us run this system confidently...
Senior
Full time
Work experience placement
Work at office
2 days per week
Treeswift Inc
New York, NY
2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Site Reliability Engineer. Be the first to apply!