Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Software Engineer, Site Reliability NYC

$160k - $300k

Hebbia, Inc.

About Hebbia The AI platform for investors and bankers that generates alpha and drives upside. Founded in 2020 by George Sivulka and backed by Peter Thiel and Andreessen Horowitz, Hebbia powers investment decisions for BlackRock, KKR, Carlyle, Centerview, and 40% of the world’s largest asset managers. Our flagship product, Matrix, delivers industry-leading accuracy, speed, and transparency in AI-driven analysis. It is trusted to help manage over $30 trillion in assets globally. We deliver the intelligence that gives finance professionals a definitive edge. Our AI uncovers signals no human could see, surfaces hidden opportunities, and accelerates decisions with unmatched speed and conviction. We do not just streamline workflows. We transform how capital is deployed, how risk is managed, and how value is created across markets. Hebbia is not a tool. Hebbia is the competitive advantage that drives performance, alpha, and market leadership. The Role We are looking for a Site Reliability Engineer who thinks like a software engineer first. You will own critical production systems end-to-end, designing, building, and improving them rather than simply operating them. You will write production-quality code that keeps the platform reliable at scale, embed with product engineering teams to influence architecture from the start, and build the internal tooling that every engineer at Hebbia depends on. This is not a ticket-driven ops role. You will spend most of your time writing code: instrumenting services, eliminating performance bottlenecks, building deployment platforms, and translating incident post-mortems into lasting architectural improvements. Responsibilities Own critical production services end-to-end, from design and code review through deployment, operation, and incident response Profile, benchmark, and rewrite hot paths to eliminate bottlenecks as Hebbia scales Lead incident response and drive post-mortem culture, translating findings into code changes and architectural improvements rather than runbooks Design and build observability frameworks from scratch, writing custom instrumentation, alerting logic, and debugging tooling that surfaces production issues before customers feel them Define and enforce SLOs across platform services and build the feedback loops that keep engineering teams accountable to them Own capacity planning and cost efficiency: model growth, right-size infrastructure, and write automation that prevents over-provisioning and resource exhaustion Build robust, well-tested internal platforms and deployment tooling held to the same engineering standards as customer-facing code Own and continuously improve CI/CD systems so engineering teams can ship safely and quickly Embed with product engineering teams as a peer software engineer, contributing directly to production codebases and co-designing systems for reliability from the start Partner on infrastructure security through threat modeling, hardening, and automated compliance tooling Who You Are 5+ years software development with a track record of writing, shipping, and maintaining production services, not just operating infrastructure Production-grade proficiency in at least one systems or backend language: Go, Python, C++, or Rust Proven experience as a Production Engineer, SRE, or software engineer with a deep infrastructure focus, comfortable owning services end-to-end across the full stack Deep understanding of distributed systems Container orchestration expertise and hands-on experience debugging complex distributed failures in production Working knowledge of OS-level concepts Cloud platform fluency (AWS preferred) Experience in building and maintaining observability stacks Strong CI/CD pipeline expertise and a track record of improving developer velocity without sacrificing safety Background at a company with a Production Engineering or software-focused SRE culture is a strong plus Experience building platforms for AI/ML workloads or high-throughput document processing pipelines is a plus Compensation The salary range for this role is $160,000 to $300,000. This range may be inclusive of several career levels at Hebbia and will be narrowed during the interview process based on the candidate’s experience and qualifications. Adjustments outside of this range may be considered for candidates whose qualifications significantly differ from those outlined in the job description. Life @ Hebbia PTO: Unlimited Insurance: Medical + Dental + Vision + 401K Eats: Catered lunch daily + doordash dinner credit if you ever need to stay late Parental leave policy: 3 months non-birthing parent, 4 months for birthing parent Fertility benefits: $15k lifetime benefit New hire equity grant: competitive equity package with unmatched upside potential

  • LI-Onsite
  • J-18808-Ljbffr Hebbia, Inc.

Vacancy posted 5 days ago
Similar jobs that could be interesting for youBased on the Software Engineer, Site Reliability NYC in San Francisco, CA vacancy
  • $325k

     ...Anthropic's mission is to create reliable, interpretable, and steerable...  ...of committed researchers, engineers, policy experts, and business...  ...serving -- critical for both site reliability and Anthropic's...  ...looking for reliability-minded software engineers and SREs Are... 
    Website
    Work at office
    Visa sponsorship
    Flexible hours

    anthropic

    San Francisco, CA
    4 days ago
  • $180k - $250k

     ...infrastructure running at scale. You own the reliability and availability of customer-facing...  ...of production issues, and improve software development speed, reliability and maintainability...  ...automation, runbooks, and chaos engineering Requirements 5+ years experience in... 
    Website
    Currently hiring
    Relocation
    Visa sponsorship

    Fal

    San Francisco, CA
    1 day ago
  • $150k - $176k

     ...mission. Checkr is recognized on Forbes Cloud 100 2025 List and is a Y Combinator 2024 Breakthrough Company. As a Software Engineer II on the Site Reliability Engineering team within the Platform Engineering group at Checkr, you will identify reliability challenges... 
    Website
    Full time
    Work at office
    Local area
    Remote work
    Relocation
    Flexible hours
    3 days per week

    Checkr

    San Francisco, CA
    2 days ago
  • $151.5k - $252.5k

     ...are looking for an experienced Senior Site Reliability Engineer to join the Veeam Data Cloud (VDC) engineering...  ...4x7 production operations for a SaaS (Software as a Service) or cloud service...  ..., Nevada, Hawaii, New York (excluding NYC boroughs); Sales roles located in Georgia... 
    Website
    Base plus commission
    Local area
    Worldwide

    Veeam

    San Francisco, CA
    1 day ago
  • $202.5k - $247.5k

     ...inference, device fleets, and site-to-site connectivity....  ...our success! We like software that’s serious and...  ...runs entirely on AWS. Engineers develop by using remote...  ...Buildkite to operate and ship reliably. React is used for user...  ...1 (SF, LA, Seattle, NYC): $202,500 - $247,500... 
    Website
    Permanent employment
    Full time
    Work at office
    Local area
    Remote work
    Home office
    Flexible hours

    ngrok Inc.

    San Francisco, CA
    4 days ago
  • $170k - $235k

     ...you will join a group of engineers dedicated to building...  ...interface, ensuring speed, reliability, and scalability for...  ...high-quality software systems ~ Demonstrated...  ...all our offices in SF, NYC, London and Sydney....  ...job application on this site, Sigma processes your personal... 
    Website
    Full time
    Work at office
    Flexible hours

    Sigma Computing

    San Francisco, CA
    4 days ago
  • $124k - $170k

     ...culture. Job Title: Senior Software Engineer (Video) Location: Burbank, CA / New York, NYC - Onsite Overview The Video...  ...and improve the performance, reliability, and scalability of microservices...  .... Opportunities for both on-site and virtual engagement events.... 
    Website
    Local area

    Paramount Unified School District

    San Francisco, CA
    1 day ago
  • $140k - $260k

     ...Infrastructure Engineer Profound is on a mission to help companies understand and...  ..., and alerting systems to maintain reliability Manage CI/CD pipelines to ensure seamless...  ...Location This is an on-site role based in our NYC or SF office, designed for builders who... 
    Website
    Work at office
    Visa sponsorship

    Profound

    San Francisco, CA
    4 days ago
  • $128.5k - $200k

     ...and tools for operating software in production. You’...  ...collaborate with other engineers on the Infrastructure team...  ...that are secure, reliable, and performant. Through...  ...services using modern site-reliability practices,...  ...locations( Boston, Denver, NYC, SF) Compensation:... 
    Website
    Currently hiring
    Local area
    Remote work
    Weekend work
    3 days per week

    Semgrep, Inc

    San Francisco, CA
    4 days ago
  • $202.5k - $247.5k

     ...Software Engineer III/Senior, Admin ngrok is an all-in-one cloud networking...  ..., device fleets, and site-to-site connectivity. It's the...  ...admin systems stay scalable, reliable, and hard to misuse—because mistakes...  ...Tier 1 (SF, LA, Seattle, NYC): $202,500 – $247,500 Tier... 
    Website
    Permanent employment
    Full time
    Work at office
    Local area
    Remote work
    Home office
    Flexible hours

    ngrok

    San Francisco, CA
    4 days ago
  • $140k - $260k

     ...Profound Workflow Runner Engineer Profound is building the foundational agentic...  ...backbone that turns complex AI work into reliable, composable workflows. You will shape...  ...Location This is an on-site role based in our NYC or SF office, designed for builders who... 
    Website
    Work at office
    Visa sponsorship

    Profound

    San Francisco, CA
    4 days ago
  • CloudDevs: Senior Web site Reliability Engineer (SRE) CloudDevs works with fast-moving, venture-backed startups throughout the US. We’re constructing...  ...in designing for scale and bettering how groups ship software program, you’ll match proper in. Key Duties Work as a... 
    Website

    The10minutecareersolution

    San Francisco, CA
    2 days ago
  • $148.5k - $223.9k

     ...Senior Member of Technical Staff (SMTS) - Site Reliability Engineer (Cloud Automation) Location: New York, NY; San Francisco, CA About...  ...Bachelor's degree in Computer Science, Computer Engineering, Software Engineering or relevant work experience ~7+ years of... 
    Website
    Work experience placement
    Shift work

    Salesforce

    San Francisco, CA
    4 days ago
  • A dynamic tech firm located in San Francisco is seeking a Site Reliability Engineer to enhance operational health across their production systems. This high-impact role demands expertise in AWS and strong programming skills. You will manage production systems' reliability... 
    Website

    gamma.app

    San Francisco, CA
    3 days ago
  • US Corp. is seeking a Lead Site Reliability Engineer to spearhead our mission of delivering highly available and performant systems. With an average...  ..., the successful candidate will bridge the gap between software development and systems engineering. You will be... 
    Website

    Axiom Pursuits

    San Francisco, CA
    5 days ago
  • $130.9k - $198k

     ...that wins deals. As a Senior Software Engineer, AI Platform, you’ll lead the...  ...focus on building scalable, reliable systems that enable multi-step...  ...San Francisco Bay Metro Area, NYC Metro Area, and Washington, D...  ...Take a look at our Benefits site to learn more. Accommodations... 
    Website
    Full time
    Contract work
    Internship
    Remote work
    Flexible hours

    Samsara

    San Francisco, CA
    2 days ago
  • Fieldguide is seeking a Senior Site Reliability Engineer to ensure the reliability and scalability of our production systems in San Francisco, CA. The role involves working closely with product teams to define reliability standards and build robust observability practices... 
    Website
    Remote job
    Flexible hours

    Fieldguide

    San Francisco, CA
    3 days ago
  • $238k - $290k

     ...of professional services is being written today — and we're just getting started. Role Overview As a Staff Software Engineer on the Site Reliability team at Harvey, you will ensure the reliability, scalability, and performance of our legal AI platform. You'll join... 
    Website
    Relocation package

    Harvey

    San Francisco, CA
    4 days ago
  •  ...company in San Francisco seeks a Platform/DevOps Engineer to manage and optimize CI/CD pipelines, enhance infrastructure reliability, and facilitate deployment across multiple...  ...a flexible work environment, following an on-site requirement in San Francisco. #J-18808-Ljbffr... 
    Website
    Flexible hours

    Untolabs

    San Francisco, CA
    3 days ago
  • $175k - $250k

    I did my part and supported the Regular Toilet is seeking a Site Reliability Engineer to enhance the reliability and performance of our systems at WorkOS. As a key member of the SRE team, you will handle critical responsibilities like improving incident responses and collaborating... 
    Website
    Remote job
    Flexible hours

    I did my part and supported the Regular Toilet

    San Francisco, CA
    5 days ago
  •  ...back and when to dive deep. We call this role a Cloud Service Reliability Engineer. The Cloud Service Reliability Engineer will be...  ...automating infrastructure, service delivery, and engineering site reliability, maintaining infrastructure on premise and in cloud... 
    Website

    forhyre.com

    San Francisco, CA
    4 days ago
  • We are seeking a Sr. Site Reliability Engineer to join our team and run critical infrastructure for our blockchain and web applications. You’ll...  ...Developer A seasoned developer with a solid foundation in software engineering, particularly in backend development. Someone... 
    Website
    Remote job

    Blockchain Works

    San Francisco, CA
    2 days ago
  • A tech company focused on AI is seeking a Site Reliability Engineer to ensure the reliability and performance of its GPU marketplace. This role involves maintaining service level objectives, managing capacity, and implementing secure systems. The ideal candidate has strong... 
    Website

    Hyperbolic Labs

    San Francisco, CA
    2 days ago
  •  ...manifesto. About the Role We're looking for an Infrastructure Engineer to take the lead on scaling our operational resilience as we...  ...This is a high-impact, high-trust role where you’ll shape how reliability is done - reducing incident load, building internal tooling, and... 
    Website
    Worldwide
    Shift work

    Happyrobot Inc.

    San Francisco, CA
    5 days ago
  •  ...co‑founders with PhDs in AI, Math, and Computer Science — is poised to redefine computing. About the Role We're seeking a Site Reliability Engineer to ensure Hyperbolic's GPU marketplace and AI infrastructure operate with exceptional reliability, performance, and... 
    Website

    deCircle

    San Francisco, CA
    4 days ago
  • $150k - $170k

    Claryo, Inc. is seeking an Integration Reliability Engineer in San Francisco, CA, responsible for ensuring the reliability of systems across cloud and edge environments. The candidate will build and maintain observability tools and improve incident response processes.... 
    Website

    Claryo, Inc.

    San Francisco, CA
    2 days ago
  • $125k - $165k

    Position Site Reliability Engineer Location Lincoln, NE, San Francisco, CA, or Remote Job ID 434 Openings 1 Job Summary The Site Reliability Engineer will help ensure the reliability, scalability, and performance of the systems that power our AI products. This role... 
    Website
    Temporary work
    Remote work
    Visa sponsorship
    Work visa
    Flexible hours

    TELCOR Inc

    San Francisco, CA
    3 days ago
  • $125k - $165k

    A leading innovator in laboratory software is seeking a Site Reliability Engineer in San Francisco, CA. The role focuses on ensuring reliability and performance of AI systems, managing production infrastructure, and operating resilient systems in cloud environments. The... 
    Website

    TELCOR

    San Francisco, CA
    3 days ago
  • $180k - $270k

     ...critical supplies quickly and reliably. Today, Zipline operates on...  ...complexity scale. We create software that detects issues in live operations...  ..., and distributed site assets maintenance orchestration...  ...by maintenance teams, service engineering, and flight operations to... 
    Website
    Full time
    Local area
    Immediate start

    Zipline

    South San Francisco, CA
    3 days ago
  •  ...alongside clinicians to make that possible. We’re a team of doctors, engineers, designers, researchers, and creatives building tools that...  ...for leading incidents end-to-end. Improve operational reliability: Identify recurring issues and reliability risks, and drive fixes... 
    Website
    Work at office
    Worldwide

    Heidi Health Ltd

    San Francisco, CA
    5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Engineer, Site Reliability NYC. Be the first to apply!