Software Engineer, Site Reliability NYC
$160k - $300kHebbia, Inc.
About Hebbia The AI platform for investors and bankers that generates alpha and drives upside. Founded in 2020 by George Sivulka and backed by Peter Thiel and Andreessen Horowitz, Hebbia powers investment decisions for BlackRock, KKR, Carlyle, Centerview, and 40% of the world’s largest asset managers. Our flagship product, Matrix, delivers industry-leading accuracy, speed, and transparency in AI-driven analysis. It is trusted to help manage over $30 trillion in assets globally. We deliver the intelligence that gives finance professionals a definitive edge. Our AI uncovers signals no human could see, surfaces hidden opportunities, and accelerates decisions with unmatched speed and conviction. We do not just streamline workflows. We transform how capital is deployed, how risk is managed, and how value is created across markets. Hebbia is not a tool. Hebbia is the competitive advantage that drives performance, alpha, and market leadership. The Role We are looking for a Site Reliability Engineer who thinks like a software engineer first. You will own critical production systems end-to-end, designing, building, and improving them rather than simply operating them. You will write production-quality code that keeps the platform reliable at scale, embed with product engineering teams to influence architecture from the start, and build the internal tooling that every engineer at Hebbia depends on. This is not a ticket-driven ops role. You will spend most of your time writing code: instrumenting services, eliminating performance bottlenecks, building deployment platforms, and translating incident post-mortems into lasting architectural improvements. Responsibilities Own critical production services end-to-end, from design and code review through deployment, operation, and incident response Profile, benchmark, and rewrite hot paths to eliminate bottlenecks as Hebbia scales Lead incident response and drive post-mortem culture, translating findings into code changes and architectural improvements rather than runbooks Design and build observability frameworks from scratch, writing custom instrumentation, alerting logic, and debugging tooling that surfaces production issues before customers feel them Define and enforce SLOs across platform services and build the feedback loops that keep engineering teams accountable to them Own capacity planning and cost efficiency: model growth, right-size infrastructure, and write automation that prevents over-provisioning and resource exhaustion Build robust, well-tested internal platforms and deployment tooling held to the same engineering standards as customer-facing code Own and continuously improve CI/CD systems so engineering teams can ship safely and quickly Embed with product engineering teams as a peer software engineer, contributing directly to production codebases and co-designing systems for reliability from the start Partner on infrastructure security through threat modeling, hardening, and automated compliance tooling Who You Are 5+ years software development with a track record of writing, shipping, and maintaining production services, not just operating infrastructure Production-grade proficiency in at least one systems or backend language: Go, Python, C++, or Rust Proven experience as a Production Engineer, SRE, or software engineer with a deep infrastructure focus, comfortable owning services end-to-end across the full stack Deep understanding of distributed systems Container orchestration expertise and hands-on experience debugging complex distributed failures in production Working knowledge of OS-level concepts Cloud platform fluency (AWS preferred) Experience in building and maintaining observability stacks Strong CI/CD pipeline expertise and a track record of improving developer velocity without sacrificing safety Background at a company with a Production Engineering or software-focused SRE culture is a strong plus Experience building platforms for AI/ML workloads or high-throughput document processing pipelines is a plus Compensation The salary range for this role is $160,000 to $300,000. This range may be inclusive of several career levels at Hebbia and will be narrowed during the interview process based on the candidate’s experience and qualifications. Adjustments outside of this range may be considered for candidates whose qualifications significantly differ from those outlined in the job description. Life @ Hebbia PTO: Unlimited Insurance: Medical + Dental + Vision + 401K Eats: Catered lunch daily + doordash dinner credit if you ever need to stay late Parental leave policy: 3 months non-birthing parent, 4 months for birthing parent Fertility benefits: $15k lifetime benefit New hire equity grant: competitive equity package with unmatched upside potential
- LI-Onsite
- J-18808-Ljbffr Hebbia, Inc.
$180k - $250k
...infrastructure running at scale. You own the reliability and availability of customer-facing... ...of production issues, and improve software development speed, reliability and maintainability... ...automation, runbooks, and chaos engineering Requirements 5+ years experience in...WebsiteCurrently hiringRelocationVisa sponsorship$325k
...Anthropic’s mission is to create reliable, interpretable, and steerable... ...of committed researchers, engineers, policy experts, and business... ...serving -- critical for both site reliability and Anthropic's... ...looking for reliability-minded software engineers and SREs. Are...WebsiteVisa sponsorship- CLERA is seeking a Senior Software Engineer to join our in-person, fast-paced NYC startup. This role involves building scalable, user-centric tools to help legal... ...with LLM integrations. This position requires on-site work in NYC and a collaborative culture focused on innovation...Website
$151.5k - $252.5k
...are looking for an experienced Senior Site Reliability Engineer to join the Veeam Data Cloud (VDC) engineering... ...4x7 production operations for a SaaS (Software as a Service) or cloud service... ..., Nevada, Hawaii, New York (excluding NYC boroughs); Sales roles located in Georgia...WebsiteBase plus commissionLocal areaWorldwide$202.5k - $247.5k
...inference, device fleets, and site-to-site connectivity.... ...our success! We like software that’s serious and... ...runs entirely on AWS. Engineers develop by using remote... ...Buildkite to operate and ship reliably. React is used for user... ...1 (SF, LA, Seattle, NYC): $202,500 - $247,500...WebsitePermanent employmentFull timeWork at officeLocal areaRemote workHome officeFlexible hours$170k - $235k
...Software Engineer - Compiler San Francisco, CA About the Role Sigma... ...interface, ensuring speed, reliability, and scalability for all... ...environment in all our offices in SF, NYC, London and Sydney. Our... ...a job application on this site, Sigma processes your...WebsiteFull timeWork at officeFlexible hours$140k - $260k
...Infrastructure Engineer As an Infrastructure Engineer, you will build and scale... ...logging, and alerting systems to maintain reliability Manage CI/CD pipelines to ensure... ...Location This is an on-site role based in our NYC or SF office, designed for builders who...WebsiteWork at officeVisa sponsorship$232k - $319k
...scale the service with great people and reliable, cost-effective, and efficient infrastructure... ...the velocity of SRE and product engineering by developing robust platforms, powerful... ...and hiring process. In accordance with NYC Local Law 144, if you are an applicant or...WebsitePermanent employmentLocal areaWorldwideFlexible hours- ...shape the future of healthcare, we’d love to meet you. About the role We’re hiring an SRE to join our engineering team at Plenful and take ownership of the reliability and performance of the systems that power our product. You’ll work across our distributed workflow...WebsiteWork at officeRemote workFlexible hours2 days per week
- CloudDevs: Senior Web site Reliability Engineer (SRE) CloudDevs works with fast-moving, venture-backed startups throughout the US. We’re constructing... ...in designing for scale and bettering how groups ship software program, you’ll match proper in. Key Duties Work as a...Website
$140k - $260k
...lean, fast-moving team across NYC, SF, Buenos Aires, and London,... ...that turns complex AI work into reliable, composable workflows. You... ...Do Build core workflow engine primitives used to orchestrate... ...Location This is an on-site role based in our NYC or SF office...WebsiteWork at officeVisa sponsorshipShift work$130.2k - $195.3k
...culture. Job Title: Senior Software Engineer (Video) Location: Burbank, CA / New York, NYC - Onsite Overview The Video... ...and improve the performance, reliability, and scalability of microservices... .... Opportunities for both on-site and virtual engagement events....WebsiteLocal area$128.5k - $161k
...and tools for operating software in production. You’ll... ...collaborate with other engineers on the Infrastructure team... ...that are secure, reliable, and performant. Through... ...services using modern site-reliability practices,... ...locations( Boston, Denver, NYC, SF) Compensation: The...WebsiteCurrently hiringLocal areaRemote workWeekend work3 days per week$148.5k - $223.9k
...Senior Member of Technical Staff (SMTS) - Site Reliability Engineer (Cloud Automation) Location: New York, NY; San Francisco, CA About... ...Bachelor's degree in Computer Science, Computer Engineering, Software Engineering or relevant work experience ~7+ years of...WebsiteWork experience placementShift work- A dynamic tech firm located in San Francisco is seeking a Site Reliability Engineer to enhance operational health across their production systems. This high-impact role demands expertise in AWS and strong programming skills. You will manage production systems' reliability...Website
- OutSystems, Inc. is looking for a Site Reliability Engineer to join their team in San Francisco, CA. The ideal candidate will lead the onboarding of services and teams to reliability tenets while establishing SLOs and SLAs. Proficiency in Python and experience with Kubernetes...WebsiteFlexible hours
- US Corp. is seeking a Lead Site Reliability Engineer to spearhead our mission of delivering highly available and performant systems. With an average... ..., the successful candidate will bridge the gap between software development and systems engineering. You will be...Website
- TELCOR Inc is looking for a Site Reliability Engineer to ensure the reliability, scalability, and performance of our AI products' systems. The role involves designing and operating resilient systems in cloud and containerized environments while managing production infrastructure...WebsiteRemote job
- ...company in San Francisco seeks a Platform/DevOps Engineer to manage and optimize CI/CD pipelines, enhance infrastructure reliability, and facilitate deployment across multiple... ...a flexible work environment, following an on-site requirement in San Francisco. #J-18808-Ljbffr...WebsiteFlexible hours
$175k - $250k
I did my part and supported the Regular Toilet is seeking a Site Reliability Engineer to enhance the reliability and performance of our systems at WorkOS. As a key member of the SRE team, you will handle critical responsibilities like improving incident responses and collaborating...WebsiteRemote jobFlexible hours- We are seeking a Sr. Site Reliability Engineer to join our team and run critical infrastructure for our blockchain and web applications. You’ll... ...Developer A seasoned developer with a solid foundation in software engineering, particularly in backend development. Someone...WebsiteRemote job
- ...Job Description Forhyre is looking for engineers who can bring unique perspectives and innovative... ...practices while building a culture of reliability and observability Engage in and improve the end to end lifecycle of software development--from inception and design, through...Website
- ...co‑founders with PhDs in AI, Math, and Computer Science — is poised to redefine computing. About the Role We're seeking a Site Reliability Engineer to ensure Hyperbolic's GPU marketplace and AI infrastructure operate with exceptional reliability, performance, and...Website
- ...manifesto. About the Role We're looking for an Infrastructure Engineer to take the lead on scaling our operational resilience as we... ...This is a high-impact, high-trust role where you’ll shape how reliability is done - reducing incident load, building internal tooling, and...WebsiteWorldwideShift work
- ...that significantly outperforms individual engineers. We combine language models with human ingenuity to push the boundaries of software development efficiency and quality. The Role We are seeking an experienced Site Reliability Engineer to join our Platform Engineering...Website
$125k - $165k
Position Site Reliability Engineer Location Lincoln, NE, San Francisco, CA, or Remote Job ID 434 Openings 1 Job Summary The Site Reliability Engineer will help ensure the reliability, scalability, and performance of the systems that power our AI products. This role...WebsiteTemporary workRemote workVisa sponsorshipWork visaFlexible hours$150k - $170k
Claryo, Inc. is seeking an Integration Reliability Engineer in San Francisco, CA, responsible for ensuring the reliability of systems across cloud and edge environments. The candidate will build and maintain observability tools and improve incident response processes....Website- ...home day is currently Tuesday. Engineering at Lambda is responsible for... ...infrastructure. Develop platform software to make observability adoptable and improve product reliability. Lead members of other... ...Have 5+ years of experience in Site Reliability Engineering practices...WebsiteWork at officeLocal areaWork from home
$227.2k - $324.5k
...About the Role: Site Reliability Engineering (SRE) at Tubi is not a traditional operations team. We are a software engineering organization that applies a developer's mindset and toolkit to the challenges of building and running large-scale, distributed systems....WebsiteFull timeContract workTemporary workLocal areaFlexible hours- A tech company focused on AI is seeking a Site Reliability Engineer to ensure the reliability and performance of its GPU marketplace. This role involves maintaining service level objectives, managing capacity, and implementing secure systems. The ideal candidate has strong...Website
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Software Engineer, Site Reliability NYC. Be the first to apply!
- software developer internship no experience San Francisco, CA
- federal - software developer San Francisco, CA
- research software engineer San Francisco, CA
- software engineer contract San Francisco, CA
- part time software developer San Francisco, CA
- software engineer healthcare San Francisco, CA
- network software engineer San Francisco, CA
- ngo software engineer San Francisco, CA
- software development engineer aws San Francisco, CA
- software developer internship San Francisco, CA



