Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Site Reliability Engineer, Fleet Management

$127k - $249k

MongoDB HQ

The Team

Platform Engineering is the department within SRE that is responsible for a range of critical infrastructure and operational functions that support the broader engineering organization. Among these are our multi-cloud-provider Kubernetes infrastructure, networking, load balancing (including our public-facing edge and internal service mesh), and observability and alerting systems.

The Fleet Management team provides the core runtime environment that empowers our developers to build and ship products to delight our customers. We manage the end-to-end lifecycle of our Kubernetes fleet, alongside the critical components that ensure cluster reliability and security (e.g., CoreDNS, cert-manager, and Gatekeeper). As our infrastructure scales to support new use cases and products, we are spearheading a migration from Terraform-based Infrastructure as Code (IaC) to an Operator-driven lifecycle management model.

This role can be based out of our Austin, Boston, Los Angeles, New York City, Raleigh, or San Francisco offices, remotely in the United States region, or our European office in Dublin.

Responsibilities

  • Contribute to developing and maintaining a scalable and secure runtime environment on top of Kubernetes that supports product needs across MongoDB

  • Provide internal support for our Kubernetes ecosystem, partnering with engineering teams to help them solve domain-specific problems

  • Participate in a 24/7 on-call rotation to resolve critical issues

  • Prioritize blameless post-mortems and dedicate engineering time to systemic fixes, ensuring you aren't paged for the same issue twice

You may be a good fit if you

  • Have 6+ years of experience in software development and operating distributed systems

  • Are proficient in Go, Python, or a similar language, with a strong commitment to code quality and testing practices (writing unit, integration, and E2E tests)

  • Have deep experience using and extending containerization technologies, preferably Kubernetes

  • Have a solid understanding of Linux operating system internals and networking concepts (e.g., filesystems, TCP/IP, DNS, TLS)

  • Possess a customer focused mindset, treating internal developers as your primary users

  • Have strong operational ownership, including a track record of debugging complex production issues and driving them to resolution

  • Prefer automation over manual processes ("allergic to ops work")

  • We are a small team of software engineers with a strong bias toward building software solutions to eliminate toil

Strong candidates may also have experience with

  • Designing and implementing secure, multi-tenant runtime environments from first principles

  • Proficiency with Kubernetes ecosystem tools such as Helm, Kustomize, Gatekeeper, Kyverno, and CRDs/Operators, CRI, CSI

  • Expertise in cloud infrastructure platforms, including AWS, GCP, or Azure

  • Proficiency in provisioning infrastructure using tools like Terraform, Crossplane, and AWS Controllers for Kubernetes (ACK)

  • Advanced Linux systems internals and networking concepts specifically relevant to containers, such as namespaces and cgroups

About MongoDB

MongoDB is built for change, empowering our customers and our people to innovate at the speed of the market. We have redefined the data platform for the AI era, enabling builders to create, transform, and disrupt industries with software. MongoDB's unified data platform, the most widely available, globally distributed data platform on the market, helps organizations modernize legacy workloads, embrace innovation, and unleash AI. Our cloud-native platform, MongoDB Atlas, is the only globally distributed, multi-cloud data platform and is available across AWS, Google Cloud, and Microsoft Azure.

With offices worldwide and over 67,000 customers, including 75% of the Fortune 100 and AI-native startups, relying on MongoDB for their most important applications, we're powering the next era of software.

Our compass at MongoDB is our Leadership Commitment, ( guiding how and why we make decisions, show up for each other, and win. It's what makes us MongoDB.

To drive the personal growth and business impact of our employees, we're committed to developing a supportive and enriching culture for everyone. From employee affinity groups, to fertility assistance and a generous parental leave policy ( , we value our employees' wellbeing and want to support them along every step of their professional and personal journeys. Learn more about what it's like to work at MongoDB ( , and help us make an impact on the world!

MongoDB is committed to providing any necessary accommodations for individuals with disabilities within our application and interview process. To request an accommodation due to a disability, please inform your recruiter.

MongoDB, Inc. provides equal employment opportunities to all employees and applicants for employment and prohibits discrimination and harassment of any type and makes all hiring decisions without regard to race, color, religion, age, sex, national origin, disability status, genetics, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.

Req ID: 426182

MongoDB's base salary range for this role is posted below. Compensation at the time of offer is unique to each candidate and based on a variety of factors such as skill set, experience, qualifications, and work location. Salary is one part of MongoDB's total compensation and benefits package. Other benefits for eligible employees may include: equity, participation in the employee stock purchase program, flexible paid time off, 20 weeks fully-paid gender-neutral parental leave, fertility and adoption assistance, 401(k) plan, mental health counseling, access to transgender-inclusive health insurance coverage, and health benefits offerings. Please note, the base salary range listed below and the benefits in this paragraph are only applicable to U.S.-based candidates.

MongoDB's base salary range for this role in the U.S. is:

$127,000-$249,000 USD

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Senior Site Reliability Engineer, Fleet Management in San Francisco, CA vacancy
  • Senior Site Reliability Engineer - AI Infrastructure Location: Global Remote / San Francisco • Full-Time About...  ...hyperscalers. We began with a single managed cluster - but it filled almost...  ...capacity planning across heterogeneous GPU fleets optimized for training throughput.... 
    Senior
    Fleet
    Full time
    Remote work

    Cortes 23

    San Francisco, CA
    1 day ago
  • $181.69k - $213.75k

     ...Senior Site Reliability Engineer San Francisco, California; Santa Clara, California; Seattle, WA The Company You'll Join Carta connects...  ...funds and SPVs, representing nearly $185B in assets under management, with tools designed to enhance the strategic impact of... 
    Senior
    Full time
    Work at office

    Carta

    San Francisco, CA
    8 hours ago
  • $220k - $235k

     ...Staff/Senior Staff Site Reliability Engineer Ironclad is the leading AI contracting platform that transforms agreements into assets. Contracts move...  ...Wave and Gartner Magic Quadrant for Contract Lifecycle Management, a Fortune Great Place to Work, and one of Fast Company... 
    Senior
    Full time
    Contract work
    Work at office

    Ironclad Inc

    San Francisco, CA
    2 days ago
  • We are seeking a Sr. Site Reliability Engineer to join our team and run critical infrastructure for our blockchain and web applications. You’ll learn to deploy and maintain a fleet of RPC and validator nodes for multiple blockchain networks. You’ll also provide guidance... 
    Senior
    Fleet
    Remote job

    Blockchain Works

    San Francisco, CA
    13 days ago
  •  ...perform under real-world scale, reliability, and security demands — and we're looking for an engineer who wants to own the...  ...network device configuration management end to end, ensuring consistency and reliability across the fleet. Improve system and network reliability... 
    Senior
    Fleet

    Alembic, Inc.

    San Francisco, CA
    1 day ago
  • $151k - $297k

    The Team Platform Engineering is the department within SRE that is responsible for a range...  ...and alerting systems. The Fleet Management team provides the core runtime environment...  ...critical components that ensure cluster reliability and security (e.g., CoreDNS, cert-manager... 
    Fleet
    Local area
    Immediate start
    Remote work
    Flexible hours
    Shift work

    MongoDB

    San Francisco, CA
    5 days ago
  • Drata is seeking a Senior Site Reliability Engineer in San Francisco. In this role, you will engage in reliability architecture for product teams, lead production readiness reviews, and build automation around monitoring and alerting. The ideal candidate has at least 6... 
    Senior

    Careers at Drata

    San Francisco, CA
    2 days ago
  • A technology company focused on grid management is looking for a Senior Software Engineer in San Francisco. You will lead the development of systems for device...  ...support thousands of edge devices, and ensure the reliability of sensors. The role requires over 5 years of... 
    Senior
    Fleet

    Gridware Technologies Inc.

    San Francisco, CA
    4 days ago
  • Airwallex- is seeking a Senior Site Reliability Engineer in San Francisco, California, to work with product teams to build and maintain robust cloud infrastructure. In this role, you will lead critical infrastructure projects, ensuring the reliability and performance of... 
    Senior

    Airwallex-

    San Francisco, CA
    1 day ago
  • $190k - $215k

     ...groundbreaking new class of grid management called active grid response...  ...of the grid that affect reliability and safety. Gridware’s advanced...  ...related tooling so on‑call engineers can quickly find and fix...  ...Experience supporting IoT / embedded fleets at scale, including secure... 
    Senior
    Fleet

    Gridware Technologies Inc.

    San Francisco, CA
    4 days ago
  • $175k - $195k

     ...groundbreaking new class of grid management called active grid response...  ...of the grid that affect reliability and safety. Gridware’s...  ...Description We’re looking for a Senior Software Engineer to lead the development of...  ...that manage our growing fleet of devices - the foundation... 
    Senior
    Fleet

    Gridware Technologies Inc.

    San Francisco, CA
    4 days ago
  • Nuro is seeking a Product Manager to enhance operations for Autonomous Vehicles. This role involves driving initiatives by working closely with teams across Operations, Design, Engineering, and more. The ideal candidate will prioritize user workflows, build tools to scale... 
    Senior
    Fleet
    3 days per week

    Nuro

    San Francisco, CA
    4 days ago
  • $101.9k - $140.14k

    CEI Fleet Collision and Safety is seeking an Environmental Health and Safety (EHS) Manager to oversee safety and risk programs at our San Diego facility. The role involves developing EHS programs, ensuring compliance with all safety regulations, and promoting a proactive... 
    Senior
    Fleet

    CEI Fleet Collision and Safety

    San Francisco, CA
    1 day ago
  • A leading aerospace technology firm in California is seeking a Senior Ground Segment Engineer to manage satellite connectivity and ground segment operations. This role involves overseeing cross-fleet contact allocation, optimizing network infrastructure, and providing... 
    Senior
    Fleet
    Flexible hours

    Cerebras

    San Francisco, CA
    1 day ago
  • Sky Chefs is seeking a Fleet Maintenance Manager in San Francisco to oversee all fleet maintenance operations ensuring compliance with regulations. The role involves managing repairs, collaborating with vendors, and leading a team in a fast-paced environment. The ideal... 
    Senior
    Fleet

    Sky Chefs

    San Francisco, CA
    1 day ago
  • $220k

     ...things like change detection or visual semantic data mining. AI Fleet management tools drive value to large fleets of vehicles....  ...Working closely with operations, product development, and other engineering teams to deliver data-intensive cross-functional platform solutions... 
    Senior
    Fleet

    Hive

    San Francisco, CA
    4 days ago
  • $164.2k - $205.2k

     ...clusters, and must deliver extreme elasticity, reliability and cost efficiency. As a Senior Software Engineer on the Compute Infra team, you will design and...  ...with high performance and efficiency Scale the fleet management systems that launch and configure millions of VMs... 
    Senior
    Fleet
    Local area

    Menlo Ventures

    San Francisco, CA
    4 days ago
  •  ...talent agency is seeking a Senior Technical Program Manager to join our Client's team....  ...working closely with Project Engineers (PEs) and coordinating day‑...  ...field installation, or on‑site commissioning Background in...  ...compute, computer vision, or fleet management systems... 
    Senior
    Fleet
    Contract work

    Blackstone Technology Group

    San Francisco, CA
    3 days ago
  • $238k - $288k

     ...Type Full time Location Type On-site Department Cloud Engineering Crusoe builds and operates AI-first...  ...in the firmware that underpins fleet reliability, security, and operability - and we...  ...kernel, U-Boot, device tree, sensor management, fan and thermal control, power... 
    Senior
    Fleet
    Full time
    Temporary work

    ProducePay

    San Francisco, CA
    1 day ago
  • $122.4k - $180k

     ...pipelines, assets, and tools to enable our fleets to scale in the real world. See more about Sim here!: As a software engineer on the Simulation team, you will be in the...  ...develop user‑efficiency tools for simulation management and authoring workflows to help test... 
    Senior
    Fleet
    Local area
    Immediate start
    Remote work
    Flexible hours

    I did my part and supported the Regular Toilet

    San Francisco, CA
    1 day ago
  • $150k - $190k

     ...Department of Defense. The Systems and Safety Engineering team at Kodiak is seeking an experienced...  ...Kodiak's next-generation Autonomy Fault Management System. This individual will lead the...  ...safety system-it is a primary lever of fleet availability, utilization, and cost per... 
    Senior
    Fleet
    Temporary work
    Work at office
    Visa sponsorship
    Flexible hours

    Kodiak

    San Francisco, CA
    4 days ago
  •  ...that significantly outperforms individual engineers. We combine language models with human...  ...: We are seeking an experienced Site Reliability Engineer to join our Platform Engineering...  ...roles ~ Proven track record of managing production systems at scale, preferably... 
    Senior

    CodeRabbit

    San Francisco, CA
    2 days ago
  • $155k - $190k

     ...Senior Backend/Infrastructure Software Engineer We are searching for a Senior Backend/Infrastructure Software Engineer...  ...our customers' complex software management environments. As a Senior Backend...  ...to manage and support our growing fleet of autonomous robots Building... 
    Senior
    Fleet
    Full time
    Work at office
    Immediate start

    Osaro, Inc.

    San Francisco, CA
    3 days ago
  • $146k

     ...the Role As the Service Operations Program Manager , you will be the primary architect of the...  ...that keep Uber’s autonomous vehicle (AV) fleets moving. You aren't just managing a steady state; you are building the "Service Engine" from the ground up. You will define how... 
    Senior
    Fleet
    Full time

    Uber

    San Francisco, CA
    4 days ago
  • $120k - $150k

     ...Technical Customer Success Manager, AI & Ops Tread is the AI-native operating system...  ...projects, alongside the family-owned hauling fleets that have moved the material this work...  ...standards for what reaches Product and Engineering — and send incomplete escalations back... 
    Senior
    Fleet
    For contractors

    Higher People

    San Francisco, CA
    8 hours ago
  •  ...intelligence, redefining how cities are managed. Powered by a proprietary visual intelligence engine with full spatial reasoning, EchoTwin transforms municipal fleets into mobile urban sensors—creating...  ...EchoTwin AI is looking for a Senior Technical Program Manager to plan... 
    Senior
    Fleet
    Flexible hours

    EchoTwin AI

    San Francisco, CA
    3 days ago
  • $300k

     ..., full-scale model training, or inference. As a Platform Engineer/Senior Site Reliability Engineer, you’ll own the reliability, performance, and automation...  ..., ensuring seamless orchestration across environments managed by Slurm, Kubernetes, or direct SSH access. As well as... 
    Senior

    Hamilton Barnes Associates Limited

    San Francisco, CA
    3 days ago
  •  ...the frontier of applying machine learning to investment management. We have become a multibillion‑dollar asset manager, and we have ambitious goals for the future. As a Senior Cluster Site Reliability Engineer (SRE), you will help scale our research compute cluster to... 
    Senior
    Local area

    The Voleon Group

    Berkeley, CA
    4 days ago
  • $50 per hour

     ...carbon-negative distributed computing solutions. Crusoe Cloud is a managed cloud services platform powered by stranded energy that...  ...contributing to architecture and design (architecture, design patterns, reliability and scaling) of new and current systems Bachelor's Degree in... 
    Senior
    Temporary work
    Work experience placement

    Epoch Biodesign

    San Francisco, CA
    5 days ago
  • $300 per month

     ...in the software systems that manage, observe, and heal our...  ...network at scale. We are hiring a Senior Engineering Manager, SDN Management...  ...runs across our entire network fleet. This is a senior software engineering...  ...with operational reliability and stakeholder needs. Clear... 
    Senior
    Fleet
    Temporary work

    Crusoe Energy Systems LLC

    San Francisco, CA
    5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Site Reliability Engineer, Fleet Management. Be the first to apply!