Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Staff Software Engineer - ML Observability

$234k - $300k

I did my part and supported the Regular Toilet

The ML Observability team builds cutting‑edge tools to monitor, explain, and improve AI systems in production, particularly those leveraging Large Language Models (LLMs) and generative AI. We provide robust, scalable observability for AI workloads, including drift detection and model evaluation, and behavior tracing, enabling customers to ship AI with confidence. As a Staff Engineer, you’ll lead the development of new features and foundational capabilities within Datadog’s LLM Observability product. You will shape product direction, drive experimentation, and apply your deep understanding of both AI systems and software engineering to solve open‑ended problems in the fast‑moving AI landscape. Your work will directly impact how our customers monitor, troubleshoot, and optimize LLM‑based applications in production. Join us in building the foundational tools that make AI systems observable, understandable, and reliable in the real world. At Datadog, we place value in our office culture – the relationships and collaboration it builds and the creativity it brings to the table. We operate as a hybrid workplace to ensure our Datadogs can create a work‑life harmony that best fits them. What You’ll Do: Drive design and implementation of LLM observability features. Ideate, prototype, and scale new product features to provide insights and drive improvements for generative AI systems. Work cross‑functionally with other engineering teams, product, UX, and applied science to iterate fast and find product‑market fit. Develop and extend tools for tracing, evaluating, and debugging LLMs. Influence architecture decisions and mentor engineers to build resilient, high‑performance systems. Stay close to customer pain points and use those insights to guide product and engineering priorities. Stay current with industry trends and advancements in machine learning and observability, driving innovation within the team. Who You Are: You have a BS/MS/PhD in a Computer Science, Engineering or related scientific field or equivalent experience. Deep understanding of distributed systems and scalable backend architectures. Hands‑on experience building and shipping LLM‑powered or GenAI applications. Understanding of model internals, inference pipelines, evaluation techniques, and prompt engineering. Ability to thrive in ambiguous, fast‑changing spaces and have a product‑oriented mindset. You’re excited to shape the next generation of AI observability tools from the ground up. Communicate clearly, think rigorously, and take pride in clean, maintainable code. Experience with observability tools/platforms. Datadog values people from all walks of life. We understand not everyone will meet all the above qualifications on day one. That’s okay. If you’re passionate about technology and want to grow your skills, we encourage you to apply. Benefits and Growth: Get to build tools for software engineers, just like yourself. And use the tools we build to accelerate our development. Have a lot of influence on product direction and impact on the business. Work with skilled, knowledgeable, and kind teammates who are happy to teach and learn. Competitive global benefits. Continuous professional development. Benefits and Growth listed above may vary based on the country of your employment and the nature of your employment with Datadog. Datadog offers a competitive salary and equity package, and may include variable compensation. Actual compensation is based on factors such as the candidate's skills, qualifications, and experience. In addition, Datadog offers a wide range of best in class, comprehensive and inclusive employee benefits for this role including healthcare, dental, parental planning, and mental health benefits, a 401(k) plan and match, paid time off, fitness reimbursements, and a discounted employee stock purchase plan. The reasonably estimated yearly salary for this role at Datadog is: $234,000 — $300,000 USD Equal Opportunity at Datadog: Datadog is an affirmative action and equal opportunity employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. Here are our candidate legal notices for your reference. #J-18808-Ljbffr

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Staff Software Engineer - ML Observability in Boston, MA vacancy
  • $234k - $300k

     ...The ML Observability team builds cutting-edge tools to monitor, explain, and improve AI systems...  ...to ship AI with confidence. As a Staff Engineer, you’ll lead the development of new features...  ...understanding of both AI systems and software engineering to solve open-ended... 
    Suggested
    Work at office

    Datadog

    Boston, MA
    more than 2 months ago
  • $170k - $200k

     ...empower small businesses. Our engineering team isn't just a support...  ...party vendors to fuel our AI/ML initiatives and create a more...  ...ecosystem. We are seeking a Staff Software Engineer to be a pivotal...  ...user experience. Performance & observability: Tunes for real-user performance... 
    Suggested
    Work at office
    Work from home
    Flexible hours

    Forward Financing

    Boston, MA
    9 hours ago
  • $230k - $280k

     ...inclusion, respect, and accountability. Staff Software Applied AI Engineer Location: Seattle, WA; Austin,...  ...Establish meaningful metrics, observability, evaluation frameworks, and continuous...  ...Experience with cloud-based AI/ML services (AWS Bedrock, GCP Vertex AI... 
    Suggested
    Apprenticeship
    Work at office
    Local area
    Remote work
    Flexible hours
    Shift work
    1 day per week

    HackerOne

    Boston, MA
    3 days ago
  •  ...award-winning, AI-First global digital engineering company that helps the world’s leading Fortune...  ...across the industry, including:3 AWS AI/ML Partner of the Year awards3 NVIDIA...  ...visit: Website or LinkedIn PageRole: DevOps/Observability EngineerExperience Level: 8+... 
    Suggested
    Remote work
    Worldwide

    Quantiphi

    Boston, MA
    2 days ago
  • $146.6k - $215.1k

     ...Staff Software Engineer, ML Infrastructure We’re a high-tech home security company that’s passionate about protecting the life you’ve built and...  ...whether that's serving reliability, deployment friction, observability gaps, scaling, or cost. Build and operate real‑time CV... 
    Suggested
    Work at office

    SimpliSafe Wireless Home Security

    Boston, MA
    8 hours ago
  •  ...at the intersection of data, software engineering, and applied machine...  ...rigorous environment. As a Staff Software Engineer , you’ll...  ...software, data, and applied ML teams to ensure systems are...  ...environments Improve reliability, observability, and CI/CD practices across... 

    Evolution USA

    Boston, MA
    2 days ago
  • $106.61k - $284.28k

     ...professionals. We are looking for a Staff Software Development Engineer with expert-level backend...  ...deployment pipelines, and production observability infrastructure. Your work will directly...  ...implementation Experience with managed AI/ML cloud services, modern JavaScript... 
    Hourly pay
    Full time
    Temporary work
    Local area

    CVS Health

    Boston, MA
    2 days ago
  • $99.6k - $223.4k

     ...design for scalability, reliability, and observability. Stay hands-on with coding, debugging, and production delivery. Drive engineering excellence through code reviews and best...  ...experience (OCI, AWS, Azure, or GCP). ~ AI/ML or AIOps production experience. ~... 
    Full time
    Temporary work
    Remote work
    Flexible hours

    Oracle

    Boston, MA
    2 days ago
  •  ...Staff Software Engineer (Backend) Axiomatic AI is building a new class of AI systems designed to...  ...routing, agent runtime, persistence, observability) Drive engineering excellence through...  ...APIs: REST, WebSockets, SSE, MCP AI/ML: Anthropic Claude, Google Gemini,... 
    Work at office
    Remote work

    Axiomatic_AI

    Boston, MA
    1 day ago
  •  ...our internal processes. We’re hiring Staff and Senior Software Engineers to work on one of the teams that powers...  ..., secure, scalable, and easy to observe Build and operate distributed databases...  ...planes to support rapid product and ML growth Design inference infrastructure... 
    Full time
    Work at office
    Local area

    SUNO

    Boston, MA
    9 hours ago
  • $200k - $325k

     ...foundation for predictive intelligence. We're looking for a Staff Software Engineer to own the design and implementation of the core systems that...  ...and implementation, between integration engineering and ML infrastructure, between defining technical strategy and writing... 

    MD Ally

    Cambridge, MA
    9 hours ago
  • $172.8k - $251.65k

     ...analyses to evaluate autonomous driving software performance across the autonomy stack....  ...functional efforts with autonomy, systems engineering, simulation, and data teams to embed evaluation...  ...Invent and drive new statistical and ML methods, and ML introspection techniques,... 
    Local area
    Remote work
    Work from home
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Boston, MA
    4 days ago
  • $253.9k - $298.7k

     ...and reliability. The Role We are looking for a Senior Staff Software Engineer to serve as Coinbase's Solana Staking Protocol CTO — the...  ...infrastructure at scale — bare metal, cloud, networking, observability. Strategic Vision: You can define year-long technical... 
    Local area

    Coinbase

    Boston, MA
    3 days ago
  • $242k - $333k

     ...methodology (RBV - Risk Balanced Verification) for evaluating hardware/software changes in varied environments. Familiarity with diverse...  ...Qualifications Experience in performance optimization to fit complex ML stack to low‑power low‑cost edge compute (e.g., Nvidia Thor,... 
    Odd job
    Temporary work

    Zoox

    Boston, MA
    1 day ago
  • Staff Full-Stack Software Engineer Financial institutions - banks and credit unions - have begun a seismic shift in how they operate and serve their customers...  ...pipelines, vector store migrations, orchestration of ML utility services Optimize applications for reliability... 
    Remote work
    Work from home
    Shift work

    Roberts Recruiting, LLC

    Boston, MA
    3 days ago
  • $254k - $336k

     ...assembled a diverse team of experts in software, robotics, artificial intelligence,...  ...defense capability. ABOUT THE JOB Staff Robotics Engineers lead the delivery of vehicle...  ...the security of our candidates. We've observed a rise in sophisticated phishing and... 
    Full time
    Work experience placement
    Immediate start
    Flexible hours

    Anduril Industries

    Boston, MA
    5 days ago
  • $166k - $265k

     ...Job Description: Our Opportunity Chewy is seeking a Staff Software Engineer to lead the Practice Hub engineering team, part of the Vet...  ...aggregated data pipeline that ingests signals from predictive ML models and recommendation engines. You will work with... 
    Local area
    Flexible hours

    Chewy

    Boston, MA
    1 day ago
  • $218.03k - $256.5k

     ...Coinbase is seeking an experienced backend engineer to join our Advanced Trading team to...  ...have at least 8 years of experience in software engineering. You’ve designed, built,...  ...customer obsession with comprehensive observability You empower and cultivate your teammates... 
    Local area
    Worldwide

    Coinbase

    Boston, MA
    4 days ago
  • $133.65k - $220.68k

     ...Job Summary The Red Hat Observability Service team is looking for a Senior Software Engineer to join us in the USA. The role collaborates with peers and senior engineers to ensure the delivery of high-quality software features and solutions, resolving routine and semi... 
    Permanent employment
    Full time
    Contract work
    Work experience placement
    Work at office
    Remote work
    Flexible hours

    Red Hat

    Boston, MA
    9 hours ago
  • $175k - $197.8k

     ...pushing the frontier of AI-driven software development, using cutting-...  ...scale. As a Sr. Software Engineer at OpenGov, you will develop...  ...workflow automation (direct ML experience not required) BA/BS...  ...modern tools like Grafana for observability and performance monitoring Proficiency... 
    Contract work
    Work at office
    Local area
    Flexible hours

    OpenGov

    Boston, MA
    1 day ago
  •  ...role fundamental scientific and engineering research play in developing...  ...at MIT) and our integrated software platform. We care deeply about...  ...css Experience with analytics/observability systems and products...  ...frameworks Experience deploying ML models in production and familiarity... 
    Flexible hours

    QuantAQ

    Somerville, MA
    5 days ago
  • $286.2k - $326.7k

     ...Overview Sr. Distinguished Software Engineer - AML (Remote - Eligible) As a Sr. Distinguished...  ...system characteristics, such as observability, resiliency, and operational excellence...  ...professional experience implementing AI/ML strategies for anomaly detection, entity... 
    Full time
    Part time
    Local area
    Remote work

    Capital One

    Boston, MA
    3 days ago
  •  ...Overview of Job Function: As a Senior Software Engineer, you will take deep technical ownership...  ...standards. Production Support and Observability Lead triage and resolution of Tier-2...  ...Principal Engineer and Tech Leads. AI/ML Integration and Innovation Integrate... 
    Contract work
    Local area
    Shift work

    Verint Systems

    Boston, MA
    5 days ago
  •  ...Senior Software Engineer About Datalign: Datalign Advisory is a Cambridge-based fintech building...  ...Senior Software Engineers with deep AI/ML expertise to join a small, high-impact...  ...inference/services and production observability (logs, metrics, traces) ~ Strong communication... 
    Work at office
    Flexible hours

    Datalign Advisory, Inc.

    Cambridge, MA
    3 days ago
  •  .... Overview of Job Function: As a Software Engineer, you will be a core contributor to Verint...  .... Monitor application health using observability tooling (logs, metrics, traces);...  ...documentation for supported features. AI/ML Integration and Continuous Improvement... 
    Local area
    Worldwide
    Shift work

    Verint Systems

    Boston, MA
    5 days ago
  •  ...A leading open-source software company is looking for a Senior Software Engineer to support their Observability Service team. This role involves developing high-quality software features, collaborating with engineers, and optimizing user experience in large-scale distributed... 

    Red Hat

    Boston, MA
    9 hours ago
  • $121k - $157k

     ...Software Engineer II, ML Ops Somerville, MA About Generate:Biomedicines Generate:Biomedicines is a new kind of therapeutics company...  ...learning at Generate. You will develop reliable, scalable, and observable systems that enable efficient model training, evaluation,... 
    Shift work

    Generate Biomedicines

    Somerville, MA
    22 days ago
  • $99.6k - $223.4k

     ...Description - OCI Enterprise Engineering IC4 Principal Developer -...  ...on Agent AI , AI-assisted software engineering , and Harness-based...  ..., model orchestration, observability, governance, and production-ready...  ...enterprise architecture, AI/ML engineering, developer platforms... 
    Temporary work
    Flexible hours

    Oracle

    Boston, MA
    3 days ago
  • $133.65k - $220.68k

     ...Red Hat, LLC is seeking a Senior Software Engineer to join their Observability Service team in Boston, USA. This role focuses on high-quality software delivery, supporting integration, troubleshooting, and optimizing user experience for their logging stack. The ideal candidate... 

    Red Hat

    Boston, MA
    1 day ago
  • ## Staff Software Engineer- Virtual Warehouse SimulationApplylocations: Waltham, MAtime type: Full timeposted on: Posted Yesterdayjob requisition...  ...with synthetic data generation for training perception or ML models.* Understanding of navigation and localization concepts... 
    Long distance

    Boston Dynamics

    Waltham, MA
    8 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff Software Engineer - ML Observability. Be the first to apply!