Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Software Engineer, ML & Data Infra

$180k
Full-time

Xai

About xAI


xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company’s mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. All employees are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates.

About the Role


The ML and Data Infrastructure team is responsible for building the foundational infrastructure that powers frontier AI models and truth-seeking agents—from petabyte-scale data acquisition and multimodal crawling, to web-scale search/retrieval systems, reliable high-throughput inference serving, low-level GPU/kernel optimizations, compiler/runtime innovations, and high-speed interconnect fabrics for massive clusters. In this role, you will collaborate across pre-training, multimodal, reasoning, and product teams in a fast-paced, meritocratic environment where you will tackle ambiguous, high-stakes problems with first-principles thinking and rigorous execution.

Responsibilities



  • Design, build, and operate petabyte-to-exabyte scale distributed systems for data acquisition, web crawling, preprocessing, filtering/classification, and multimodal pipelines (CPU/GPU workloads).

  • Architect high-performance search/retrieval engines (vector/hybrid/semantic) at trillion-document scale, integrating with LLMs/agents for truth-seeking, low-hallucination reasoning, and real-time knowledge access.

  • Develop reliable inference serving infrastructure: load balancing, autoscaling, KV cache, batching, fault-tolerance, monitoring (Prometheus/Grafana), CI/CD (Buildkite/ArgoCD), and benchmarking for 100% uptime and optimal tail latency.

  • Optimize low-level performance: CUDA kernels (GeMM, attention), Triton/CUTLASS extensions, quantization/distillation/speculative decoding, GPU memory hierarchy, and model-hardware co-design for next-gen architectures.

  • Innovate on compilers/runtimes (JAX/XLA/MLIR, custom features for Hopper/Blackwell), distributed profiling/debugging tools, and interconnect fabrics (copper/optical, 1.6T+, SerDes/photonics, topology simulation, vendor roadmaps).

  • Manage complex workloads across clouds/clusters: orchestration (Kubernetes), data bookkeeping/verifiability, high-speed interconnect validation, failure analysis, and telemetry/automation for production reliability.

Required Qualifications



  • Strong systems engineering skills with proven impact on large-scale distributed infrastructure (data processing, search, inference, or cluster networking).

  • Proficiency in Python and at least one compiled language (Rust, C++, Go, Java); experience building bespoke libraries, optimizing performance, and debugging complex systems.

  • Hands-on experience with at least one key area: petabyte-scale data pipelines/crawling (Spark/Ray/Kubernetes), web-scale search/retrieval (vector DBs, ranking, RAG), inference optimization (SGLang, kernels, batching), compiler features (JAX/XLA), or high-speed interconnects (optical/copper, SerDes, signal integrity).job

  • Deep understanding of distributed systems challenges: high-throughput ops/sec, latency/throughput tradeoffs, fault-tolerance, monitoring, and scaling to production billions-of-users or 100k+ GPUs.

  • Passion for AI infrastructure: keeping up with SOTA techniques, first-principles problem-solving, meticulous organization/bookkeeping, and delivering rigorous, high-quality results.

Preferred Qualifications



  • Experience with multimodal data (images/video/audio), epistemics/truth-seeking in retrieval, or agentic systems (long-horizon reasoning, feedback loops).

  • Low-level optimizations: CUDA kernel development (Tensor cores, attention), GPU profiling (Nsight), low-precision numerics, or interconnect pathfinding (LPO/LRO/CPO, photonics).

  • Production expertise in inference reliability (0% error target), CI/CD for ML, or cluster networking (topology, vendor collaboration, failure root-cause).

  • Track record owning end-to-end projects in hyperscale environments, with strong debugging, vendor management, or open-source contributions (e.g., SGLang).

Annual Salary Range


$180,000 - $440,000 USD

Benefits


Base salary is just one part of our total rewards package at xAI, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks.

xAI is an equal opportunity employer. For details on data processing, view our 

Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Software Engineer, ML & Data Infra in Palo Alto, CA vacancy
  • $170k - $360k

     ...Software Engineer - Data Infra Reliability As our models scale to "omni" capabilities, our data infrastructure must be unbreakable. We are looking...  ...Bonus Points) Experience managing GPU clusters or AI/ML workloads. Background in both Software Engineering and Operations... 
    Suggested

    Luma AI

    Palo Alto, CA
    5 days ago
  • $204k - $259k

     ...developing and deploying advanced ML models that interpret traffic...  ...will report to the Senior Engineering Manager of Semantics. You...  ...new features in the VLM data infra and validate the changes for...  ...professional experience in the field of software engineering ~ Proficiency... 
    Suggested
    Full time
    Work at office
    Remote work

    Waymo

    Mountain View, CA
    1 day ago
  •  ...performance in the industry. Position Overview As a Software Engineer, Data Infra you are the architect of the "Laboratory" where Dyna's robotic...  ...between raw multimodal sensor streams and production-ready ML models. This is a high-impact, hands-on role where you... 
    Suggested

    DYNA Robotics Inc

    Redwood City, CA
    5 hours ago
  • $213k - $263k

     ...across 15+ U.S. states. The ML Ops team, part of Waymo ML...  ...Develop and contribute to Waymo's data infrastructure platform to...  ...via data store and data infra ecosystem. Work closely with...  ...professional experience in the field of software engineering ~ Experience programming in... 
    Suggested
    Full time
    Remote work

    Waymo

    Mountain View, CA
    2 days ago
  • $153k - $222k

     ...the role We are looking for infrastructure engineers with expertise in scaling open-source data infrastructure to join the Data & ML infra group. This role will work across the...  ...hooks. Develop and deploy high-quality software using modern tooling and frameworks, especially... 
    Suggested
    Full time
    For contractors
    For subcontractor
    Casual work
    Work at office
    Remote work
    Day shift

    Decisive Point

    Mountain View, CA
    4 days ago
  • $162.8k - $203.5k

    Rivian is searching for a Staff Software Engineer on the Data team, responsible for expertise in cloud and data engineering. The role requires a solid understanding of the AWS Cloud Data Platform, leading critical infrastructure services for the ADAS team. Key qualifications... 

    Rivian

    Palo Alto, CA
    2 days ago
  •  ...generation of humanoid robots — from high-performance, software-defined hardware to the foundational models and video world...  ...-up to make that a reality. We're looking for a Senior ML & Data Infrastructure Engineer to own and scale the systems that power our model... 
    Immediate start

    Rhoda AI

    Palo Alto, CA
    2 days ago
  •  ...Engineering Role at Latica At Latica, our goal is to unlock the value of data to transform patient care. We're building a secure data network...  ...5+ years building production software systems; care deeply about...  ...requirements • Exposure to ML pipelines, feature stores, or... 

    Latica

    Palo Alto, CA
    5 days ago
  • $160k - $230k

     ...the future of how work gets done. The Data Governance team builds services, systems...  ..., and auditability access. We leverage ML techniques across our product offerings...  ...of Data Stewards easier. AS A SENIOR SOFTWARE ENGINEER IN DATA GOVERNANCE AT SNOWFLAKE, YOU WILL... 
    Flexible hours

    Snowflake Computing

    Menlo Park, CA
    5 days ago
  • $180k - $220k

     ...Software Engineer, Data Los Angeles, Palo Alto, San Francisco About HeyGen At HeyGen, our mission is to make visual storytelling accessible...  ...AI models. Power Intelligent Features: Collaborate with ML engineers to implement data structures and APIs for new,... 
    Work experience placement

    HeyGen

    Palo Alto, CA
    5 days ago
  • $153k - $222k

     ...Infrastructure Engineer Applied Intuition, Inc. is powering the future of...  ...expertise in scaling open-source data infrastructure to join the Data & ML infra group. This role will work across...  ...Develop and deploy high-quality software using modern tooling and frameworks... 
    Full time
    For contractors
    For subcontractor

    Applied Compute

    Sunnyvale, CA
    4 days ago
  • $281k - $356k

     ...states. The Perception Data team at Waymo is...  ...automated "flywheels" and "infra-as-product" solutions that...  ...to a Director of Engineering   You will: Define...  ...: ~10+ years of software engineering experience...  ...distributed systems or ML infrastructure. ~ System... 
    Full time
    Remote work

    Waymo

    Mountain View, CA
    5 days ago
  • $193.93k - $352.29k

     ...Staff Software Engineer, Behavior ML Data Mountain View, California (HQ) Who We Are Nuro is a self-driving technology company on a mission to...  ...Functional Leadership: Work across autonomy teams and data infra teams to build effective ML data pipelines and products... 
    Shift work

    Nuro

    Mountain View, CA
    1 day ago
  • $180k - $300k

     ...compute is wasted training on data that are already learned, irrelevant...  ...both data research and data engineering necessary to solve this...  ...have experience maintaining the infra that supports these. Proficiency...  ...Team. Experience building ML/DL systems and/or data... 
    Work at office
    Visa sponsorship
    Relocation package

    datologyai

    Redwood City, CA
    2 days ago
  • $193.93k - $291.15k

     ...Sr. Software Engineer, Perception Data Infrastructure Mountain View, California (HQ) About the Role We are a team of high-output generalists where ML and systems engineering converge to push autonomy performance forward. As a Senior Perception ML Data Infrastructure... 

    Nuro

    Mountain View, CA
    2 days ago
  • $206.5k - $258.1k

     ...Autonomy org at Rivian is seeking a Staff Software Engineer, Data Ops to join the Data team who can...  ...highlighting AWS Cloud Platform and Data/Dev/ML Ops practices. Responsibilities...  ...highly reliable, scalable, and distributed infra using microservice architecture.... 
    Full time
    Contract work
    Temporary work
    Part time
    Local area
    Shift work

    Rivian

    Palo Alto, CA
    4 days ago
  • $160.36k - $240.54k

     ...Software Engineer, ML Data Infrastructure Mountain View, California (HQ) Nuro is a self-driving technology company on a mission to make autonomy accessible to all. Founded in 2016, Nuro is building the world's most scalable driver, combining cutting-edge AI with... 
    Work experience placement

    Nuro

    Mountain View, CA
    5 days ago
  • $272k - $431.25k

     ...Principal Ai And Ml Infra Software Engineer, Gpu Clusters We are seeking a Principal AI and ML Infra Software Engineer, GPU Clusters at NVIDIA...  ...Work closely with a variety of teams, such as researchers, data engineers, and DevOps professionals, to develop a cohesive... 

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $162k - $260k

    Senior Software Engineer - Vehicle Data Management Aurora’s Vehicle Data Management team is responsible for offloading, ingesting, and scaling data across...  ...to hundreds of PB of multimodal data (sensor/log/sim/ML datasets). Improve reliability and performance of offline... 
    Local area

    I did my part and supported the Regular Toilet

    Mountain View, CA
    5 days ago
  • $240k - $280k

     ...highly motivated, and focused on engineering excellence. This organization...  ...discovery. High-quality data is fundamental to every stage...  ...We work at the intersection of software, data, infrastructure, and machine...  ...closely with acquisition teams, ML engineers, and data engineers... 
    Temporary work

    Pantera Capital

    Palo Alto, CA
    2 days ago
  • Staff Software Engineer, GenAI, Data Quality corporate_fare Google place Mountain View, CA, USA Apply Minimum...  ...or a related field. Familiarity with ML production tools and lifecycle. About...  ...Data Science, Product, UX/UX Researcher, Infra and Operations teams. #J-18808-Ljbffr... 

    Google Inc.

    Mountain View, CA
    3 days ago
  • $180k - $225k

     ...hiring a Machine Learning Infrastructure Engineer to help build the backbone that trains, serves...  ...end-to-end-partnering with product and data teams, reducing latency and cost, and...  ...launched model. You'll work across the ML lifecycle: making training faster and more... 
    Full time
    Local area
    Work from home

    NewsBreak

    Mountain View, CA
    2 days ago
  • $140k - $252k

     ...What to Expect As a Software Engineer within the Supercomputing AI Infrastructure team, you will...  ...across training jobs, experiments, and data pipelines. In this role, you will own the...  ...job throughput Work closely with the ML team to understand workload patterns and... 
    Hourly pay
    Full time
    Temporary work
    Flexible hours

    Tesla

    Palo Alto, CA
    3 days ago
  • $275.8k - $340.5k

    About the Team The AV ML Infra team at GM builds ML infrastructure designed to meet the unique...  ...teams such as Embodied AI, Simulation, Data Science, and more. We enable scalable and...  ..., enhance the productivity of ML engineers, and drive the adoption of cutting‑edge ML... 
    Remote work
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Mountain View, CA
    1 day ago
  • $193.93k - $291.15k

     ...a team of high-output generalists where ML and systems engineering converge to push autonomy performance forward. As a Perception ML Data Engineer, you’ll bridge machine learning...  ...~ Experience: ~4+ years of industry software engineering experience with Python fluency... 
    Full time

    Nuro

    Mountain View, CA
    1 day ago
  • $150.32k - $225.48k

     ...Software Engineer II - Data Platform Pittsburgh, PA Latitude AI develops automated driving technologies, including L3, for Ford vehicles at...  ...Airflow and Superset to serve Latitude's unique autonomy and ML use cases A Rich Metadata Layer: Provide the automation... 
    Permanent employment
    Full time
    Work at office
    Immediate start
    Visa sponsorship

    Latitude AI

    Palo Alto, CA
    5 days ago
  • $160k - $240k

     ...trained on rich, continuous neural data — a high-resolution model of...  ...come from researchers and engineers working as a single, tightly collaborative...  ...the Role We are hiring Software Engineers to build the data...  ...closely with researchers, ML engineers, and infrastructure... 
    Full time
    Visa sponsorship
    Flexible hours

    Metamorphic

    Palo Alto, CA
    1 day ago
  • $166k - $244k

    Senior Software Engineer, Infra, Vertex Gemini API+ Serving - Sunnyvale, CA, USA. About the job Google'...  ...architecting production‑quality Machine Learning (ML) infrastructure. Experience in AI/ML...  ...field. 5 years of experience with data structures/algorithms. 1 year of... 
    Full time

    Carlsbad Tech

    Sunnyvale, CA
    1 day ago
  • $160k - $225k

     ...used to expand our product and engineering teams, bringing our vision of...  ...Why Join Now While traditional software has a clear playbook, building...  ...stack , from the foundational data platforms that feed our agents...  ...accuracy across analytics and ML applications. Implement... 

    MAI Agents

    Mountain View, CA
    3 days ago
  • $154.4k - $212.3k

    About the role This role sits within our Data Layer and Marketing AI (MAI) platform, where...  ..., distributed compute, and platform engineering. Key Responsibilities Design and build scalable...  .... Collaborate with product, AI/ML, and platform teams to deliver end‑to‑end... 

    Uniphore Technologies North America Inc

    Palo Alto, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Software Engineer, ML & Data Infra. Be the first to apply!