Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Member of Technical Staff, Data Infrastructure

Inception LLC

Member Of Technical Staff, Data Infrastructure

Inception creates the world's fastest, most efficient AI models. Our Mercury model is the world's fastest reasoning LLM and first commercially available diffusion LLM, delivering 5x greater speed and efficiency than today's LLMs, with best-in-class quality. We are the AI researchers and engineers behind such breakthrough AI technologies as diffusion models, flash attention, and DPO.

The Role

We seek experienced engineers to architect and scale the core infrastructure behind distributed training pipelines and petabyte-scale data catalogs. You'll work directly with researchers to accelerate experiments, develop new datasets, improve infrastructure efficiency, and enable key insights across our data assets.

Key Responsibilities
  • Design, build, and operate scalable, fault-tolerant infrastructure for LLM research: distributed compute, data orchestration, and storage across modalities.
  • Develop high-throughput systems for data ingestion, processing, and transformation — including training data catalogs, deduplication, quality checks, and search.
  • Build systems for web crawling, data ingestion, and real-time data processing to support model training operations.
  • Develop tools and frameworks for efficient data storage, retrieval, and versioning across distributed systems.
  • Ensure data collection adheres to privacy regulations.
Qualifications
  • BS/MS/PhD in Computer Science, Machine Learning, or a related field (or equivalent experience).
  • 3+ years of experience building data processing pipelines at scale, particularly with AI/ML applications.
  • Strong proficiency in Python and experience with data processing frameworks (Apache Spark, Beam, Airflow).
  • Familiarity with synthetic data generation techniques and data augmentation strategies.
  • Familiarity with web scraping, crawling technologies, and Common Crawl datasets.
  • Solid understanding of machine learning fundamentals and experience with ML frameworks (PyTorch, TensorFlow).
  • Experience with SQL and NoSQL databases for managing structured and unstructured data.
Preferred Skills
  • Experience with large language models and understanding of tokenization, embeddings, and model architectures.
  • Experience managing human annotation workflows and quality control processes.
  • Experience with vector databases and embedding-based retrieval systems.
  • Knowledge of data privacy regulations and ethical AI practices.
  • Experience with distributed computing and large-scale data storage systems (HDFS, S3, BigQuery).
Why Join Inception
  • Work with World-Class Talent : Collaborate with the inventors of diffusion models and leading AI researchers
  • Shape Foundational Technology : Your decisions will influence how the next generation of AI products are built and used
  • Immediate Impact : Join at the ground floor where your contributions directly shape product direction and company trajectory
Perks & Benefits
  • Competitive salary and equity in a rapidly growing startup
  • Flexible vacation and paid time off (PTO)
  • Health, dental, and vision insurance
  • Catered meals (breakfast, lunch, & dinner)
  • Commuter subsidies
  • A collaborative and inclusive culture

Inception creates the world's fastest, most efficient AI models. Today's autoregressive LLMs generate tokens sequentially, which makes them painfully slow and expensive. Inception's diffusion-based LLMs (dLLMs) generate answers in parallel. They are 5x faster and more efficient, while delivering best-in-class quality. Inception was co-founded by Stanford professor Stefano Ermon, who co-invented such breakthrough AI technologies as diffusion models, flash attention, and DPO, UCLA professor Aditya Grover, who co-invented node2vec, decision transformers, and d1 reasoning, and Cornell professor and Afresh co-founder Volodymyr Kuleshov, who co-invented MDLM and Block Diffusion. We pioneered the application of diffusion to language, with world's first (and only) commercially available dLLM, Mercury. We are currently deploying our large-scale diffusion LLMs at Fortune 500 companies. Diffusion is the technology behind today's image and video AI, and we're making it the standard for LLMs as well. Our team includes engineers from AWS, Google DeepMind, Meta AI, Microsoft, HashiCorp, and OpenAI. Based in Palo Alto, CA, we are backed by top-tier venture capitalists, including Menlo Ventures, Mayfield, M12 (Microsoft's venture fund), Snowflake Ventures, Databricks, and Innovation Endeavors, and by tech luminaries such as Andrew Ng, Andrej Karpathy, and Eric Schmidt.

If you are talented, innovative, and ambitious, come help us invent the future of AI. We are an equal opportunity employer and encourage candidates of all backgrounds to apply.

Vacancy posted 1 hour ago
Similar jobs that could be interesting for youBased on the Member of Technical Staff, Data Infrastructure in Palo Alto, CA vacancy
  •  ...real-world radiology datasets. About the Role As a Member of Technical Staff on the ML Infrastructure team, you will build and operate the platform that enables...  ...that manage large-scale compute, storage, and data movement for ML workloads. Design and optimize model... 
    Data

    Cognita Imaging Inc.

    Palo Alto, CA
    18 hours ago
  • $180k

     ...engineers to build the software infrastructure that enables our models to...  ...phone interview”) during which a member of our team will ask some...  ...process, which consists of four technical interviews: Coding...  ...secret information, and/or user data; Interacting with internal and... 
    Data
    Local area
    Relocation

    Pantera Capital

    Palo Alto, CA
    3 days ago
  • $180k

    Member Of Technical Staff - Cloud Infrastructure ABOUT xAI xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity...  ...Pulumi, Terraform, or Ansible, with a focus on secure data handling. Drive system reliability through incident management... 
    Data
    Temporary work

    x.ai

    Palo Alto, CA
    18 hours ago
  • Member of Technical Staff - Foundation Model Architecture & AI Infrastructure Vinci | Full-Time | Remote / Hybrid The Mission At Vinci, we are building the operator intelligence...  .... Trained on 45TB+ of structured physics data Running billion-voxel inference in production... 
    Data
    Full time
    Remote work

    Vinci4d

    Palo Alto, CA
    3 days ago
  • $180k

    Member of Technical Staff - RL Infrastructure About xAI xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity...  ...seeking experienced software engineers to create robust data pipelines, comprehensive evaluations for benchmarking... 
    Data
    Temporary work

    xAI

    Palo Alto, CA
    2 days ago
  • $180k

     ...AI supercomputers from the ground up. As part of the Compute Infrastructure team, you will own both the raw GPU supercomputer and the platform...  ...xAI is an equal opportunity employer. For details on data processing, view our Recruitment Privacy Notice. #J-18808-Ljbffr... 
    Data
    Temporary work

    xAI

    Palo Alto, CA
    1 day ago
  • $180k

     ...Member Of Technical Staff - Data Platform Palo Alto, CA xAI's mission is to create AI systems that can accurately understand the universe...  ...teammates. The Data Platform team builds and operates the infrastructure responsible for all large-scale data transport and... 
    Data
    Temporary work

    Xai

    Palo Alto, CA
    1 day ago
  • $180k

     ...Member Of Technical Staff - Inference Palo Alto, CA About Xai Xai's mission is to create AI...  ...will own everything from distributed infrastructure (global KV cache, continuous batching...  ...opportunity employer. For details on data processing, view our Recruitment Privacy... 
    Data
    Temporary work

    Xai

    Palo Alto, CA
    1 hour ago
  •  ...the Role RadixArk is seeking a Member of Technical Staff - Training to build and scale the systems...  ...on large-scale distributed training infrastructure for LLMs and generative models,...  ...with large-scale distributed training (data, tensor, and pipeline parallelism)... 
    Data
    Flexible hours

    RadixArk

    Palo Alto, CA
    1 day ago
  •  ...What You'll Do As a Founding Member of the Technical Staff (ML infra) at Architect, you'll be responsible...  ...for the critical algorithms and infrastructure that our researchers depend on to...  ...end-to-end ML pipelines, including data curation, preparation, and large-scale... 
    Data

    Architect Labs

    Palo Alto, CA
    4 days ago
  • $180k

     ...Member of Technical Staff - X Money New York, NY; Palo Alto, CA xAI's mission is to create AI...  ...Develop backend services, APIs, and data models to support high-volume, multi-...  ...engineers to ship products. Design robust infrastructure and microservices for payments,... 
    Data
    Temporary work

    Xai

    Palo Alto, CA
    58 minutes ago
  • Member of Technical Staff (Data Acquisition) About the Role Your mission is to build and operate the ingestion systems that turn the open web...  ...pipelines. Build and operate large-scale distributed crawling infrastructure capable of continuously discovering and ingesting audio... 
    Data

    Sanas

    Palo Alto, CA
    3 days ago
  •  ...radiology datasets. About the Role As a Member of Technical Staff focused on model training, you will...  ...owns how models are trained: how data flows into training, how training jobs...  ...failure modes. Work closely with ML infrastructure engineers to ensure training systems... 
    Data

    Cognita Imaging Inc.

    Palo Alto, CA
    4 days ago
  • About the Role As a Member of Technical Staff [Platform] at NeoCognition , you’ll design and build the...  .... You’ll create the tooling, infrastructure, and developer experience that enable...  ...software engineers to ensure that our data, model, and product workflows are robust... 
    Data

    NeoCognition Inc.

    Palo Alto, CA
    4 days ago
  • About the Role As a Member of Technical Staff [Research] at NeoCognition , you’ll be part of the core...  ...(instruction tuning, RL, reasoning) Data pipeline design and model evaluation...  ...Llama, Mistral, or similar) and training infrastructure. Publications in top-tier AI venues... 
    Data

    NeoCognition Inc.

    Palo Alto, CA
    4 days ago
  • Member of Technical Staff Physical AI (Robotics / World Models) Palo Alto, CA About Orbifold AI Orbifold...  ...evaluation and curated, real-world data. We work directly with leading...  ...Closing that gap requires evaluation infrastructure that can systematically surface, categorize... 
    Data
    Shift work

    Bonfirevc

    Palo Alto, CA
    2 days ago
  • $180k

     ...teammates. About the Team The Data Platform team at X builds and operates the infrastructure responsible for all large‑scale...  ...phone interview”) during which a member of our team will ask some basic...  ...process, which consists of 2 technical interviews and 1 project deep‑dive... 
    Data
    Temporary work
    Work at office
    Work from home

    Pantera Capital

    Palo Alto, CA
    4 days ago
  • $180k

     ...user experience on X Write data pipelines and training jobs that...  ...You Are Knowledge of data infrastructure like Kafka, Clickhouse, and Spark...  ...interview”) during which a member of our team will ask some...  ...process, which consists of four technical interviews: Coding... 
    Data
    Local area
    Relocation

    Pantera Capital

    Palo Alto, CA
    2 days ago
  • $180k

    Member of Technical Staff - Multimodal Understanding About xAI xAI’s mission is to create AI systems...  ...audio, and text—spanning the full stack: data curation/acquisition, tokenizer...  ...pre‑training, post‑training/alignment, infrastructure/scaling, evaluation, tooling/demos, and... 
    Data
    Temporary work

    xAI

    Palo Alto, CA
    18 hours ago
  •  ...Li (Godmother of AI). We're building the infrastructure for a new era of interactive...  ...lasts two hours. As our RecSys founding member, you'll own this problem end-to-end - set...  ...retrieval through final ranking Own the full data pipeline - ingestion, feature engineering... 
    Data

    Astrocade

    Palo Alto, CA
    1 day ago
  • $90k - $130k

    Member of Technical Staff - Program Analysis This role is based in Palo Alto, California, and follows a hybrid work...  ...production. Contribute to our program‑analysis infrastructure, which includes call graph construction, data‑flow and taint analysis, and language‑... 
    Data

    Endor Labs

    Palo Alto, CA
    1 day ago
  • $180k

     ...teammates. About the Team The Data Platform team at X builds and operates the infrastructure responsible for all large-scale...  ...phone interview”) during which a member of our team will ask some basic...  ...process, which consists of 2 technical interviews and 1 project deep-dive... 
    Data
    Temporary work
    H1b
    Work at office
    Work from home
    Work visa

    xAI

    Palo Alto, CA
    4 days ago
  •  ...Role We are seeking a highly skilled Member of Technical Staff to join our team in managing and enhancing reliability across a multi-data center environment. This role focuses...  ...seamless operations for mission-critical AI infrastructure. The ideal candidate will combine... 
    Data

    Pantera Capital

    Palo Alto, CA
    2 days ago
  •  ...radiology datasets. About the Role As a Member of Technical Staff in Software Engineering, you will...  ...volumes of structured and unstructured data, including imaging metadata and model...  ...with ML training, evaluation, and infrastructure teams to integrate inference into production... 
    Data

    Cognita Imaging Inc.

    Palo Alto, CA
    18 hours ago
  • $180k

     ...Member Of Technical Staff - Pre-Training Palo Alto, CA About XAI XAI's mission is to create AI systems that can accurately understand...  ...and perks. xAI is an equal opportunity employer. For details on data processing, view our Recruitment Privacy Notice.... 
    Data
    Temporary work

    Xai

    Palo Alto, CA
    4 days ago
  • $139.9k - $274.8k

     ...frontier models, pushing the boundaries of scale, performance, and product deployment. The AI Data Infra team at Microsoft AI is responsible for building data infrastructure to help MAI teams to generate the biggest and best training dataset. Our work involves data... 
    Data
    Ongoing contract
    Work experience placement
    Work at office
    Local area
    Shift work

    Microsoft Corporation

    Mountain View, CA
    1 day ago
  •  ...strengthening cyber resilience for the infrastructure, systems, and organizations that keep the...  ...distributed multi-tenant system that process data and real time events and systems/asset/...  ...grow as engineers and become productive members of the team. ~ You will... 
    Data
    Immediate start

    Illumio

    Sunnyvale, CA
    4 days ago
  • $148.5k - $223.9k

     ...systems with customers. With your strong technical competence, strategic thinking and...  ...Strong software engineering fundamentals (data structures, algorithms, system design)...  ...evaluation, and inference pipelines Infrastructure & Deployment Experience deploying... 
    Data

    Salesforce.Com Inc

    Palo Alto, CA
    3 days ago
  • $139.9k - $274.8k

     ...intelligenceacross agents, applications, services, and infrastructure. It's also inclusive: we aim to make AI accessible to all...  .... Microsoft AI (MS AI) is seeking a experienced Member of Technical Staff - Data Engineer - Microsoft AI - Copilot to help build mission... 
    Data
    Ongoing contract
    Work at office
    Local area

    Microsoft Corporation

    Mountain View, CA
    4 days ago
  • $140k - $160k

     ...Member Of Technical Staff - Backend Software Engineer This role is based in Palo Alto, California...  ...servers, ensuring robust and scalable infrastructure. Maintain Diverse Scan...  ...the customer's success. We debate with data, make the complex simple, and challenge... 
    Data
    Shift work

    Endor Labs

    Palo Alto, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Member of Technical Staff, Data Infrastructure. Be the first to apply!