Member of Technical Staff, Data Infrastructure
Inception LLC
Member Of Technical Staff, Data Infrastructure
Inception creates the world's fastest, most efficient AI models. Our Mercury model is the world's fastest reasoning LLM and first commercially available diffusion LLM, delivering 5x greater speed and efficiency than today's LLMs, with best-in-class quality. We are the AI researchers and engineers behind such breakthrough AI technologies as diffusion models, flash attention, and DPO.
The Role
We seek experienced engineers to architect and scale the core infrastructure behind distributed training pipelines and petabyte-scale data catalogs. You'll work directly with researchers to accelerate experiments, develop new datasets, improve infrastructure efficiency, and enable key insights across our data assets.
Key Responsibilities
- Design, build, and operate scalable, fault-tolerant infrastructure for LLM research: distributed compute, data orchestration, and storage across modalities.
- Develop high-throughput systems for data ingestion, processing, and transformation — including training data catalogs, deduplication, quality checks, and search.
- Build systems for web crawling, data ingestion, and real-time data processing to support model training operations.
- Develop tools and frameworks for efficient data storage, retrieval, and versioning across distributed systems.
- Ensure data collection adheres to privacy regulations.
Qualifications
- BS/MS/PhD in Computer Science, Machine Learning, or a related field (or equivalent experience).
- 3+ years of experience building data processing pipelines at scale, particularly with AI/ML applications.
- Strong proficiency in Python and experience with data processing frameworks (Apache Spark, Beam, Airflow).
- Familiarity with synthetic data generation techniques and data augmentation strategies.
- Familiarity with web scraping, crawling technologies, and Common Crawl datasets.
- Solid understanding of machine learning fundamentals and experience with ML frameworks (PyTorch, TensorFlow).
- Experience with SQL and NoSQL databases for managing structured and unstructured data.
Preferred Skills
- Experience with large language models and understanding of tokenization, embeddings, and model architectures.
- Experience managing human annotation workflows and quality control processes.
- Experience with vector databases and embedding-based retrieval systems.
- Knowledge of data privacy regulations and ethical AI practices.
- Experience with distributed computing and large-scale data storage systems (HDFS, S3, BigQuery).
Why Join Inception
- Work with World-Class Talent : Collaborate with the inventors of diffusion models and leading AI researchers
- Shape Foundational Technology : Your decisions will influence how the next generation of AI products are built and used
- Immediate Impact : Join at the ground floor where your contributions directly shape product direction and company trajectory
Perks & Benefits
- Competitive salary and equity in a rapidly growing startup
- Flexible vacation and paid time off (PTO)
- Health, dental, and vision insurance
- Catered meals (breakfast, lunch, & dinner)
- Commuter subsidies
- A collaborative and inclusive culture
Inception creates the world's fastest, most efficient AI models. Today's autoregressive LLMs generate tokens sequentially, which makes them painfully slow and expensive. Inception's diffusion-based LLMs (dLLMs) generate answers in parallel. They are 5x faster and more efficient, while delivering best-in-class quality. Inception was co-founded by Stanford professor Stefano Ermon, who co-invented such breakthrough AI technologies as diffusion models, flash attention, and DPO, UCLA professor Aditya Grover, who co-invented node2vec, decision transformers, and d1 reasoning, and Cornell professor and Afresh co-founder Volodymyr Kuleshov, who co-invented MDLM and Block Diffusion. We pioneered the application of diffusion to language, with world's first (and only) commercially available dLLM, Mercury. We are currently deploying our large-scale diffusion LLMs at Fortune 500 companies. Diffusion is the technology behind today's image and video AI, and we're making it the standard for LLMs as well. Our team includes engineers from AWS, Google DeepMind, Meta AI, Microsoft, HashiCorp, and OpenAI. Based in Palo Alto, CA, we are backed by top-tier venture capitalists, including Menlo Ventures, Mayfield, M12 (Microsoft's venture fund), Snowflake Ventures, Databricks, and Innovation Endeavors, and by tech luminaries such as Andrew Ng, Andrej Karpathy, and Eric Schmidt.
If you are talented, innovative, and ambitious, come help us invent the future of AI. We are an equal opportunity employer and encourage candidates of all backgrounds to apply.
- ...real-world radiology datasets. About the Role As a Member of Technical Staff on the ML Infrastructure team, you will build and operate the platform that enables... ...that manage large-scale compute, storage, and data movement for ML workloads. Design and optimize model...Data
$180k
...engineers to build the software infrastructure that enables our models to... ...phone interview”) during which a member of our team will ask some... ...process, which consists of four technical interviews: Coding... ...secret information, and/or user data; Interacting with internal and...DataLocal areaRelocation$180k
Member Of Technical Staff - Cloud Infrastructure ABOUT xAI xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity... ...Pulumi, Terraform, or Ansible, with a focus on secure data handling. Drive system reliability through incident management...DataTemporary work- Member of Technical Staff - Foundation Model Architecture & AI Infrastructure Vinci | Full-Time | Remote / Hybrid The Mission At Vinci, we are building the operator intelligence... .... Trained on 45TB+ of structured physics data Running billion-voxel inference in production...DataFull timeRemote work
$180k
Member of Technical Staff - RL Infrastructure About xAI xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity... ...seeking experienced software engineers to create robust data pipelines, comprehensive evaluations for benchmarking...DataTemporary work$180k
...AI supercomputers from the ground up. As part of the Compute Infrastructure team, you will own both the raw GPU supercomputer and the platform... ...xAI is an equal opportunity employer. For details on data processing, view our Recruitment Privacy Notice. #J-18808-Ljbffr...DataTemporary work$180k
...Member Of Technical Staff - Data Platform Palo Alto, CA xAI's mission is to create AI systems that can accurately understand the universe... ...teammates. The Data Platform team builds and operates the infrastructure responsible for all large-scale data transport and...DataTemporary work$180k
...Member Of Technical Staff - Inference Palo Alto, CA About Xai Xai's mission is to create AI... ...will own everything from distributed infrastructure (global KV cache, continuous batching... ...opportunity employer. For details on data processing, view our Recruitment Privacy...DataTemporary work- ...the Role RadixArk is seeking a Member of Technical Staff - Training to build and scale the systems... ...on large-scale distributed training infrastructure for LLMs and generative models,... ...with large-scale distributed training (data, tensor, and pipeline parallelism)...DataFlexible hours
- ...What You'll Do As a Founding Member of the Technical Staff (ML infra) at Architect, you'll be responsible... ...for the critical algorithms and infrastructure that our researchers depend on to... ...end-to-end ML pipelines, including data curation, preparation, and large-scale...Data
$180k
...Member of Technical Staff - X Money New York, NY; Palo Alto, CA xAI's mission is to create AI... ...Develop backend services, APIs, and data models to support high-volume, multi-... ...engineers to ship products. Design robust infrastructure and microservices for payments,...DataTemporary work- Member of Technical Staff (Data Acquisition) About the Role Your mission is to build and operate the ingestion systems that turn the open web... ...pipelines. Build and operate large-scale distributed crawling infrastructure capable of continuously discovering and ingesting audio...Data
- ...radiology datasets. About the Role As a Member of Technical Staff focused on model training, you will... ...owns how models are trained: how data flows into training, how training jobs... ...failure modes. Work closely with ML infrastructure engineers to ensure training systems...Data
- About the Role As a Member of Technical Staff [Platform] at NeoCognition , you’ll design and build the... .... You’ll create the tooling, infrastructure, and developer experience that enable... ...software engineers to ensure that our data, model, and product workflows are robust...Data
- About the Role As a Member of Technical Staff [Research] at NeoCognition , you’ll be part of the core... ...(instruction tuning, RL, reasoning) Data pipeline design and model evaluation... ...Llama, Mistral, or similar) and training infrastructure. Publications in top-tier AI venues...Data
- Member of Technical Staff Physical AI (Robotics / World Models) Palo Alto, CA About Orbifold AI Orbifold... ...evaluation and curated, real-world data. We work directly with leading... ...Closing that gap requires evaluation infrastructure that can systematically surface, categorize...DataShift work
$180k
...teammates. About the Team The Data Platform team at X builds and operates the infrastructure responsible for all large‑scale... ...phone interview”) during which a member of our team will ask some basic... ...process, which consists of 2 technical interviews and 1 project deep‑dive...DataTemporary workWork at officeWork from home$180k
...user experience on X Write data pipelines and training jobs that... ...You Are Knowledge of data infrastructure like Kafka, Clickhouse, and Spark... ...interview”) during which a member of our team will ask some... ...process, which consists of four technical interviews: Coding...DataLocal areaRelocation$180k
Member of Technical Staff - Multimodal Understanding About xAI xAI’s mission is to create AI systems... ...audio, and text—spanning the full stack: data curation/acquisition, tokenizer... ...pre‑training, post‑training/alignment, infrastructure/scaling, evaluation, tooling/demos, and...DataTemporary work- ...Li (Godmother of AI). We're building the infrastructure for a new era of interactive... ...lasts two hours. As our RecSys founding member, you'll own this problem end-to-end - set... ...retrieval through final ranking Own the full data pipeline - ingestion, feature engineering...Data
$90k - $130k
Member of Technical Staff - Program Analysis This role is based in Palo Alto, California, and follows a hybrid work... ...production. Contribute to our program‑analysis infrastructure, which includes call graph construction, data‑flow and taint analysis, and language‑...Data$180k
...teammates. About the Team The Data Platform team at X builds and operates the infrastructure responsible for all large-scale... ...phone interview”) during which a member of our team will ask some basic... ...process, which consists of 2 technical interviews and 1 project deep-dive...DataTemporary workH1bWork at officeWork from homeWork visa- ...Role We are seeking a highly skilled Member of Technical Staff to join our team in managing and enhancing reliability across a multi-data center environment. This role focuses... ...seamless operations for mission-critical AI infrastructure. The ideal candidate will combine...Data
- ...radiology datasets. About the Role As a Member of Technical Staff in Software Engineering, you will... ...volumes of structured and unstructured data, including imaging metadata and model... ...with ML training, evaluation, and infrastructure teams to integrate inference into production...Data
$180k
...Member Of Technical Staff - Pre-Training Palo Alto, CA About XAI XAI's mission is to create AI systems that can accurately understand... ...and perks. xAI is an equal opportunity employer. For details on data processing, view our Recruitment Privacy Notice....DataTemporary work$139.9k - $274.8k
...frontier models, pushing the boundaries of scale, performance, and product deployment. The AI Data Infra team at Microsoft AI is responsible for building data infrastructure to help MAI teams to generate the biggest and best training dataset. Our work involves data...DataOngoing contractWork experience placementWork at officeLocal areaShift work- ...strengthening cyber resilience for the infrastructure, systems, and organizations that keep the... ...distributed multi-tenant system that process data and real time events and systems/asset/... ...grow as engineers and become productive members of the team. ~ You will...DataImmediate start
$148.5k - $223.9k
...systems with customers. With your strong technical competence, strategic thinking and... ...Strong software engineering fundamentals (data structures, algorithms, system design)... ...evaluation, and inference pipelines Infrastructure & Deployment Experience deploying...Data$139.9k - $274.8k
...intelligenceacross agents, applications, services, and infrastructure. It's also inclusive: we aim to make AI accessible to all... .... Microsoft AI (MS AI) is seeking a experienced Member of Technical Staff - Data Engineer - Microsoft AI - Copilot to help build mission...DataOngoing contractWork at officeLocal area$140k - $160k
...Member Of Technical Staff - Backend Software Engineer This role is based in Palo Alto, California... ...servers, ensuring robust and scalable infrastructure. Maintain Diverse Scan... ...the customer's success. We debate with data, make the complex simple, and challenge...DataShift work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Member of Technical Staff, Data Infrastructure. Be the first to apply!
- technical support assistant Palo Alto, CA
- technical analyst Palo Alto, CA
- end user support technician Palo Alto, CA
- IT assistant Palo Alto, CA
- help desk assistant Palo Alto, CA
- IT support technician Palo Alto, CA
- operations support technician Palo Alto, CA
- desktop support analyst Palo Alto, CA
- support analyst Palo Alto, CA
- technical associate Palo Alto, CA

