Tech Lead, Data & Inference Engineer

Catalyst Labs, LLC

Tech Lead, Data & Inference Engineer

San Jose, California, United States

About the Job Tech Lead, Data & Inference Engineer

Our client is a fast moving and venture backed advertising technology startup based in San Francisco. They have raised twelve million dollars in funding and are transforming how business to business marketers reach their ideal customers. Their identity resolution technology blends business and consumer signals to convert static audience lists into high match and cross channel segments without the use of cookies. By transforming first party and third party data into precision targetable audiences across platforms such as Meta, Google and YouTube, they enable marketing teams to reach higher match rates, reduce wasted advertising spend and accelerate pipeline growth. With a strong understanding of how business buyers behave in channels that have traditionally been focused on business to consumer activity, they are redefining how business brands scale demand generation and account based efforts.

About Us

Catalyst Labs is a leading talent agency with a specialized vertical in Applied AI, Machine Learning, and Data Science. We stand out as an agency that’s deeply embedded in our clients recruitment operations.

We collaborate directly with Founders, CTOs, and Heads of AI in those themes who are driving the next wave of applied intelligence from model optimization to productized AI workflows. We take pride in facilitating conversations that align with your technical expertise, creative problem-solving mindset, and long-term growth trajectory in the evolving world of intelligent systems.

Location

San Francisco

Work Type

Full Time

Compensation

Above market base + bonus + equity

Roles & Responsibilities

Lead the design, development and scaling of an end to end data platform from ingestion to insights, ensuring that data is fast, reliable and ready for business use.
Build and maintain scalable batch and streaming pipelines, transforming diverse data sources and third party application programming interfaces into trusted and low latency systems.
Take full ownership of reliability, cost and service level objectives. This includes achieving ninety nine point nine percent uptime, maintaining minutes level latency and optimizing cost per terabyte. Conduct root cause analysis and provide long lasting solutions.
Operate inference pipelines that enhance and enrich data. This includes enrichment, scoring and quality assurance using large language models and retrieval augmented generation. Manage version control, caching and evaluation loops.
Work across teams to deliver data as a product through the creation of clear data contracts, ownership models, lifecycle processes and usage based decision making.
Guide architectural decisions across the data lake and the entire pipeline stack. Document lineage, trade offs and reversibility while making practical decisions on whether to build internally or buy externally.
Scale integration with application programming interfaces and internal services while ensuring data consistency, high data quality and support for both real time and batch oriented use cases.
Mentor engineers, review code and raise the overall technical standard across teams. Promote data driven best practices throughout the organization.

Qualifications

Bachelors or Masters degree in Computer Science, Computer Engineering, Electrical Engineering, or Mathematics.
Excellent written and verbal communication; proactive and collaborative mindset.
Comfortable in hybrid or distributed environments with strong ownership and accountability.
A founder-level bias for actionable to identify bottlenecks, automate workflows, and iterate rapidly based on measurable outcomes.
Demonstrated ability to teach, mentor, and document technical decisions and schemas clearly.

Core Experience

6 to 12 years of experience building and scaling production-grade data systems, with deep expertise in data architecture, modeling, and pipeline design.
Expert SQL (query optimization on large datasets) and Python skills.
Hands-on experience with distributed data technologies (Spark, Flink, Kafka) and modern orchestration tools (Airflow, Dagster, Prefect).
Familiarity with dbt, DuckDB, and the modern data stack; experience with IaC, CI/CD, and observability.
Exposure to Kubernetes and cloud infrastructure (AWS, GCP, or Azure).
Bonus: Strong Node.js skills for faster onboarding and system integration.
Previous experience at a high-growth startup (10 to 200 people) or early-stage environment with a strong product mindset.

Apply

Vacancy posted 19 hours ago

Similar jobs that could be interesting for youBased on the Tech Lead, Data & Inference Engineer in San Jose, CA vacancy

Kubernetes AI Inference Tech Lead
...Micro Devices in Santa Clara, California, seeks a strategic software engineering lead. This role entails developing techniques for optimizing key applications, particularly for large-scale inference within the K8s ecosystem. Successful candidates should possess leadership...
Suggested
Advanced Micro Devices , Inc.
Santa Clara, CA
3 days ago
Sr. Lead AI Engineer (Inference Optimization, FM hosting, AI Platform)
$229.9k - $262.4k
...Sr. Lead AI Engineer (Inference Optimization, FM Hosting, AI Platform) Overview At Capital One, we are creating responsible and reliable AI systems, changing banking for good. For years, Capital One has been an industry leader in using machine learning to create real‑time...
Suggested
Full time
Part time
Local area
Information Technology Senior Management Forum
San Jose, CA
19 hours ago
Lead ML Inference Engineer, Advertising
$246.5k
...core of this is our Machine Learning and Inference Platform that powers the entire... ...this role, you will architect, design, and lead the development of a state-of-the-art Inference... ...frameworks - someone excited to mentor engineers, innovate at scale, and shape the future...
Suggested
Work at office
Local area
Remote work
Monday to Thursday
Flexible hours
Roku
San Jose, CA
2 days ago
Sr. Lead AI Engineer (Inference Optimization, FM hosting, AI Platform)
$229.9k - $262.4k
...Sr. Lead AI Engineer (Inference Optimization, FM hosting, AI Platform) Overview: At Capital One, we are creating responsible and reliable AI systems, changing banking for good. For years, Capital One has been an industry leader in using machine learning to create...
Suggested
Full time
Part time
Local area
Capital One Financial Corp
San Jose, CA
2 days ago
AI/ML Technical Leader - Language Model Inference & AI Ops
$212.3k - $275.8k
...You will collaborate with product and engineering teams to deploy reliable, secure, and observable AI services, optimizing inference performance from CPU and small GPUs to large... ...tuning workflows for LLMs/SLMs, including data curation, experiment tracking, and packaging...
Suggested
Full time
Temporary work
Local area
Flexible hours
3 days per week
Cisco
San Jose, CA
5 days ago
Senior AI Inference Kernel Engineer
$184k - $287.5k
...NVIDIA Gruppe in Santa Clara is seeking an AI Systems Engineer to innovate and develop cutting-edge technologies in the AI inference software stack. Candidates should hold a Master's degree and possess over 6 years of experience in ML/DL systems development. The role...
NVIDIA Gruppe
Santa Clara, CA
4 days ago
AI Inference Engineer
...Role: AI Inference Engineer Location: San Jose, CA Duration: 6 to 12 Months Overview: We are seeking a highly skilled AI Inference Engineer to join our team and drive the performance, scalability, and reliability of our large-scale model serving infrastructure...
Triune Infomatics Inc
San Jose, CA
8 days ago
Senior AI Inference Engineer - High-Performance LLM Serving
$152k - $241.5k
...NVIDIA Gruppe is seeking a Senior Software Engineer – AI Inference in Santa Clara, California. This role involves enhancing open-source LLM serving optimizations and implementing high-performance runtime capabilities. Candidates should have 5+ years of experience in building...
NVIDIA Gruppe
Santa Clara, CA
3 days ago
Senior GPU AI Inference Engineer - Triton & Dynamo
A leading technology company is seeking a Senior System Software Engineer to develop GPU-accelerated AI inference serving software. The ideal candidate will have over 5 years of experience with deep learning software, strong skills in Rust and C++, and a collaborative approach...
NVIDIA Corporation
Santa Clara, CA
3 days ago
AI Inference Performance Engineer — Scale LLMs & GPU Clusters
$124k - $195.5k
NVIDIA Corporation is seeking an AI Inference Performance Engineer - New College Grad 2026 in Santa Clara. This role involves optimizing AI inference benchmarks using NVIDIA’s accelerators and working with various teams on performance enhancements. Applicants should have...
NVIDIA Corporation
Santa Clara, CA
1 day ago
Senior AI Inference Performance Engineer (GPU/Cluster)
$152k - $241.5k
NVIDIA Gruppe is seeking a talented individual to optimize and benchmark GenAI inference using the latest acceleration technologies. The role involves driving industry benchmark results and architecting distributed inference systems. Required qualifications include a relevant...
NVIDIA Gruppe
Santa Clara, CA
2 days ago
AI Inference Performance Engineer
$152k - $241.5k
...optimize and benchmark GenAI inference on NVIDIA's latest accelerators... ...of GPU performance engineering and public accountability. What... ...roadmaps based on real workload data. Technical Leadership: Raise... ...tight benchmark timelines, and lead a world-class team. What We Need...
NVIDIA Gruppe
Santa Clara, CA
2 days ago
Senior AI Kernel & Inference Engineer
A leading technology company is seeking a Senior AI Software Engineer to join their team in Santa Clara, California. In this role, you will innovate and develop groundbreaking AI systems software for inference applications including deep learning framework optimizations...
NVIDIA Corporation
Santa Clara, CA
4 days ago
Senior AI Inference Engineer for AIConfigurator (Dynamo)
NVIDIA Corporation is looking for a Senior Inference Engineer to advance AIConfigurator, enhancing model serving and performance for large-scale LLM inference. This role entails developing production-quality APIs and integrating complex deployment configurations on NVIDIA...
NVIDIA Corporation
Santa Clara, CA
2 days ago
Senior AI Infra Engineer - Large-Model Inference
$156k - $387.6k
...Ellis Technologies, Inc. is seeking an AI Infra Engineer to develop and optimize next-generation inference systems for large-scale traffic. The ideal candidate will have a strong background in high-performance computing and should be able to work with large-model architectures...
Ellis Technologies, Inc.
San Jose, CA
9 hours ago
Principal AI Inference Engineer Open-Source & GPU-Focused
$272k - $431.25k
...NVIDIA Gruppe is looking for a Principal Software Engineer to advance open-source AI inference. This hands-on role emphasizes running high-performance inference on NVIDIA platforms and involves collaboration across various teams. Key responsibilities include optimizing...
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior Software Development Engineer - SGLang and Inference Stack
...generation computing experiences-from AI and data centers, to PCs, gaming and embedded... ...RL training and SOTA LLM and Multimodal inference at scale across multi-GPU and multi-node... ...THE PERSON: Skilled engineer with strong technical and analytical expertise...
Advanced Micro Devices , Inc.
Santa Clara, CA
19 hours ago
Senior Software Engineer, Deep Learning Inference - TensorRT
$152k - $241.5k
...Senior Software Engineer – Deep Learning Inference What you’ll be doing: Craft and develop robust inferencing software that can be scaled to multiple... ...concepts. Experience and knowledge in Computer Architecture, Data Structures, Algorithms. Excellent communication skills,...
NVIDIA Gruppe
Santa Clara, CA
3 days ago
Senior Software Development Engineer - LLM Inference Framework
...generation computing experiences-from AI and data centers, to PCs, gaming and embedded... ...ROLE: As a senior member of the LLM inference framework team, you will be responsible for... ...sits at the intersection of inference engines, distributed systems, and GPU runtime and...
Advanced Micro Devices , Inc.
Santa Clara, CA
19 hours ago
DL Software Engineer - TensorRT Performance & Inference
...NVIDIA Gruppe in Santa Clara is seeking a Deep Learning Software Engineer focused on improving performance of deep learning inference software like TensorRT. The ideal candidate will have a strong foundation in C++ and Python, relevant experience with deep learning frameworks...
NVIDIA Gruppe
Santa Clara, CA
4 days ago
Senior AI Co-Design Engineer - Memory-Driven Inference
$189k - $301k
...Conductor in San Jose, CA is seeking a seasoned engineer to lead co-design efforts for optimizing AI model inference performance. The role requires a deep understanding of AI infrastructure, covering everything from model definition to serving. The ideal candidate should...
Conductor
San Jose, CA
4 days ago
Inference Software Engineer
$2,000 per month
...Inference Software Engineer Cupertino, CA Etched is building AI chips that are hard-coded for individual model architectures. Our first product (Sohu) only supports transformers, but has an order of magnitude more throughput and lower latency than a B200. With Etched...
Work at office
Relocation package
ETCHED LLC
Cupertino, CA
8 hours ago
Senior Software Engineer, Inference
$152k - $204k
...Senior Software Engineer, Inference Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud... ...scale AI with confidence. Trusted by leading AI labs, startups, and global enterprises... ...lunch each day in our office and data center locations ~ A casual work environment...
Permanent employment
Temporary work
Casual work
Work at office
Flexible hours
Shift work
CoreWeave
Sunnyvale, CA
1 day ago
Software Engineering A
$2,000 per month
...Etched is building the world's first AI inference system purpose-built for transformers -... ...from top-tier investors and staffed by leading engineers, Etched is redefining the... ...extensive SIMD (Single Instruction, Multiple Data) optimizations for performance-critical...
Work at office
Relocation package
ETCHED LLC
San Jose, CA
3 days ago
Senior Software Engineer I, Inference
$139k - $204k
...Senior Software Engineer I, Inference Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud... ...scale AI with confidence. Trusted by leading AI labs, startups, and global enterprises... ...lunch each day in our office and data center locations ~ A casual work environment...
Permanent employment
Temporary work
Casual work
Work at office
Remote work
Flexible hours
Shift work
CoreWeave
Sunnyvale, CA
1 day ago
Senior ML Inference Engineer - Platform
$128.7k - $261.3k
...The Model Deployment & Inference Solutions team in GM AV deploys machine learning models... ...Copilot, or equivalent) as part of your engineering workflow. Experience designing clean, well... ...transform mobility.We are determined to lead change for the world through technology,...
Flexible hours
General Motors
Sunnyvale, CA
3 days ago
Staff Software Engineer, Inference
$188k - $275k
...Staff Software Engineer, Inference CoreWeave is The Essential Cloud for AI™. Built for pioneers... ...and scale AI with confidence. Trusted by leading AI labs, startups, and global... ...Catered lunch each day in our office and data center locations ~ A casual work environment...
Permanent employment
Temporary work
Casual work
Work at office
Flexible hours
CoreWeave
Sunnyvale, CA
2 days ago
AI Infra Engineer - Large Model Inference Systems (Multimodal/LLM/VLM)
$156k - $387.6k
...are dedicated to building the inference infrastructure for ultra-... ...in Computer Science, Software Engineering, Artificial Intelligence, Mathematics... ...TikTok TikTok is the leading destination for short-form... ...impact in a rapidly growing tech company. Every challenge is an...
Temporary work
Local area
Tik Tok
San Jose, CA
19 hours ago
AI Infra Engineer - Large Model Inference Systems (Multimodal/LLM/VLM)
$156k - $387.6k
...AI Infra Engineer - Large Model Inference Systems (Multimodal/LLM/VLM) Location: San Jose Employment Type: Regular Job Code: A122047 Responsibilities... ...multimodal models through tensor parallel, pipeline parallel, data parallel, and related strategies to improve throughput and...
Temporary work
Local area
Ellis Technologies, Inc.
San Jose, CA
8 hours ago
Senior AI Systems Engineer: Inference Kernels & Runtimes
$184k - $287.5k
NVIDIA Gruppe is seeking talented AI systems engineers to advance innovative technologies in AI inference systems software. This role involves developing cutting-edge libraries, code generators, and kernel technologies for NVIDIA's architecture, emphasizing high-impact...
NVIDIA Gruppe
Santa Clara, CA
2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Tech Lead, Data & Inference Engineer. Be the first to apply!