Sr. Lead AI Engineer (Inference Optimization, FM hosting, AI Platform)

Capital One

Sr. Lead AI Engineer (Inference Optimization, FM Hosting, AI Platform)

At Capital One, we are creating responsible and reliable AI systems, changing banking for good. For years, Capital One has been an industry leader in using machine learning to create real-time, personalized customer experiences. Our investments in technology infrastructure and world-class talent — along with our deep experience in machine learning — position us to be at the forefront of enterprises leveraging AI. From informing customers about unusual charges to answering their questions in real time, our applications of AI & ML are bringing humanity and simplicity to banking. We are committed to continuing to build world-class applied science and engineering teams to deliver our industry leading capabilities with breakthrough product experiences and scalable, high-performance AI infrastructure. At Capital One, you will help bring the transformative power of emerging AI capabilities to reimagine how we serve our customers and businesses who have come to love the products and services we build.

The Intelligent Foundations and Experiences (IFX) team is at the center of bringing our vision for AI at Capital One to life. We work hand-in-hand with our partners across the company to advance the state of the art in science and AI engineering, and we build and deploy proprietary solutions that are central to our business and deliver value to millions of customers. Our AI models and platforms empower teams across Capital One to enhance their products with the transformative power of AI, in responsible and scalable ways for the highest leverage impact.

In this role, you will:

Partner with a cross-functional team of engineers, research scientists, technical program managers, and product managers to deliver AI-powered products that change how our associates work and how our customers interact with Capital One.
Design, develop, test, deploy, and support AI software components including foundation model training, large language model inference, similarity search, guardrails, model evaluation, experimentation, governance, and observability, etc.
Leverage a broad stack of Open Source and SaaS AI technologies such as AWS Ultraclusters, Huggingface, VectorDBs, Nemo Guardrails, PyTorch, and more.
Invent and introduce state-of-the-art LLM optimization techniques to improve the performance — scalability, cost, latency, throughput — of large scale production AI systems.
Contribute to the technical vision and the long term roadmap of foundational AI systems at Capital One.

The Ideal Candidate:

You love to build systems, take pride in the quality of your work, and also share our passion to do the right thing. You want to work on problems that will help change banking for good.
Passion for staying abreast of the latest research, and an ability to intuitively understand scientific publications and judiciously apply novel techniques in production.
You adapt quickly and thrive on bringing clarity to big, undefined problems. You love asking questions and digging deep to uncover the root of problems and can articulate your findings concisely with clarity. You have the courage to share new ideas even when they are unproven.
You are deeply Technical. You possess a strong foundation in engineering and mathematics, and your expertise in hardware, software, and AI enable you to see and exploit optimization opportunities that others miss.
You are a resilient trail blazer who can forge new paths to achieve business goals when the route is unknown.

Basic Qualifications:

Bachelor's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 6 years of experience developing AI and ML algorithms or technologies, or a Master's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 4 years of experience developing AI and ML algorithms or technologies
At least 6 years of experience programming with Python, Go, Scala, or Java

Preferred Qualifications:

7 years of experience deploying scalable and responsible AI solutions on cloud platforms (e.g. AWS, Google Cloud, Azure, or equivalent private cloud)
Experience designing, developing, integrating, delivering, and supporting complex AI systems
Demonstrated ability to lead and mentor an engineering team and influence cross-functional stakeholders
Experience developing AI and ML algorithms or technologies (e.g. LLM Inference, Similarity Search and VectorDBs, Guardrails, Memory) using Python, C++, C#, Java, or Golang
Experience developing and applying state-of-the-art techniques for optimizing training and inference software to improve hardware utilization, latency, throughput, and cost
Passion for staying abreast of the latest AI research and AI systems, and judiciously apply novel techniques in production
Excellent communication and presentation skills, with the ability to articulate complex AI concepts to peers

Apply

Vacancy posted 21 hours ago

Similar jobs that could be interesting for youBased on the Sr. Lead AI Engineer (Inference Optimization, FM hosting, AI Platform) in San Jose, CA vacancy

Sr. Lead AI Engineer (Inference Optimization, FM hosting, AI Platform)
$229.9k - $262.4k
...Sr. Lead AI Engineer (Inference Optimization, FM hosting, AI Platform) Overview: At Capital One, we are creating responsible and reliable AI systems, changing banking for good. For years, Capital One has been an industry leader in using machine learning to create...
Platform
Senior
Full time
Part time
Local area
Capital One Financial Corp
San Jose, CA
5 hours ago
Lead AI Engineer (FM Hosting, LLM Inference)
$197.3k - $225.1k
...Lead AI Engineer (FM Hosting, LLM Inference) Overview At Capital One, we are creating responsible and... ...of customers. Our AI models and platforms empower teams across Capital One... ...and introduce state-of-the-art LLM optimization techniques to improve the performance...
Platform
Full time
Part time
Local area
Capital One Financial Corp
San Jose, CA
3 days ago
Sr. Lead AI Engineer
...strong foundation in engineering and mathematics,... ..., software, and AI enable you to see and exploit optimization opportunities that... ...solutions on cloud platforms (e.g. AWS, Google... ...Demonstrated ability to lead and mentor an... ...technologies (e.g. LLM Inference, Similarity Search...
Platform
Senior
Full time
Part time
Capital One
San Jose, CA
2 days ago
Sr. Lead AI Engineer
$209k - $238.5k
...Sr. Lead AI Engineer Overview: At Capital One, we are creating responsible... .... Our AI models and platforms empower teams across Capital... ...training, large language model inference, similarity search,... ...introduce state-of-the-art LLM optimization techniques to improve the performance...
Platform
Senior
Full time
Part time
Local area
Capital One
San Jose, CA
5 days ago
Sr. Lead AI Engineer
$229.9k - $262.4k
Sr. Lead AI Engineer Overview: At Capital One, we are... .... Our AI models and platforms empower teams across Capital... ...training, large language model inference, similarity search, guardrails... ...state-of-the-art LLM optimization techniques to improve the performance...
Platform
Senior
Full time
Part time
Local area
Capital One Financial Corporation
San Jose, CA
11 hours ago
Senior Lead AI Engineer (Gen AI Platform Services, Agentic AI)
$229.9k - $262.4k
Senior Lead AI Engineer (Gen AI Platform Services, Agentic AI) Overview:... ...training, large language model inference, similarity search,... ...introduce state-of-the-art LLM optimization techniques to improve the... ...: $229,900 - $262,400 for Sr. Lead AI Engineer...
Platform
Senior
Full time
Part time
Local area
Capital One Financial Corp
San Jose, CA
4 days ago
Senior AI Inference Systems Engineer: GPU-Optimized, Cloud
$184k - $356.5k
NVIDIA Gruppe is looking for skilled software engineers to develop AI inference systems that operate with high efficiency. The role involves architecting high-performance inference frameworks and optimizing GPU processes. Ideal candidates should have extensive programming...
Senior
NVIDIA Gruppe
Santa Clara, CA
11 hours ago
Lead AI Engineer - AgenticAI
$152.93k - $254.88k
...possible. Our industry-leading portfolio unlocks... ...this in a simple and optimized way by connecting people... ...are looking for a Lead AI Engineer to help build our next... ...generation Agentic AI platform from 0-1. This is a hands... ...across AI pipelines, inference services, orchestration...
Platform
BMC Software
Santa Clara, CA
1 day ago
Lead AI Engineer (AI Foundations, LLM Core and Agentic AI)
$197.3k - $225.1k
...Lead AI Engineer (AI Foundations, LLM Core and Agentic AI) Overview At... ...customers. Our AI models and platforms empower teams across... ...training, large language model inference, similarity search, guardrails... ...state-of-the-art LLM optimization techniques to improve the performance...
Platform
Full time
Part time
Local area
Capital One Financial Corp
San Jose, CA
4 days ago
Principal AI Inference Engineer Open-Source & GPU-Focused
$272k - $431.25k
...looking for a Principal Software Engineer to advance open-source AI inference. This hands-on role emphasizes running... ...-performance inference on NVIDIA platforms and involves collaboration across... .... Key responsibilities include optimizing inference runtimes, improving efficiency...
Platform
NVIDIA Gruppe
Santa Clara, CA
11 hours ago
Sr. AI Machine Learning Engineer
$110k - $145k
...Sr. AI Machine Learning Engineer Position Overview We are looking for a talented... ..., including safe inference strategies to ensure reliable... ...and pipelines Design and optimize LLM architectures to improve... ...experience with cloud computing platforms (e.g., AWS, Azure, Google...
Platform
Senior
A10 Networks
San Jose, CA
11 hours ago
AI Inference Engineer - Speech
$151.8k
...can expect We are looking for an AI Inference Engineer with a solid background in speech recognition... ...most unique AI-powered collaboration platform to users across the globe.... ...shelf solutions are not available. Optimizing ASR inference systems for production deployment...
Platform
Work at office
Remote work
Zoom Video Communications
San Jose, CA
11 hours ago
Principal AI Inference Systems Engineer
...experiences-from AI and data centers,... ...Senior Staff AI Infra Engineer who is passionate... ...and software to optimize performance for next... ...: • Lead technical initiatives... ...LLM training and inference on AMD GPUs, improving... ...training or inference platforms using Kubernetes,...
Platform
Advanced Micro Devices , Inc.
Santa Clara, CA
1 day ago
Staff/Sr. Staff AI Engineer - Enterprise AI Solutions
$157.2k - $254.1k
...Inclusion. We weave AI into the fabric of... ...As a Staff/Sr. Staff AI Engineer for Enterprise AI... ...our Enterprise AI Platform to deliver measurable... ...Solution Implementation: Lead the design and... ...deployment and real-time inference systems. System Optimization: Design and...
Platform
Senior
Full time
Work at office
Palo Alto Networks
Santa Clara, CA
1 day ago
Lead ML Inference Engineer, Advertising
$246.5k
...the #1 TV streaming platform in the U.S.,... ...time multi-objective optimization across distributed... ...Reinforcement Learning, AI, Control and... ...Machine Learning and Inference Platform that... ...architect, design, and lead the development of... ...excited to mentor engineers, innovate at scale...
Platform
Work at office
Local area
Remote work
Monday to Thursday
Flexible hours
Roku
San Jose, CA
11 hours ago
Sr. Multimodal Model Training and Inference Optimization Engineer
$244.8k
...applied research in Generative AI and CV/Multimodal... ...Multimodal Model Training and Inference Optimization Engineer with expertise in optimizing... ...CapCut and Pico as well as platforms specific to the China market... ...things with great people. We lead with curiosity, humility,...
Platform
Senior
Temporary work
Local area
ByteDance
San Jose, CA
3 days ago
Staff AI Cloud Platform Engineer - Inference & Training
A leading AI technology company in Sunnyvale, California, is seeking a skilled software engineer to optimize its AI cloud platform for model training and inference. In this role, you'll enhance deployment efficiency and ensure system reliability and scalability. The ideal...
Platform
Cerebras
Sunnyvale, CA
1 day ago
Sr. AI / Embedded ML Engineer
$150k - $225k
...life. As a Senior AI / Embedded Engineer, you will be... ...model development, optimization, and deployment on embedded... ...reduce model size and inference latency ◦ Use... ...inference and cloud or edge-hosted LLM components ◦... ...Experience with RTOS platforms such as FreeRTOS or...
Platform
Senior
Full time
Work at office
Immediate start
Visa sponsorship
Night shift
E-Space
Saratoga, CA
11 hours ago
AI Inference Compiler Engineer — MLIR & Kernel Optimizer
NVIDIA Gruppe in Santa Clara, California is seeking AI Compiler Engineers to drive technological innovation within their compiler organization. The role involves working on kernel generation and optimization for next-generation NVIDIA GPUs and solving complex compilation...
NVIDIA Gruppe
Santa Clara, CA
11 hours ago
Lead AI Inference Compiler Engineer
$152k - $241.5k
NVIDIA is hiring an AI & Deep Learning Compiler Engineer in Santa Clara, California. This role involves analyzing deep learning networks, developing compiler optimization algorithms, and collaborating with various teams to enhance deep learning software. Candidates should...
NVIDIA
Santa Clara, CA
11 hours ago
Senior Software & AI Engineer
$110k - $190k
...hiring a Senior Software & AI Engineer to build production-grade AI... ..., anomaly detection, optimization) Own the full ML lifecycle... ...software products and internal platforms. Focus on outcomes, not just... ...(training pipelines, inference optimization, monitoring)...
Platform
Senior
Covalent
Sunnyvale, CA
1 day ago
Senior AI Inference Performance Engineer (GPU/Cluster)
$152k - $241.5k
NVIDIA Gruppe is seeking a talented individual to optimize and benchmark GenAI inference using the latest acceleration technologies. The role involves driving industry benchmark results and architecting distributed inference systems. Required qualifications include a relevant...
Senior
NVIDIA Gruppe
Santa Clara, CA
11 hours ago
Senior AI Engineer
$176.8k - $265.2k
...enterprise-scale Agentic AI platform to enable secure,... ...Software Development Engineer to serve as the technical... ...AI Integration Lead technical integration... ...across F5 teams. Optimize prompt engineering, token... ...Optimize inference latency, parallelization...
Platform
Senior
Local area
F5
San Jose, CA
11 hours ago
Senior AI Inference Engineer - High-Performance LLM Serving
$152k - $241.5k
NVIDIA Gruppe is seeking a Senior Software Engineer - AI Inference in Santa Clara, California. This role involves enhancing open-source LLM serving optimizations and implementing high-performance runtime capabilities. Candidates should have 5+ years of experience in building...
Senior
NVIDIA Gruppe
Santa Clara, CA
11 hours ago
Senior AI Engineer
$209k
...the Machine Learning Platform management system. •... ...preprocessing, feature engineering, and dataset versioning... ...and hyperparameter optimization. • Incorporate A/B testing... ...the auto scale for inference service and multi-... ...and resource-efficient AI workloads across multi...
Platform
Senior
Work at office
Remote work
1 day per week
Zoom Video Communications
San Jose, CA
4 days ago
Senior AI Inference Engineer — GPU DL, Equity Eligible
$184k - $356.5k
NVIDIA Corporation is seeking a Senior Deep Learning Software Engineer specializing in Inference to join their growing team in Santa Clara, CA. The role involves optimizing GPU-accelerated software for advanced AI applications, including developing high-performance deep...
Senior
NVIDIA Corporation
Santa Clara, CA
2 days ago
Senior High Performance AI Engineer
$184k - $287.5k
...unlimited potential of AI to define the next era... ...Design, build and optimize agentic AI systems for... ...distributed training, and inference/serving—and with model... ...Science, Electrical Engineering, or related field (or... ...resource-constrained platforms. Deep expertise in GPU...
Platform
Senior
NVIDIA
Santa Clara, CA
11 hours ago
Senior Director, Software Engineering - AI ML Engineering
...security at scale. You'll lead a focused team of 3–5 senior engineers while remaining... ...privacy-preserving AI models that run... .... You'll own model optimization, fine-tuning for tool... ...orchestration platform itself, you'll work... ...optimized for on-device inference (Mac, iOS, Android,...
Platform
Senior
Remote work
Relocation package
McAfee
San Jose, CA
3 days ago
Sr. Director - AI Engineering Productivity and Developer Experience
$337.1k - $426.7k
...Team As the Senior Director of AI Engineering within the Security Business Group (SBG), you will lead the strategic vision and... ...and deployment of internal AI platforms and development environments.... ...complex coding workflows, and optimize the overall engineering development...
Platform
Senior
Full time
Temporary work
Local area
Flexible hours
Cisco
San Jose, CA
3 days ago
Senior AI Inference Compiler Engineer
$152k - $241.5k
...learning ignited modern AI — the next era of... ...Deep Learning Compiler Engineer. NVIDIA is hiring software... ...backbone of NVIDIA’s inference engine, spanning... ...compiler must deliver leading inference performance,... ...programming model and optimizations for future GPU architectures...
Senior
NVIDIA Gruppe
Santa Clara, CA
1 hour ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Sr. Lead AI Engineer (Inference Optimization, FM hosting, AI Platform). Be the first to apply!