Sr. Lead AI Engineer (Inference Optimization, FM hosting, AI Platform)
Capital One
Sr. Lead AI Engineer (Inference Optimization, FM Hosting, AI Platform)
At Capital One, we are creating responsible and reliable AI systems, changing banking for good. For years, Capital One has been an industry leader in using machine learning to create real-time, personalized customer experiences. Our investments in technology infrastructure and world-class talent — along with our deep experience in machine learning — position us to be at the forefront of enterprises leveraging AI. From informing customers about unusual charges to answering their questions in real time, our applications of AI & ML are bringing humanity and simplicity to banking. We are committed to continuing to build world-class applied science and engineering teams to deliver our industry leading capabilities with breakthrough product experiences and scalable, high-performance AI infrastructure. At Capital One, you will help bring the transformative power of emerging AI capabilities to reimagine how we serve our customers and businesses who have come to love the products and services we build.
The Intelligent Foundations and Experiences (IFX) team is at the center of bringing our vision for AI at Capital One to life. We work hand-in-hand with our partners across the company to advance the state of the art in science and AI engineering, and we build and deploy proprietary solutions that are central to our business and deliver value to millions of customers. Our AI models and platforms empower teams across Capital One to enhance their products with the transformative power of AI, in responsible and scalable ways for the highest leverage impact.
In this role, you will:
- Partner with a cross-functional team of engineers, research scientists, technical program managers, and product managers to deliver AI-powered products that change how our associates work and how our customers interact with Capital One.
- Design, develop, test, deploy, and support AI software components including foundation model training, large language model inference, similarity search, guardrails, model evaluation, experimentation, governance, and observability, etc.
- Leverage a broad stack of Open Source and SaaS AI technologies such as AWS Ultraclusters, Huggingface, VectorDBs, Nemo Guardrails, PyTorch, and more.
- Invent and introduce state-of-the-art LLM optimization techniques to improve the performance — scalability, cost, latency, throughput — of large scale production AI systems.
- Contribute to the technical vision and the long term roadmap of foundational AI systems at Capital One.
The Ideal Candidate:
- You love to build systems, take pride in the quality of your work, and also share our passion to do the right thing. You want to work on problems that will help change banking for good.
- Passion for staying abreast of the latest research, and an ability to intuitively understand scientific publications and judiciously apply novel techniques in production.
- You adapt quickly and thrive on bringing clarity to big, undefined problems. You love asking questions and digging deep to uncover the root of problems and can articulate your findings concisely with clarity. You have the courage to share new ideas even when they are unproven.
- You are deeply Technical. You possess a strong foundation in engineering and mathematics, and your expertise in hardware, software, and AI enable you to see and exploit optimization opportunities that others miss.
- You are a resilient trail blazer who can forge new paths to achieve business goals when the route is unknown.
Basic Qualifications:
- Bachelor's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 6 years of experience developing AI and ML algorithms or technologies, or a Master's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 4 years of experience developing AI and ML algorithms or technologies
- At least 6 years of experience programming with Python, Go, Scala, or Java
Preferred Qualifications:
- 7 years of experience deploying scalable and responsible AI solutions on cloud platforms (e.g. AWS, Google Cloud, Azure, or equivalent private cloud)
- Experience designing, developing, integrating, delivering, and supporting complex AI systems
- Demonstrated ability to lead and mentor an engineering team and influence cross-functional stakeholders
- Experience developing AI and ML algorithms or technologies (e.g. LLM Inference, Similarity Search and VectorDBs, Guardrails, Memory) using Python, C++, C#, Java, or Golang
- Experience developing and applying state-of-the-art techniques for optimizing training and inference software to improve hardware utilization, latency, throughput, and cost
- Passion for staying abreast of the latest AI research and AI systems, and judiciously apply novel techniques in production
- Excellent communication and presentation skills, with the ability to articulate complex AI concepts to peers
$229.9k - $262.4k
...Sr. Lead AI Engineer (Inference Optimization, FM hosting, AI Platform) Overview: At Capital One, we are creating responsible and reliable AI systems, changing banking for good. For years, Capital One has been an industry leader in using machine learning to create...PlatformSeniorFull timePart timeLocal area$197.3k - $225.1k
...Lead AI Engineer (FM Hosting, LLM Inference) Overview At Capital One, we are creating responsible and... ...of customers. Our AI models and platforms empower teams across Capital One... ...and introduce state-of-the-art LLM optimization techniques to improve the performance...PlatformFull timePart timeLocal area- ...strong foundation in engineering and mathematics,... ..., software, and AI enable you to see and exploit optimization opportunities that... ...solutions on cloud platforms (e.g. AWS, Google... ...Demonstrated ability to lead and mentor an... ...technologies (e.g. LLM Inference, Similarity Search...PlatformSeniorFull timePart time
$209k - $238.5k
...Sr. Lead AI Engineer Overview: At Capital One, we are creating responsible... .... Our AI models and platforms empower teams across Capital... ...training, large language model inference, similarity search,... ...introduce state-of-the-art LLM optimization techniques to improve the performance...PlatformSeniorFull timePart timeLocal area$229.9k - $262.4k
Sr. Lead AI Engineer Overview: At Capital One, we are... .... Our AI models and platforms empower teams across Capital... ...training, large language model inference, similarity search, guardrails... ...state-of-the-art LLM optimization techniques to improve the performance...PlatformSeniorFull timePart timeLocal area$229.9k - $262.4k
Senior Lead AI Engineer (Gen AI Platform Services, Agentic AI) Overview:... ...training, large language model inference, similarity search,... ...introduce state-of-the-art LLM optimization techniques to improve the... ...: $229,900 - $262,400 for Sr. Lead AI Engineer...PlatformSeniorFull timePart timeLocal area$184k - $356.5k
NVIDIA Gruppe is looking for skilled software engineers to develop AI inference systems that operate with high efficiency. The role involves architecting high-performance inference frameworks and optimizing GPU processes. Ideal candidates should have extensive programming...Senior$152.93k - $254.88k
...possible. Our industry-leading portfolio unlocks... ...this in a simple and optimized way by connecting people... ...are looking for a Lead AI Engineer to help build our next... ...generation Agentic AI platform from 0-1. This is a hands... ...across AI pipelines, inference services, orchestration...Platform$197.3k - $225.1k
...Lead AI Engineer (AI Foundations, LLM Core and Agentic AI) Overview At... ...customers. Our AI models and platforms empower teams across... ...training, large language model inference, similarity search, guardrails... ...state-of-the-art LLM optimization techniques to improve the performance...PlatformFull timePart timeLocal area$272k - $431.25k
...looking for a Principal Software Engineer to advance open-source AI inference. This hands-on role emphasizes running... ...-performance inference on NVIDIA platforms and involves collaboration across... .... Key responsibilities include optimizing inference runtimes, improving efficiency...Platform$110k - $145k
...Sr. AI Machine Learning Engineer Position Overview We are looking for a talented... ..., including safe inference strategies to ensure reliable... ...and pipelines Design and optimize LLM architectures to improve... ...experience with cloud computing platforms (e.g., AWS, Azure, Google...PlatformSenior$151.8k
...can expect We are looking for an AI Inference Engineer with a solid background in speech recognition... ...most unique AI-powered collaboration platform to users across the globe.... ...shelf solutions are not available. Optimizing ASR inference systems for production deployment...PlatformWork at officeRemote work- ...experiences-from AI and data centers,... ...Senior Staff AI Infra Engineer who is passionate... ...and software to optimize performance for next... ...: • Lead technical initiatives... ...LLM training and inference on AMD GPUs, improving... ...training or inference platforms using Kubernetes,...Platform
$157.2k - $254.1k
...Inclusion. We weave AI into the fabric of... ...As a Staff/Sr. Staff AI Engineer for Enterprise AI... ...our Enterprise AI Platform to deliver measurable... ...Solution Implementation: Lead the design and... ...deployment and real-time inference systems. System Optimization: Design and...PlatformSeniorFull timeWork at office$246.5k
...the #1 TV streaming platform in the U.S.,... ...time multi-objective optimization across distributed... ...Reinforcement Learning, AI, Control and... ...Machine Learning and Inference Platform that... ...architect, design, and lead the development of... ...excited to mentor engineers, innovate at scale...PlatformWork at officeLocal areaRemote workMonday to ThursdayFlexible hours$244.8k
...applied research in Generative AI and CV/Multimodal... ...Multimodal Model Training and Inference Optimization Engineer with expertise in optimizing... ...CapCut and Pico as well as platforms specific to the China market... ...things with great people. We lead with curiosity, humility,...PlatformSeniorTemporary workLocal area- A leading AI technology company in Sunnyvale, California, is seeking a skilled software engineer to optimize its AI cloud platform for model training and inference. In this role, you'll enhance deployment efficiency and ensure system reliability and scalability. The ideal...Platform
$150k - $225k
...life. As a Senior AI / Embedded Engineer, you will be... ...model development, optimization, and deployment on embedded... ...reduce model size and inference latency ◦ Use... ...inference and cloud or edge-hosted LLM components ◦... ...Experience with RTOS platforms such as FreeRTOS or...PlatformSeniorFull timeWork at officeImmediate startVisa sponsorshipNight shift- NVIDIA Gruppe in Santa Clara, California is seeking AI Compiler Engineers to drive technological innovation within their compiler organization. The role involves working on kernel generation and optimization for next-generation NVIDIA GPUs and solving complex compilation...
$152k - $241.5k
NVIDIA is hiring an AI & Deep Learning Compiler Engineer in Santa Clara, California. This role involves analyzing deep learning networks, developing compiler optimization algorithms, and collaborating with various teams to enhance deep learning software. Candidates should...$110k - $190k
...hiring a Senior Software & AI Engineer to build production-grade AI... ..., anomaly detection, optimization) Own the full ML lifecycle... ...software products and internal platforms. Focus on outcomes, not just... ...(training pipelines, inference optimization, monitoring)...PlatformSenior$152k - $241.5k
NVIDIA Gruppe is seeking a talented individual to optimize and benchmark GenAI inference using the latest acceleration technologies. The role involves driving industry benchmark results and architecting distributed inference systems. Required qualifications include a relevant...Senior$176.8k - $265.2k
...enterprise-scale Agentic AI platform to enable secure,... ...Software Development Engineer to serve as the technical... ...AI Integration Lead technical integration... ...across F5 teams. Optimize prompt engineering, token... ...Optimize inference latency, parallelization...PlatformSeniorLocal area$152k - $241.5k
NVIDIA Gruppe is seeking a Senior Software Engineer - AI Inference in Santa Clara, California. This role involves enhancing open-source LLM serving optimizations and implementing high-performance runtime capabilities. Candidates should have 5+ years of experience in building...Senior$209k
...the Machine Learning Platform management system. •... ...preprocessing, feature engineering, and dataset versioning... ...and hyperparameter optimization. • Incorporate A/B testing... ...the auto scale for inference service and multi-... ...and resource-efficient AI workloads across multi...PlatformSeniorWork at officeRemote work1 day per week$184k - $356.5k
NVIDIA Corporation is seeking a Senior Deep Learning Software Engineer specializing in Inference to join their growing team in Santa Clara, CA. The role involves optimizing GPU-accelerated software for advanced AI applications, including developing high-performance deep...Senior$184k - $287.5k
...unlimited potential of AI to define the next era... ...Design, build and optimize agentic AI systems for... ...distributed training, and inference/serving—and with model... ...Science, Electrical Engineering, or related field (or... ...resource-constrained platforms. Deep expertise in GPU...PlatformSenior- ...security at scale. You'll lead a focused team of 3–5 senior engineers while remaining... ...privacy-preserving AI models that run... .... You'll own model optimization, fine-tuning for tool... ...orchestration platform itself, you'll work... ...optimized for on-device inference (Mac, iOS, Android,...PlatformSeniorRemote workRelocation package
$337.1k - $426.7k
...Team As the Senior Director of AI Engineering within the Security Business Group (SBG), you will lead the strategic vision and... ...and deployment of internal AI platforms and development environments.... ...complex coding workflows, and optimize the overall engineering development...PlatformSeniorFull timeTemporary workLocal areaFlexible hours$152k - $241.5k
...learning ignited modern AI — the next era of... ...Deep Learning Compiler Engineer. NVIDIA is hiring software... ...backbone of NVIDIA’s inference engine, spanning... ...compiler must deliver leading inference performance,... ...programming model and optimizations for future GPU architectures...Senior
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Sr. Lead AI Engineer (Inference Optimization, FM hosting, AI Platform). Be the first to apply!
- lead operating engineer San Jose, CA
- lead engineer San Jose, CA
- lead infrastructure engineer San Jose, CA
- lead algorithm engineer San Jose, CA
- lead industrial engineer San Jose, CA
- lead network engineer San Jose, CA
- lead system engineer San Jose, CA
- ai research engineer San Jose, CA
- machine learning ai engineer San Jose, CA
- ai engineer remote San Jose, CA


