Senior Software Engineer, Inference

PIKA Inc

About the Role We are seeking a Senior Inference Engineer to accelerate the performance of Pika's AI-driven products. In this highly technical role, you will operate at the intersection of cutting‑edge inference acceleration, GPU parallelism, advanced model deployment, and video generation technologies. Your expertise will drive significant improvements to model speed and efficiency, ensuring our creative AI systems deliver industry‑leading user experiences at scale. You will design and optimize inference pipelines, implement state‑of‑the‑art acceleration techniques, and work closely with researchers and engineers across the team to push the boundaries of what’s possible in real‑time AI deployment. Your efforts will play a foundational role in powering the next generation of Pika’s video and language models. What You’ll Do Accelerate Inference : Lead and implement advanced inference acceleration techniques, including attention optimization and quantization for efficient model serving. Maximize GPU Parallelism : Engineer and optimize GPU strategies across tensor, sequence, and pipeline parallelism (TP, SP, PP) for maximal efficiency and scalability. Programming for Performance : Develop and optimize high‑performance computing kernels and distributed workloads using CUDA and NCCL. Advance AI Deployment : Collaborate with research and engineering teams to bring state‑of‑the‑art videogen and large language models into production. Improve Training Efficiency : (Bonus) Contribute to improvements in model training speed, stability, and resource utilization as part of our deployment lifecycle. Technical Excellence : Drive rigorous code reviews, participate in technical discussions, and mentor fellow engineers on best practices in inference and GPU programming. What We’re Looking For Experience : 5+ years engineering experience, with a strong track record in inference acceleration and model deployment at scale. Inference Mastery : Proven expertise in inference optimization, including quantization, attention acceleration, and deep learning compiler stacks. GPU & Parallelism : Deep knowledge of GPU programming (CUDA, NCCL) and experience with SP, TP, PP, and other forms of parallelism for distributed inference. AI Domain Knowledge : Familiarity with video generation (videogen) models and large language models (LLMs). Collaboration : Strong cross‑discipline communication skills; able to drive shared goals across research and engineering functions. Ownership Mindset : Self‑driven, solutions‑oriented, and capable of managing ambiguity in a fast‑paced startup environment. Bonus : Experience in enhancing training efficiency, stability, or resource optimization for large models. Nice to Have Experience with high‑throughput video or real‑time streaming model deployment Familiarity with distributed training and optimization toolkits Contributions to open source projects in AI infrastructure or deep learning compilers Startup or rapid prototyping experience What We Offer Competitive salary in the AI industry Equity in a fast‑growing startup shaping the future of AI Comprehensive health benefits, monthly stipends, company retreats A supportive and collaborative office culture—we’re all building and launching together About Pika At Pika, we're crafting a future where video creation is seamless, intuitive, and universally accessible. Our mission is to empower creativity by breaking down technical barriers using the transformative power of AI. We’re a tight‑knit, energetic team based in Palo Alto, CA, valuing efficiency, curiosity, and the ambition to make a meaningful impact on the world. We work from our Palo Alto office 3–5 days a week and welcome applicants who are eager to contribute onsite. #J-18808-Ljbffr PIKA Inc

Apply

Vacancy posted 2 days ago

Similar jobs that could be interesting for youBased on the Senior Software Engineer, Inference in Palo Alto, CA vacancy

Senior Software Engineer, Inference Platform Palo Alto
We’re looking for a Senior Engineer to help build the next-generation inference platform that supports embedding models used for semantic search, retrieval, and... ...backend or infrastructure systems at scale Strong software engineering skills in languages such as Go, Rust,...
Senior
Local area
Worldwide
MongoDB
Palo Alto, CA
2 days ago
Senior Software Engineer, Deep Learning Inference - TensorRT
$152k - $241.5k
Senior Software Engineer - Deep Learning Inference What you’ll be doing: Craft and develop robust inferencing software that can be scaled to multiple platforms for functionality and performance Develop components of TensorRT, NVIDIA’s SDK for high-performance deep learning...
Senior
NVIDIA Gruppe
Santa Clara, CA
3 days ago
Senior Software Engineer - AI Inference
$152k - $241.5k
NVIDIA is the platform upon which every new AI‑powered application is built. We are seeking a Senior Software Engineer - AI Inference to advance open‑source LLM serving by contributing directly to upstream inference engines like vLLM and SGLang-ensuring they run best‑in...
Senior
NVIDIA Gruppe
Santa Clara, CA
3 days ago
Senior Software Development Engineer - SGLang and Inference Stack
...RL training and SOTA LLM and Multimodal inference at scale across multi-GPU and multi-node... ...You will collaborate across internal GPU software teams and engage with open-source... ...software ecosystem. THE PERSON: Skilled engineer with strong technical and analyticalexpertisein...
Senior
Advanced Micro Devices
Santa Clara, CA
13 hours ago
Senior Staff Software Engineer - High Performance GPU Inference Systems
$248.71k - $292.6k
About Groq Groq delivers fast, efficient AI inference. Our LPU-based system powers GroqCloud™, giving businesses and developers the... ...within reach, anything is possible. Build fast. Sr. Staff Software Engineer - High Performance GPU Inference Systems Mission Push the limits...
Senior
I did my part and supported the Regular Toilet
Palo Alto, CA
3 days ago
Senior Software Engineer, AI Inference Systems
$184k - $287.5k
Position Overview We are seeking highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models with extreme efficiency. You’ll architect and implement high-performance inference stacks, optimize GPU kernels and...
Senior
NVIDIA Gruppe
Santa Clara, CA
3 days ago
Senior Inference Platform Engineer — Low-Latency, Multi-Tenant
A leading data platform company in Palo Alto seeks a Senior Engineer to develop a cutting-edge inference platform supporting semantic search and AI-native experiences. The ideal candidate will have over five years of experience in backend systems and proficiency in languages...
Senior
MongoDB
Palo Alto, CA
2 days ago
Senior ML Inference Platform Engineer (Remote)
Israelvcforum is looking for a Senior ML Infrastructure Engineer in Mountain View, California. This position... ...and scale robust platforms for ML inference workflows supporting GM’s AI efforts... ...serving strategies and handle backend software components. The position demands 5+...
Senior
Remote job
Israelvcforum
Mountain View, CA
3 days ago
Senior Software Engineer
$180k - $258.75k
...Diffusion Policy and Large Behavior Models. We are looking for a Senior Software Engineer to join our end-to-end automated driving team, supporting... ...and Python, that supports ML training, evaluation, and inference workflows. Build and maintain ML tooling for dataset...
Senior
Local area
Shift work
Toyota Research Institute
Los Altos, CA
23 hours ago
Senior ML Inference Engineer - Platform
$128.7k - $261.3k
About the Team The Model Deployment & Inference Solutions team in GM AV deploys machine learning... ...currently performed manually by engineers. Build the developer experience that ML... ...Experience designing clean, well‑tested software with clear interfaces and good abstractions...
Senior
Local area
Remote work
Flexible hours
Shift work
General Motors
Mountain View, CA
2 days ago
Senior ML Infrastructure Engineer, Inference Platform
$155.42k - $395.9k
...Description About the Team: The ML Inference Platform is part of the AV ML Infrastructure... .... About the Role: We are seeking a Senior ML Infrastructure engineer to help build and scale robust... ...and implement core platform backend software components. Collaborate with ML...
Senior
Local area
Remote work
Relocation
Relocation package
Flexible hours
Israelvcforum
Mountain View, CA
3 days ago
Senior Software Engineer, Inference
$152k - $204k
...Nasdaq: CRWV) in March 2025. Learn more at What You'll Do: Senior engineers are area owners who lead designs, raise engineering... ...orchestration, and hardware teams to evolve our Kubernetes-native inference platform and meet strict P99 SLAs at scale. About the role...
Senior
Permanent employment
Temporary work
Casual work
Work at office
Flexible hours
Shift work
CoreWeave
Sunnyvale, CA
7 days ago
Senior GPU AI Inference Engineer - Triton & Dynamo
A leading technology company is seeking a Senior System Software Engineer to develop GPU-accelerated AI inference serving software. The ideal candidate will have over 5 years of experience with deep learning software, strong skills in Rust and C++, and a collaborative approach...
Senior
NVIDIA Corporation
Santa Clara, CA
4 days ago
Senior AI Kernel & Inference Engineer
A leading technology company is seeking a Senior AI Software Engineer to join their team in Santa Clara, California. In this role, you will innovate and develop groundbreaking AI systems software for inference applications including deep learning framework optimizations...
Senior
NVIDIA
Santa Clara, CA
23 hours ago
Senior AI Inference Kernel Engineer
$184k - $287.5k
NVIDIA Gruppe in Santa Clara is seeking an AI Systems Engineer to innovate and develop cutting-edge technologies in the AI inference software stack. Candidates should hold a Master's degree and possess over 6 years of experience in ML/DL systems development. The role involves...
Senior
NVIDIA Gruppe
Santa Clara, CA
3 days ago
Senior AI Inference Performance Engineer (GPU/Cluster)
$152k - $241.5k
...seeking a talented individual to optimize and benchmark GenAI inference using the latest acceleration technologies. The role involves... ...Required qualifications include a relevant degree and significant software development experience in Python or C++. A deep understanding...
Senior
NVIDIA Gruppe
Santa Clara, CA
3 days ago
Senior AI Systems Engineer: Inference Kernels & Runtimes
$184k - $287.5k
NVIDIA Gruppe is seeking talented AI systems engineers to advance innovative technologies in AI inference systems software. This role involves developing cutting-edge libraries, code generators, and kernel technologies for NVIDIA's architecture, emphasizing high-impact...
Senior
NVIDIA Gruppe
Santa Clara, CA
3 days ago
Senior AI Systems Performance Engineer: Drive SOTA Inference
A leader in AI technology in Palo Alto is seeking a Senior AI Systems Performance Engineer to optimize the latest foundation models on their innovative platform. This role involves collaborating with cross-functional teams to push the performance limits of AI systems....
Senior
SambaNova
Palo Alto, CA
23 hours ago
Senior Software Engineer, Behavior Planning
$167.2k - $250.8k
...with our generalized AI‑first self‑driving software. Built to learn and improve through data... .... We are looking for strong software engineers to research, develop, and implement technologies... ...systems, distributed training, or inference optimization. Equal‑Opportunity...
Senior
Immediate start
Kindredventures
Mountain View, CA
2 days ago
Senior Software Engineer - Zero Trust for Agentic AI
$140k - $200k
Senior Software Engineer - Zero Trust for Agentic AI Cyberattacks on critical infrastructure, government, and private enterprises are at an all... ...Guard, NeMo Guardrails), and implementing low‑latency inference at the proxy/gateway level. AI/LLM Integration & Protocols...
Senior
Contract work
Worldwide
Visa sponsorship
Xage
Palo Alto, CA
3 days ago
Senior AI Inference Engineer - High-Performance LLM Serving
$152k - $241.5k
NVIDIA Gruppe is seeking a Senior Software Engineer - AI Inference in Santa Clara, California. This role involves enhancing open-source LLM serving optimizations and implementing high-performance runtime capabilities. Candidates should have 5+ years of experience in building...
Senior
NVIDIA Gruppe
Santa Clara, CA
3 days ago
Senior Staff Engineer — AI Inference & Cloud Infra
$230k - $250k
Cerebras Systems is seeking a Sr. Member of Technical Staff in Sunnyvale, CA. This role involves designing resilient software features for cloud-based AI inference, leveraging AWS tools and services. Candidates should have a Master’s degree in Computer Science and experience...
Senior
Cerebras Systems
Sunnyvale, CA
2 days ago
Senior AI Inference Systems Engineer: GPU-Optimized, Cloud
$184k - $356.5k
NVIDIA Gruppe is looking for skilled software engineers to develop AI inference systems that operate with high efficiency. The role involves architecting high-performance inference frameworks and optimizing GPU processes. Ideal candidates should have extensive programming...
Senior
NVIDIA Gruppe
Santa Clara, CA
3 days ago
Senior Inference Engineer: Real-Time Video AI on GPUs
Pika is looking for a Senior Inference Engineer in Palo Alto to enhance the performance of AI-driven products. This pivotal role involves designing and optimizing inference pipelines, applying advanced techniques to improve model speed and efficiency. The ideal candidate...
Senior
Pika
Palo Alto, CA
2 days ago
Senior Platform Engineer, Inference & Kubernetes
Cerebras is seeking a Software Engineer to join our Inference Platform team in Sunnyvale, California. This role involves developing and leading projects that integrate cloud and ML components. You will contribute to shaping the technical direction and improve system performance...
Senior
Cerebras
Sunnyvale, CA
3 days ago
Simulation Runtime Software Engineer (Senior)
...structured physics data Running billion-voxel inference in production Tier-1 semiconductor and... ...problem to a runtime environment is the engine of our product. Making our simulations... ...such as Ray Engineering Expectations Software engineering fundamentals Comfortable...
Senior
Vinci4d
Palo Alto, CA
23 hours ago
Senior Systems Engineering
...infrastructure company in California is seeking a Member of Technical Staff — Inference to design and optimize large-scale AI inference systems. The role demands 5+ years in systems engineering and expertise in large-scale inference systems. Successful candidates will...
Senior
Flexible hours
RadixArk
Palo Alto, CA
1 day ago
Senior Software Engineer
...Description Our Data-infra team is looking for a Senior Backend Developer with a passion for solving complex scaling problems with... ...and frameworks. Requirements 5+ years of experience in backend engineering in a agile environment Experienced with traffic intensive systems...
Senior
CTERA Networks Ltd
Palo Alto, CA
4 days ago
Senior Software Engineering Manage
...Your RoleDesign, develop, test, deploy, maintain, and enhance software as part of an interdisciplinary team.Manage individual project... ...peers in a constructive manner.Collaborate with 219ers across engineering disciplines during development.Advise less experienced engineers...
Senior
Flexible hours
219 Design
Mountain View, CA
1 day ago
Senior Software Engineer
...by the end of the 2024-2025 academic school year. Our team is engineering driven and product first. In order to maximize our product iteration... ..., and secure Setup tools and processes that promote great software development practices Imagine, design, deploy, and iterate new...
Senior
Immediate start
I did my part and supported the Regular Toilet
Palo Alto, CA
23 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Software Engineer, Inference. Be the first to apply!