Senior Software Engineer, Inference
PIKA Inc
About the Role We are seeking a Senior Inference Engineer to accelerate the performance of Pika's AI-driven products. In this highly technical role, you will operate at the intersection of cutting‑edge inference acceleration, GPU parallelism, advanced model deployment, and video generation technologies. Your expertise will drive significant improvements to model speed and efficiency, ensuring our creative AI systems deliver industry‑leading user experiences at scale. You will design and optimize inference pipelines, implement state‑of‑the‑art acceleration techniques, and work closely with researchers and engineers across the team to push the boundaries of what’s possible in real‑time AI deployment. Your efforts will play a foundational role in powering the next generation of Pika’s video and language models. What You’ll Do Accelerate Inference : Lead and implement advanced inference acceleration techniques, including attention optimization and quantization for efficient model serving. Maximize GPU Parallelism : Engineer and optimize GPU strategies across tensor, sequence, and pipeline parallelism (TP, SP, PP) for maximal efficiency and scalability. Programming for Performance : Develop and optimize high‑performance computing kernels and distributed workloads using CUDA and NCCL. Advance AI Deployment : Collaborate with research and engineering teams to bring state‑of‑the‑art videogen and large language models into production. Improve Training Efficiency : (Bonus) Contribute to improvements in model training speed, stability, and resource utilization as part of our deployment lifecycle. Technical Excellence : Drive rigorous code reviews, participate in technical discussions, and mentor fellow engineers on best practices in inference and GPU programming. What We’re Looking For Experience : 5+ years engineering experience, with a strong track record in inference acceleration and model deployment at scale. Inference Mastery : Proven expertise in inference optimization, including quantization, attention acceleration, and deep learning compiler stacks. GPU & Parallelism : Deep knowledge of GPU programming (CUDA, NCCL) and experience with SP, TP, PP, and other forms of parallelism for distributed inference. AI Domain Knowledge : Familiarity with video generation (videogen) models and large language models (LLMs). Collaboration : Strong cross‑discipline communication skills; able to drive shared goals across research and engineering functions. Ownership Mindset : Self‑driven, solutions‑oriented, and capable of managing ambiguity in a fast‑paced startup environment. Bonus : Experience in enhancing training efficiency, stability, or resource optimization for large models. Nice to Have Experience with high‑throughput video or real‑time streaming model deployment Familiarity with distributed training and optimization toolkits Contributions to open source projects in AI infrastructure or deep learning compilers Startup or rapid prototyping experience What We Offer Competitive salary in the AI industry Equity in a fast‑growing startup shaping the future of AI Comprehensive health benefits, monthly stipends, company retreats A supportive and collaborative office culture—we’re all building and launching together About Pika At Pika, we're crafting a future where video creation is seamless, intuitive, and universally accessible. Our mission is to empower creativity by breaking down technical barriers using the transformative power of AI. We’re a tight‑knit, energetic team based in Palo Alto, CA, valuing efficiency, curiosity, and the ambition to make a meaningful impact on the world. We work from our Palo Alto office 3–5 days a week and welcome applicants who are eager to contribute onsite. #J-18808-Ljbffr PIKA Inc
- We’re looking for a Senior Engineer to help build the next-generation inference platform that supports embedding models used for semantic search, retrieval, and... ...backend or infrastructure systems at scale Strong software engineering skills in languages such as Go, Rust,...SeniorLocal areaWorldwide
$152k - $241.5k
Senior Software Engineer - Deep Learning Inference What you’ll be doing: Craft and develop robust inferencing software that can be scaled to multiple platforms for functionality and performance Develop components of TensorRT, NVIDIA’s SDK for high-performance deep learning...Senior$152k - $241.5k
NVIDIA is the platform upon which every new AI‑powered application is built. We are seeking a Senior Software Engineer - AI Inference to advance open‑source LLM serving by contributing directly to upstream inference engines like vLLM and SGLang-ensuring they run best‑in...Senior- ...RL training and SOTA LLM and Multimodal inference at scale across multi-GPU and multi-node... ...You will collaborate across internal GPU software teams and engage with open-source... ...software ecosystem. THE PERSON: Skilled engineer with strong technical and analyticalexpertisein...Senior
$248.71k - $292.6k
About Groq Groq delivers fast, efficient AI inference. Our LPU-based system powers GroqCloud™, giving businesses and developers the... ...within reach, anything is possible. Build fast. Sr. Staff Software Engineer - High Performance GPU Inference Systems Mission Push the limits...Senior$184k - $287.5k
Position Overview We are seeking highly skilled and motivated software engineers to join us and build AI inference systems that serve large-scale models with extreme efficiency. You’ll architect and implement high-performance inference stacks, optimize GPU kernels and...Senior- A leading data platform company in Palo Alto seeks a Senior Engineer to develop a cutting-edge inference platform supporting semantic search and AI-native experiences. The ideal candidate will have over five years of experience in backend systems and proficiency in languages...Senior
- Israelvcforum is looking for a Senior ML Infrastructure Engineer in Mountain View, California. This position... ...and scale robust platforms for ML inference workflows supporting GM’s AI efforts... ...serving strategies and handle backend software components. The position demands 5+...SeniorRemote job
$180k - $258.75k
...Diffusion Policy and Large Behavior Models. We are looking for a Senior Software Engineer to join our end-to-end automated driving team, supporting... ...and Python, that supports ML training, evaluation, and inference workflows. Build and maintain ML tooling for dataset...SeniorLocal areaShift work$128.7k - $261.3k
About the Team The Model Deployment & Inference Solutions team in GM AV deploys machine learning... ...currently performed manually by engineers. Build the developer experience that ML... ...Experience designing clean, well‑tested software with clear interfaces and good abstractions...SeniorLocal areaRemote workFlexible hoursShift work$155.42k - $395.9k
...Description About the Team: The ML Inference Platform is part of the AV ML Infrastructure... .... About the Role: We are seeking a Senior ML Infrastructure engineer to help build and scale robust... ...and implement core platform backend software components. Collaborate with ML...SeniorLocal areaRemote workRelocationRelocation packageFlexible hours$152k - $204k
...Nasdaq: CRWV) in March 2025. Learn more at What You'll Do: Senior engineers are area owners who lead designs, raise engineering... ...orchestration, and hardware teams to evolve our Kubernetes-native inference platform and meet strict P99 SLAs at scale. About the role...SeniorPermanent employmentTemporary workCasual workWork at officeFlexible hoursShift work- A leading technology company is seeking a Senior System Software Engineer to develop GPU-accelerated AI inference serving software. The ideal candidate will have over 5 years of experience with deep learning software, strong skills in Rust and C++, and a collaborative approach...Senior
- A leading technology company is seeking a Senior AI Software Engineer to join their team in Santa Clara, California. In this role, you will innovate and develop groundbreaking AI systems software for inference applications including deep learning framework optimizations...Senior
$184k - $287.5k
NVIDIA Gruppe in Santa Clara is seeking an AI Systems Engineer to innovate and develop cutting-edge technologies in the AI inference software stack. Candidates should hold a Master's degree and possess over 6 years of experience in ML/DL systems development. The role involves...Senior$152k - $241.5k
...seeking a talented individual to optimize and benchmark GenAI inference using the latest acceleration technologies. The role involves... ...Required qualifications include a relevant degree and significant software development experience in Python or C++. A deep understanding...Senior$184k - $287.5k
NVIDIA Gruppe is seeking talented AI systems engineers to advance innovative technologies in AI inference systems software. This role involves developing cutting-edge libraries, code generators, and kernel technologies for NVIDIA's architecture, emphasizing high-impact...Senior- A leader in AI technology in Palo Alto is seeking a Senior AI Systems Performance Engineer to optimize the latest foundation models on their innovative platform. This role involves collaborating with cross-functional teams to push the performance limits of AI systems....Senior
$167.2k - $250.8k
...with our generalized AI‑first self‑driving software. Built to learn and improve through data... .... We are looking for strong software engineers to research, develop, and implement technologies... ...systems, distributed training, or inference optimization. Equal‑Opportunity...SeniorImmediate start$140k - $200k
Senior Software Engineer - Zero Trust for Agentic AI Cyberattacks on critical infrastructure, government, and private enterprises are at an all... ...Guard, NeMo Guardrails), and implementing low‑latency inference at the proxy/gateway level. AI/LLM Integration & Protocols...SeniorContract workWorldwideVisa sponsorship$152k - $241.5k
NVIDIA Gruppe is seeking a Senior Software Engineer - AI Inference in Santa Clara, California. This role involves enhancing open-source LLM serving optimizations and implementing high-performance runtime capabilities. Candidates should have 5+ years of experience in building...Senior$230k - $250k
Cerebras Systems is seeking a Sr. Member of Technical Staff in Sunnyvale, CA. This role involves designing resilient software features for cloud-based AI inference, leveraging AWS tools and services. Candidates should have a Master’s degree in Computer Science and experience...Senior$184k - $356.5k
NVIDIA Gruppe is looking for skilled software engineers to develop AI inference systems that operate with high efficiency. The role involves architecting high-performance inference frameworks and optimizing GPU processes. Ideal candidates should have extensive programming...Senior- Pika is looking for a Senior Inference Engineer in Palo Alto to enhance the performance of AI-driven products. This pivotal role involves designing and optimizing inference pipelines, applying advanced techniques to improve model speed and efficiency. The ideal candidate...Senior
- Cerebras is seeking a Software Engineer to join our Inference Platform team in Sunnyvale, California. This role involves developing and leading projects that integrate cloud and ML components. You will contribute to shaping the technical direction and improve system performance...Senior
- ...structured physics data Running billion-voxel inference in production Tier-1 semiconductor and... ...problem to a runtime environment is the engine of our product. Making our simulations... ...such as Ray Engineering Expectations Software engineering fundamentals Comfortable...Senior
- ...infrastructure company in California is seeking a Member of Technical Staff — Inference to design and optimize large-scale AI inference systems. The role demands 5+ years in systems engineering and expertise in large-scale inference systems. Successful candidates will...SeniorFlexible hours
- ...Description Our Data-infra team is looking for a Senior Backend Developer with a passion for solving complex scaling problems with... ...and frameworks. Requirements 5+ years of experience in backend engineering in a agile environment Experienced with traffic intensive systems...Senior
- ...Your RoleDesign, develop, test, deploy, maintain, and enhance software as part of an interdisciplinary team.Manage individual project... ...peers in a constructive manner.Collaborate with 219ers across engineering disciplines during development.Advise less experienced engineers...SeniorFlexible hours
- ...by the end of the 2024-2025 academic school year. Our team is engineering driven and product first. In order to maximize our product iteration... ..., and secure Setup tools and processes that promote great software development practices Imagine, design, deploy, and iterate new...SeniorImmediate start
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Software Engineer, Inference. Be the first to apply!
- software engineer amazon Palo Alto, CA
- agile software developer Palo Alto, CA
- rust software engineer Palo Alto, CA
- software developer positions Palo Alto, CA
- senior software design engineer Palo Alto, CA
- software developer Palo Alto, CA
- ngo software engineer Palo Alto, CA
- startup software engineer Palo Alto, CA
- software development engineer (robotics engineer) Palo Alto, CA
- scientific software engineer Palo Alto, CA

