Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Principal Software Engineer - CoreAI Model Inference & Serving

$139.9k - $274.8k

Microsoft Corporation

Overview
Join our team within CoreAI , where we are building theAI data-planethat powersall LLMinferencing workloads across Microsoft and Azure customers-fromcutting-edgestartups to Fortune 500 enterprises. Ourconverged AI fabricdelivers inference capabilities for all LLMs inMicrosoft catalog, including OpenAI,Anthropic,Mistral, Cohere, Llama, and more. As a Principal Software Engineer , you will shape the future of one of thelargest and fastest-growing services in Azure, foundational to Microsoft's AI strategy. Our mission is to serve models at scale-reliably, efficiently, and with ultra-low latency-enabling a rich set of AI-powered product experiences. This is a rapidly evolving space with immense opportunities to learn, innovate, and drive industry-wide impact!


Responsibilities
  • Be a hands-on technical leader, designing, coding, and shipping core serving systems, smart routing, and request distribution for a broad portfolio of LLMs, including OpenAI, Mistral, Grok, DeepSeek, and others.
  • Build large-scale AI services and platform capabilities that power new products and customer experiences.
  • Drive cutting-edge innovation in AI systems alongside world-class engineers and cross-functional partners.
  • Lead through architecture, code reviews, mentorship, and technical excellence while staying close to implementation.
  • Improve reliability, scalability, observability, efficiency, and performance across mission-critical services.

Qualifications
Required/Minimum Qualifications:
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, or Java
    • OR equivalent experience.
Other Requirements:
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:
    • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
Preferred/Additional Qualifications:
  • 4+ years of design and problem-solving experience, with understanding of system performance, scalability, and engineering best practices.
  • Understanding of distributed systems specifically in request serving at scale; (e.g. inferencing, L7 gateways, high-performance storage, distributed databases across global-scale infrastructure)
  • Demonstrated experience in building high-quality, reliable systems at scale.
  • Experience using modern AI-assisted development tools and workflows to move faster, improve quality, and amplify engineering impact.
  • Customer-obsessed approach to problem solving, with empathy and a drive to deliver impactful solutions.
#AIPLATFORM #azureai #coreai #genai #aiinference


Software Engineering IC5 - The typical base pay range for this role across the U.S. is USD $139,900 - $274,800 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $188,000 - $304,200 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Principal Software Engineer - CoreAI Model Inference & Serving in Mountain View, CA vacancy
  • $139.9k - $274.8k

     ...Overview Software quality is being redefined by AI. As part of...  ...modern developer workflow and serve millions worldwide. As a Principal Software Engineer - CoreAI on the Playwright engineering...  ...years of experience with AI LLM models, such as OpenAI, Azure AI, ML... 
    Suggested
    Ongoing contract
    Local area
    Worldwide

    Microsoft Corporation

    Mountain View, CA
    3 days ago
  • $139.9k - $274.8k

     ...cloud-enabled world. The CoreAI organization at Microsoft...  ...AI, Azure OpenAI, Model as a Service, Azure ML, Cognitive...  ..., our customers are better served. Within CoreAI, the Foundry...  ...content. We are looking for a Principal Software Engineer - Responsible AI who is... 
    Suggested
    Ongoing contract
    Work at office
    Local area

    Microsoft Corporation

    Mountain View, CA
    2 days ago
  • $272k - $431.25k

     ...for every new AI-powered application. We seek a Principal Software Engineer - AI Inference to advance open-source LLM serving. This role involves contributing to upstream...  ...systems. You will collaborate closely with internal model teams, infrastructure/SRE, and product to... 
    Suggested
    Remote work

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $139.9k - $274.8k

     ...Overview As a Principal Software Engineer on the Azure Artificial Intelligence Core team at...  ...entire stack—from API interfaces to inference backends serving AI models—delivering end-to-end solutions...  ...or thought leadership. #COREAI #AIPLATFORM Software Engineering... 
    Suggested
    Ongoing contract
    Local area

    Microsoft Corporation

    Mountain View, CA
    5 days ago
  • $193.3k - $261.5k

     ...builds AWS Neuron, the software development kit used...  ...enabling unparalleled ML inference and training...  ...running a wide range of models and supporting novel architecture...  ...boundary, our engineers build systematic infrastructure...  .../offline inference serving with vLLM, SGLang,... 
    Suggested
    Work experience placement
    Internship
    Local area
    Flexible hours

    Amazon

    Cupertino, CA
    4 days ago
  • $139k - $204k

     ...Senior Software Engineer I, Inference Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for AI™. Built for pioneers by pioneers, CoreWeave...  ...to inference frameworks (vLLM, Triton, TensorRT-LLM, Ray Serve, TorchServe). Experience with CUDA kernels, NCCL/SHARP,... 
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours
    Shift work

    CoreWeave

    Sunnyvale, CA
    2 days ago
  • $165k - $242k

     ...Senior Software Engineer II, Inference Sunnyvale, CA / Bellevue, WA CoreWeave is The Essential Cloud for...  ...and tuning CUDA kernels, reducing model latency, maximizing compute and memory...  ...frameworks (vLLM, Triton, TensorRT-LLM, Ray Serve, TorchServe). Experience with CUDA... 
    Permanent employment
    Temporary work
    Casual work
    Work at office
    Remote work
    Flexible hours
    Shift work

    CoreWeave

    Sunnyvale, CA
    2 days ago
  • $232.75k - $325k

     ...influential companies. As a Senior Principal Software Engineer at JPMorganChase within the...  ...strategy, architecture, and development of Model serving solutions for different model architectures...  ...and optimization using Model Inference servers such as Triton Inference Server... 

    JPMorgan Chase Bank, N.A.

    Palo Alto, CA
    11 hours ago
  • $119.8k - $234.7k

     ...About the Role We're building AI-first engineering systems that power growth at Microsoft -...  ...adopt AI. As a Growth Engineer in CoreAI, you'll sit at the intersection of product...  ...What We're Looking For Software engineering fundamentals with experience... 
    Ongoing contract
    Local area

    Microsoft Corporation

    Mountain View, CA
    2 days ago
  • $163k - $296.4k

     ...role is about designing foundational engineering systems (instrumentation,...  ...faster with higher confidence. As a Principal Growth Engineer in CoreAI, you’ll drive the technical strategy...  ...deeply hands‑on and detail‑oriented Software engineering fundamentals with... 
    Ongoing contract
    Local area

    Microsoft Corporation

    Mountain View, CA
    2 days ago
  • $170k - $216k

     ...evaluate the Waymo Driver's software stack at a massive scale....  ...of customers Software Engineers, Product, Data Science, System...  ...Build and evolve ML inference infrastructure for simulations...  ...and user experience of ML model deployment and serving. You have: ~ B.Sc... 
    Full time
    Remote work

    Waymo

    Mountain View, CA
    11 hours ago
  •  ...the Institute of Foundation Models We are a dedicated research lab...  ..., data scientists, and engineers, tackling the most fundamental...  ...on Operational Stability to serve as the backbone of our AI research...  ...difference between pre-training and inference, and you are familiar with... 
    Live in
    Immediate start
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    2 days ago
  • $92k - $135k

     ...Learn more at What You'll Do: Join the Inference team to ship production features that...  ...latency, reliability, and cost for model serving on our GPU platform. As an IC1, you'll...  ...quickly with mentorship from experienced engineers. About the role: Implement well-scoped... 
    Permanent employment
    Temporary work
    Casual work
    Internship
    Work at office
    Remote work
    Flexible hours

    CoreWeave

    Sunnyvale, CA
    7 days ago
  •  ...-leading training and inference speeds and empowers machine...  ...customers include top model labs, global...  ...across the models we serve, building AI-driven systems...  ...You'll sit between engineering, product, and customer...  ...who are serious about software make their own hardware... 

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    2 days ago
  •  ...-leading training and inference speeds and empowers machine...  ...customers include top model labs, global...  ...We're hiring a Staff Engineer to own major areas of...  ...years of experience in software engineering, with substantial...  ...infrastructure, model serving systems, or GPU-... 

    CEREBRAS SYSTEMS INC.

    Sunnyvale, CA
    2 days ago
  • $261.5k - $353.5k

     ...important financial challenges. Serving over 50 million customers...  ...units, is seeking a Principal Software Engineer to lead the long-term technology...  ...detection, and behavioral modeling to significantly reduce...  ..., and real-time inference at scale. Partner with... 
    Local area
    Worldwide

    Intuit

    Mountain View, CA
    4 days ago
  • $143k - $286k

     ...Role Overview: We are seeking a Principal Software Engineer to lead the design and development of...  ...centers. What you'll do: You will serve as a technical authority responsible...  ...platforms , feature stores, or real-time inference Experience with observability... 
    Full time
    Temporary work
    Part time

    Walmart

    Sunnyvale, CA
    11 hours ago
  • $180k

     ...small, highly motivated, and focused on engineering excellence. This organization is for...  ...As a multimodal engineer on the Imagine Model Team, you will develop cutting-edge AI...  ...span data curation, modeling, training, inference serving, and product integration, covering both... 
    Temporary work

    xAI

    Palo Alto, CA
    6 days ago
  • $128.7k - $261.3k

     ...Description About the Team The Model Deployment & Inference Solutions team in GM AV...  ...that makes deployment self-serve for every ML model...  ...currently performed manually by engineers. Build the developer...  ...designing clean, well-tested software with clear interfaces and good... 
    Local area
    Remote work
    Work from home
    Relocation package
    Flexible hours
    Shift work

    General Motors

    Sunnyvale, CA
    2 days ago
  •  ...Technical Staff — Diffusion Model About the Role RadixArk is seeking...  ...thinking with strong engineering execution — from designing novel...  ...teams to scale training and inference Translate research ideas into...  ...stars, the fastest open LLM serving engine), and developed Miles,... 
    Flexible hours

    RadixArk

    Palo Alto, CA
    4 days ago
  • $155.42k - $205.9k

     ...About the Team: The ML Inference Platform is part of...  .... We're proud to serve teams developing autonomous...  ...) machine learning models for experimental, online...  ...Senior ML Infrastructure engineer to help build and...  ...core platform backend software components. Collaborate... 
    Local area
    Remote work
    Work from home
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Mountain View, CA
    11 hours ago
  •  ...industry-leading training and inference speeds and empowers machine...  .... About The Role As a software engineer on our AI cloud platform, you...  ...on our cloud platform for AI model training and inference. In this...  ...services. Experience building ML serving and/or training services,... 

    Cerebras

    Sunnyvale, CA
    4 days ago
  • We’re looking for a Senior Engineer to help build the next-generation inference platform that supports embedding models used for semantic search,...  ...infrastructure systems at scale Strong software engineering skills in...  ...with concepts in ML model serving and inference runtimes,... 
    Local area
    Worldwide

    MongoDB

    Palo Alto, CA
    2 days ago
  • $185.5k - $270k

     ...About the Team: The ML Inference Platform is part of...  .... We're proud to serve as the AI infrastructure...  ...SOTA) machine learning models for experimental and bulk...  ...ML Infrastructure engineer to help build and scale...  ...core platform backend software components. Collaborate... 
    Local area
    Work from home
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    4 days ago
  • $184k - $287.5k

     ...NVIDIA seeks a Senior Software Engineer specializing in Deep Learning Inference for our growing team. As a key contributor, you will help design, build, and...  ...which are at the forefront of efficient large-scale model serving and inference. You will play a central role in... 
    Remote work

    NVIDIA

    Santa Clara, CA
    5 days ago
  • $193.3k - $261.5k

     ...AWS Neuron is the software stack powering AWS Inferentia and...  ...-performance, low-cost inference at scale. The Neuron Serving team develops infrastructure...  ...modern machine learning models-including large language...  ...seeking a Software Development Engineer to lead and architect our... 
    Internship
    Local area
    Flexible hours

    Amazon

    Cupertino, CA
    4 days ago
  •  ...accelerating deep learning models, and enabling RL...  ...SOTA LLM and Multimodal inference at scale across multi-...  ...across internal GPU software teams and engage with...  ...PERSON: Skilled engineer with strong technical...  ...for LLM, multimodal serving and RL-training.... 

    Advanced Micro Devices , Inc.

    Santa Clara, CA
    1 day ago
  • $152k - $241.5k

     ...powered application is built. We are seeking a Senior Software Engineer – AI Inference to advance open‑source LLM serving by contributing directly to upstream inference...  ...slowdowns and ensure stable behavior across model and hardware configurations. Collaborate with... 

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $187.5k - $395k

     ...for intelligence. To go beyond language models and build more aware, capable and...  ...architectures by integrating them into our inference engine Collaborate closely across research,...  ...user priority Architect a e2e model serving deployment pipeline for a custom vendor... 

    Luma AI

    Redwood City, CA
    11 hours ago
  •  ...Principal, Software Engineer Join Walmart as a Principal Software Engineer for the Colony Platform within...  ...Principal Software Engineer, you will serve as a senior technical authority and...  ...OIDC authentication and scope/permission models. ~ Familiarity with schema/contract... 
    Contract work
    Temporary work
    Local area

    Walmart

    Sunnyvale, CA
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Principal Software Engineer - CoreAI Model Inference & Serving. Be the first to apply!