Principal Software Engineer, CoreAI Workload Engines

$142.8k - $274.8k

Microsoft Corporation

Overview

TheCoreAIWorkloadsteam builds thefoundationalinference engines and APIs thatpower largescale AI inference across Azure-fromcutting-edgestartups to Fortune 500 enterprisesand Microsoft Copilots and agents. Our mission is to deliversecure, reliable, and highly efficient GPUinferencethat enable multitenant AI systems atglobalscale while maximizingutilization, performance, and developer productivity.We own inferenceserving andperformance of OpenAI and other state of the art large language model (LLM) models and work directly with OpenAI serving some of the largest workloads on the planet with trillions of inferences per day.Ourconverged AI fabricand enginesdeliver inference capabilities for all LLMs inMicrosoft catalog, including OpenAI,Anthropic,Mistral, Cohere, Llama, and more.

This role sits at the intersection of LLM inference fleets, serving efficiency, rapid experimentation, cloud infrastructure, and systems software-working closely with CoreAI data plane, compute, and partner teams to deliver end-to-end efficiencies and platform capabilities.

In this role, you will have the opportunity to work on multiple levels of the AI software stack, including the fundamental abstractions, programming models,OpenAI and OSS enginesruntimes, libraries and application programming interfaces (APIs) to enable large scale inferencing of models.

You will drive production-grade inference serving improvements for OpenAI and open-source models across Azure, including benchmarking, performance measurement, and disciplined experimentation to improve latency, throughput, availability, and cost at scale. You will both (1) make hands-on engine changes and (2) contribute to the experimentation capabilities that make those changes measurable, safe to ship, and repeatable across teams.

Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

In alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day.

Responsibilities

Asthe Principalengineer on theteam, your responsibilities include:

Optimize inference engines for OpenAI and open-source models by implementing and shipping performance/efficiency improvements across runtime, scheduling, and serving paths (latency, throughput, utilization, availability, and cost).

Run experiments end-to-end: formulate hypotheses, implement engine changes (including Python/PyTorch integration points where relevant), analyze results, and ship improvements behind guardrails.

Buildand useexperimentation capabilities for large-scale AI inference (experiment lifecycle, tracking, metric modeling, comparability standards, automated analysis) so the team can iterate quickly and safely.

Own serving availability and efficiency for Azure OpenAI Service workloads through tiered experimentation, lean segmentation, and multi-modal utilization across heterogeneous fleets-turning findings into shipped engine improvements.

Design and evolve inference serving architectures to improve utilization and latency using techniques such as disaggregated serving, multi-token prediction, KV offload/retrieval, and quantization-validated via staged rollouts and production guardrails.

Extend AI infrastructure abstractions to support elastic, heterogeneous inference engines reliably at scale (e.g., dynamic scaling across model families, modalities, and workload classes while maintaining isolation and SLOs).

Tune and scale inference engines across NVIDIA GPU generations (A100, H100, H200) for state-of-the-art OpenAI models, focusing on serving efficiency, utilization, and reliability (not hardware bring-up).

Partner with networking and storage teams to leverage high-performance interconnects (e.g., RDMA/InfiniBand-class fabrics such as RoCE over IB) for distributed inference, without owning low-level kernel/driver enablement.

Drive end-to-end features from design through production: observability, diagnostics, performance regression detection, and operational excellence for inference serving.

Influence platform architecture and technical direction across teams through design reviews, clear metrics, and technical leadership focused on experimentation velocity and production reliability.

Additional Responsibilities

Work across multiple layers of the AI software stack (abstractions, programming models,engineruntimes, libraries, and APIs) to enable large-scale model inference.

Benchmark OpenAI and other LLMs for performance across Azure OpenAI Service workload tiers and segments, and translate results into production improvements.

Debug, profile, and optimize production inference performance across the stack (abstractions, runtime, scheduling, and serving pipelines) to improve latency, throughput, and utilization.

Monitor performance regressions and drive continuous improvements to reduce time-to-deploy and hardware footprint.

Collaborate across engineering teams to deliver scalable, production-ready serving efficiency and availability improvements, using experimentation results to guide prioritization and rollout.

Build durable engine interfaces that enable fast experimentation and safe shipping of new strategies for class of service (QoS), replica load balancing, KV management (including offload/retrieval), quantization, and sampling (e.g., multi-token prediction and constrained sampling).

Out of Scope (This role does not focus on)

Novel hardware bring-up or first-party silicon enablement (e.g., Microsoft chips) or expanded support for non-NVIDIA platforms (e.g., AMD).

Low-level kernel, driver, or CUDA optimization as a primary responsibility.

Model pre-training, fine-tuning, or model architecture customization.

Qualifications

Bachelor's Degree in Computer Science or related technical field and 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, Python, or equivalent experience.

Other Requirements:

Proven ability to design and operate large-scale, production inference services with high reliability and performance requirements, and to ship performance improvements safely via disciplined experimentation.

Strong skills in performance analysis: benchmarking, profiling, diagnosing regressions, and turning results into concrete engine/runtime changes.

Strong problem-solving skills and the ability to debug complex,crosslayersystems issues.

Demonstrated technical leadership, including mentoring engineers, driving cross-team architectural alignment, and leveraging AI tools and AI-assisted workflows to accelerate engineering velocity and quality.

Hands-on experience with Kubernetes (building and operating services on k8s), including debugging production issues and designing platform abstractions (e.g., custom resources/controllers) and scheduling-aware deployments (e.g., node affinity, taints/tolerations, resource requests/limits).

Strong collaboration and communication skills, with the ability to work across organizational boundaries.

Preferred Qualifications:

Experience optimizing LLM inference in practice (e.g., PyTorch inference, serving runtimes, model execution, or inference orchestration) in production environments.

Familiaritywithhighperformancenetworkingandlowlatencycommunication stacks.

Familiarity with GPU-accelerated inference stacks (e.g., CUDA at the application/runtime level, device plugins, or runtime integration).

Experience building or using experimentation systems (A/B, canarying, tiered rollout), including metric definition and comparability for performance and reliability.

Familiarity with distributed inference stacks (e.g., NCCL-style collectives, model/tensor parallelism) and performance tradeoffs in large-scale serving.

Impact & Growth:

Work onmissioncritical infrastructurethat directly powerslargescaleAI systems.

Influence the future ofcloud GPU platformsused by internal and external customers.

Collaborate with experts acrossOS, hardware, networking, and AI platform teams.

Opportunity to grow as atechnical leader, shaping longterm platform strategy.

Software Engineering IC5 - The typical base pay range for this role across the U.S. is USD $142,800 - $274,800 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $188,000 - $304,200 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:

Software Engineering IC6 - The typical base pay range for this role across the U.S. is USD $165,600 - $296,400 per year. There is a different range applicable to specific work locations, within the San Francisco Bay area and New York City metropolitan area, and the base pay range for this role in those locations is USD $220,800 - $331,200 per year.

Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here:

This position will be open for a minimum of 5 days, with applications accepted on an ongoing basis until the position is filled.

Microsoft is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to age, ancestry, citizenship, color, family or medical care leave, gender identity or expression, genetic information, immigration status, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran or military status, race, ethnicity, religion, sex (including pregnancy), sexual orientation, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance with religious accommodations and/or a reasonable accommodation due to a disability during the application process, read more about requesting accommodations.

Apply

Vacancy posted 3 days ago

Similar jobs that could be interesting for youBased on the Principal Software Engineer, CoreAI Workload Engines in Mountain View, CA vacancy

Principal Software Engineer - Responsible AI (CoreAI)
$142.8k - $274.8k
...thinking in a cloud-enabled world. The CoreAI organization at Microsoft builds the... ...for running the largest AI workloads on the planet. We do not just value differences... ...content. We are looking for a Principal Software Engineer - Responsible AI who is passionate about...
Suggested
Ongoing contract
Work at office
Local area
Microsoft Corporation
Mountain View, CA
1 day ago
Principal Software Engineer - CoreAI Model Inference & Serving
$139.9k - $274.8k
...Overview Join our team within CoreAI , where we are building theAI data-planethat powersall LLMinferencing workloads across Microsoft and Azure customers-fromcutting... ..., Cohere, Llama, and more. As a Principal Software Engineer , you will shape the future of one of...
Suggested
Ongoing contract
Local area
Microsoft Corporation
Mountain View, CA
2 days ago
Principal/Senior Software Engineer, Experimentation Platform - CoreAI
$119.8k - $234.7k
...Overview CoreAI sits at the center of Microsoft's mission to redefine how software is built and experienced, providing the foundational... ...build services that empower engineers and scientists across the... ...of production workloads. · Experience designing and...
Suggested
Ongoing contract
Local area
Microsoft Corporation
Mountain View, CA
4 days ago
Principal Software Engineer - CoreAI
$139.9k - $274.8k
...Overview Software quality is being redefined by AI. As part of the Microsoft Playwright team , you'll build the foundation... ...workflow and serve millions worldwide. As a Principal Software Engineer - CoreAI on the Playwright engineering team, you will design and...
Suggested
Ongoing contract
Local area
Worldwide
Microsoft Corporation
Mountain View, CA
2 days ago
Principal Software Engineer - Growth (CoreAI)
$163k - $296.4k
...role is about designing foundational engineering systems (instrumentation,... ...faster with higher confidence. As a Principal Growth Engineer in CoreAI, you’ll drive the technical strategy... ...deeply hands‑on and detail‑oriented Software engineering fundamentals with...
Suggested
Ongoing contract
Local area
Microsoft Corporation
Mountain View, CA
6 days ago
Senior/Principal Software Engineer - Growth (CoreAI)
$119.8k - $234.7k
...About the Role We're building AI-first engineering systems that power growth at Microsoft -... ...adopt AI. As a Growth Engineer in CoreAI, you'll sit at the intersection of product... ...What We're Looking For Software engineering fundamentals with experience...
Ongoing contract
Local area
Microsoft Corporation
Mountain View, CA
16 hours ago
Principal Software Engineer, CoreAI
$139.9k - $274.8k
...Responsibilities include the following. Collaboration with engineers and researchers to build and optimize training infrastructure... ...working with engineering teams to deliver large-scale software systems, preferably in AI, machine learning, graphics or related...
Ongoing contract
Local area
Microsoft Corporation
Mountain View, CA
16 hours ago
Principal Software Engineer Manager, CoreAI
$165.6k - $296.4k
...is at the forefront of Microsoft's mission to redefine how software is built and experienced. We are responsible for building the... ...intelligent, adaptive, and transformative software. The Principal Engineering Manager in this role leads AI Foundry Agents Platform in the...
Ongoing contract
Local area
Microsoft Corporation
Mountain View, CA
1 day ago
Senior Software Engineer II, AI Workload Orchestration
...What You’ll Do As a Senior Software Engineer II (IC4) on the AI Workload Orchestration team, you will help build and operate CoreWeave’s Kubernetes-native platform for admitting, scheduling, and operating AI workloads at scale. This platform integrates multiple orchestration...
Temporary work
Casual work
Work at office
Remote work
Flexible hours
CoreWeave
Sunnyvale, CA
2 days ago
Principal Software Engineer - Isovalent
$231.4k - $331.8k
...Principal Engineer Iovalent, now part of Cisco, is the company founded by the creators of Cilium and eBPF. Cisco Secure Workload is a flagship security product offering workload protection, micro... ...10+ years' experience building software systems in one or more...
Full time
Temporary work
Work experience placement
Local area
Flexible hours
Webex Events (formerly Socio)
Palo Alto, CA
6 days ago
Senior Software Engineer, Foundry Agents - CoreAI
$119.8k - $234.7k
...generation of intelligent agents and generative AI systems. Within CoreAI, the Foundry Agents organization is responsible for key... ...and continuously evaluate and optimize agents. As a Senior Software Engineer within Foundry Agents, you will build and evolve largescale,...
Ongoing contract
Local area
Microsoft Corporation
Mountain View, CA
3 days ago
Senior AI Workload Orchestration Engineer (Kubernetes)
...CoreWeave is looking for a Senior Software Engineer II to enhance our Kubernetes-native platform for AI workloads. You will design and manage Kubernetes services while mentoring junior engineers. Ideal candidates will have 5-8 years of experience in software engineering...
Remote work
Flexible hours
CoreWeave
Sunnyvale, CA
2 days ago
Principal Software Engineer, SDN Networking
$300 per month
...Principal Software Engineer - Software Defined Networking Crusoe is on a mission to accelerate the abundance of energy and intelligence. As... ...electrons to tokens — to power the world's most ambitious AI workloads. When you join Crusoe, you join a team that is building...
Temporary work
Crusoe
Sunnyvale, CA
16 hours ago
Senior Software Engineer, Chip Workloads on Cloud
$174k - $252k
Senior Software Engineer, Chip Workloads on Cloud corporate_fare Google Sunnyvale, CA, USA Qualifications Bachelor’s degree or equivalent practical experience. 5 years of experience with software development in one or more programming languages. 3 years of experience testing...
Full time
Google Inc.
Sunnyvale, CA
16 hours ago
Senior Software Engineer - Cloud EDA Workloads
$174k - $252k
Google Inc. is seeking a Senior Software Engineer to work on Chip Workloads on Cloud in Sunnyvale, CA. The role involves developing and testing software for EDA workloads, collaborating with hardware engineers, and leading design reviews. Candidates should possess strong...
Full time
Google Inc.
Sunnyvale, CA
16 hours ago
Software Engineer II - CoreAI
$100.6k - $199k
...AI is at the forefront of Microsoft's mission to redefine how software is built and experienced. Weare responsible forbuildingthe foundationalplatforms... ...agent performance. We are seekingaSenior Software Engineer to join theEvaluationplatform team. This teamis responsible...
Ongoing contract
Local area
Microsoft Corporation
Mountain View, CA
3 days ago
Principal Software Development Engineer
$205k - $241k
...part of shaping the future of mobility, then read on!We are looking for a highly experienced and technically profound Principal Software Development Engineer to join our team. This pivotal role requires deep expertise in safety critical and modern software architectures...
Full time
Work at office
Local area
Immediate start
Flexible hours
3 days per week
Wisk
Mountain View, CA
2 days ago
Principal Software Development Engineer
...What you will do In this role, situated within the S3 Organization, you will serve as a Principal Software Development Engineer dedicated to the development of a novel aircraft designed for the Advanced Air Mobility (AAM) market. You will define, architect, and champion...
Wisk Aero LLC
Mountain View, CA
2 days ago
Principal Software Engineer - Palo Alto, CA
$200k - $220k
...Principal Software Engineer – Palo Alto, CA October 10, 2024 Xage Security, Inc. seeks Principal Software Engineer in Palo Alto, CA: Job Duties: Design and develop Xage distributed system and deployment projects including health monitoring and visualization. Design and...
Remote work
Xage
Palo Alto, CA
2 days ago
Principal Kubernetes Software Engineer
$127.1k - $226k
...Principal Kubernetes Software Engineer VMware by Broadcom is the leader in virtualization and cloud infrastructure solutions. VMware Cloud Foundation (VCF) is a full-stack Infrastructure as a Service (IaaS) platform that provides a unified, self-service experience...
Local area
VMware
Palo Alto, CA
16 hours ago
Principal Wi-Fi Software Engineer (Starlink)
$210k - $300k
...Principal Wi‑Fi Software Engineer (Starlink) SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not. Today SpaceX is actively developing the technologies to make this possible...
Temporary work
Live in
Remote work
Worldwide
Weekend work
SPACE EXPLORATION TECHNOLOGIES CORP
Palo Alto, CA
2 days ago
Sr Principal Software Engineer
$209.09k - $303.34k
...We are CARIAD , an automotive software development team with the Volkswagen Group. Our mission is to make the automotive experience... ...around it. Role Summary: The Technical Lead Sr Principal Software Engineer, BSW is a senior technical leader and enterprise-wide...
Permanent employment
Temporary work
Worldwide
Cariad, Inc.
Mountain View, CA
16 hours ago
Principal Software Engineer, Perception Pretraining
$349k - $431k
...Principal Software Engineer, Perception Pretraining Waymo is an autonomous driving technology company with the mission to be the world's most trusted driver. Since its start as the Google Self-Driving Car Project in 2009, Waymo has focused on building the Waymo Driver...
Full time
Remote work
Waymo
Mountain View, CA
3 days ago
Principal Software Developer Engineer
$170k - $210k
...Principal Software Developer Engineer Join Fortinet, a cybersecurity pioneer with over two decades of excellence, as we continue to shape the future of cybersecurity and redefine the intersection of networking and security. At Fortinet, our mission is to safeguard people...
Full time
Worldwide
Home office
Edelman
Sunnyvale, CA
16 hours ago
Senior Principal Software Engineer - Media Entertainment Services
$96.8k - $251.6k
...Cloud— a product line focused on enabling studio‑grade creative workloads on OCI. This individual contributor will own ambiguous, cross‑... ...‑grade creative workflows in the cloud while improving the engineering systems, operational practices, and AI‑enabled delivery patterns...
Temporary work
Flexible hours
Oracle
Redwood City, CA
2 days ago
Principal Software Engineer, Business Experience
$276k - $414k
...and other services; and its AR glasses, Spectacles. Snap Engineering teams build fun and technically sophisticated products that... ...with privacy at the forefront. We're looking for a Principal Software Engineer to join the Business Experience team at Snap. What...
Temporary work
Live in
Work at office
Local area
Snapchat
Palo Alto, CA
1 day ago
Principal Software Engineer
$172k - $349k
...Principal Software Engineer This role has been designed as "Onsite" with an expectation that you will primarily work from an HPE office. Hewlett Packard Enterprise is the global edge-to-cloud company advancing the way people live and work. We help companies connect...
Work experience placement
Work at office
Hewlett Packard Enterprise
Sunnyvale, CA
16 hours ago
Senior Principal Software Engineer - AI tools and Adoption
$160.97k - $349.89k
...fast speed, Yahoo Mail makes reading, organizing, and sending emails easier than ever. Yahoo Mail is seeking a Senior Principal/Staff Software Engineer to lead our AI adoption initiative across the Engineering organization. You will be the driving force behind improving...
Work at office
Flexible hours
Yahoo Holdings Inc.
Mountain View, CA
2 days ago
Principal Software Engineer
$261.5k - $353.5k
...Overview Come join Intuit as a Principal Software Engineer and help us power prosperity around the world. We are looking for engineers that love to take on new challenges, solve tough problems, and have deep empathy for our customers. You’ll work with a small group...
Temporary work
Work experience placement
Local area
Relocation package
Intuit
Mountain View, CA
3 days ago
Principal Software Engineer (Prisma Access - Dataplane)
$147k - $237.5k
...Software Engineer Prisma Access™ SASE (formally GlobalProtect Cloud Service) provides protection straight from the cloud to make access to the cloud secure. It combines the connectivity and security you need and delivers it everywhere you need it. Using cutting-edge...
Remote work
Palo Alto Networks
Palo Alto, CA
3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Principal Software Engineer, CoreAI Workload Engines. Be the first to apply!