Principal AI Agent / ML Software Engineer (OCI)
Oracle
Principal AI Agent / ML Software Engineer
The Principal AI Agent / ML Software Engineer is a Senior Staff-level, hands-on technical leadership role responsible for defining, building, and operating next-generation AI systems on Oracle Cloud Infrastructure (OCI). This person will set architecture and engineering direction for production-grade agentic AI platforms, autonomous workflows, scalable inference infrastructure, and enterprise AI applications used in large-scale, business-critical environments.
This role requires a proven engineer who can translate ambiguous product and platform goals into durable technical strategy, lead multi-team execution without direct authority, and remain deeply hands-on in design, code, reviews, operations, and incident follow-up. The ideal candidate combines deep distributed systems experience with practical AI-native engineering, including orchestration of LLMs, tools, APIs, memory, retrieval, evaluation, guardrails, and cloud services. The expectation is to ship, scale, and operate reliable, secure, observable, and cost-aware AI platform systems while raising the technical bar for engineers across the organization.
Responsibilities
- Serve as a senior technical owner for OCI AI platform capabilities, including agent execution, inference systems, model serving, AI workflow orchestration, evaluation, and observability.
- Design, architect, and deliver scalable agentic AI systems capable of reasoning, planning, tool use, workflow execution, multi-step task orchestration, and safe human-in-the-loop escalation.
- Build production-grade services for tool calling, agent memory, context management, Model Context Protocol (MCP) integration, vector retrieval, multi-agent coordination, policy enforcement, and evaluation.
- Lead architecture across distributed services optimized for low latency, high throughput, GPU efficiency, reliability, cost, operability, and secure multi-tenant operation.
- Define service boundaries, APIs, data models, state management, consistency tradeoffs, failure modes, SLIs/SLOs, rollout strategies, and operational readiness criteria for AI platform services.
- Drive technical strategy across infrastructure, platform, security, data, and application engineering teams, converting broad goals into executable multi-quarter plans and measurable milestones.
- Integrate AI agents securely and reliably with enterprise APIs, cloud services, databases, identity systems, secrets management, and external systems.
- Establish AgentOps and LLMOps practices for tracing, monitoring, eval suites, regression testing, experimentation, safety guardrails, prompt/tool versioning, and production reliability.
- Evaluate and operationalize emerging technologies in generative AI, agentic workflows, inference optimization, long-context systems, reasoning models, AI developer tooling, and agentic-first development.
- Drive engineering excellence through code reviews, design reviews, test strategy, deployment automation, incident analysis, documentation, and AI-assisted development practices using tools such as Codex, Claude Code, Cursor, Copilot, or similar systems.
- Mentor Staff and senior engineers, raise architectural standards, and influence engineering practices across OCI without requiring direct management authority.
- Own critical production outcomes, including reliability, performance, security posture, cost efficiency, and supportability for the systems delivered.
Required Qualifications
- Bachelor's, Master's, or Ph.D. in Computer Science, AI/ML, Engineering, or a related field, or equivalent practical experience.
- 6-10 years of professional software engineering experience, including significant ownership of production systems; or equivalent experience demonstrating Senior Staff / Principal-level impact.
- Proven track record as a Staff, Senior Staff, Principal, or equivalent technical leader influencing architecture and execution across multiple teams.
- Deep experience designing, building, and operating high-scale distributed systems, cloud services, infrastructure platforms, or AI/ML platform services.
- Practical experience with orchestration frameworks such as LangGraph, LangChain, CrewAI, AutoGen, LlamaIndex, or similar ecosystems.
- Deep understanding of LLM application patterns, including prompt design, structured outputs, function/tool calling, context management, RAG, memory, tool safety, and evaluation.
- Strong programming skills in Python and ability to contribute high-quality production code, reviews, tests, and debugging in complex distributed environments.
- Strong expertise with Kubernetes, Docker, cloud-native infrastructure, service-to-service communication, scalability, fault tolerance, observability, and performance analysis.
- Experience defining SLIs/SLOs, production readiness criteria, incident response practices, monitoring, tracing, experiments, and reliability programs for AI or distributed systems.
- Strong understanding of AI safety, governance, security, and operational risks for autonomous or semi-autonomous systems, including data handling, access control, auditability, and human accountability.
- Excellent written and verbal communication, with demonstrated ability to lead technical direction, resolve ambiguity, and influence senior stakeholders.
Preferred Qualifications
- Experience optimizing large-scale GPU inference or training workloads for latency, throughput, utilization, availability, and cost.
- Experience building or operating model serving, inference gateways, agent runtimes, workflow engines, developer platforms, or internal AI productivity platforms.
- Experience integrating AI systems with enterprise APIs, databases, cloud services, vector databases, embeddings, retrieval systems, identity systems, and policy enforcement layers.
- Experience with LLM fine-tuning, long-context systems, reasoning models, model routing, caching, batching, quantization, or emerging generative AI research.
- Experience building evaluation frameworks for agentic systems, including offline evals, online experiments, golden tasks, adversarial testing, regression gates, and observability dashboards.
- Experience using AI-assisted software development tools such as Codex, Claude Code, Cursor, Copilot, or similar systems in large-scale engineering environments.
- Track record of defining architectural standards, platform capabilities, or engineering practices adopted across multiple teams or organizations.
- Experience in enterprise, cloud infrastructure, regulated, security-sensitive, or mission-critical environments.
- ...Senior Principal Ai Agent / Ml Software Engineer The Senior Principal AI Agent / ML Software Engineer is a Senior Staff-level, hands-on technical leadership... ...-generation AI systems on Oracle Cloud Infrastructure (OCI). This person will set architecture and engineering...Principal
$272k - $431.25k
...We are seeking a Principal AI and ML Infra Software Engineer, GPU Clusters at NVIDIA to join our Hardware Infrastructure team. As an Engineer, you will have a pivotal role in enhancing efficiency for our researchers by implementing progressions throughout the entire stack...Principal- ...NVIDIA Gruppe is seeking a Principal AI and ML Infra Software Engineer to join our Hardware Infrastructure team in Santa Clara, CA. In this role, you'll work closely with AI research teams to enhance efficiency by addressing infrastructure deficiencies for GPU Clusters...Principal
- ...Palo Alto Networks, Inc. is seeking a Sr Principal Software Engineer to lead technical initiatives and deliver next-generation cloud security solutions. You will work collaboratively with cross-functional teams to tackle complex challenges in network security architecture...PrincipalFull time
$272k - $431.25k
NVIDIA Corporation seeks a Principal AI and ML Infra Software Engineer in Santa Clara, California, to enhance the efficiency of AI/ML research on GPU Clusters. The role involves collaboration with various teams, monitoring infrastructure performance, and implementing improvements...Principal$167k - $270.5k
...Networks, Inc. is seeking a Technical Leader to develop AI applications within the GTM/CX domain. This role involves defining the architecture for scalable AI/ML systems and leading the design of intelligent agents. Ideal candidates will have 15+ years of experience,...Principal- ...Principal AI/ML System Software Engineer At d-Matrix, we are focused on unleashing the potential of generative AI to power the transformation of technology. We are at the forefront of software and hardware innovation, pushing the boundaries of what is possible. Our...PrincipalWork experience placement3 days per week
- Abbott Laboratories is seeking a Principal AI/ML Engineer in Santa Clara, CA. This role focuses on leading the technical execution of AI initiatives in medical devices, ensuring the delivery of innovative solutions that comply with regulations. The ideal candidate will...Principal
$147k - $237.5k
...Job Summary The ADEM engineering team is the engine of innovation... ...; we create them. As a Principal Engineer focused on the Agent, you will be at the... ...group is seeking a Principal Software Engineer to serve as a... ...components adhere to Secure AI by Design, operate with...PrincipalPermanent employmentLocal area$208k - $260k
Gigamon is seeking a Principal Software Engineer to lead the design and development of AI/ML-driven, cloud-native applications for network monitoring and analytics. You will be responsible for crafting scalable and resilient software while providing technical leadership...Principal$275.8k - $340.5k
...About the team: The AV ML Infra team at GM builds ML infrastructure... ...to meet the unique demands of AI and ML innovation, supporting... ...the productivity of ML engineers, and drive the adoption of cutting... ...Position Overview: The Principal AI/ML Engineer will lead a growing...PrincipalLocal areaRemote workWork from homeRelocationRelocation packageFlexible hours- Principal Software ML Test Engineer At d-Matrix , we are focused on unleashing the potential of generative AI to power the transformation of technology. We are at the forefront of software and hardware innovation, pushing the boundaries of what is possible. Our culture...Principal3 days per week
$296.3k
...About the team: The AV ML Infra team at GM builds ML infrastructure... ...to meet the unique demands of AI and ML innovation, supporting... ...the productivity of ML engineers, and drive the adoption of cutting... ...Position Overview: The Principal AI/ML Engineer will lead a growing...PrincipalLocal areaWork from homeFlexible hours$148.7k - $297.3k
## Principal AI/ML EngineerApplylocations: United States - California - Santa Claratime type:... ...THE OPPORTUNITY**This **Principal AI/ML Engineer** position can work out of our **Santa... ...privacy standards such as HIPAA, GDPR, and Software as a Medical Device (SaMD) guidelines.*...PrincipalShift work- ...TwinThread is seeking a highly skilled Principal AI/ML and Gen AI Engineer to join its dynamic team in Palo Alto, California. The ideal candidate will have over 8 years of relevant experience with a strong foundation in AWS, AI/ML, and Databricks, focusing on scaling infrastructure...Principal
$262k - $365k
A leading technology company in Sunnyvale, CA seeks a Senior Staff Software Engineer specializing in ML Infrastructure. The role involves designing back-end services and collaborating with AI teams. Candidates should have significant software development experience, particularly...- A leading financial institution is seeking a highly skilled Principal AI/ML and Gen AI Engineer to join their dynamic team in Palo Alto, California. The ideal candidate will have over 8 years of experience in AI/ML engineering, specifically with AWS and Databricks, focused...Principal
- Cerebras Systems builds the world’s largest AI chip, 56 times larger than GPUs. Our... ...learning users to effortlessly run large‑scale ML applications, without the hassle of... ...computation. About the Role We are looking for a Software Engineer to join the ML Integration and Quality...Work at officeRemote work
- ...Job Type: Full-Time Company: Upscale AI Team Size: +100 employees Industry:... .... We’re looking for a smart, driven engineering professional to join our infrastructure team... ...master key swap procedures. Hardware-Software Co-design: Collaborate closely with hardware...PrincipalFull time
$174k - $252k
Senior Software ML Engineer, AI/ML GenAI, Gemini Enterprise corporate_fare Google place Sunnyvale, CA, USA Apply Bachelor’s degree in Computer... ...Models (LLMs), Retrieval-Augmented Generation (RAG), Agents) or model training/post-training. 1 year of experience with...Full time- A leading technology company in Sunnyvale, CA, is seeking a Senior Software Engineer to develop innovative AI/ML solutions for their productivity suite. The role includes programming responsibilities and collaborating on code quality while leveraging advanced ML infrastructures...
$225k - $245k
...Principal AI/ML Engineer - AI Safety & Evaluation About the Team We're building a future where AI systems are not only powerful but safe, aligned, and robust against misuse. Our team focuses on advancing practical safety techniques for large language models (LLMs...PrincipalFor subcontractorLocal area$241.8k - $409.2k
...GPGPU Software Architect/ Principal Engineer XPENG is a leading smart technology company at the forefront of innovation, integrating advanced AI and autonomous driving technologies into its vehicles, including electric vehicles (EVs), electric vertical take-off and...PrincipalFull time$150k - $250k
Collinear AI, Inc. is seeking a Software Engineer (Machine Learning) based in Sunnyvale, California. This role focuses on developing scalable web applications and high-performance backend solutions using Python and FastAPI. Candidates should have a strong background in...$313.06k
..., rewarding, and diverse environment for every OK-er. About the Opportunity We are seeking a Principal Engineer with a deep expertise in autonomous AI agent architecture and deployment, to spearhead the design, development, and optimization of intelligent agent...Principal- ...boundaries of what's possible together. As a Principal AI/ML at JPMorgan Chase within the... ...deep knowledge of machine learning, software engineering, and product management to spearhead... ...AI/ML Platforms, LLMs, GenAI, and AI Agents. FEDERAL DEPOSIT INSURANCE ACT: This...Principal
- The Cisco Security AI team delivers AI products and platform... ...Learning. Who You Are As a Principal Engineer, you will have the incredible... ..., including data scientists, software developers, product managers,... ...developments in AI and ML and evaluate the potential impact...PrincipalFull timeTemporary work
$147k - $211k
Google Inc. in Sunnyvale, CA, seeks a talented Software Engineer to develop next-generation technologies that transform user interaction. You... ...strong programming skills in Python or C++, and experience in ML infrastructure. The position offers a competitive salary ranging...$275.8k - $340.5k
About the Team The AV ML Infra team at GM builds ML infrastructure... ...to meet the unique demands of AI and ML innovation, supporting... ...the productivity of ML engineers, and drive the adoption of cutting... ...techniques. Position Overview The Principal AI/ML Engineer will lead a...PrincipalRemote workRelocationRelocation packageFlexible hours$170k - $275k
...Software Engineer, Agent Harnessing Sunnyvale, California The future of defense will be decided by those who field intelligent machines at scale. At Scout AI, we're developing Fury, the first robotic foundation model for defense, to give U.S. forces overwhelming,...Full timeRelocation package
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Principal AI Agent / ML Software Engineer (OCI). Be the first to apply!
- sourcing agent Santa Clara, CA
- commissioning agent Santa Clara, CA
- cruise agent Santa Clara, CA
- state farm agent Santa Clara, CA
- airport agent Santa Clara, CA
- executive protection agent Santa Clara, CA
- import export agent Santa Clara, CA
- remote chat agent Santa Clara, CA
- agent Santa Clara, CA
- principal software engineer Santa Clara, CA

