Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Principal AI Agent / ML Software Engineer (OCI)

Oracle

Principal AI Agent / ML Software Engineer

The Principal AI Agent / ML Software Engineer is a Senior Staff-level, hands-on technical leadership role responsible for defining, building, and operating next-generation AI systems on Oracle Cloud Infrastructure (OCI). This person will set architecture and engineering direction for production-grade agentic AI platforms, autonomous workflows, scalable inference infrastructure, and enterprise AI applications used in large-scale, business-critical environments.

This role requires a proven engineer who can translate ambiguous product and platform goals into durable technical strategy, lead multi-team execution without direct authority, and remain deeply hands-on in design, code, reviews, operations, and incident follow-up. The ideal candidate combines deep distributed systems experience with practical AI-native engineering, including orchestration of LLMs, tools, APIs, memory, retrieval, evaluation, guardrails, and cloud services. The expectation is to ship, scale, and operate reliable, secure, observable, and cost-aware AI platform systems while raising the technical bar for engineers across the organization.

Responsibilities
  • Serve as a senior technical owner for OCI AI platform capabilities, including agent execution, inference systems, model serving, AI workflow orchestration, evaluation, and observability.
  • Design, architect, and deliver scalable agentic AI systems capable of reasoning, planning, tool use, workflow execution, multi-step task orchestration, and safe human-in-the-loop escalation.
  • Build production-grade services for tool calling, agent memory, context management, Model Context Protocol (MCP) integration, vector retrieval, multi-agent coordination, policy enforcement, and evaluation.
  • Lead architecture across distributed services optimized for low latency, high throughput, GPU efficiency, reliability, cost, operability, and secure multi-tenant operation.
  • Define service boundaries, APIs, data models, state management, consistency tradeoffs, failure modes, SLIs/SLOs, rollout strategies, and operational readiness criteria for AI platform services.
  • Drive technical strategy across infrastructure, platform, security, data, and application engineering teams, converting broad goals into executable multi-quarter plans and measurable milestones.
  • Integrate AI agents securely and reliably with enterprise APIs, cloud services, databases, identity systems, secrets management, and external systems.
  • Establish AgentOps and LLMOps practices for tracing, monitoring, eval suites, regression testing, experimentation, safety guardrails, prompt/tool versioning, and production reliability.
  • Evaluate and operationalize emerging technologies in generative AI, agentic workflows, inference optimization, long-context systems, reasoning models, AI developer tooling, and agentic-first development.
  • Drive engineering excellence through code reviews, design reviews, test strategy, deployment automation, incident analysis, documentation, and AI-assisted development practices using tools such as Codex, Claude Code, Cursor, Copilot, or similar systems.
  • Mentor Staff and senior engineers, raise architectural standards, and influence engineering practices across OCI without requiring direct management authority.
  • Own critical production outcomes, including reliability, performance, security posture, cost efficiency, and supportability for the systems delivered.
Required Qualifications
  • Bachelor's, Master's, or Ph.D. in Computer Science, AI/ML, Engineering, or a related field, or equivalent practical experience.
  • 6-10 years of professional software engineering experience, including significant ownership of production systems; or equivalent experience demonstrating Senior Staff / Principal-level impact.
  • Proven track record as a Staff, Senior Staff, Principal, or equivalent technical leader influencing architecture and execution across multiple teams.
  • Deep experience designing, building, and operating high-scale distributed systems, cloud services, infrastructure platforms, or AI/ML platform services.
  • Practical experience with orchestration frameworks such as LangGraph, LangChain, CrewAI, AutoGen, LlamaIndex, or similar ecosystems.
  • Deep understanding of LLM application patterns, including prompt design, structured outputs, function/tool calling, context management, RAG, memory, tool safety, and evaluation.
  • Strong programming skills in Python and ability to contribute high-quality production code, reviews, tests, and debugging in complex distributed environments.
  • Strong expertise with Kubernetes, Docker, cloud-native infrastructure, service-to-service communication, scalability, fault tolerance, observability, and performance analysis.
  • Experience defining SLIs/SLOs, production readiness criteria, incident response practices, monitoring, tracing, experiments, and reliability programs for AI or distributed systems.
  • Strong understanding of AI safety, governance, security, and operational risks for autonomous or semi-autonomous systems, including data handling, access control, auditability, and human accountability.
  • Excellent written and verbal communication, with demonstrated ability to lead technical direction, resolve ambiguity, and influence senior stakeholders.
Preferred Qualifications
  • Experience optimizing large-scale GPU inference or training workloads for latency, throughput, utilization, availability, and cost.
  • Experience building or operating model serving, inference gateways, agent runtimes, workflow engines, developer platforms, or internal AI productivity platforms.
  • Experience integrating AI systems with enterprise APIs, databases, cloud services, vector databases, embeddings, retrieval systems, identity systems, and policy enforcement layers.
  • Experience with LLM fine-tuning, long-context systems, reasoning models, model routing, caching, batching, quantization, or emerging generative AI research.
  • Experience building evaluation frameworks for agentic systems, including offline evals, online experiments, golden tasks, adversarial testing, regression gates, and observability dashboards.
  • Experience using AI-assisted software development tools such as Codex, Claude Code, Cursor, Copilot, or similar systems in large-scale engineering environments.
  • Track record of defining architectural standards, platform capabilities, or engineering practices adopted across multiple teams or organizations.
  • Experience in enterprise, cloud infrastructure, regulated, security-sensitive, or mission-critical environments.
Vacancy posted 2 hours ago
Similar jobs that could be interesting for youBased on the Principal AI Agent / ML Software Engineer (OCI) in Santa Clara, CA vacancy
  •  ...Senior Principal Ai Agent / Ml Software Engineer The Senior Principal AI Agent / ML Software Engineer is a Senior Staff-level, hands-on technical leadership...  ...-generation AI systems on Oracle Cloud Infrastructure (OCI). This person will set architecture and engineering... 
    Principal

    Oracle

    Santa Clara, CA
    2 hours ago
  • $272k - $431.25k

     ...We are seeking a Principal AI and ML Infra Software Engineer, GPU Clusters at NVIDIA to join our Hardware Infrastructure team. As an Engineer, you will have a pivotal role in enhancing efficiency for our researchers by implementing progressions throughout the entire stack... 
    Principal

    NVIDIA Gruppe

    Santa Clara, CA
    4 days ago
  •  ...NVIDIA Gruppe is seeking a Principal AI and ML Infra Software Engineer to join our Hardware Infrastructure team in Santa Clara, CA. In this role, you'll work closely with AI research teams to enhance efficiency by addressing infrastructure deficiencies for GPU Clusters... 
    Principal

    NVIDIA Gruppe

    Santa Clara, CA
    3 days ago
  •  ...Palo Alto Networks, Inc. is seeking a Sr Principal Software Engineer to lead technical initiatives and deliver next-generation cloud security solutions. You will work collaboratively with cross-functional teams to tackle complex challenges in network security architecture... 
    Principal
    Full time

    Palo Alto Networks

    Santa Clara, CA
    3 days ago
  • $272k - $431.25k

    NVIDIA Corporation seeks a Principal AI and ML Infra Software Engineer in Santa Clara, California, to enhance the efficiency of AI/ML research on GPU Clusters. The role involves collaboration with various teams, monitoring infrastructure performance, and implementing improvements... 
    Principal

    NVIDIA Corporation

    Santa Clara, CA
    2 days ago
  • $167k - $270.5k

     ...Networks, Inc. is seeking a Technical Leader to develop AI applications within the GTM/CX domain. This role involves defining the architecture for scalable AI/ML systems and leading the design of intelligent agents. Ideal candidates will have 15+ years of experience,... 
    Principal

    Palo Alto Networks, Inc.

    Santa Clara, CA
    4 days ago
  •  ...Principal AI/ML System Software Engineer At d-Matrix, we are focused on unleashing the potential of generative AI to power the transformation of technology. We are at the forefront of software and hardware innovation, pushing the boundaries of what is possible. Our... 
    Principal
    Work experience placement
    3 days per week

    d-Matrix

    Santa Clara, CA
    19 hours ago
  • Abbott Laboratories is seeking a Principal AI/ML Engineer in Santa Clara, CA. This role focuses on leading the technical execution of AI initiatives in medical devices, ensuring the delivery of innovative solutions that comply with regulations. The ideal candidate will... 
    Principal

    Abbott Laboratories

    Santa Clara, CA
    1 day ago
  • $147k - $237.5k

     ...Job Summary The ADEM engineering team is the engine of innovation...  ...; we create them. As a Principal Engineer focused on the Agent, you will be at the...  ...group is seeking a Principal Software Engineer to serve as a...  ...components adhere to Secure AI by Design, operate with... 
    Principal
    Permanent employment
    Local area

    Palo Alto Networks

    Santa Clara, CA
    3 days ago
  • $208k - $260k

    Gigamon is seeking a Principal Software Engineer to lead the design and development of AI/ML-driven, cloud-native applications for network monitoring and analytics. You will be responsible for crafting scalable and resilient software while providing technical leadership... 
    Principal

    Gigamon

    Santa Clara, CA
    2 days ago
  • $275.8k - $340.5k

     ...About the team: The AV ML Infra team at GM builds ML infrastructure...  ...to meet the unique demands of AI and ML innovation, supporting...  ...the productivity of ML engineers, and drive the adoption of cutting...  ...Position Overview: The Principal AI/ML Engineer will lead a growing... 
    Principal
    Local area
    Remote work
    Work from home
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Sunnyvale, CA
    3 days ago
  • Principal Software ML Test Engineer At d-Matrix , we are focused on unleashing the potential of generative AI to power the transformation of technology. We are at the forefront of software and hardware innovation, pushing the boundaries of what is possible. Our culture... 
    Principal
    3 days per week

    d-Matrix inc.

    Santa Clara, CA
    4 days ago
  • $296.3k

     ...About the team: The AV ML Infra team at GM builds ML infrastructure...  ...to meet the unique demands of AI and ML innovation, supporting...  ...the productivity of ML engineers, and drive the adoption of cutting...  ...Position Overview: The Principal AI/ML Engineer will lead a growing... 
    Principal
    Local area
    Work from home
    Flexible hours

    General Motors

    Sunnyvale, CA
    19 hours ago
  • $148.7k - $297.3k

    ## Principal AI/ML EngineerApplylocations: United States - California - Santa Claratime type:...  ...THE OPPORTUNITY**This **Principal AI/ML Engineer** position can work out of our **Santa...  ...privacy standards such as HIPAA, GDPR, and Software as a Medical Device (SaMD) guidelines.*... 
    Principal
    Shift work

    Abbott Laboratories

    Santa Clara, CA
    1 day ago
  •  ...TwinThread is seeking a highly skilled Principal AI/ML and Gen AI Engineer to join its dynamic team in Palo Alto, California. The ideal candidate will have over 8 years of relevant experience with a strong foundation in AWS, AI/ML, and Databricks, focusing on scaling infrastructure... 
    Principal

    Aumni

    Palo Alto, CA
    3 days ago
  • $262k - $365k

    A leading technology company in Sunnyvale, CA seeks a Senior Staff Software Engineer specializing in ML Infrastructure. The role involves designing back-end services and collaborating with AI teams. Candidates should have significant software development experience, particularly... 

    Google Inc.

    Sunnyvale, CA
    2 days ago
  • A leading financial institution is seeking a highly skilled Principal AI/ML and Gen AI Engineer to join their dynamic team in Palo Alto, California. The ideal candidate will have over 8 years of experience in AI/ML engineering, specifically with AWS and Databricks, focused... 
    Principal

    JPMorgan Chase & Co.

    Palo Alto, CA
    4 days ago
  • Cerebras Systems builds the world’s largest AI chip, 56 times larger than GPUs. Our...  ...learning users to effortlessly run large‑scale ML applications, without the hassle of...  ...computation. About the Role We are looking for a Software Engineer to join the ML Integration and Quality... 
    Work at office
    Remote work

    Dormont Manufacturing Co

    Sunnyvale, CA
    2 days ago
  •  ...Job Type: Full-Time Company: Upscale AI Team Size: +100 employees Industry:...  .... We’re looking for a smart, driven engineering professional to join our infrastructure team...  ...master key swap procedures. Hardware-Software Co-design: Collaborate closely with hardware... 
    Principal
    Full time

    Upscaleai

    Santa Clara, CA
    4 days ago
  • $174k - $252k

    Senior Software ML Engineer, AI/ML GenAI, Gemini Enterprise corporate_fare Google place Sunnyvale, CA, USA Apply Bachelor’s degree in Computer...  ...Models (LLMs), Retrieval-Augmented Generation (RAG), Agents) or model training/post-training. 1 year of experience with... 
    Full time

    Google Inc.

    Sunnyvale, CA
    2 days ago
  • A leading technology company in Sunnyvale, CA, is seeking a Senior Software Engineer to develop innovative AI/ML solutions for their productivity suite. The role includes programming responsibilities and collaborating on code quality while leveraging advanced ML infrastructures... 

    Google Inc.

    Sunnyvale, CA
    4 days ago
  • $225k - $245k

     ...Principal AI/ML Engineer - AI Safety & Evaluation About the Team We're building a future where AI systems are not only powerful but safe, aligned, and robust against misuse. Our team focuses on advancing practical safety techniques for large language models (LLMs... 
    Principal
    For subcontractor
    Local area

    A10 Networks

    San Jose, CA
    1 day ago
  • $241.8k - $409.2k

     ...GPGPU Software Architect/ Principal Engineer XPENG is a leading smart technology company at the forefront of innovation, integrating advanced AI and autonomous driving technologies into its vehicles, including electric vehicles (EVs), electric vertical take-off and... 
    Principal
    Full time

    XPENG

    Santa Clara, CA
    4 days ago
  • $150k - $250k

    Collinear AI, Inc. is seeking a Software Engineer (Machine Learning) based in Sunnyvale, California. This role focuses on developing scalable web applications and high-performance backend solutions using Python and FastAPI. Candidates should have a strong background in... 

    Collinear AI, Inc.

    Sunnyvale, CA
    2 days ago
  • $313.06k

     ..., rewarding, and diverse environment for every OK-er. About the Opportunity We are seeking a Principal Engineer with a deep expertise in autonomous AI agent architecture and deployment, to spearhead the design, development, and optimization of intelligent agent... 
    Principal

    OKX

    San Jose, CA
    3 days ago
  •  ...boundaries of what's possible together. As a Principal AI/ML at JPMorgan Chase within the...  ...deep knowledge of machine learning, software engineering, and product management to spearhead...  ...AI/ML Platforms, LLMs, GenAI, and AI Agents. FEDERAL DEPOSIT INSURANCE ACT: This... 
    Principal

    Aumni

    Palo Alto, CA
    2 days ago
  • The Cisco Security AI team delivers AI products and platform...  ...Learning. Who You Are As a Principal Engineer, you will have the incredible...  ..., including data scientists, software developers, product managers,...  ...developments in AI and ML and evaluate the potential impact... 
    Principal
    Full time
    Temporary work

    Cisco

    San Jose, CA
    a month ago
  • $147k - $211k

    Google Inc. in Sunnyvale, CA, seeks a talented Software Engineer to develop next-generation technologies that transform user interaction. You...  ...strong programming skills in Python or C++, and experience in ML infrastructure. The position offers a competitive salary ranging... 

    Google Inc.

    Sunnyvale, CA
    4 days ago
  • $275.8k - $340.5k

    About the Team The AV ML Infra team at GM builds ML infrastructure...  ...to meet the unique demands of AI and ML innovation, supporting...  ...the productivity of ML engineers, and drive the adoption of cutting...  ...techniques. Position Overview The Principal AI/ML Engineer will lead a... 
    Principal
    Remote work
    Relocation
    Relocation package
    Flexible hours

    General Motors

    Mountain View, CA
    2 days ago
  • $170k - $275k

     ...Software Engineer, Agent Harnessing Sunnyvale, California The future of defense will be decided by those who field intelligent machines at scale. At Scout AI, we're developing Fury, the first robotic foundation model for defense, to give U.S. forces overwhelming,... 
    Full time
    Relocation package

    Scout AI

    Sunnyvale, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Principal AI Agent / ML Software Engineer (OCI). Be the first to apply!