Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

AI Reliability Engineer (AI SRE)

$50 - $175 per hour

DeWinter Group

Title: AI Reliability Engineer (AI SRE) Job Type: Contract Contract Length: 12 Months Pay Range: $50/hr – $175/hr Start Date: ASAP Location: Remote About the Opportunity: Our client, a leader in AI testing and Generative AI solutions, is looking for a skilled AI Reliability Engineer (AI SRE) to join their team for a 12-month engagement. This project involves ensuringthe reliability, availability, and performance of mission‑critical AI systems by defining SLOs, implementing automated resilience measures, and leading incident response. This is a high‑impact role that requires a self‑motivated professional who can hit the ground running and deliver results quickly. Key Responsibilities & Deliverables: Defining and maintaining Service Level Objectives (SLOs) for AI inference latency and availability. Building automated "circuit breakers" and fallback logic (e.g., switching to a smaller model if the primary fails). Leading incident response and root-cause analysis (RCA) for complex AI system failures. Developing stress‑testing and chaos engineering scenarios specifically for AI agent swarms. Optimizing the "cold start" and scaling time for serverless AI functions. Required Skills & Experience: 4+ years of experience in Site Reliability Engineering (SRE). Deep expertise in system monitoring, incident management, and cloud resilience. This isn't a learning role—you need to be a subject matter expert. Demonstrated ability to work autonomously and manage your own time effectively to meet project goals. Experience with Python/Go, Kubernetes, and observability stacks (Datadog, New Relic). Strong communication skills to provide clear and concise status updates to the project team. #J-18808-Ljbffr

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the AI Reliability Engineer (AI SRE) in Campbell, CA vacancy
  • A leading firm in AI solutions is looking for an experienced AI Reliability Engineer (AI SRE) for a 12-month remote contract role. In this high-impact position, you will focus on ensuring the reliability and performance of critical AI systems by defining SLOs, implementing... 
    Suggested
    Contract work
    Remote work

    DeWinter Group

    Campbell, CA
    3 days ago
  • $100k - $170k

     .... We are seeking a  DevOps Engineer who is eager to have an immediate...  ...on Amazon EKS with focus on reliability and performance Build and...  ...~5+ years of experience in SRE, DevOps, or Platform Engineering...  ...use artificial intelligence (AI) tools to support parts of the... 
    Suggested
    Full time
    Work at office
    Immediate start
    Visa sponsorship
    Night shift

    E-Space

    Saratoga, CA
    13 days ago
  • $147k - $237.5k

     ...Execution, Integrity, and Inclusion. We weave AI into the fabric of everything we do and...  ...we hire. We’re looking for a Principal SRE to join our InfoSec SRE team that owns...  ...a US Citizen. BS/MS in Computer Science/Engineering or equivalent training, education, and experience... 
    Suggested
    Full time
    Work at office
    Visa sponsorship
    Work visa

    Palo Alto Networks, Inc.

    Santa Clara, CA
    4 days ago
  • $120k - $180k

     ...with the world’s most advanced AI-native platform. We work on...  ...About the Role: CrowdStrike's engineering organization depends on shared...  ...engineering ownership to operate reliably, scale safely, harden for...  ...Collaborate with Infrastructure, SRE, and Data Services on shared operational... 
    Suggested
    Full time
    Work experience placement
    Work at office
    Local area
    2 days per week

    CrowdStrike

    Sunnyvale, CA
    2 days ago
  •  ...leader is looking for an experienced SRE software engineer in Cupertino, California, to build and...  ...services. The role involves developing AI-powered tooling, automating deployment...  ...at least 8 years of experience in site reliability engineering, a strong background in... 
    Suggested

    Apple Inc.

    Cupertino, CA
    2 days ago
  •  ...Job Description Job Description Java SRE Engineer Onsite San Francisco Bay Area Infrastructure Engineer (2 Positions) We are...  ...platforms. This role is focused on infrastructure, reliability, and automation , with Java exposure as a supporting skill.... 

    EITACIES

    Santa Clara, CA
    9 hours ago
  •  ...please send me a copy of your updated resumes Title: Sr. SRE / DevOps Engineer Location: Sunnyvale, CA (Only Local candidate) Client...  ...DevOps Engineer at Sunnyvale, California location. As Site Reliability Engineer, the individual will work closely with multi-... 
    Local area
    Immediate start

    Donato Technologies Inc

    Sunnyvale, CA
    9 hours ago
  • $172.1k - $305.6k

     ...Software and Services The Apple Services Engineering team is one of the most exciting...  ...secure, end-to-end solutions. The Service Reliability Engineering (SRE) team is responsible for service...  ...methodologies. Do you believe that generative AI has the power to change how we live,... 
    Relocation

    Apple Inc.

    Cupertino, CA
    1 day ago
  • A technology company is seeking an experienced SRE DevOps with AWS for a hybrid role in Sunnyvale, CA. The ideal candidate will have at least 8 years of experience focused on reliability engineering, with a strong understanding of programming in languages like Python and... 

    Blockchain Technologies. LLC

    Sunnyvale, CA
    13 hours ago
  • Google Inc. is seeking a Software Engineer for their Applied AI/ML team in Sunnyvale, CA. The role involves developing next-generation technologies and designing AI/ML models to solve complex infrastructure challenges. Candidates should have a Bachelor’s degree, along with... 

    Google Inc.

    Sunnyvale, CA
    4 days ago
  • $120.2k - $155.5k

    A global cybersecurity company seeks an intermediate SRE Specialist to enhance service reliability for high-traffic services. The role requires strong Linux administration skills, 3+ years of experience with automation tools, and expertise in network management. Ideal candidates... 
    Full time

    Fortinet, Inc.

    Sunnyvale, CA
    13 hours ago
  • A leading technology company is looking for a Java SRE Engineer to support large-scale cloud migrations and production systems on AWS and...  ...team members and collaborating with various teams to ensure reliability. This position is onsite in the San Francisco Bay Area. #J-18... 

    EITACIES Inc.

    Santa Clara, CA
    3 days ago
  • $126k - $250k

     ..., Inc. is powering the future of physical AI. Founded in 2017 and now valued at $15 billion...  ...family commitments. Meet our software engineers! Meet some of our software engineers...  ...such as background coding agents, AI SRE tooling Work directly with customers to... 
    Full time
    For contractors
    For subcontractor
    Casual work
    Work at office
    Remote work
    Day shift

    Applied Intuition

    Sunnyvale, CA
    13 hours ago
  •  ...Job Description Job Description Site Reliability Engineer Onsite- Bay Area, CA Skills...  ...interaction. Must-Have: ~4+ years in SRE, DevOps, or Infrastructure Engineering...  ...not just maintaining) Experience with scalable GPU infrastructure for AI/ML... 

    Amiri Recruiting

    Mountain View, CA
    9 hours ago
  • $150k - $180k

     ...traditional and variable uptime workloads (e.g., AI/ML). Verrus builds and capitalizes its...  ...serve as software-focused Senior Site Reliability Engineer at Verrus. This is a full‑time position...  ...to mid‑level engineers, championing SRE best practices and software engineering... 
    Full time
    Work at office
    Local area
    Flexible hours

    Verrus, LLC

    Mountain View, CA
    13 hours ago
  • $126k - $250k

    Software Engineer - AI Engineering - Remote Eligible Location: Sunnyvale, California, United States Compensation: $126,000 - 250,000 USD /...  ...and ship agentic systems such as background coding agents, AI SRE tooling Work directly with customers to understand their needs... 
    Full time
    For contractors
    For subcontractor
    Casual work
    Work at office
    Remote work
    Day shift

    jobs.frontdoordefense.com - Jobboard

    Sunnyvale, CA
    4 days ago
  • $148k - $235.75k

     ...into the unlimited potential of AI to define the next era of...  ...world.Join our team of innovative engineers who are building an AI Data...  ...raw, high-volume telemetry into reliable, job-centric insights and automation...  ...distributed systems as SRE/DevOps/Platform Ops.* Proven ownership... 

    NVIDIA Corporation

    Santa Clara, CA
    1 day ago
  • $150k - $180k

    A technology-focused data center developer in Mountain View, CA is looking for a Senior Site Reliability Engineer to manage software infrastructure. This full-time position requires experience in Software Engineering or DevOps, with strong proficiency in Golang. The role... 
    Full time

    Verrus, LLC

    Mountain View, CA
    13 hours ago
  • $207k - $300k

    Google Inc. is looking for a Staff Software Engineer specializing in Site Reliability Engineering in Sunnyvale, CA. This role combines software and systems engineering to build and manage distributed systems, ensuring high reliability and uptime. The ideal candidate should... 

    Google Inc.

    Sunnyvale, CA
    13 hours ago
  •  ...operational resilience. Powered by the Illumio AI Security Graph, our breach containment...  ...Headquarters. Our Team's Vision: Our Engineering team is shaping the future of...  ...looking for an experienced Senior Site Reliability Engineer (SRE) with a strong background in AWS & Azure... 
    Work experience placement

    Illumio

    Sunnyvale, CA
    2 days ago
  • $168k - $270.25k

    Senior Site Reliability Engineer page is loaded## Senior Site Reliability Engineerlocations: US, CA,...  ...tapping into the unlimited potential of AI to define the next era of computing. An...  ...motivated Senior Site Reliability Engineer (SRE) to join our team in Santa Clara, CA.... 

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • $250k

    eGain is the leader in AI knowledge management solutions for enterprises...  ...source of truth—explainable, reliable, and maintainable—that serves...  ...Director of Site Reliability Engineering, you will ensure that eGain’s...  ...Build and lead a world-class SRE organization that ensures... 
    Work at office

    eGain Corporation

    Sunnyvale, CA
    3 days ago
  • $168.93k - $192.5k

     ...visit Role Overview We are seeking a Site Reliability Engineer to join our Core Platform Engineering organization. The SRE team builds the automation, observability, and...  ...high standards for resilience and security in an AI-accelerated development environment. You'll... 
    Full time
    Temporary work
    Work at office
    Remote work
    Flexible hours

    ID.me

    Mountain View, CA
    9 hours ago
  • $200k - $322k

    Senior Manager, Site Reliability Engineering page is loaded## Senior Manager, Site Reliability Engineeringlocations...  ...leveraging the immense potential of AI to usher in the next era of computing,...  ...strong operational execution with an SRE attitude, facilitating the move from... 

    NVIDIA Corporation

    Santa Clara, CA
    13 hours ago
  • $147.4k - $272.1k

    Site Reliability Engineer (Edge Services), Infrastructure Services Sunnyvale, California, United States...  .... Description As a key member of the SRE team, your mission is to treat operations...  ...fluency in applying Generative AI tools within SRE and software engineering... 
    Relocation
    Shift work

    Apple Inc.

    Sunnyvale, CA
    1 day ago
  • $151.6k - $245.3k

     ...Execution, Integrity, and Inclusion. We weave AI into the fabric of everything we do and...  ...of the largest GCP customers. As a Site Reliability Engineer, you will be part of a team supporting...  ...Impact Contribute to the success of SRE and DevOps Develop expertise in new technologies... 
    Full time
    Work at office
    Visa sponsorship
    Work visa

    Palo Alto Networks, Inc.

    Santa Clara, CA
    1 day ago
  • $174k - $252k

    A leading tech company is seeking a Senior Software Engineer for Site Reliability Engineering based in Sunnyvale, CA. The role involves ensuring service reliability, leading technical projects, and enhancing systems performance. Candidates should have at least 5 years of... 

    Google Inc.

    Sunnyvale, CA
    3 days ago
  • $90k - $140k

    Tata Consultancy Services Limited is looking for a Site Reliability Engineer in Sunnyvale, CA, with 8-10 years of experience in application support across multiple environments. The role involves end-to-end ownership of production environments, ensuring reliability, and... 

    Tata Consultancy Services Limited

    Sunnyvale, CA
    3 days ago
  • Apple Inc. is seeking a Site Reliability Engineer for its Enterprise Technology Services in Sunnyvale, California. In this role, you will collaborate with application teams to automate operations, optimize infrastructure, and ensure systems are reliable and high-performing... 

    Apple Inc.

    Sunnyvale, CA
    13 hours ago
  • $210k - $265k

    A leading hardware startup in Saratoga, California is seeking a Hardware Engineer to drive innovations in AI infrastructure solutions. The ideal candidate will have a Bachelor's degree in Electrical or Computer Engineering, with at least 10 years of experience in system... 

    Eridu Corporation

    Saratoga, CA
    2 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to AI Reliability Engineer (AI SRE). Be the first to apply!