Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Site Reliability Engineer (SRE)

AceStack LLC

Title: Site Reliability Engineer (SRE)


Location: Location: Sunnyvale, CA (3x/ week onsite)


Contract

Responsibilities:

  • Engage with our product teams to understand requirements, design and implement resilient and scalable infrastructure solutions.
  • Operate, monitor, and triage all aspects of our production and non-production environments.
  • Collaborate on code, infrastructure, design reviews, and process enhancements Evaluate and integrate new technologies to improve system reliability, security, and performance.
  • Develop and implement automation to provision, configure, deploy, and monitor services.
  • Participate in an oncall rotation providing hands-on technical expertise during service impacting events.
  • Contribute to capacity planning, scale testing, and disaster recovery exercises Approach operational problems with a software engineering mindset.
Min Qualification:
  • 5+ years in Infrastructure Ops, Site Reliability Engineering, or DevOps focused role.
  • BS degree in computer science or equivalent field with 5+ years of experience.
  • Knowledge of Linux operating system principles, networking fundamentals, and systems management.
  • Demonstrable fluency in at least one of the following languages: Java, Python, or Go.
  • Experience in managing and scaling distributed systems in a public, private, or hybrid cloud environment.
  • Familiarity with micro-services architecture and container orchestration with Kubernetes.
  • Awareness of key security principles including encryption, keys (types and exchange protocols).
  • Understanding of SRE principals including monitoring, alerting, error budgets, fault analysis, and automation.
  • Strong sense of ownership, with a desire to communicate and collaborate with other engineers and teams.
  • Ability to identify and communicate technical and architectural problems, while working with partners and their team to iteratively find solutions.
Experience implementing automation


Scripting experience in Python


Enjoy building partnerships


Apple Exp is preferred

Role Descriptions: We are seeking a DevOps Site Reliability Engineer (SRE) with strong experience in containerization orchestration and automation. The ideal candidate will have hands-on expertise in Kubernetes Docker and Python and will be responsible for building scalable infrastructure automating operations and ensuring high availability of production systems.

Key Responsibilities
  • Design deploy and maintain containerized applications using Docker and Kubernetes.
  • Build and maintain automated infrastructure and deployment pipelines.Develop automation scripts and tools using Python.Manage and optimize Kubernetes clusters in production environments.
  • Implement CICD pipelines to streamline build| test| and deployment processes.
  • Monitor system performance and reliability using observability and monitoring tools.
  • Troubleshoot production issues and participate in incident response and root cause analysis.
  • Work closely with development teams to improve system reliability and deployment efficiency.
  • Implement security and best practices for container and cloud infrastructure.
Essential Skills:
  • We are seeking a DevOps Site Reliability Engineer (SRE) with strong experience in containerization orchestration and automation.
  • The ideal candidate will have hands-on expertise in Kubernetes Docker and Python and will be responsible for building scalable infrastructure automating operations and ensuring high availability of production systems.
Key Responsibilities
  • Design| deploy| and maintain containerized applications using Docker and Kubernetes.
  • Build and maintain automated infrastructure and deployment pipelines.
  • Develop automation scripts and tools using Python.
  • Manage and optimize Kubernetes clusters in production environments.
  • Implement CICD pipelines to streamline build test and deployment processes.
  • Monitor system performance and reliability using observability and monitoring tools.
  • Troubleshoot production issues and participate in incident response and root cause analysis.
  • Work closely with development teams to improve system reliability and deployment efficiency.
  • Implement security and best practices for container and cloud infrastructure.



Desirable Skills:


Skills: Digital : Cloud DevOps~Digital : Python~Digital : DevOps Continuous Integration and Continuous Delivery (CI/CD)~Digital : Kubernetes~Digital : Site Reliability Engineering (SRE) Experience
Vacancy posted 1 day ago
Similar jobs that could be interesting for youBased on the Site Reliability Engineer (SRE) in Sunnyvale, CA vacancy
  •  ...Role: Site Reliability Engineer (SRE) Location: Santa Clara Valley (Cupertino), California, Hybrid. Duration: 6+ Months Job Description: Deploy, support and monitor new and existing services, platforms, and application stacks.... 
    Suggested

    Zortech Solutions

    Santa Clara, CA
    2 days ago
  •  ...Overview: *Must have Apple experience* • At least 8+ years in a Reliability Engineering, DevOps or infrastructure focused role • Advanced experience with programming languages (Python, Java) • Passion for designing and building reliable systems • Strong sense... 
    Suggested

    Purple Drive

    Sunnyvale, CA
    3 days ago
  • $170k - $230k

     ...Site Reliability Engineer (SRE) Palo Alto / San Francisco Bay Area About Mithril Mithril is an AI infrastructure platform built to make GPU compute more accessible and affordable for the world's leading enterprises, AI startups, and the AI research community,... 
    Suggested
    Work at office
    Local area
    1 day per week

    Mithril

    Palo Alto, CA
    3 days ago
  • $158k - $225k

     ...Senior Site Reliability Engineer (SRE) Manufacturing advanced electronics requires understanding millions of signals generated across complex assembly processes. Instrumental builds systems that capture and analyze those signals — images, test results, and process... 
    Suggested

    Instrumental Inc

    Palo Alto, CA
    2 days ago
  • $181k - $197k

    Senior SRE Palo Alto, CA • Engineering • Hybrid • Full-time Founded by a team of ex-Apple engineers, Instrumental provides a collection of software...  ...on, and measuring KPIs to ensure ongoing performance, reliability and efficiency. Network/application security and... 
    Suggested
    Full time

    Clutch Canada

    Palo Alto, CA
    5 days ago
  •  ...SRE Engineer St Louis, MO (Onsite from day 1) Client Required Skills: • Bachelor's Degree in Computer Science, Computer Systems, Information Technology or related. Equivalent experience is acceptable. • Experience with web applications and distributed systems... 

    Omega Solutions

    Santa Clara, CA
    3 days ago
  •  ...SRE Engineer Location: Sunnyvale CA Rate: DOE Duration: 12+ Months What You Will Do: Identify, develop and execute opportunities to raise the bar on engineering & operational excellence. Providing thought leadership and defining strategy on developer... 

    Redolent

    Sunnyvale, CA
    3 days ago
  • $100k - $170k

     ...life. We are seeking a DevOps Engineer who is eager to have an immediate impact...  ...on Amazon EKS with focus on reliability and performance Build and maintain Infrastructure...  ...this role: ~5+ years of experience in SRE, DevOps, or Platform Engineering roles... 
    Full time
    Work at office
    Immediate start
    Visa sponsorship
    Night shift

    eSpace

    Saratoga, CA
    4 days ago
  •  ...SRE DevOps Engineer Our client, an IT Services and Consulting company, is looking for a SRE DevOps Engineer for their Santa Clara, CA location. Responsibilities: Design, implement, and maintain CI/CD pipelines using Jenkins, GitLab CI, or similar tools Deploy... 

    ICONMA

    Santa Clara, CA
    4 days ago
  •  ...About the Role We are seeking a high-caliber Senior Site Reliability Engineer (SRE) based in California to ensure the scalability, reliability, and runtime efficiency of our next-generation platform. In this role, you will bridge the gap between development and operations... 

    Kody

    Sunnyvale, CA
    2 days ago
  •  ...Enterprise Technologies Inc. is a recognized provider of professional IT Consulting services in the US. We are actively seeking SRE Devops Engineer Fulltime Role for one of our direct client. Role: SRE Devops Engineer Location :- Santa Clara,CA (Remote... 
    Full time
    Local area
    Remote work

    Rootshell Enterprise Technologies

    Santa Clara, CA
    3 days ago
  • About the Role CrowdStrike's engineering organization depends on shared infrastructure platforms...  ...engineering ownership to operate reliably, scale safely, harden for security, and...  ...across teams - Work with Infrastructure, SRE, and Data Services on shared operational... 
    Work at office
    Local area
    2 days per week

    CrowdStrike Holdings, Inc.

    Sunnyvale, CA
    1 day ago
  • A technology company is seeking an experienced SRE DevOps with AWS for a hybrid role in Sunnyvale, CA. The ideal candidate will have at least 8 years of experience focused on reliability engineering, with a strong understanding of programming in languages like Python and... 

    Blockchain Technologies. LLC

    Sunnyvale, CA
    3 days ago
  • Java SRE Engineer Onsite San Francisco Bay Area Infrastructure Engineer (2 Positions) We are looking for an experienced Java SRE / Platform...  ...platforms. This role is focused on infrastructure, reliability, and automation , with Java exposure as a supporting skill.... 

    EITACIES Inc.

    Santa Clara, CA
    1 day ago
  •  ...that goes for the talent we hire. We’re looking for a Principal SRE to join our InfoSec SRE team that owns the process of securing...  ...Qualifications Must be a U.S. Citizen. BS/MS in Computer Science/Engineering or equivalent training, education, and experience in... 
    Full time
    Work at office
    Visa sponsorship
    Work visa

    Palo Alto Networks, Inc.

    Santa Clara, CA
    1 day ago
  • A global technology leader is looking for an experienced SRE software engineer in Cupertino, California, to build and enhance compute infrastructure...  ...Applicants should have at least 8 years of experience in site reliability engineering, a strong background in cloud infrastructure,... 

    Apple Inc.

    Cupertino, CA
    5 days ago
  • A leading technology company is looking for a Java SRE Engineer to support large-scale cloud migrations and production systems on AWS and...  ...team members and collaborating with various teams to ensure reliability. This position is onsite in the San Francisco Bay Area. #J-18... 

    EITACIES Inc.

    Santa Clara, CA
    1 day ago
  • CrowdStrike Holdings, Inc. is searching for an experienced SRE Engineering Manager in Sunnyvale, California. You will lead a high-performing team to ensure the reliability and scalability of our cloud-native security platform. Ideal candidates will have extensive experience... 
    Remote work

    CrowdStrike Holdings, Inc.

    Sunnyvale, CA
    1 day ago
  • $120.2k - $155.5k

    A global cybersecurity company seeks an intermediate SRE Specialist to enhance service reliability for high-traffic services. The role requires strong Linux administration skills, 3+ years of experience with automation tools, and expertise in network management. Ideal candidates... 
    Full time

    Fortinet, Inc.

    Sunnyvale, CA
    3 days ago
  •  ...Site Reliability Engineer Onsite - Bay Area, CA Skills Relevant Skills and Experience What You’ll Do (Day-to-Day) Own and manage our cloud infrastructure...  ...role — no customer interaction. Must-Have: 4+ years in SRE, DevOps, or Infrastructure Engineering Solid experience with... 

    Amiri Recruiting

    Santa Clara, CA
    2 days ago
  • CrowdStrike, Inc. is seeking an experienced SRE Engineering Manager in Sunnyvale, California to lead a talented team focused on ensuring the reliability and scalability of our cloud-native security platform. The role involves managing significant challenges, leveraging... 

    Koitecc Solutions

    Sunnyvale, CA
    5 days ago
  • $250k

     ...the single source of truth—explainable, reliable, and maintainable—that serves as the...  ...Position Overview As Director of Site Reliability Engineering, you will ensure that eGain’s AI knowledge...  ...Build and lead a world-class SRE organization that ensures exceptional... 
    Work at office

    eGain Corporation

    Sunnyvale, CA
    4 days ago
  •  ...Senior Site Reliability Engineer LeanData helps the world's fastest-growing companies automate, simplify, and accelerate revenue. We are looking...  ...Are Experienced Architect: 5+ years of experience in SRE, DevOps, or Systems Engineering, with a proven track record... 
    Full time
    Work at office
    Flexible hours
    2 days per week

    LeanData

    Santa Clara, CA
    3 days ago
  • $170k - $200k

     ...Site Reliability Engineer We are seeking a talented and motivated Site Reliability Engineer to join our engineering team. You will be responsible...  ...in infrastructure automation, system reliability, and a SRE mindset of continuous improvement. Key Responsibilities:... 
    Full time
    Worldwide

    Edelman

    Sunnyvale, CA
    1 day ago
  • $148k - $235.75k

     ...world. NVIDIA is looking for a seasoned SRE to join its complex and fast-paced...  ...where you will be working as a Senior SRE Engineer. The position will be part of a fast-paced...  ...-prem infrastructure. Maintain uptime, reliability and readiness of on-prem engineering cloud... 
    Remote work

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $145k - $175k

     ...Site Reliability Engineer (SRE) Bolt Graphics is a semiconductor startup based in Sunnyvale, CA building the fastest and most efficient graphics processors. We pride ourselves on our first principles approach to solving problems. We are energized by our mission to... 
    Work at office
    Immediate start
    Work from home

    Bolt Graphics

    Sunnyvale, CA
    1 day ago
  •  ...the world running. Location: 5 on-site days a week in Sunnyvale, CA Headquarters. Our Team's Vision: Our Engineering team is shaping the future of cybersecurity...  ...for an experienced Senior Site Reliability Engineer (SRE) with a strong background in AWS & Azure... 
    Work experience placement
    Immediate start

    Illumio

    Sunnyvale, CA
    5 days ago
  •  ...world running. Location: 5 on-site days a week in Sunnyvale, CA...  ...Our Team's Vision: Our Engineering team is driven by a culture...  .... Your Impact: As an SRE Engineer II, you will be responsible...  ...work on enhancing system reliability and scalability of Illumio... 
    Work experience placement
    Immediate start

    Illumio

    Sunnyvale, CA
    5 days ago
  •  ...Senior Site Reliability Engineer Location: Remote Duration: 12 month contract to start IV Process: 1-3 Round IV process International...  ...automation suite for In-Market BCDR systems - Enable SRE Best practices and standardization for Brick & Mortar... 
    Contract work
    Local area
    Remote work

    My3Tech Inc

    Sunnyvale, CA
    2 days ago
  •  ...that turns raw, high‑volume telemetry into reliable, job‑centric insights and automation for GPU fleets. Join our team of innovative engineers who are building this platform and...  ...operating production distributed systems as SRE/DevOps/Platform Ops. Proven ownership of... 

    NVIDIA Gruppe

    Santa Clara, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Site Reliability Engineer (SRE). Be the first to apply!