Staff Site Reliability Engineer
$227k - $290kMoveworks
Who We Are
Moveworks is the universal AI copilot for search and automation across all your business applications. We give employees one place to go to find information and get support while reducing costs for your business. The Moveworks Copilot is powered by an industry-leading Reasoning Engine that uses a combination of public and proprietary language models to understand employee queries, then build and execute multi-step plans that achieve them. It does this by linking into systems (like ITSM, HRIS, ERP, identity management, and more) with native and custom-built integrations that turn natural language into powerful automations for employees.
The world’s most innovative brands like Databricks, Broadcom, Hearst, and Palo Alto Networks trust Moveworks to eliminate repetitive support issues, deliver instant knowledge, and empower employees to work faster across applications.
Founded in 2016, Moveworks has raised $315 million in funding, at a valuation of $2.1 billion , thanks to our award-winning product and team. In 2023, we were included in the Forbes Cloud 100 list as well as the Forbes AI 50 for the fifth consecutive year. We were also recognized by the 2023 Edison Awards for AI Optimized Productivity, and were included on Fast Company's Most Innovative Companies list for 2024!
Moveworks has over 500 employees in six offices around the world, and is backed by some of the world's most prominent investors, including Kleiner Perkins, Lightspeed, Bain Capital Ventures, Sapphire Ventures, Iconiq, and more.
Come join one of the most innovative teams on the planet!
What You Will Do
As a site reliability engineer, you will be an owner of and be responsible for overall health, performance, and capacity of the Moveworks AI infrastructure and services. In addition to helping engineering teams with resolving operational issues, you will also design and implement solutions, tools and practices that help us improve operational efficiency and product SLA. This role is a blend of SRE, infrastructure, and software development.
We’re building a team that indexes on moving fast, solving challenging product/engineering problems and providing value to our customers. To be successful, you'll be partnering with and enabling machine learning, search, product, data, and full stack teams to design and build fault tolerant and scalable infrastructure, services and features. This is an opportunity to play an integral role at the fastest-growing AI startup in its space.
- Design, develop, and evolve site reliability and chaos engineering for Moveworks infrastructure and services.
- Closely work with machine learning, search, product, infrastructure, data, and frontend teams to understand their infrastructure and operational needs and build solutions that are optimal, fault tolerant, and scalable.
- Author and advocate for reliability through best distributed system design patterns (error handling, retries, rate limiting, circuit breaking, etc.). Participate in design discussions and ensure operational readiness of infrastructure, services, and features.
- Design and build tools, libraries, and frameworks that allow engineering teams to rapidly deploy and scale Moveworks infrastructure and applications.
- Review and participate in application performance analysis / tuning and capacity planning.
- Setup and maintain monitoring, metrics, and reporting systems for observability and actionable alerting.
- Define internal and customer-facing key SLA metrics, implement solutions and practices with different teams to improve those metrics.
- Own the engineering on-call process and setup. Drive discussions for outages, root cause analysis, and action items.
- Participate in on-call rotation for second-tier escalation (at Moveworks, each engineer participates in the team specific first-tier on-call rotation). Help diagnose and resolve complex operational issues.
What You Bring To The Table
- 7+ years of experience in authoring and operating complex distributed infrastructure and applications
- Strong experience with container orchestration platform like Kubernetes and cloud infrastructure like AWS / GCP / Azure
- Very high proficiency with Unix/Linux, TCP/IP, DNS, load balancers, autoscaling, file systems and different types of data stores.
- Software development proficiency with Python, Golang, Java, or C++
- Experience working across teams and implementing solutions, tools, and practices to improve observability, reliability, and scalability
- Desire to work at a startup pace in a small company with a high degree of ownership
- Strong motivation, gumption, and an appetite for continuous, incremental changes and completing challenging projects fast
- High level of curiosity about engineering outside of your immediate discipline and an incessant desire to learn
- BS+ in computer science or a related field
Compensation Range : $227,000 - $290,000
*Our compensation package includes a market competitive salary, equity for all full time roles, exceptional benefits, and, for applicable roles, commissions or bonus plans.
Ultimately, in determining pay, final offers may vary from the amount listed based on geography, the role’s scope and complexity, the candidate’s experience and expertise, and other factors.
Moveworks Is An Equal Opportunity Employer
*Moveworks is proud to be an equal opportunity employer. We provide employment opportunities without regard to age, race, color, ancestry, national origin, religion, disability, sex, gender identity or expression, sexual orientation, veteran status, or any other characteristics protected by law.
- ...Job Description Job Description Site Reliability Engineer Onsite- Bay Area, CA Skills Relevant Skills and Experience What You’ll Do (Day-to-Day) Own and manage our cloud infrastructure (GCP or AWS, on-prem). Build, maintain, and optimize Kubernetes...Suggested
$217.57k - $260k
...Identity Left Behind" to enable all people to have a secure digital identity. To learn more, visit Role Overview The Staff Site Reliability Engineer, Infrastructure role is building a high-scale infrastructure team responsible for owning environments with thousands...SuggestedFull timeTemporary workWork at officeRemote workFlexible hoursShift work$168.93k - $192.5k
...Left Behind" to enable all people to have a secure digital identity. To learn more, visit Role Overview We are seeking a Site Reliability Engineer to join our Core Platform Engineering organization. The SRE team builds the automation, observability, and operational...SuggestedFull timeTemporary workWork at officeRemote workFlexible hours- ...Job Description Job Description Forhyre is looking for engineers who can bring unique perspectives and innovative ideas to all areas... ...evangelize cloud best practices while building a culture of reliability and observability Engage in and improve the end to end lifecycle...Suggested
$180k
...knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who... ...Cybersecurity / SRE team is focused on ensuring the security and reliability of X Money. This role will primarily focus on the X Money platform...SuggestedPermanent employmentTemporary workRelocation- ...design by customizing MES tool per business needs Education Requirements, Ideal Experience: Associate’s degree in Industrial Engineering or IT related field Minimum of 0-3 years’ relevant experience Experience in C#, Delphi desired Knowledge of the...Work at office
- ...that keep the world running. Location: 5 on-site days a week in Sunnyvale, CA Headquarters. Our Team's Vision: Our Engineering team is shaping the future of cybersecurity... ...are looking for an experienced Senior Site Reliability Engineer (SRE) with a strong background in...Work experience placement
$150k - $195k
...customers worldwide. Our team is growing, and we are looking for engineers with passion for automation. You will help support the... ...alongside engineering/operations teams to improve the scalability and reliability of internal processes. Participate in an on‑call rotation....Full timeWorldwide$147.4k - $272.1k
Site Reliability Engineer, Enterprise Technology Services Sunnyvale, California, United States Software and Services Imagine what we could do together. At Apple, new ideas have a way of becoming excellent products, services, and customer experiences very quickly. Bring...Relocation- Education Requirements, Ideal Experience: Associate’s degree in Industrial Engineering or IT related field Minimum of 0-3 years’ relevant experience Knowledge of the application of tools/techniques Experience in one coding language (Preferred) Experience in Database (Preferred...
$210k - $270k
Zocdoc is seeking a Senior Site Reliability Engineer to develop and maintain distributed production systems. The ideal candidate will have over 5 years of experience in site reliability or production engineering, particularly in cloud environments like AWS. Responsibilities...$147.4k - $220.9k
Site Reliability Engineer, Customer Systems Sunnyvale, California, United States Software and Services Imagine what you could do here. Apple is a place where extraordinary people gather to do their best work. Together we craft products and experiences people once couldn...Relocation$86.33k - $191.9k
...guardrails to make going fast also going safely. Identifying reliability anti-patterns and solving them systemically . You dive deep into... ...of AI‑assisted developer tools and platforms to increase engineering productivity, enforce code quality standards, and enable real...Local areaFlexible hours- ...technologies. Our mission is to double America’s compute capacity without building new data centers. We are seeking a skilled Site Reliability Engineer to join our growing team. The ideal candidate will help ensure the reliability, scalability, and performance of our hybrid...Work at officeWeekend work
$210k - $270k
Your Impact on our Mission: Zocdoc is looking for a Senior Site Reliability Engineer to help develop, monitor, and maintain our distributed production systems. You’ll be challenged with building frameworks and processes for ensuring uptime for our patients and providers...Flexible hours$180k - $360k
...knowledge. Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who... ...Cybersecurity / SRE team is focused on ensuring the security and reliability of X Money. This role will primarily focus on the X Money platform...Temporary workRelocation$140k - $220k
About the Job You’ll own reliability and operational excellence for Pylon’s production systems. This means designing and implementing... ...scale as we grow. You’ll build tooling that makes the entire engineering team more effective, establish on‑call rotations and runbooks...$174k - $252k
Senior Software Engineer, Site Reliability Engineering X Applicants in San Francisco: Qualified applications with arrest or conviction records will be considered for employment in accordance with the San Francisco Fair Chance Ordinance for Employers and the California...Full time$145k - $165k
...Your Ego : Selflessly collaborate towards our shared purpose. About the role Bolt Graphics is seeking a highly experienced Site Reliability Engineer (SRE) to design, build, and operate highly reliable developer and production systems. This role is mission-critical to...Work at officeImmediate start- ..., and the challenges of building in a high-growth startup, we’d love to talk. This is more than a job—it’s a journey. Site Reliability Engineers (SREs) are responsible for the overall performance and reliability of ASAPP's infrastructure and products. The team owns...Remote work
$232k - $263k
...'re scaling quickly toward long-term growth and IPO readiness. Join us as we define the future of SaaS security! Sr. Staff Site Reliability Engineer As a Sr. Staff SRE at Obsidian , you will define and drive the company-wide reliability vision for a complex, multi...Work from homeFlexible hours$168.93k - $192.5k
...Left Behind" to enable all people to have a secure digital identity. To learn more, visit Role Overview We are seeking a Site Reliability Engineer to join our Core Platform Engineering organization. The SRE team builds the automation, observability, and operational...Full timeTemporary workWork at officeRemote workFlexible hours$250k
...systems, eGain provides the single source of truth—explainable, reliable, and maintainable—that serves as the repository for all... ...and knowledge at scale. Position Overview As Director of Site Reliability Engineering, you will ensure that eGain’s AI knowledge management...Work at office- ...Infrastructure Footprint: Global production infrastructure across AWS, South America, and Europe Role Overview Seeking a Senior Site Reliability Engineer / DevOps Engineer to design, scale, and operate highly available global infrastructure supporting production systems...
$180k - $260k
...facilitating effortless integration into customers’ logistics operations. About the role We are seeking an experienced Senior/Staff Site Reliability Engineer to support the operation, monitoring, and scaling of our growing fleet of autonomous vehicles. In this role, you will...Odd jobWork at officeRemote work$147.4k - $272.1k
Site Reliability Engineer (Edge Services), Infrastructure Services Sunnyvale, California, United States Software and Services We are seeking a proactive Site Reliability Engineer to champion the evolution of our production ecosystems. In this role, you will help drive...RelocationShift work$207k - $300k
Staff Software Engineer, Site Reliability Engineering, Traffic Virtnet corporate_fare Google place Sunnyvale, CA, USA Bachelor’s degree in Computer Science, a related field, or equivalent practical experience. 8 years of experience with software development in one or...Full time$145k - $165k
A technology solutions firm in Sunnyvale, CA is looking for a highly experienced Site Reliability Engineer (SRE). This role involves maintaining uptime and performance across systems. Exceptional Linux expertise and automation skills in Bash and Python are crucial. Key...- A leading tech recruiting firm is seeking a Site Reliability Engineer to manage and optimize cloud infrastructure primarily using GCP or AWS. The role involves maintaining high availability through Kubernetes clusters and improving CI/CD pipelines with Terraform. Ideal...
$207k - $300k
Software Engineering Manager II, Site Reliability Engineering corporate_fare Google Sunnyvale, CA, USA Bachelor’s degree in Computer Science, a related field, or equivalent practical experience. 8 years of experience with software development in one or more programming...Full time
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Staff Site Reliability Engineer. Be the first to apply!
- staff data engineer Mountain View, CA
- engineering aide Mountain View, CA
- software engineer staff Mountain View, CA
- technology administrator Mountain View, CA
- staff engineer Mountain View, CA
- senior staff engineer Mountain View, CA
- assistant engineer Mountain View, CA
- senior staff systems engineer Mountain View, CA
- site reliability engineer Mountain View, CA
- website content developer Mountain View, CA


