Principal Site Reliability Engineer
$163.62k - $212.71kiSpot
Job Description
Job Description
Immigration / Work Authorization Notice: Applicants must be currently authorized to work in the United States. iSpot is not able to sponsor or take over sponsorship of an employment visa for this position at this time.
iSpot competes for the best talent. Our compensation packages consist of salary and equity in one of Seattle's hottest start-ups, as well as other standard benefits. Most importantly, we provide a really interesting working experience, and the chance to contribute to the success of something great.
What You'll Be Part Of:iSpot.tv is changing how brands, agencies, and networks measure and assess the impact of TV advertising. We deal with BIG data, operating mainly in AWS with multiple Kubernetes clusters and thousands of servers. We are looking for an experienced SRE leader with the skills and passion to make a significant impact on our ecosystem. You will have a wide array of projects to tackle, with ample opportunities for growth.
You will be a key member of our SRE leadership team, focused on empowering developers to build, test, and deploy applications faster and more efficiently. You will both lead the team and remain hands-on in designing, building, and maintaining the tools, platforms, and processes that improve our engineering teams' productivity and streamline the software development lifecycle. Your work will directly impact developer happiness and the speed at which we can deliver innovative features to our customers.
Responsibilities:We are seeking a seasoned and strategic Lead/Principal Site Reliability Engineer to drive the reliability, scalability, and performance of our core production systems while significantly enhancing the internal developer experience. This role sits at the intersection of operations and development, requiring deep technical expertise, strong leadership, and a passion for optimizing the entire software development lifecycle (SDLC).
Our team consists of senior engineers who work together with minimal supervision to attain those goals. Candidates must possess deep operational experience with AWS and Kubernetes to support teams utilizing these systems. You will lead the technical direction of the team while remaining a key individual contributor. You will be responsible for creating a culture of engineering excellence, designing self-service platforms, and fostering alignment across all engineering teams to accelerate product delivery and maintain world-class service stability.The key responsibilities are:
- System Reliability and Operations (SRE Focus)
- Platform Design and Management: Architect, build, and maintain scalable, highly available, and reliable cloud infrastructure in AWS leveraging modern container orchestration technologies.
- Data Pipeline Reliability: Serve as the reliability and cost optimization expert for high-volume, data-intensive workloads. Focus on optimizing and ensuring the stability of distributed data processing engines, specifically Apache Spark and related ecosystems (e.g., EMR, Databricks, Glue).
- Observability and Monitoring: Establish comprehensive observability practices by defining SLIs/SLOs, implementing advanced monitoring, alerting, and logging solutions to quickly identify and resolve system anomalies.
- Automation: Drive automation across all operational aspects, including infrastructure provisioning (Terraform), scaling, deployment, and incident response, minimizing toil and manual effort.
- Incident Management: Lead and participate in the incident response lifecycle, performing thorough post-mortems to derive actionable insights and implement preventative measures to improve system resilience.
- AIOps: Define and champion the strategic roadmap for AI/ML integration within SRE, establishing organizational best practices for AIOps, automated incident remediation, Toil Reduction via LLMs, and Automated Root Cause Analysis (RCA) and the governance of LLM-driven tooling to enhance system observability and resilience.
- Developer Experience and Productivity (DevEx Focus)
- Platform Strategy: Design, implement, and champion self-service tools, internal developer portals, and services that empower engineering teams to manage their infrastructure and deployments independently and efficiently.
- AI Developer Tools: Lead the standardization of AI developer assistants by architecting and maintaining global 'steering files' and context-configuration standards, ensuring AI-generated code aligns with our specific patterns, security protocols, and architectural guardrails.
- CI/CD Optimization: Own and continuously improve the CI/CD pipelines, reducing build times, streamlining deployment workflows, and integrating best practices for testing, security (Shift Left), and code quality. Maintain and improve our container orchestration and deployment tools, leveraging Kubernetes, Helm, and ArgoCD to create seamless developer workflows.
- KPIs: Develop, implement, and maintain a set of key performance indicators (KPIs) to measure and improve the developer experience across all of Engineering.
- Mentorship and Documentation: Guide and mentor senior engineers, promoting SRE/DevEx principles. Develop clear, comprehensive documentation and tutorials to ensure seamless adoption of new tools and platforms.
- Cost and Efficiency: Strategically identify and implement opportunities for cloud cost optimization and resource efficiency without compromising reliability or performance.
III. Strategic Leadership and Cross-Team Alignment
- Architecting the Roadmap: Define, champion, and communicate the long-term technical roadmap for the SRE and DevEx platforms, balancing immediate operational needs with strategic, future-state goals.
- Driving Cross-Team Alignment: Act as a critical liaison between infrastructure, security, and product development teams. Proactively drive cross-team alignment on architectural standards, tooling choices, and development workflows to ensure consistency and shared accountability for system health.
- Bottleneck Identification and Mitigation: Systematically identify engineering bottlenecks, friction points, and points of organizational toil within the SDLC. Implement targeted solutions—whether technical, process-based, or organizational—to mitigate these constraints and enhance overall engineering velocity.
- Planning and Execution: Collaborate with engineering leadership to transform the strategic roadmap into actionable, prioritized plans, securing cross-functional buy-in and resources for successful execution.
Qualifications and Education Requirements:
- Bachelor's degree in Computer Science, Engineering, or a related technical field, or equivalent practical experience.
- 10+ years of relevant experience in software engineering, cloud architecture, and/or Site Reliability Engineering, with at least 3 years in a leadership or lead contributor role.
- Deep expertise of AWS, including EKS, ECR, RDS, SQS/SNS, VPC, MWAA and S3.
- Strong proficiency in Infrastructure as Code (IaC) tools (e.g., Terraform, CloudFormation).
- Specialized experience in optimizing large-scale data platforms, specifically with Apache Spark. Proven ability to profile, troubleshoot, and tune Spark jobs for performance, cost, and reliability.
- 5+ years of experience with Kubernetes and containerization in general, including associated tools (kubectl, Helm, ArgoCD).
- Strong knowledge of AWS cost optimization.
- TCP/IP networking, including routing and AWS security groups.
- Excellent knowledge of CI/CD concepts and experience developing associated pipelines in CircleCI.
- Proficient in high-level scripting languages, including shell scripting, Python, and/or JavaScript.
- Experience with OTel and monitoring tools such as Splunk or DataDog. Experience with native AI observability tools is a plus.
- Experience with evaluating and rolling out GenAI tools for improving developer efficiency.
- Excellent communication, collaboration, and stakeholder management skills, with proven experience driving technical initiatives across multiple teams.
- Experience with researching and selecting new/modern developer toolsets and assisting teams in adopting them including vendor assessments, security assessments and procurement process.
- Experience in Ad-Tech or "BIG Data" processing organization is highly preferred
Target cash compensation range: $163,620 - $212,710 USD Annually
We are committed to providing competitive, market-informed compensation. The cash compensation above includes base salary, variable commission for employees in eligible roles, and annual bonus targets for eligible roles. In addition to cash compensation, all full time iSpotters are eligible to participate in iSpot's equity plan to receive stock options. Non-exempt roles will also be eligible for (pre-approved) overtime pay. Individual compensation packages are influenced by different factors unique to each candidate, including their skills, experience, qualifications and other job-related reasons.
For more information on total rewards package, go HERE
Hybrid & Flexible Workplace Policy
iSpot supports a hybrid and flexible workplace. Depending on location and work responsibilities, employees may be designated as full-time or part-time office-based or a fully remote employee. A hybrid work schedule indicates that you work in the office some days and work from home other days. The best hybrid workplaces allow for flexibility while also encouraging consistency.
Those local or living in surrounding areas to one of our offices (Bellevue, WA or New York, NY) will work a hybrid schedule, coming into their local office 1-3 days a week. While those in a role, not office-based and located further away from our offices, will work a fully remote schedule. If you have questions regarding exact details of our hybrid & flexible workplace policy, please let your recruiter know and they will discuss with you further.
#LI-Hybrid
If you don't feel you met every single requirement for the role, don't rule yourself out. Please apply anyway!
iSpot is an equal opportunity employer. All applicants will receive consideration for employment without regard to race, ethnicity, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. If you need assistance and/or a reasonable accommodation due to a disability during the application or the recruiting process, please contact our HR team.
California Residents applying for positions at iSpot can access our California Consumer Privacy Act here.
$163.62k - $212.71k
...maintaining the tools, platforms, and processes that improve our engineering teams’ productivity and streamline the software... ...Responsibilities: We are seeking a seasoned and strategic Lead/Principal Site Reliability Engineer to drive the reliability, scalability, and...PrincipalPermanent employmentFull timePart timeWork experience placementWork at officeLocal areaImmediate startRemote workWork from homeFlexible hoursShift work3 days per week1 day per week- Bright Vision Technologies is looking for a Principal Software Engineer to lead enterprise-wide architectural initiatives and define long-term technology strategy. This remote full-time position requires over 10 years of experience and expertise in designing scalable, resilient...PrincipalRemote jobFull time
$160k - $210k
...achieving remarkable growth in a rapidly evolving industry. Now, we're growing! The Role We are looking for a senior site reliability engineer to work on expanding our global footprint of datacenters and improve service management across Cognitiv. Our immediate...SuggestedWork at officeImmediate startRemote workWork from home- A leading video game developer in Mercer Island seeks a Principal Software Engineer to lead the design of back end services for games. You will ensure high standards and collaborate with teams to implement scalable solutions. Qualifications include a relevant degree and...Principal
- Snowflake is seeking a Principal Software Engineer to solve real business needs at large scale and develop scalable distributed systems. The ideal candidate will have 15+ years of experience, strong computer science fundamentals, and fluency in Java or similar languages...Principal
$276k - $414k
...and other services; and its AR glasses, Spectacles ( . Snap Engineering ( teams build fun and technically sophisticated products that... ...execute with privacy at the forefront. We're looking for a Principal Software Engineer to join the Ads Platform team at Snap. What...PrincipalLive inWork at officeLocal area$165k - $225k
...Smartsheet Inc in Bellevue, WA is seeking a Senior Principal Customer Programs & Adoption Strategy Manager to lead customer adoption strategies aimed at ensuring measurable outcomes. This pivotal role involves defining structured maturity models and operationalizing customer...Principal$155k - $190k
A pioneering AI healthcare startup seeks a Principal Data Engineer to lead the development of innovative data pipelines and infrastructure. Candidates should have a strong background in Azure Databricks and real-time data solutions, with at least 2 years of relevant experience...PrincipalFull time- ...Principal Software Engineer At Snowflake At Snowflake, we are powering the era of the agentic enterprise. To usher in this new era, we seek AI... ...located in the United States, please visit the job posting on the Snowflake Careers Site for salary and benefits information....Principal
- ManpowerGroup Global, Inc. is seeking a Principal Systems Engineer in the Town of Norway, WI, to support the R&D department for advanced cardiac monitoring solutions. Candidates should possess a degree in engineering and 5+ years in systems engineering with medical devices...Principal
$264k - $379.5k
...Principal Software Engineer At Snowflake, we are powering the era of the agentic enterprise. To usher... ...Postgres service that delivers the reliability, performance, and scale developers expect... ...job posting on the Snowflake Careers Site for salary and benefits information:...PrincipalImmediate startFlexible hours$304k
...foundational layer that powers Snowflake's AI, Analytics and Data Engineering capabilities. We lead innovations across open table formats... ...States, please visit the job posting on the Snowflake Careers Site for salary and benefits information: careers.snowflake.com The...PrincipalFlexible hours$304k
...Principal Software Engineer II At Snowflake, we are powering the era of the agentic enterprise. To usher in this new era, we seek AI-native thinkers... ..., please visit the job posting on the Snowflake Careers Site for salary and benefits information: careers.snowflake.com...PrincipalFlexible hours$276k - $414k
...and other services; and its AR glasses, Spectacles ( . Snap Engineering ( teams build fun and technically sophisticated products that... ...execute with privacy at the forefront. We're looking for a Principal Software Engineer to join the Business Experience team at Snap...PrincipalTemporary workLive inWork at officeLocal area$226k - $369k
...Principal Staff Software Engineer - Systems and Infrastructure This role will be based in Mountain View, CA, or Bellevue, WA. At LinkedIn, our approach to flexible work is centered on trust and optimized for culture, connection, clarity, and the evolving needs of...PrincipalFor contractorsWork at officeFlexible hours- ...purpose. Could that be you? Your Mission At UiPath's Site Reliability team, we build the platforms and systems that the entire... ...customer communications, repair item tracking, and assertion of engineering best practices across UiPath. We are scaling each of these...PrincipalWork at officeImmediate startRemote work
$229.2k - $319.5k
...Riot engineers bring deep knowledge of specific technical areas but also value the opportunity to work in a variety of broader domains. As a Principal Software Engineer, you’ll work alongside other engineers to implement technical solutions at a company-wide scale. You...PrincipalTemporary workLocal areaFlexible hours$280k - $402.5k
...Principal Software Engineer II On The Product Security Team At Snowflake, we are powering the era of the agentic enterprise. To usher in this new era, we seek AI-native thinkers across every function who are energized by the opportunity to reinvent how they work. You...PrincipalFlexible hours$173.5k - $234.7k
...T-Mobile Generative AI Software Engineer At T-Mobile, we invest in YOU! Our Total Rewards Package ensures that employees get the same... ..., agent frameworks) that serve multiple teams and operate reliably in production. Experience establishing AI engineering standards...PrincipalFull timeTemporary workPart timeWork experience placementLocal areaFlexible hours- UiPath is seeking a Principal Forward Deployed Engineer Manager in Bellevue, WA. This role involves leading a team of senior engineers to define technical strategies across major customer engagements, focusing on AI and automation solutions. You will build a high-performing...Principal
- Spectraforce Technologies is seeking a Principal Systems Engineer in Bellevue, WA, to empower clinicians and health systems. This role involves driving the development of complex medical device systems, integrating wearable devices and cloud technologies. Ideal candidates...Principal
- Zvh, located in Bellevue, Washington, is seeking a Senior Principal Business Analyst to lead their cardiology portfolio. In this role,... ...over 12 years of relevant experience, a bachelor's degree in engineering, and strong skills in agile product management. Compensation includes...Principal
$144k - $198k
Baxter Healthcare is seeking a Senior Principal, Business Analyst in Bellevue, WA to lead product lifecycle management for their cardiology... ...development process. Candidates should have a BS/BA in engineering, 12+ years of experience in business analysis, and strong knowledge...Principal$233.4k - $339.65k
...Role We are seeking a highly skilled and experienced Principal ML Systems Engineer to join our Autonomous Vehicles team. In this role, you will... ...optimizing services for cost efficiency, performance & reliability ~ Experience with Micro services architecture and proven...PrincipalH1bLocal areaWork from homeRelocation packageFlexible hours$172.5k - $313.7k
...you are not duplicating efforts. Job Category Software Engineering Job Details About Salesforce Salesforce is the #1 AI... ...scale to 10x" conversation Monitoring, Security & Reliability Establish alerting (Grafana, PagerDuty) for both traditional...Principal$304k
...how work gets done. We are hiring a Principal Engineer II to architect the core data processing... ...: Design and implement highly reliable, multi-tenant system internals that handle... ...the job posting on the Snowflake Careers Site for salary and benefits information: careers...PrincipalFlexible hours$229.2k - $319.5k
Principal Software Engineer - Content Access Platform Mercer Island, USA - Job Id: REQ-0009063 Riot engineers bring deep knowledge of specific technical... ...by creating systems that are easy to use, scalable, reliable, and deployed around the globe.Your role will also extend...PrincipalTemporary workLocal areaWorldwideFlexible hours- ...Position Title: Principal Systems Engineer Work Location: Bellevue, WA Assignment Duration: 12 months (possibility of extension)... ...with development engineers to analyze system performance, reliability, and data integrity across the entire data pipeline, from...PrincipalWork at office
$152k - $209k
...defining work. We're all in on this mission. If you are too, let's talk. Role Overview We are seeking an accomplished Principal Engineer to lead the technical architecture and evolution of our hybrid digital ecosystem. In this pivotal role, you will drive the...PrincipalLocal areaWorldwideFlexible hours- ZS is seeking a Digital Transformation & Solutions Associate Principal in Bellevue, WA. This senior role involves leading digital client engagements, driving content transformation strategies, and managing large teams globally. Candidates should have a strong background...Principal
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Principal Site Reliability Engineer. Be the first to apply!
- principal infrastructure engineer Bellevue, WA
- chief engineer Bellevue, WA
- principal developer Bellevue, WA
- director data engineering Bellevue, WA
- general engineer Bellevue, WA
- senior chief engineer Bellevue, WA
- principal network engineer Bellevue, WA
- data center chief engineer Bellevue, WA
- hotel chief engineer Bellevue, WA
- engineering director Bellevue, WA


