Workplace Platforms - Site Reliability Engineer (SRE) Lead - Dallas
Goldman Sachs
Endpoint Compute Sre Lead
The Workplace Engineering organization is responsible for the reliability, resilience, and operational integrity of the firm's endpoint compute platforms and services, including:
- Corporate-owned physical devices
- Virtual and cloud-hosted desktops
- Core endpoint services such as device lifecycle management, access and identity integration, profile and session services, and application delivery frameworks
The Endpoint Compute SRE function applies Site Reliability Engineering (SRE) principles to ensure these platforms and services are highly available, observable, scalable, and recoverable, while meeting operational and regulatory expectations.
We are seeking an Endpoint Compute SRE Lead to own reliability engineering and operational excellence across endpoint compute platforms and their foundational services .
This role is focused on systems and services, not applications, and covers the reliability of:
- Endpoint compute platforms (physical, virtual, cloud desktops)
- Device and desktop lifecycle services
- Access and sign-in dependency platforms
- Profile, policy, and session services
- Application delivery and execution frameworks (packaging, deployment, availability—not app functionality)
The successful candidate will define service-level objectives, observability strategies, failure models, and operational practices that ensure a predictable and resilient end-user compute experience at enterprise scale.
Job Responsibilities
Reliability Engineering Across Endpoint Services
- Own end-to-end reliability of endpoint compute platforms and supporting services
- Define service boundaries, dependencies, and critical paths from user sign-in through productive desktop use
- Model failure modes and blast radius across lifecycle, access, and delivery services
- Drive designs that support graceful degradation and fast recovery
Observability & Telemetry
- Establish observability standards across endpoint compute services, including:
- Enrollment and provisioning success rates
- Access and session establishment health
- Policy and profile delivery latency/failures
- Application delivery availability
- Ensure telemetry enables:
- Fast incident detection
- Root cause analysis
- Proactive trend identification
SLOs, SLIs & Error Budgets
- Define SLOs and SLIs for key endpoint services (e.g., sign-in success, provisioning time, policy convergence)
- Implement error budget frameworks to guide change, security control rollout, and platform evolution
- Use reliability signals to influence platform design and operational priorities
Incident, Problem & Resilience Management
- Lead reliability aspects of incident response involving endpoint compute or services
- Drive post-incident reviews focused on systemic corrections
- Identify recurring failure patterns in:
- Lifecycle flows
- Access paths
- Policy or profile delivery
- Sponsor and track permanent fixes, not workarounds
Operational Excellence & Automation
- Define and maintain runbooks, playbooks, and escalation models for endpoint services
- Drive automation to reduce:
- Manual remediation
- Repeat incidents
- Operational toil
- Influence engineering designs to improve operability and debuggability
Risk & Governance Alignment
- Partner with Technology Risk and Security teams to:
- Demonstrate reliability and recoverability controls
- Support operational risk and resilience assessments
- Provide audit-ready evidence for availability and incident management
- Ensure reliability metrics support control effectiveness narratives
Leadership & Collaboration
- Act as the reliability authority for endpoint compute and services
- Partner closely with:
- Endpoint platform engineers
- Device management teams
- Security engineering and identity teams
- Mentor engineers in applying SRE principles to workplace platforms
- Communicate reliability posture clearly to leadership
Basic Qualifications
- 8+ years in SRE, platform operations, reliability engineering, or workplace infrastructure roles
- Strong experience operating endpoint compute platforms and core supporting services at enterprise scale
- Proven ability to define and implement:
- Observability frameworks
- SLOs / SLIs
- Incident and problem management models
- Strong systems thinking across lifecycle, access, and service dependencies
- Excellent documentation and communication skills
Preferred Qualifications
- Experience applying SRE concepts to end-user computing or digital workplace platforms
- Deep understanding of:
- Device lifecycle and provisioning services
- Identity and access dependencies (availability-focused)
- Profile, policy, and session orchestration
- Experience in regulated or high-assurance environments
- Strong ability to influence architecture using data-driven reliability insights
What Success Looks Like
- Endpoint compute and services have clear reliability targets
- Lifecycle, access, and delivery failures are predictable, observable, and fast to remediate
- Incidents are less frequent, shorter, and less impactful
- Platforms are designed with operability and resilience built in
- Leadership has confidence in desktop stability as a service
Job Info
- Job Identification 163435
- Job Category Vice President
- Locations Dallas, TX, United States
- ...Site Reliability Engineer - Vice President Site Reliability Engineering (SRE) is an engineering discipline that combines... ...firm's most critical platform services and ensures... ...Architectural Leadership: Lead the design, build,... ...Locations Dallas, TX, United States...Platform
- ...Cloud SRE Engineer - Associate Goldman Sachs... ...motivated Cloud Site Reliability Engineer (SRE) to... ...Incident Response: Lead high-severity... ...analysis. Cloud Platforms: Advanced knowledge... ...in our own workplace and beyond by ensuring... ...PM Locations Dallas, TX, United States...Platform
- Compliance Engineering, Site Reliability Engineering, Vice President, Dallas Job Description We are... ...operate a suite of platforms and applications... ...application portfolio. SRE at Goldman Sachs... ...in our own workplace and beyond by ensuring... ...suite of class‑leading benefits our firm...PlatformFull timeWork at office
$177k - $265k
...Equinix is looking for a specialist to modernize Digital Workplace operations in Dallas, Texas. You will focus on automating processes, reducing... ...tooling. The role requires expertise in endpoint management platforms like Microsoft Intune and familiarity with IT service...Platform$177k - $265k
...for modernizing Digital Workplace operations into a more... ...to improve service reliability and efficiency Establish... ...Partner with engineering teams to embed automation into workplace platforms and services Modernizing... ...: United States - Dallas Infomart Office DAI :...PlatformFull timeWork at officeShift work- Site Reliability Engineer (Chicago, IL; Dallas, TX; ...) Qualifications: 8+ years of Software Engineering experience, or equivalent demonstrated through... ...scalable and reliable infrastructure on Google Cloud Platform (GCP) for Snowflake data warehousing. Monitor, troubleshoot...PlatformContract workFor contractorsWork experience placement
- Role: Senior SRE Engineer Location: Washington DC - Hybrid... ...observability platform. This is a high-impact... ...Grail to drive proactive reliability, mentoring cross-functional... ...Architecture: Lead the design, governance... ...Flexibility: Ability to work on-site in the Washington, DC...PlatformWork from homeFlexible hours
- ...Compunnel, Inc. is seeking a Senior Cloud Engineer to join the Cloud SRE team in Dallas, Texas. In this role, you will design and develop cloud solutions, ensuring platform reliability and engineering reliability tools. The ideal candidate will have over 7 years of software...Platform
- ...exceptional Principal Site Reliability Engineer to architect, design, and build our SRE foundation from the ground... ...that will support our platform serving millions of... ...~5+ years in lead SRE roles building and... ...in-person meeting in Dallas, TX on need basis Collaborative...PlatformRemote work
- ...LOCATION(S) Dallas assignment JOB... ...Software Engineering account_balance... ..., and leading-edge technology... ...Communications Platform is a... ...approach to reliability, security, and... ...observability and SRE practices:... ...Assistance and a Workplace Ergonomics... ...‑the‑art on‑site health...PlatformFull timeContract workTemporary workWork at officeImmediate start
- ...Lead Engineer, Wealth Management Custody / Asset Transfers... ...Wealth Core Custody Platform manages client asset position... ...Partner closely with SRE and production support... ...inclusion in our own workplace and beyond by ensuring... ...Locations Dallas, TX, United States...Platform
- ...Vanguard is seeking a Change Management Lead to drive crew adoption of the Workplace Engineering platform. The ideal candidate will have at least five years of relevant... ...miss this opportunity to join a dynamic team in Dallas, Texas, where you can leverage your expertise. #J...Platform
- ...Genius Road, LLC is looking for a Treasury System Implementation Consultant in Dallas. The role involves supporting the implementation and ownership of the GTreasury SaaS platform, ensuring effective setup and integration with financial systems. Candidates should have...Platform
- ...Vytwo is hiring a Senior SRE Engineer to enhance their enterprise observability platform. Located in Washington, DC but with flexible hybrid options, the ideal candidate will lead Dynatrace implementation, optimizing large-scale monitoring across cloud environments. Candidates...PlatformFlexible hours
- ...Wealth Management- Lead Software Engineer-Vice President-Dallas Job Description Who... ...Wealth Core Custody Platform manages client... ...Partner closely with SRE and production support... ...in our own workplace and beyond by ensuring... ...state‑of‑the‑art on‑site health centers in certain...PlatformFull timeWork at office
- ..., executes relentlessly, and leads by example. The Marketing Manager... ...This role is 100% onsite in Dallas, TX and is built for a... ...paid advertising Social media platforms Email marketing campaigns Content... ...and holidays Tobacco-free workplace; adherence to company policy...PlatformMonday to FridayFlexible hoursShift workWeekend work
- ...Site Reliability Engineer At GM Financial, innovation drives everything we do... ...generative AI and cloud-native platforms to advanced release... ...Join us and discover a workplace where your ideas matter, your... ...based on agile / scrum and SRE practices Assist Cloud Transformation...PlatformFull timeWork experience placementWork at officeFlexible hoursShift work2 days per week
- ...multiple opportunities for Gen AI Lead with my client. Please go... ...AI Lead Location: Dallas, TX ( Hybrid ) W2... ...Web Frameworks, MLOps, Data Engineering Role Overview:... ...Sagemaker, Azure ML, or GCP AI Platform; Git, Docker, CI/CD....PlatformContract workRemote work
- ...A technology solutions firm in Dallas is seeking a Solutions Integrations Architect. In this role, you will lead a team of architects in designing high-performance computing... ...deployment, ensuring customer success and platform scalability. Ideal candidates will have...Platform
$126k - $180k
...makes Modern Animal different. We are seeking a Lead Veterinarian to join us in Dallas at our Park Cities clinic! With a technology-... ...customizable for your individual needs An in-house technology platform built to reduce the time you spend on admin tasks and...PlatformLocal areaRelocation packageShift work- ...are looking for a Manager, Site Reliability Engineering to be part of revolutionizing... ...'re looking for a hands‑on SRE leader to build and develop... ...across our Azure‑based platform. You'll promote modern SRE... ...design reliability frameworks, lead incident response, coach...Platform
- ...Sachs, our Engineers don’t just make... ..., and leading-edge technology... ...enhancing system reliability, scalability... ...in DevOps, Site Reliability... ...with cloud platforms,... ...DevOps and SRE. ABOUT GOLDMAN... ...in our own workplace and beyond by... ...Locations Dallas, Texas, United...PlatformFull timeWork at officeImmediate start
- ...cloud-based property management platform combines smart automation, AI... .... This role is based in our Dallas office and is part-time - 3... ...Internal Events & Culture Lead the planning and delivery of... ...programmes Help maintain a workplace environment where people feel...PlatformOdd jobPart timeWork at officeRemote workWork from homeFlexible hours3 days per week
- ...Software Developer Principle based in Dallas, TX. In this role, you will lead the engineering improvements of enterprise technology platforms with a focus on system stability, scalability... .... Join us in fostering an inclusive workplace culture while ensuring top-notch...Platform
- ...Software Engineer -Dallas-Associate- Engineering Job... ...Ensure that all platforms are built in accordance... ...scalable and reliable systems with a... ...inclusion in our own workplace and beyond by... ...state-of-the-art on-site health centers in... ...suite of class-leading benefits our firm...PlatformFull timeWork at officeShift work
$147.76k - $221.64k
...enjoy living in it. Engineering Manager, IAM Platform (Ops, SRE & AI Enablement) We are... ...Engineering Manager to lead our Identity and Access... ...operations to a modern Site Reliability Engineering (SRE) model.... ...location for this role is Dallas TX, Nashville, TN or Peoria...PlatformHourly payTemporary workPart timeRelocationRelocation packageFlexible hours- ...President, Software Engineering with Goldman Sachs & Co. LLC in Dallas, Texas (Multiple... ...businesses and internal platforms in innovative and... ...Assistance and a Workplace Ergonomics Program... ...-of-the-art on-site health centers in... ...full suite of class-leading benefits our firm...PlatformFull timeTemporary workWork at office
- ...Treasury, Payments Platform, Software Engineering, Associate, Dallas Job Description Goldman... ...a set of nimble, reliable, and scalable... ...services on AWS but also lead the adoption of... ...inclusion in our own workplace and beyond by ensuring... ...-of-the-art on-site health centers in certain...PlatformFull timeWork at office
- ...Compliance-Dallas-Vice President-Software Engineering Job Description Are you passionate... ...operate a suite of platforms and applications... ...in 1869, we are a leading global investment... ...in our own workplace and beyond by ensuring... ...state‑of‑the‑art on‑site health centers in certain...PlatformFull timeWork at office
- ...Software Engineer - Associate – Asset & Wealth... ...customers, and leading-edge technology, data... ...Communications Platform is a strategic initiative... ...approach to reliability, security, and cost... ...observability and SRE practices: metrics... ...0 PM Locations Dallas, TX, United States...PlatformContract workImmediate start
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Workplace Platforms - Site Reliability Engineer (SRE) Lead - Dallas. Be the first to apply!
- platform developer Dallas, TX
- platform engineer Dallas, TX
- platform engineering manager Dallas, TX
- data platform engineer Dallas, TX
- client platform engineer Dallas, TX
- senior platform engineer Dallas, TX
- site reliability engineer sre Dallas, TX
- site reliability engineer Dallas, TX
- platform product manager Dallas, TX
- platform manager Dallas, TX

