Director, Site Reliability Engineering
$121.5k - $306.4kOracle
Job Description
Provides leadership to one or more teams designing and architecting infrastructure and service and provides input on best practices for reliability and functionality. Establishes direction to ensure accurate forecasting and ensure systems have adequate resources. Builds collaborative relationships with the software development team to create reliable, scalable infrastructures. Ensures alignment regarding data collection and contributes to standards for optimizing operations and infrastructure reliability. Defines approaches for incident response activities to ensure service reliability. Ensures in-depth reports. Plays a key role in developing standards for identifying and recommending automation. Anticipates and explains the impact of changes, mentoring other managers on what to communicate. Defines approaches for escalating incidents and refines methods for documentation. Encourages experimenting with new technology, executing improvements, building site reliability knowledge, and providing clear data.
#LI-ES2
Responsibilities
Key Responsibilities
Capacity Ingestion and Management:
-Provides leadership for one or more teams designing and architecting infrastructure and/or service, providing input on the development of best practices for adhering to terms for reliability and functionality.
-Establishes direction for other managers and senior-level individuals to drive the forecasting of demands for infrastructure and respond to capacity needs, ensuring that systems have sufficient resources to meet current and future workloads and identifying and addressing resource gaps.
-Builds collaborative relationships with senior software development team members to design and develop infrastructures that are highly reliable and scalable, meeting stringent deployment requirements.
-Ensures teams align on expectations for identifying opportunities for prototyping and oversees prototyping initiatives (e.g., testing new applications or infrastructures, assisting in onboarding), experimenting with cutting-edge approaches.
Incident and Service Lifecycle Management:
-Ensures alignment across teams regarding performing data collection, triage, technical analysis, and redirection, contributing to the development of standards to maintain and optimize operations and infrastructure reliability.
-Shares techniques across teams for monitoring of services, maintaining up-to-date knowledge of their performance, and thoroughly documenting their condition.
-Defines approaches for performing incident response, root cause analysis, and/or maintenance on assigned services (e.g., software installs, version upgrades, security updates, backup and recovery) and drives execution.
-Ensures teams provide in-depth health and performance reporting and coordinates managerial actions based on trends in data.
-Refines procedures for performing provisioning to support infrastructure, applications, and services, mentoring team members.
-Provides input on standards for decommissioning (e.g., shutting down servers, removing data from databases) to remove objects that are no longer needed.
Automation:
-Plays a key role in developing standards for identifying and recommending opportunities for automation and reviewing potential benefits in terms of metrics across teams to ensure expectations are met.
-Ensures alignment on expectations for developing and drives the implementation of design, automation tools, or scripts.
-Refines strategies for conducting testing on highly complex automations to ensure they perform tasks correctly and produce expected results.
-Provides guidance and expertise to others testing automations.
Technical Communication and Guidance:
-Shares expectations for release notes and communication of in-depth information about the scale, capacity, security, performance attributes, and requirements of services and technology with customers, cross-functional teams and leadership.
-Anticipates and explains the potential impact of infrastructure, feature, and tool changes, considering the strategic implications and goals.
-Takes a leadership role in mentoring other managers on what information to communicate and how to communicate.
Troubleshooting and Resolution:
-Defines approaches for escalating incidents and other highly complex issues arising within Oracle services within and across teams.
-Coordinates with other team leaders to review service performance and ensure the resolution of technical issues spanning multiple services and customers, encouraging collaboration across teams and leveraging advanced investigation and debugging techniques to ensure the achievement of SLOs (service level objectives).
-Refines standard reporting methods for incident documentation and performing root cause analyses, aiming to capture insights and lessons learned for continuous improvement and knowledge sharing.
-Plays a key role in creating guidelines for post-mortem procedures to prevent incident reoccurrence.
-Communicates with other team leaders to ensure adherence to service level agreements (SLAs) made with customers.
Innovation and Improvement:
-Encourages creativity and innovation and coordinates with other leaders to drive the exploration and adoption of innovative tools and technologies to transform infrastructure performance and reliability, investigating implications of adherence to security standards on other integrations.
-Provides input on initiatives to improve performance bottlenecks and optimize deployments, aligning other leaders on expectations for efficient resource usage, speed, and scalability and driving roadmap development.
-Refines standards for developing and maintaining knowledge of site reliability trends and sharing valuable insights and information cross-functionally to drive innovation in building, testing, deploying, and running services.
-Plays a key role in the review of analyses and data, driving and influencing business development decisions (e.g., design changes).
Core Responsibilities
Planning & Execution:
-Oversees and guides multiple teams on managing complex projects or initiatives, monitoring timelines, deliverables, and budgets when applicable to ensure strategic objectives are met. Serves as a role model for appropriately delegating work, setting priorities, and ensuring alignment with business needs. Coaches others on adjusting resources or project timelines in anticipation of business changes.
Collaboration & Partnership:
-Role models leading cross-functional collaborative efforts to ensure alignment of expectations and strategic objectives. Empowers team to build and maintain partnerships with business leaders, stakeholders, and/or customers to address barriers and contribute to organizational success. Drives transparency and inclusivity by modeling actively seeking, listening to, and leveraging diverse perspectives.
Problem Solving:
-Shares problem-solving strategies across teams, providing oversight on complex operational and/or technical issues, as needed. Coaches teams on analyzing highly complex data and/or information to identify solutions to ambiguous issues and provides direction on identifying root causes to prevent recurrence of issues.
Continuous Learning:
-Pursues strategic learning opportunities to maintain expertise and apply best practices at the organizational level. Creates opportunities for team members and leaders to build their expertise in new areas, coaching them to build innovative skills. Identifies skill gap trends across the organization, and upholds a culture that places significant emphasis on sharing knowledge and pursuing learning opportunities that advance the organization. Evaluates efficiency of learning strategies and recommends adjustments as needed.
Continuous Improvement:
-Empowers team to own the development and implementation of ideas that increase the efficiency and effectiveness of processes, protocols, and workflows across the department. Coaches teams to gain buy-in for ideas and to seek feedback on approaches and methods for continued improvement. Prioritizes and reviews the roadmap of improvement initiatives to ensure alignment with strategic direction and maximize return on investments.
Performance and Development:
-Serves as a role model for driving performance across teams through tailored feedback and coaching in alignment with performance management processes, guidelines, and expectations. Drives consistency in the application of talent development procedures and socializes performance expectations across the organization. Ensures that individual development goals are aligned with organizational strategic initiatives. Collaborates with HR to implement talent strategy through hiring and promotion processes.
Disclaimer:
Certain U.S. based or U.S. customer or client-facing roles may be required to comply with applicable requirements, such as immunization/occupational health mandates, and/or drug testing requirements.
Range and benefit information provided in this posting are specific to the stated locations only
US: Hiring Range in USD from: $121,500 to $306,400 per annum. May be eligible for bonus, equity, and compensation deferral.
Oracle maintains broad salary ranges for its roles in order to account for variations in knowledge, skills, experience, market conditions and locations, as well as reflect Oracle's differing products, industries and lines of business.
Candidates are typically placed into the range based on the preceding factors as well as internal peer equity.
Oracle US offers a comprehensive benefits package which includes the following:
Medical, dental, and vision insurance, including expert medical opinion
Short term disability and long term disability
Life insurance and AD&D
Supplemental life insurance (Employee/Spouse/Child)
Health care and dependent care Flexible Spending Accounts
Pre-tax commuter and parking benefits
401(k) Savings and Investment Plan with company match
Paid time off: Flexible Vacation is provided to all eligible employees assigned to a salaried (non-overtime eligible) position. Accrued Vacation is provided to all other employees eligible for vacation benefits. For employees working at least 35 hours per week, the vacation accrual rate is 13 days annually for the first three years of employment and 18 days annually for subsequent years of employment. Vacation accrual is prorated for employees working between 20 and 34 hours per week. Employees working fewer than 20 hours per week are not eligible for vacation.
11 paid holidays
Paid sick leave: 72 hours of paid sick leave upon date of hire. Refreshes each calendar year. Unused balance will carry over each year up to a maximum cap of 112 hours.
Paid parental leave
Adoption assistance
Employee Stock Purchase Plan
Financial planning and group legal
Voluntary benefits including auto, homeowner and pet insurance
The role will generally accept applications for at least three calendar days from the posting date or as long as the job remains posted.
Career Level - M4
About Us
Only Oracle brings together the data, infrastructure, applications, and expertise to power everything from industry innovations to life-saving care. And with AI embedded across our products and services, we help customers turn that promise into a better future for all. Discover your potential at a company leading the way in AI and cloud solutions that impact billions of lives.
True innovation starts when everyone is empowered to contribute. That’s why we’re committed to growing a workforce that promotes opportunities for all with competitive benefits that support our people with flexible medical, life insurance, and retirement options. We also encourage employees to give back to their communities through our volunteer programs.
We’re committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by emailing View email address on click.appcast.io or by calling View phone number on click.appcast.io in the United States.
Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans’ status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.
$81.1k - $187k
...Job Description We are looking for a Site Reliability Engineer 3 to support mission-critical cloud services and production operations. The role focuses on improving service reliability, reducing operational risk, automating repetitive tasks, and driving faster detection...SuggestedTemporary workImmediate startFlexible hoursShift work$95k - $171k
.... Opportunities exist to focus on GPU infrastructure, Kubernetes, and ensuring reliability for AI workloads within Akamai's serverless inference platform. As an Site Reliability Engineer II, you will be responsible for: Building and maintaining dashboards, alerts...SuggestedPermanent employmentWork experience placementWork at officeRemote workWork from homeWorldwideFlexible hours$75.7k - $136.3k
...solve complex challenges? Do you have a passion for automation and building systems that scale? Join our highly skilled Site Reliability Engineering team! Our team designs, develops, and manages applications and infrastructure that support Akamai Cloud's products and...SuggestedWork experience placementWork at office$51.9 per hour
...OVERVIEW: This job is responsible for the reliability, availability, and performance of... ...operational efficiency. This role blends software engineering, clinical engineering, and security... .... Works cross-functionally with AHN site leaders and teams to navigate and to monitor...SuggestedFor contractorsLocal area$84.9k - $209.5k
...Designs and architects infrastructure and service to ensure reliability and functionality. Forecasts demands and responds to capacity needs... ...new tools and develops and maintains advanced knowledge of site reliability trends. #LI-E2 Responsibilities Key Responsibilities...SuggestedTemporary workImmediate startFlexible hoursShift work$122.4k - $170k
Connecticut On-Line Computer Center Inc is seeking a Site Reliability Engineer to join their SRE team in Rocky Hill, Connecticut. The ideal candidate will have strong expertise in Kubernetes, cloud infrastructure, and automation. This role involves managing Kubernetes clusters...- What we need… We're looking for a Site Reliability Engineer to join our SRE team, with strong expertise across Kubernetes, cloud infrastructure, and automation in both public cloud (AWS/Azure) and on‑premise environments. You'll bring a passion for reducing toil through...Work experience placementWork at officeLocal area
- SUMMARY OF THE ROLE The Staff Site Reliability Engineer will lead the evolution of Finalsite’s infrastructure, reliability, and observability practices across a multi‑cloud environment. This role partners closely with engineering leadership to improve CI/CD, environment...Permanent employmentRemote work
- ...improve software solutions to ensure system reliability and availability, mitigate operational... ...issues. You will help lead chaos engineering efforts in a production‑alike environment... ...professionals, with engineers focused on site reliability engineering and observability...Permanent employmentFlexible hours
$186.49k - $278.88k
...Otsuka is seeking an experienced Director of Statistics to join our Data Science and AI group to provide statistical leadership and solutions to efficient phase 3b/4/Real-World-Evidence (RWE) study design, global Health Technology Assessment (HTA) and regulatory requirements...Temporary workLocal areaFlexible hours$180k - $303.6k
...About the Role PagerDuty is seeking a Director of Pricing & Monetization to own the... ...frameworks - in partnership with Product and Engineering Build and maintain a monetization... ...-specific offerings, on our benefits site ( . Your package may include:...Local areaFlexible hours$186.49k - $278.88k
...Position Summary We are seeking an innovative and strategic leader to serve as the Director, U.S. Neuroscience Pipeline & Established Brand Marketing. This individual is responsible for supporting Otsuka's U.S. commercialization efforts, ensuring that pipeline assets...Temporary workLocal areaFlexible hours$186.9k - $234k
...serve as the strategic architect of Rubrik's most critical industry partnerships. As a Global Alliances Director, you will orchestrate a massive cross-functional engine-spanning Field Sales, Engineering, and Marketing-to deepen integration with key alliance partners and...Local areaRemote work- ...Alliance Director We are seeking an experienced and highly engaging Alliance Director to lead and oversee a newly awarded Integrated... ...and/or training with emphasis on Facilities Management, Engineering Operations, Transaction Management, Project Management/Construction...Contract workWork at officeVisa sponsorshipFlexible hours
- Overview RCN Capital is a nationwide wholesale lender focused on helping real estate investors and independent mortgage brokers finance non-owner-occupied residential investment properties. We support a range of strategies including ground-up construction, fix-and-flip...Full timeCasual workWork at officeMonday to Friday2 days per week
- ...A data analytics company in Hartford, Connecticut, is seeking a Director/Sr. Director of Pricing Strategy for the AI Platform to shape monetization strategies. The role requires a strong background in pricing and AI product strategy, along with financial modeling skills...
$150k - $185k
...INTRODUCTION and WHAT YOU WILL DO (Job Responsibilities) Overview We are looking for a strategic and experienced Director of Global Compensation to join our Total Rewards team. This role will focus on developing, implementing, and managing competitive global compensation...Local area$135.4k - $208.1k
...strengthen the overall healthcare experience. Reporting to the Head of the AI Center of Excellence, this role leads our Applied AI engineering efforts. While our AI Product leaders own the long term enterprise capabilities, this role acts as the "builder and execution...Full timeTemporary workFor contractorsLocal areaImmediate startRemote workFlexible hours$141.2k - $338.5k
...intelligence around the world. As Senior Director, Multimodal GenAI and Infrastructure... ...ensuring the infrastructure, capacity, reliability, governance, and operating model needed... ...will partner across applied science, engineering, product, security, finance, operations...Temporary workFlexible hours$165.8k - $207.2k
...or follow us on LinkedIn. This position reports to the Executive Director of Biostatistics and is an integral part of the oncology... ...time. Travel Requirements: Primarily remote role with periodic on-site meetings in office. Must be able to travel domestically and internationally...Contract workWork at officeRemote workWorldwideFlexible hours$169.22k - $253k
Job Summary Supports all pharmacometrics activities related to pre‐IND, IND, phase 1‐3, PK, PK/PD and regulatory submissions. Conducts analysis, execution and reporting of pharmacometrics studies. Provides input into all phases of drug development including, but not limited...Contract workTemporary workLocal areaFlexible hours$73.8k - $218.8k
We Are: Accenture Adobe Business Group. What Accenture does best is UNLEASH the powers of emotion, empathy, and excitement at an enterprise scale to create the best experiences on the planet— for our clients and their customers. Now more than ever, the future belongs...Work experience placementLive inWork at officeLocal area- ..., business judgment, and the ability to earn credibility in a founder-led environment. Experience in operationally intensive businesses, military leadership, or lower-middle-market value creation is highly relevant. On-site leadership role; relocation acceptable....Relocation
$70k - $135k
Associate Manager, Investor Services As the backbone of the investment world, SS&C is committed to excellent client service and our Investor Services teams are fundamental to the delivery of that service. Our Investor Services Associate Managers collaborate, coach and...Ongoing contractWork at office$70k - $85k
...customer focused, and have a proactive mindset. This position will work in our Itasca, IL office and requires regular travel to client sites. Who you are You are a detail‑oriented self‑starter who takes ownership of your job responsibilities, meets deadlines, and can...Work experience placementWork at officeRemote work$112k - $264.1k
Job Description This manager-level position is responsible for people management. May also be responsible for project oversight and support of federal projects. Responsibilities Ensures that operational policies are followed and that business objectives are achieved...Temporary workFlexible hours$229.26k - $286.58k
...progressive healthcare leadership experience with significant operational responsibility ~ Proven success leading complex, multi‑site healthcare operations and improving performance at scale ~ Experience driving operational excellence, implementing scalable systems...Flexible hours$100k
...Role: Build a business from inception within a stable, established company Directly shape the company's next major growth engine High visibility and close partnership with executive leadership Full ownership over strategy, operations, and team...Full timePart timeRemote work$80k - $90k
...compliance with company policies and contributes to the overall success of the organization. Communicates regularly with Regional Manager/Director for update on operations. Manages to ensure compliance with all laws and prevention of co-employment. Performs other duties as...Full timeMonday to FridayShift work$90k - $110k
Associate Manager - Health & Benefits Consulting As Associate Manager - Health & Benefits Consulting you will contribute to a wide variety of complex projects involving the design, financing and ongoing management of the full spectrum of health and benefit programs....Temporary workWork at officeLocal areaVisa sponsorshipWork visaFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Director, Site Reliability Engineering. Be the first to apply!
- principal developer Hartford, CT
- engineering director Hartford, CT
- chief engineer Hartford, CT
- data center chief engineer Hartford, CT
- senior civil engineer project manager Hartford, CT
- director data engineering Hartford, CT
- hotel chief engineer Hartford, CT
- principal infrastructure engineer Hartford, CT
- director software engineering Hartford, CT
- general engineer Hartford, CT


