Principal Site Reliability Engineer
$84.9k - $209.5kOracle
Job Description
Designs and architects infrastructure and service to ensure reliability and functionality. Forecasts demands and responds to capacity needs. Collaborates with software development teams to develop reliable and scalable infrastructures. Exercises judgment when performing data collection to maintain and optimize operations and reliability. Leverages advanced knowledge to perform incident response and/or maintenance tasks. Provides comprehensive health and performance reporting. Identifies and recommends opportunities for automation. Communicates comprehensive information about services and proactively anticipates and articulates the potential impact of changes. Provides comprehensive support for technology and documents incidents. Conducts advanced experiments with new tools and develops and maintains advanced knowledge of site reliability trends.
#LI-E2
Responsibilities
Key Responsibilities
Capacity Ingestion andManagement:
Designs
and architects infrastructure and/or service according to terms for reliability
and functionality.
Forecasts
demands for infrastructure and responds to capacity needs, ensuring systems have
sufficient resources to handle current and future workloads and identifying
resource gaps.
Collaborates
with the software development team to develop infrastructures, ensuring
features are reliable and scalable according to deployment requirements.
Proactively
identifies opportunities for prototyping and drives prototyping initiatives
(e.g., testing new applications or infrastructures, assisting in onboarding) to
explore novel approaches.
Incident and ServiceLifecycle Management:
Exercises
judgment when performing data collection, triage, technical analysis, and
redirection to maintain and optimize operations and infrastructure reliability.
Takes
proactive steps to monitor services, maintain up-to-date knowledge of their
performance, and document their condition.
Leverages
advanced knowledge to perform incident response, root cause analyses, and/or
maintenance on assigned services (e.g., software installs, version upgrades,
security updates, backup and recovery).
Provides
comprehensive health and performance reporting and takes appropriate actions
based on trends in data.
May
perform provisioning to support infrastructure, applications, and services.
May
experiment with new approaches for and performs decommissioning (e.g., shutting
down servers, removing data from databases) to remove objects that are no
longer needed.
Automation:
Identifies
and recommends opportunities for automation and assesses potential benefits to
enhance operational efficiency.
Develops
and implements design, automation tools, or scripts to provide solutions,
gather metrics, monitor, analyze, mitigate, or remediate issues/defects within
infrastructures.
Conducts
testing on moderately complex automations to ensure they perform tasks
correctly and produce expected results.
Technical Communication andGuidance:
Writes
release notes and/or communicates comprehensive information about the scale,
capacity, security, performance attributes, and requirements of services and
technology with customers and immediate and related teams.
Proactively
anticipates and articulates the potential impact of infrastructure, feature,
and tool changes, considering their impact across team operations.
Serves
as a resource to team members on what information to communicate and how to
communicate.
Troubleshooting andResolution:
Provides
comprehensive operational support for technology, serving as a key escalation
point for incidents and moderately complex issues arising within Oracle
services.
Drives
and actively participates in on-call shifts to address issues.
Executes
the resolution of technical issues spanning multiple services, applying
advanced investigation and debugging techniques to achieve SLOs (service level
objectives).
Documents
incidents according to reporting methods and performs root cause analyses,
capturing essential information for analysis and future reference.
Performs
post-mortem procedures to prevent incident reoccurrence.
Innovation and Improvement:
Conducts
advanced experiments and evaluations of cutting-edge tools and technologies to
optimize infrastructure performance and reliability, taking proactive steps to
adhere to security standards.
Identifies
and seeks opportunities to execute improvements for performance bottlenecks and
deployments, ensuring efficient resource usage, speed, and scalability.
Develops
and maintains advanced knowledge of site reliability trends, sharing valuable
insights and information with senior team members, management, and beyond to
promote innovative building, testing, deploying, and running services.
Performs
moderately complex analyses and provides clear data on production to drive
business development decisions (e.g., design changes).
Core Responsibilities
Planning & Execution:
Manages
and coordinates moderately complex tasks, monitoring timelines and deliverables
to ensure timely completion and adherence to requirements for a moderately
sized project or initiative. Efficiently delegates, monitors, and prioritizes
work across multiple projects, providing technical oversight and adjusting plans
to address shifts in resources or timelines.
Collaboration &Partnership:
Collaborates
across the organization to align on expectations and achieve shared objectives.
Leverages understanding of business leaders, stakeholders, and/or customers to
ensure proposed solutions meet their needs. Supports inclusivity by actively
seeking and listening to diverse perspectives, ensuring others feel heard and
respected.
Problem Solving:
Identifies
and addresses moderately complex issues by analyzing a wide range of data
and/or information to identify solutions in accordance with standard practices.
Proactively escalates unresolved or critical issues with a thorough assessment
and suggests potential solutions. Reviews, contributes to, and documents
problem solving strategies.
Continuous Learning:
Pursues
learning opportunities to expand knowledge and skills and/or tools in new areas
and stays abreast of the latest industry trends and best practices. Proactively
seeks and leverages ongoing feedback and training to improve skills. Coaches
and mentors junior team members, fostering continuous learning and knowledge
sharing within and across teams.
Continuous Improvement:
Develops
ideas, recommends updates, and/or collaborates on the implementation of process
improvements to increase the efficiency and effectiveness of processes,
protocols, and workflows across teams, and evaluates the impact on key
stakeholders. Solicits feedback from others on ideas for alternative approaches
and methods for continued improvement.
Performance and Development:
Contributes
to the talent development pipeline by participating in candidate interviews,
assessing candidates, and providing hiring recommendations.
Disclaimer:
Certain U.S. based or U.S. customer or client-facing roles may be required to comply with applicable requirements, such as immunization/occupational health mandates, and/or drug testing requirements.
Range and benefit information provided in this posting are specific to the stated locations only
US: Hiring Range in USD from: $84,900 to $209,500 per annum. May be eligible for bonus and equity.
Oracle maintains broad salary ranges for its roles in order to account for variations in knowledge, skills, experience, market conditions and locations, as well as reflect Oracle's differing products, industries and lines of business.
Candidates are typically placed into the range based on the preceding factors as well as internal peer equity.
Oracle US offers a comprehensive benefits package which includes the following:
Medical, dental, and vision insurance, including expert medical opinion
Short term disability and long term disability
Life insurance and AD&D
Supplemental life insurance (Employee/Spouse/Child)
Health care and dependent care Flexible Spending Accounts
Pre-tax commuter and parking benefits
401(k) Savings and Investment Plan with company match
Paid time off: Flexible Vacation is provided to all eligible employees assigned to a salaried (non-overtime eligible) position. Accrued Vacation is provided to all other employees eligible for vacation benefits. For employees working at least 35 hours per week, the vacation accrual rate is 13 days annually for the first three years of employment and 18 days annually for subsequent years of employment. Vacation accrual is prorated for employees working between 20 and 34 hours per week. Employees working fewer than 20 hours per week are not eligible for vacation.
11 paid holidays
Paid sick leave: 72 hours of paid sick leave upon date of hire. Refreshes each calendar year. Unused balance will carry over each year up to a maximum cap of 112 hours.
Paid parental leave
Adoption assistance
Employee Stock Purchase Plan
Financial planning and group legal
Voluntary benefits including auto, homeowner and pet insurance
The role will generally accept applications for at least three calendar days from the posting date or as long as the job remains posted.
Career Level - IC4
About Us
Only Oracle brings together the data, infrastructure, applications, and expertise to power everything from industry innovations to life-saving care. And with AI embedded across our products and services, we help customers turn that promise into a better future for all. Discover your potential at a company leading the way in AI and cloud solutions that impact billions of lives.
True innovation starts when everyone is empowered to contribute. That's why we're committed to growing a workforce that promotes opportunities for all with competitive benefits that support our people with flexible medical, life insurance, and retirement options. We also encourage employees to give back to their communities through our volunteer programs.
We're committed to including people with disabilities at all stages of the employment process. If you require accessibility assistance or accommodation for a disability at any point, let us know by emailing View email address on click.appcast.io or by calling View phone number on click.appcast.io in the United States.
Oracle is an Equal Employment Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability and protected veterans' status, or any other characteristic protected by law. Oracle will consider for employment qualified applicants with arrest and conviction records pursuant to applicable law.
$109.5k - $150.55k
...strive for the best, own our actions, and grow and evolve. Job Description Renaissance is looking for an experienced Sr Site Reliability Engineer to be part of the Engineering Enablement group's Site Reliability Team with a focus on Application and Infrastructure...SuggestedFor contractorsLocal areaRemote workWorldwideWork visaFlexible hoursWeekend work$83k - $187k
...practices, and ability to develop tools that automate incident management. Description We are looking for a Senior Site Reliability Engineer to join our OCI team. This role is part of a globally distributed team responsible for detecting, triaging, and mitigating...SuggestedTemporary workWork experience placementFlexible hours$121.5k - $306.4k
...infrastructure and service and provides input on best practices for reliability and functionality. Establishes direction to ensure accurate... ...with new technology, executing improvements, building site reliability knowledge, and providing clear data. #LI-ES2 Responsibilities...SuggestedTemporary workFlexible hours$238.7k - $365.7k
...per week, at minimum. The Role The Vehicle Experiences Engine software team is a dynamic and fast paced team that designs, develops... ...requirements such as scalability, maintainability, reliability, extensibility, usability, and security. Work with and bridge...PrincipalLocal areaRemote workWork from homeRelocation package$96.8k - $251.6k
...interested in Oracle opportunities. We are facing several engineering challenges in critical foundational data-plane services that powers... ...and boundaries, bring in your expertise in highly performant, reliable, available system engineering to take OCI data-planes to the...PrincipalTemporary workWork experience placementLocal areaRemote workFlexible hours$96.8k - $306.4k
...embarking on ambitious new initiatives such as canonical implementation of core components for data planes. We are hoping to enhance engineering efficiency by concentrating our expertise on building low level systems with high performance that can be adopted by our core...PrincipalTemporary workWork experience placementWorldwideFlexible hours$99.6k - $234.6k
...Work with a highly technical, distributed systems-focused engineering team Responsibilities Responsibilities Design and build... ...delivery pipelines Optimize ad delivery for low latency and high reliability Collaborate cross-functionally with networking, playback,...PrincipalTemporary workFlexible hours$99.6k - $223.4k
...Job Description About You You work backward from users and operational needs. You care about building usable, reliable software that helps engineering and operations teams reason about complex physical infrastructure. You can learn and model domain concepts such as...PrincipalTemporary workFlexible hours$99.6k - $234.6k
...production at global scale. Foundational Frameworks: Spearhead the engineering of new container runtimes and distributed frameworks to power... ...and writes, regardless of scale. A journaling service that reliably and efficiently records sequential, immutable logs (journal...PrincipalTemporary workWork experience placementWorldwideFlexible hours$173.5k - $310k
...Wolters Kluwer Tax & Accounting team is looking for a Principal Software Engineer to build the next generation of AI-powered capabilities for our cloud audit solutions. You will build AI -powered audit solutions that help professionals analyze complex documents, interpret...PrincipalWork at office$99.6k - $223.4k
...Job Description We are looking for smart systems software engineers with BS/MS/PhD in Computer Science to join the Exadata Team @ Oracle Server Technologies ( Oracle Exadata technology is the newest innovation in Oracle's history as the provider of the industry's...PrincipalTemporary workFlexible hours$96.8k - $306.4k
...Job Description The Senior Principal AI Agent / ML Software Engineer is a Senior Staff-level, hands-on technical leadership role responsible for defining... .... The expectation is to ship, scale, and operate reliable, secure, observable, and cost-aware AI platform...PrincipalTemporary workFlexible hours$146k - $241k
...Position Overview The Principal Data/AI Engineer helps drive the technical strategy and architecture of enterprise-scale data and AI platforms... ...States. This is a global position that will support all our sites. This position can be based at any of our locations around...PrincipalRemote workWork from home$132.96k - $226.04k
...Job Description BAE Systems is looking for a self-motivated and experienced Systems Engineer / Systems Administrator to be an on-site engineering representative at an operational Navy customer facility located at Camp H. M. Smith Marine Corp Base, HI. The selected candidate...PrincipalFull timeFor contractorsWork experience placementLocal areaFlexible hoursAfternoon shift$115.3k - $264.1k
Job Description Manage the development and implementation process of a specific company product. Responsibilities Manage the development and implementation process of a specific company product involving departmental or cross-functional teams focused on the delivery...PrincipalTemporary workFlexible hours$180k - $220k
...and Onsite Notice: This role requires regularly working on-site at customer locations on Oahu, Hawaii, specifically Camp H.M.... ...is a plus. About The Role We are hiring a Site Reliability Engineer (Hawaii) to join our Infrastructure & Security team. You’ll report...Remote workRelocationRelocation package$115.3k - $264.1k
...national narrative and local engagement model for one of Oracle's most visible growth areas: data center and AI infrastructure. The Sr Principal Program Manager - Data Center Campaigns will own the operating rhythm for a national campaign that connects campaign strategy,...PrincipalTemporary workLocal areaFlexible hours$146.3k - $306.4k
...validation strategies, ensuring electrical system performance, reliability, and efficiency meet industry standards. Drives the... ...technical and regulatory standards. -Leads complex sustaining engineering activities, resolving customer escalations, performing root...PrincipalTemporary workFlexible hoursShift work$109.2k - $223.4k
...technical, execution-oriented, comfortable operating in ambiguity, and capable of influencing organizations at all levels — from engineering teams to VP/SVP/EVP stakeholders. Responsibilities Key Responsibilities Lead complex, cross-functional technical programs...PrincipalTemporary workFlexible hours- ...The Principal Business Intelligence Analyst will lead the development of innovative, market leading analytics tools designed to assist our customers model and simulate complex scenarios. This role realizes the high-level business strategy with technical execution, overseeing...Principal
$115.3k - $264.1k
...metrics, and governance frameworks. • Drive alignment across engineering, product management, architecture, security, operations, and... ...alignment. • Improved cloud platform adoption, reliability, scalability, and business outcomes. • Consistent executive...PrincipalTemporary workFlexible hours$115.4k - $251.6k
...complexity of a converged AI Lakehouse platform into a message that resonates with a CIO evaluating enterprise data strategy and a data engineer evaluating query performance. You will be the connective tissue between product, sales, field marketing, communications, and...PrincipalTemporary workFlexible hours$94.9k - $135.6k
A leading healthcare services company is looking for an experienced professional in Application Development to design and implement solutions in Medical Transportation and Freight Audit areas. The role requires 4-8 years of experience, with proficiency in Angular, Java,...Remote work$125k - $191.7k
...Job Description Hybrid: This role is categorized as hybrid/Remote Role: As a Senior Software Systems Engineer on the Software Validation team within the AV organization, you will play a critical role in leading the strategy and execution of validation efforts...Local areaRemote workWork from homeFlexible hours$103.71k - $138.28k
...demonstrated knowledge and experience in system architecture and engineering disciplines. Specific technical knowledge of enterprise level... ...Amazon Web Services. -Supports due diligence activities including site surveys, design, design review, bill of materials creation,...Temporary workRemote work- ...dashboard development. Learn and apply best practices in cloud, data engineering, and LLMOps. Mandatory Qualifications BS/MS in Computer... ...driven solutions to rapidly prototype, test, iterate, and deliver reliable code. Experience using the ChatGPT, Claude or similar models...
- ...We are seeking a Principal Data Engineer to drive scalable, business-focused data solutions that power insight-driven decision-making across... ...Ensure high standards for data quality, governance, and reliability . Support and enable self-service analytics through...PrincipalLocal areaRemote work
$109.2k - $223.4k
...wealth of expertise in project management, technical design, and on-site construction, ensuring that our data centers meet the highest... ...autonomy and will require effective engagement across Design, Engineering and Operations. Your ability to anticipate and mitigate risks...PrincipalTemporary workFor contractorsFlexible hours$203.65k - $305.48k
...exceptional user experiences? If so, we want you on our team. As a Principal Product Designer at Citrix, you'll shape the future of our... ...closely with talented designers, product managers, and engineers across the globe, you'll lead major initiatives to create intuitive...PrincipalLocal area$120k - $150k
...inclusive or limit the duties of the position. Purpose Summary The Principal of Sales within Partnerships & Policy team will lead the... ...unable or limited in your ability to use or access our career site as a result of your disability, you may request reasonable accommodations...PrincipalFull timeLocal areaFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Principal Site Reliability Engineer. Be the first to apply!
- engineering director Honolulu, HI
- principal engineer Honolulu, HI
- general engineer Honolulu, HI
- principal developer Honolulu, HI
- data center chief engineer Honolulu, HI
- chief engineer Honolulu, HI
- hotel chief engineer Honolulu, HI
- site services specialist Honolulu, HI
- site leader Honolulu, HI
- site safety Honolulu, HI

