Senior Site Reliability Engineer, Observability [Remote]

Full-time

Chainlink Labs

Remote job

About Chainlink Chainlink is the industry-standard oracle platform bringing the capital markets onchain and powering the majority of decentralized finance (DeFi). The Chainlink stack provides the essential data, interoperability, compliance, and privacy standards needed to power advanced blockchain use cases for institutional tokenized assets, lending, payments, stablecoins, and more. Since inventing decentralized oracle networks, Chainlink has enabled tens of trillions in transaction value and now secures the vast majority of DeFi.
Many of the world’s largest financial services institutions have also adopted Chainlink’s standards and infrastructure, including Swift, Euroclear, Mastercard, Fidelity International, UBS, S&P Dow Jones Indices, FTSE Russell, WisdomTree, ANZ, and top protocols such as Aave, Lido, GMX and many others. Chainlink leverages a novel fee model where offchain and onchain revenue from enterprise adoption is converted to LINK tokens and stored in a strategic [Chainlink Reserve]( Learn more at [chain.link](
The Observability Team enables Chainlink development and empowers engineers to continue building and supporting crucial products and services that have a profound impact in the blockchain industry. Reliability is vital to the success of our company. As a Senior SRE, you will help us accelerate and enable other engineering teams by increasing self-service and decreasing cognitive load.
This job would be perfect for someone who has a strong DevOps mentality, is passionate about building and maintaining a mature GitOps environment, and has experience focusing on observability. The entire engineering team is expanding, and you would have plenty of opportunities to build, learn, and grow.
We all have different backgrounds and are determined to help you succeed no matter where you are or who you are. If you think you would do a great job at Chainlink, we are looking forward to speaking with you, even if you don't match 100% of the job requirements: those describe people we've usually had a great time working with, but they're not a tick-box exercise.
Your Impact
Build and orchestrate Modern OTEL-based Observability Platform
Support multiple telemetry types, like metrics, logs and traces.
Define and support modern governance in observability and problems at scale.
Ensure reliability, security, and performance exceed our defined SLAs
Work with engineers from across the company to help troubleshoot issues, deploy new products and services, and increase velocity while decreasing cognitive load
Lead the design and deployment of monitoring/observability services to detect and alert the team of needed action.
Ingest, aggregate, transform, and utilize data from a multitude of sources in our real time data pipeline.
Oversee the availability, performance, and supportability of our observability infrastructure.
Create processes around alert response operations and support the team to ensure the reliable delivery of oracle data.
Make recommendations to ensure sufficient metrics are collected to create alerts with every new feature release.
Champion reliability and security by taking the time to do your work right the first time
Requirements
7+ years of relevant professional experience. You probably have worked on a devops, infrastructure, SRE, and/or platform team before
Ability to develop software outside of the scope of typical infrastructure requirements and configurations
Experience programming in C, C++, Java, Python, Go, Perl, or Ruby
Expert knowledge in all aspects of designing, developing, and managing large real-time systems
Experience with monitoring and logging. You know how to export metrics using Prometheus, have built a Grafana dashboard or two, and have experience with a centralized logging solution like an ELK Stack, Splunk or Grafana Stack.
Experience with distributed systems and container orchestration. You have maintained or even built Kubernetes clusters before and feel comfortable deploying completely new services on them
Strong communication skills. You can give and receive constructive feedback, and you do not shy away from planning meetings and code reviews
Desired Qualifications
Excitement for blockchain, Web 3.0, and similar decentralized technologies.
Experience running any infrastructure in the blockchain/web3 space
Ability to scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity
Experience working remotely in a distributed team
A strong desire to grow and challenge yourself. We would expect you to constantly find ways to improve and automate services to reduce toil
Some of the tools and services we use daily or almost daily are:
AWS; Terraform/Terragrunt; Kubernetes, Calico and ArgoCD; Prometheus and Grafana; GitHub Actions; Packer
We expect you to be comfortable with most of those tools and very proficient in several of them.
All roles with Chainlink Labs are global and remote-based. Unless otherwise stated, we ask that you try to overlap some working hours with Eastern Standard Time (EST). We carefully review all applications and aim to provide a response to every candidate within two weeks after the job posting closes. The closing date is listed on the job advert, so we encourage you to take the time to thoughtfully prepare your application. We want to fully consider your experience and skills, and you will hear from us regarding the status of your application shortly after the closing date. Commitment to Equal Opportunity
Chainlink Labs is an equal opportunity employer. All qualified applicants will receive equal consideration for employment in compliance with applicable laws, regulations, or ordinances. If you need assistance or accommodation due to a disability or special need when applying for a role or in our recruitment process, please contact us via this [form](
Global Data Privacy Notice for Job Candidates and Applicants
Information collected and processed as part of your Chainlink Labs Careers profile, and any job applications you choose to submit, is subject to our [Recruiting Privacy Policy]( By submitting your application, you are agreeing to our use and processing of your data as required.

Apply

Vacancy posted a month ago

Similar jobs that could be interesting for youBased on the Senior Site Reliability Engineer, Observability [Remote] in United States vacancy

Senior Site Reliability Engineer — Observability & CI/CD
jobr.pro is seeking a Senior Site Reliability Engineer in New York, NY, to enhance platform reliability and engineering excellence. You will be instrumental in implementing observability, security, and CI/CD practices. This role involves coaching teams and optimizing workflows...
Senior
jobr.pro
New York, NY
4 days ago
Senior Site Reliability Engineer: Observability & CI/CD
$160k - $200k
Ripple is seeking a Senior Site Reliability Engineer in Chicago. In this role, you will enhance platform reliability by embedding with engineering teams and coaching them on CI/CD practices, observability, and application security. Your expertise will help us redefine...
Senior
jobr.pro
Chicago, IL
4 days ago
Senior Site Reliability Engineer: Observability & CI/CD
$160k - $200k
Ripple in Chicago is seeking a Senior Site Reliability Engineer to enhance product reliability and performance. In this role, you will engage with engineering teams to implement observability practices and optimize CI/CD pipelines, ensuring robust security. The position...
Senior
Ripple
Chicago, IL
2 days ago
Senior Site Reliability Engineer — Kubernetes & Observability
A modern observability platform located in the Boston area is seeking a skilled Site Reliability Engineer to join their Cloud Infrastructure Team. This role involves managing high-scale environments, collaborating with R&D to improve system stability, and performing operational...
Senior
Coralogix, inc.
Boston, MA
4 days ago
Senior Site Reliability Engineer - Scale, Observability & HA
...increases, they are expanding their SRE function to improve reliability, scalability, and performance across their cloud-native environment... ...Improving system reliability across AWS environments Driving observability improvements Automating infrastructure recovery and scaling...
Senior
Involved Solutions
Austin, TX
4 days ago
Senior Site Reliability Engineer - Observability
...home day is currently Tuesday. Engineering at Lambda is responsible for... ...’ll Do Deploy and operate observability platforms for logging,... ...adoptable and improve product reliability. Lead members of other engineering... ...5+ years of experience in Site Reliability Engineering...
Senior
Work at office
Local area
Work from home
Lambda
San Francisco, CA
3 days ago
Senior Site Reliability Engineer - Observability
About the Role We are looking for a Senior SRE to join our Platform Engineering team as the operations owner of our observability platforms. You’ll be responsible for the reliability, scalability, and continued evolution of the tools that give our engineering organization...
Senior
Dimensional Fund Advisors
Austin, TX
4 days ago
Senior Site Reliability Engineer — Observability & Cloud CI/CD
Koitecc Solutions is seeking a Site Reliability Engineer (SRE) in Scottsdale, AZ to ensure the reliability and performance of the myPBM platform... .... This hybrid role involves implementing automation and observability practices while collaborating with cross-functional teams...
Senior
Full time
Koitecc Solutions
Scottsdale, AZ
5 hours ago
Senior Observability & SRE Engineer
Axon in Seattle is seeking a Senior Engineer for its observability team. You'll design and evolve the observability platform, working on distributed tracing, logging, and metrics across Axon's infrastructures. The ideal candidate has strong engineering experience, ideally...
Senior
Koitecc Solutions
Seattle, WA
14 hours ago
Senior SRE / DevOps Engineer - Observability & Telemetry
4p-Consulting-Inc. is looking for an experienced DevOps Engineer IV / Site Reliability Engineer (SRE) in Atlanta, GA. This professional will focus on observability, telemetry, and service reliability, working with engineering and operations teams to enhance operational...
Senior
4p-Consulting-Inc.
Atlanta, GA
4 days ago
Senior Site Reliability Engineer - Cloud‑Native Platform & Observability
$198.03k - $287.95k
Calendly is looking for a Site Reliability Engineer to enhance its innovative infrastructure platform. This role will empower teams by enabling best practices in monitoring and optimizing resources. The ideal candidate will have robust experience with cloud technologies...
Senior
Calendly
New York, NY
14 hours ago
Senior Azure SRE & Observability Platform Engineer
Hitachi Vantara Corporation is looking for a Site Reliability Engineer (SRE) to design and operate the enterprise observability stack, including Azure Monitor and Managed Grafana. This position requires extensive experience in SRE and cloud infrastructure, with a focus...
Senior
Hitachi Vantara Corporation
Chicago, IL
2 days ago
Senior Site Reliability Engineer — Observability & Cloud CI/CD
$92.7k - $203.94k
CVS Health in Richardson, TX, seeks a Site Reliability Engineer responsible for ensuring the reliability and performance of the myPBM platform. The role focuses on automation, incident management, and improving delivery of client services. This position requires 5+ years...
Senior
Koitecc Solutions
Richardson, TX
1 day ago
Senior Site Reliability Engineer — Observability & Cloud CI/CD
Koitecc Solutions is seeking a Site Reliability Engineer to ensure the reliability, performance, and scalability of the myPBM platform. This position involves working closely with DevOps, Engineering, and Security teams. The ideal candidate will have over 5 years of experience...
Senior
Koitecc Solutions
Northbrook, IL
1 day ago
Senior SRE Engineer Azure Healthcare Observability Healthcare SaaS
Senior SRE Engineer Azure Healthcare Observability Healthcare SaaS As we expand our customer deployments, we seek an... ...SRE / DevOps Engineer to ensure reliability, observability, and operational excellence... ...), CTO Technologies Must-have: Site Reliability Engineering (...
Senior
Immediate start
Flexible hours
AppRecode, Inc.
Middletown, NJ
1 day ago
Senior Site Reliability Engineer
...accelerated growth in the AI-driven world. We’re looking for a Senior Site Reliability Engineer to help build and scale a high-impact SRE function. You... ...to guide engineering priorities Design and develop observability systems (metrics, logging, tracing, alerting) that...
Senior
Elea Ecuador
Austin, TX
1 day ago
Principal Site Reliability Engineer, Infrastructure Observability
...-mortems to improve the shared goal of reliability across services* Transform operations teams... ...operating cloud infrastructure with senior‐level impact.* 5+ years building and... ...standardization across the ecosystem for observability, APM and infrastructure monitoring, and...
3 days per week
T. Rowe Price
Owings Mills, MD
14 hours ago
Senior Site Reliability Engineer (SRE)
...APPIT Software Solutions is hiring a Senior Site Reliability Engineer (SRE) in Seattle, USA . Lead site reliability engineering efforts for large... ..., driving 99.99% availability targets through advanced observability, automation, and resilience engineering....
Senior
Flexible hours
Appit LLC
Seattle, WA
5 days ago
Senior Site Reliability Engineer
...match. The role We’re looking for a Senior SRE to own the reliability, scalability, and operational... ...Build and maintain CI/CD pipelines, observability stacks, and incident response workflows... ...development workflows Partner closely with engineering on reliability reviews and...
Senior
Satsuma
Austin, TX
5 days ago
Senior Site Reliability Engineer
...Site Reliability Engineers are responsible for ensuring the availability, reliability, scalability, and performance of the firm’s most critical... ...operations, with a strong emphasis on building systems that are observable, resilient, and operable by default. This is an on-site...
Senior
Local area
Remote work
Flexible hours
Shift work
O'Reilly Technology Services, Inc.
Pierce, ID
1 day ago
Site Reliability Engineer — Observability & Self-Healing
Quest Technology Management is looking for a proactive Site Reliability Engineer in Elk Grove, California. In this role, you will champion the... ...resilience and scalability of services. You will design observability and alerting strategies, automate workflows, and...
Quest Technology Management
Elk Grove, CA
14 hours ago
Senior Site Reliability Engineer
...public cloud platform? Join our IaaS Site Reliability Engineering (SRE) team. We design, develop, and... ...Terraform, and Ansible/Salt and lead observability initiatives (metrics, logging,... ...path in our SRE team: SRE I → SRE II → Senior → Senior II → Principal → Senior Principal...
Senior
Work at office
Remote work
Akamai
New York, NY
3 days ago
Senior Site Reliability Engineer
...Mango, Inc. Senior Site Reliability Engineer Los Angeles, CA·Full time We are seeking a Senior Site Reliability Engineer to own and evolve the infrastructure... ...robust automation that keeps our systems consistent and observable. Key Responsibilities Infrastructure Design &...
Senior
Full time
Mango
Los Angeles, CA
1 day ago
Site Reliability Engineer, Observability Chicago, Illinois, United States
$160k - $200k
Site Reliability Engineer, Observability Please note this is for Chicago, Illinois, United States. You only need toapply to one location if there are multiple... ...that powers the Internet of Value. THE WORK: As a Senior Site Reliability Engineer you will be a force...
Full time
Work at office
Local area
Ripple
Chicago, IL
2 days ago
Senior Site Reliability Engineer
...operations sector, is seeking a dedicated and skilled Senior Site Reliability Engineer to join their dynamic team. As a Senior Site Reliability... ...in collaboration with stakeholders. Develop and enhance observability, telemetry, and monitoring tools to ensure system...
Senior
Weekly pay
ManpowerGroup Global, Inc.
Charlotte, NC
5 days ago
Senior Site Reliability Engineer (SRE)
$150k - $200k
...collaborative environment. About the Role We are looking for a Senior Site Reliability Engineer to help ensure the reliability, scalability, and... ...teams to improve service reliability, performance, and observability. Support incident response, root cause analysis, and...
Senior
Full time
Favorited
Santa Monica, CA
5 days ago
Remote Site Reliability Engineer II - Observability
$111k - $130k
QUEST DIAGNOSTICS INC is seeking a Performance II‑Epic to provide reliability engineering services through observability and performance engineering techniques. The role requires collaboration with product owners, ensuring optimal operation through monitoring system performance...
Remote job
QUEST DIAGNOSTICS INC
Secaucus, NJ
1 day ago
Senior Site Reliability Engineer
...encourage you to apply. The Role As a Senior Platform Engineer, you are a champion for DevOps and... ...Will Be Doing Improving production reliability and system resilience within an SRE... ...preferred Experience operating a production observability stack (metrics, logs, traces), with...
Senior
Flexible hours
Megaport
Dover, FL
3 days ago
Senior Site Reliability Engineer - AWS
$175k - $190k
...behalf of a partner company. We are currently looking for a Senior Site Reliability Engineer - AWS in United States. This role sits at the core of a... ...will play a key role in strengthening CI/CD pipelines, observability, and incident response practices. This is a highly...
Senior
Full time
Temporary work
Jobgether
New York, NY
4 days ago
Senior Site Reliability Engineer
$160k - $195k
...agencies fuels the RapidSOS HARMONY AI engine that delivers this intelligence to those... ...you excited to work on systems where reliability directly impacts real‑world outcomes? At... ...improve system behavior under stress. Build observability into system behavior: Proactively...
Senior
Local area
Flexible hours
RapidSOS
New York, NY
1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Site Reliability Engineer, Observability [Remote]. Be the first to apply!