Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Site Reliability Engineer, Observability [Remote]

Full-time

Chainlink Labs

United States
  • Remote job
About Chainlink Chainlink is the industry-standard oracle platform bringing the capital markets onchain and powering the majority of decentralized finance (DeFi). The Chainlink stack provides the essential data, interoperability, compliance, and privacy standards needed to power advanced blockchain use cases for institutional tokenized assets, lending, payments, stablecoins, and more. Since inventing decentralized oracle networks, Chainlink has enabled tens of trillions in transaction value and now secures the vast majority of DeFi.

Many of the world’s largest financial services institutions have also adopted Chainlink’s standards and infrastructure, including Swift, Euroclear, Mastercard, Fidelity International, UBS, S&P Dow Jones Indices, FTSE Russell, WisdomTree, ANZ, and top protocols such as Aave, Lido, GMX and many others. Chainlink leverages a novel fee model where offchain and onchain revenue from enterprise adoption is converted to LINK tokens and stored in a strategic [Chainlink Reserve]( Learn more at [chain.link](

The Observability Team enables Chainlink development and empowers engineers to continue building and supporting crucial products and services that have a profound impact in the blockchain industry. Reliability is vital to the success of our company. As a Senior SRE, you will help us accelerate and enable other engineering teams by increasing self-service and decreasing cognitive load.

This job would be perfect for someone who has a strong DevOps mentality, is passionate about building and maintaining a mature GitOps environment, and has experience focusing on observability. The entire engineering team is expanding, and you would have plenty of opportunities to build, learn, and grow.

We all have different backgrounds and are determined to help you succeed no matter where you are or who you are. If you think you would do a great job at Chainlink, we are looking forward to speaking with you, even if you don't match 100% of the job requirements: those describe people we've usually had a great time working with, but they're not a tick-box exercise.

Your Impact
  • Build and orchestrate Modern OTEL-based Observability Platform
  • Support multiple telemetry types, like metrics, logs and traces.
  • Define and support modern governance in observability and problems at scale.
  • Ensure reliability, security, and performance exceed our defined SLAs
  • Work with engineers from across the company to help troubleshoot issues, deploy new products and services, and increase velocity while decreasing cognitive load
  • Lead the design and deployment of monitoring/observability services to detect and alert the team of needed action.
  • Ingest, aggregate, transform, and utilize data from a multitude of sources in our real time data pipeline.
  • Oversee the availability, performance, and supportability of our observability infrastructure.
  • Create processes around alert response operations and support the team to ensure the reliable delivery of oracle data.
  • Make recommendations to ensure sufficient metrics are collected to create alerts with every new feature release.
  • Champion reliability and security by taking the time to do your work right the first time
Requirements
  • 7+ years of relevant professional experience. You probably have worked on a devops, infrastructure, SRE, and/or platform team before
  • Ability to develop software outside of the scope of typical infrastructure requirements and configurations
  • Experience programming in C, C++, Java, Python, Go, Perl, or Ruby
  • Expert knowledge in all aspects of designing, developing, and managing large real-time systems
  • Experience with monitoring and logging. You know how to export metrics using Prometheus, have built a Grafana dashboard or two, and have experience with a centralized logging solution like an ELK Stack, Splunk or Grafana Stack.
  • Experience with distributed systems and container orchestration. You have maintained or even built Kubernetes clusters before and feel comfortable deploying completely new services on them
  • Strong communication skills. You can give and receive constructive feedback, and you do not shy away from planning meetings and code reviews
Desired Qualifications
  • Excitement for blockchain, Web 3.0, and similar decentralized technologies.
  • Experience running any infrastructure in the blockchain/web3 space
  • Ability to scale systems sustainably through mechanisms like automation, and evolve systems by pushing for changes that improve reliability and velocity
  • Experience working remotely in a distributed team
  • A strong desire to grow and challenge yourself. We would expect you to constantly find ways to improve and automate services to reduce toil
Some of the tools and services we use daily or almost daily are:
  • AWS; Terraform/Terragrunt; Kubernetes, Calico and ArgoCD; Prometheus and Grafana; GitHub Actions; Packer
  • We expect you to be comfortable with most of those tools and very proficient in several of them.
All roles with Chainlink Labs are global and remote-based. Unless otherwise stated, we ask that you try to overlap some working hours with Eastern Standard Time (EST). We carefully review all applications and aim to provide a response to every candidate within two weeks after the job posting closes. The closing date is listed on the job advert, so we encourage you to take the time to thoughtfully prepare your application. We want to fully consider your experience and skills, and you will hear from us regarding the status of your application shortly after the closing date. Commitment to Equal Opportunity

Chainlink Labs is an equal opportunity employer. All qualified applicants will receive equal consideration for employment in compliance with applicable laws, regulations, or ordinances. If you need assistance or accommodation due to a disability or special need when applying for a role or in our recruitment process, please contact us via this [form](

Global Data Privacy Notice for Job Candidates and Applicants

Information collected and processed as part of your Chainlink Labs Careers profile, and any job applications you choose to submit, is subject to our [Recruiting Privacy Policy]( By submitting your application, you are agreeing to our use and processing of your data as required.

Vacancy posted a month ago
Similar jobs that could be interesting for youBased on the Senior Site Reliability Engineer, Observability [Remote] in United States vacancy
  • jobr.pro is seeking a Senior Site Reliability Engineer in New York, NY, to enhance platform reliability and engineering excellence. You will be instrumental in implementing observability, security, and CI/CD practices. This role involves coaching teams and optimizing workflows... 
    Senior

    jobr.pro

    New York, NY
    4 days ago
  • $160k - $200k

    Ripple is seeking a Senior Site Reliability Engineer in Chicago. In this role, you will enhance platform reliability by embedding with engineering teams and coaching them on CI/CD practices, observability, and application security. Your expertise will help us redefine... 
    Senior

    jobr.pro

    Chicago, IL
    4 days ago
  • $160k - $200k

    Ripple in Chicago is seeking a Senior Site Reliability Engineer to enhance product reliability and performance. In this role, you will engage with engineering teams to implement observability practices and optimize CI/CD pipelines, ensuring robust security. The position... 
    Senior

    Ripple

    Chicago, IL
    2 days ago
  • A modern observability platform located in the Boston area is seeking a skilled Site Reliability Engineer to join their Cloud Infrastructure Team. This role involves managing high-scale environments, collaborating with R&D to improve system stability, and performing operational... 
    Senior

    Coralogix, inc.

    Boston, MA
    4 days ago
  •  ...increases, they are expanding their SRE function to improve reliability, scalability, and performance across their cloud-native environment...  ...Improving system reliability across AWS environments Driving observability improvements Automating infrastructure recovery and scaling... 
    Senior

    Involved Solutions

    Austin, TX
    4 days ago
  •  ...home day is currently Tuesday. Engineering at Lambda is responsible for...  ...’ll Do Deploy and operate observability platforms for logging,...  ...adoptable and improve product reliability. Lead members of other engineering...  ...5+ years of experience in Site Reliability Engineering... 
    Senior
    Work at office
    Local area
    Work from home

    Lambda

    San Francisco, CA
    3 days ago
  • About the Role We are looking for a Senior SRE to join our Platform Engineering team as the operations owner of our observability platforms. You’ll be responsible for the reliability, scalability, and continued evolution of the tools that give our engineering organization... 
    Senior

    Dimensional Fund Advisors

    Austin, TX
    4 days ago
  • Koitecc Solutions is seeking a Site Reliability Engineer (SRE) in Scottsdale, AZ to ensure the reliability and performance of the myPBM platform...  .... This hybrid role involves implementing automation and observability practices while collaborating with cross-functional teams... 
    Senior
    Full time

    Koitecc Solutions

    Scottsdale, AZ
    5 hours ago
  • Axon in Seattle is seeking a Senior Engineer for its observability team. You'll design and evolve the observability platform, working on distributed tracing, logging, and metrics across Axon's infrastructures. The ideal candidate has strong engineering experience, ideally... 
    Senior

    Koitecc Solutions

    Seattle, WA
    14 hours ago
  • 4p-Consulting-Inc. is looking for an experienced DevOps Engineer IV / Site Reliability Engineer (SRE) in Atlanta, GA. This professional will focus on observability, telemetry, and service reliability, working with engineering and operations teams to enhance operational... 
    Senior

    4p-Consulting-Inc.

    Atlanta, GA
    4 days ago
  • $198.03k - $287.95k

    Calendly is looking for a Site Reliability Engineer to enhance its innovative infrastructure platform. This role will empower teams by enabling best practices in monitoring and optimizing resources. The ideal candidate will have robust experience with cloud technologies... 
    Senior

    Calendly

    New York, NY
    14 hours ago
  • Hitachi Vantara Corporation is looking for a Site Reliability Engineer (SRE) to design and operate the enterprise observability stack, including Azure Monitor and Managed Grafana. This position requires extensive experience in SRE and cloud infrastructure, with a focus... 
    Senior

    Hitachi Vantara Corporation

    Chicago, IL
    2 days ago
  • $92.7k - $203.94k

    CVS Health in Richardson, TX, seeks a Site Reliability Engineer responsible for ensuring the reliability and performance of the myPBM platform. The role focuses on automation, incident management, and improving delivery of client services. This position requires 5+ years... 
    Senior

    Koitecc Solutions

    Richardson, TX
    1 day ago
  • Koitecc Solutions is seeking a Site Reliability Engineer to ensure the reliability, performance, and scalability of the myPBM platform. This position involves working closely with DevOps, Engineering, and Security teams. The ideal candidate will have over 5 years of experience... 
    Senior

    Koitecc Solutions

    Northbrook, IL
    1 day ago
  • Senior SRE Engineer Azure Healthcare Observability Healthcare SaaS As we expand our customer deployments, we seek an...  ...SRE / DevOps Engineer to ensure reliability, observability, and operational excellence...  ...), CTO Technologies Must-have: Site Reliability Engineering (... 
    Senior
    Immediate start
    Flexible hours

    AppRecode, Inc.

    Middletown, NJ
    1 day ago
  •  ...accelerated growth in the AI-driven world. We’re looking for a Senior Site Reliability Engineer to help build and scale a high-impact SRE function. You...  ...to guide engineering priorities Design and develop observability systems (metrics, logging, tracing, alerting) that... 
    Senior

    Elea Ecuador

    Austin, TX
    1 day ago
  •  ...-mortems to improve the shared goal of reliability across services* Transform operations teams...  ...operating cloud infrastructure with senior‐level impact.* 5+ years building and...  ...standardization across the ecosystem for observability, APM and infrastructure monitoring, and... 
    3 days per week

    T. Rowe Price

    Owings Mills, MD
    14 hours ago
  •  ...APPIT Software Solutions is hiring a Senior Site Reliability Engineer (SRE) in Seattle, USA . Lead site reliability engineering efforts for large...  ..., driving 99.99% availability targets through advanced observability, automation, and resilience engineering.... 
    Senior
    Flexible hours

    Appit LLC

    Seattle, WA
    5 days ago
  •  ...match. The role We’re looking for a Senior SRE to own the reliability, scalability, and operational...  ...Build and maintain CI/CD pipelines, observability stacks, and incident response workflows...  ...development workflows Partner closely with engineering on reliability reviews and... 
    Senior

    Satsuma

    Austin, TX
    5 days ago
  •  ...Site Reliability Engineers are responsible for ensuring the availability, reliability, scalability, and performance of the firm’s most critical...  ...operations, with a strong emphasis on building systems that are observable, resilient, and operable by default. This is an on-site... 
    Senior
    Local area
    Remote work
    Flexible hours
    Shift work

    O'Reilly Technology Services, Inc.

    Pierce, ID
    1 day ago
  • Quest Technology Management is looking for a proactive Site Reliability Engineer in Elk Grove, California. In this role, you will champion the...  ...resilience and scalability of services. You will design observability and alerting strategies, automate workflows, and... 

    Quest Technology Management

    Elk Grove, CA
    14 hours ago
  •  ...public cloud platform? Join our IaaS Site Reliability Engineering (SRE) team. We design, develop, and...  ...Terraform, and Ansible/Salt and lead observability initiatives (metrics, logging,...  ...path in our SRE team: SRE I → SRE II → Senior → Senior II → Principal → Senior Principal... 
    Senior
    Work at office
    Remote work

    Akamai

    New York, NY
    3 days ago
  •  ...Mango, Inc. Senior Site Reliability Engineer Los Angeles, CA·Full time We are seeking a Senior Site Reliability Engineer to own and evolve the infrastructure...  ...robust automation that keeps our systems consistent and observable. Key Responsibilities Infrastructure Design &... 
    Senior
    Full time

    Mango

    Los Angeles, CA
    1 day ago
  • $160k - $200k

    Site Reliability Engineer, Observability Please note this is for Chicago, Illinois, United States. You only need toapply to one location if there are multiple...  ...that powers the Internet of Value. THE WORK: As a Senior Site Reliability Engineer you will be a force... 
    Full time
    Work at office
    Local area

    Ripple

    Chicago, IL
    2 days ago
  •  ...operations sector, is seeking a dedicated and skilled Senior Site Reliability Engineer to join their dynamic team. As a Senior Site Reliability...  ...in collaboration with stakeholders. Develop and enhance observability, telemetry, and monitoring tools to ensure system... 
    Senior
    Weekly pay

    ManpowerGroup Global, Inc.

    Charlotte, NC
    5 days ago
  • $150k - $200k

     ...collaborative environment. About the Role We are looking for a Senior Site Reliability Engineer to help ensure the reliability, scalability, and...  ...teams to improve service reliability, performance, and observability. Support incident response, root cause analysis, and... 
    Senior
    Full time

    Favorited

    Santa Monica, CA
    5 days ago
  • $111k - $130k

    QUEST DIAGNOSTICS INC is seeking a Performance II‑Epic to provide reliability engineering services through observability and performance engineering techniques. The role requires collaboration with product owners, ensuring optimal operation through monitoring system performance... 
    Remote job

    QUEST DIAGNOSTICS INC

    Secaucus, NJ
    1 day ago
  •  ...encourage you to apply. The Role As a Senior Platform Engineer, you are a champion for DevOps and...  ...Will Be Doing Improving production reliability and system resilience within an SRE...  ...preferred Experience operating a production observability stack (metrics, logs, traces), with... 
    Senior
    Flexible hours

    Megaport

    Dover, FL
    3 days ago
  • $175k - $190k

     ...behalf of a partner company. We are currently looking for a Senior Site Reliability Engineer - AWS in United States. This role sits at the core of a...  ...will play a key role in strengthening CI/CD pipelines, observability, and incident response practices. This is a highly... 
    Senior
    Full time
    Temporary work

    Jobgether

    New York, NY
    4 days ago
  • $160k - $195k

     ...agencies fuels the RapidSOS HARMONY AI engine that delivers this intelligence to those...  ...you excited to work on systems where reliability directly impacts real‑world outcomes? At...  ...improve system behavior under stress. Build observability into system behavior: Proactively... 
    Senior
    Local area
    Flexible hours

    RapidSOS

    New York, NY
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Site Reliability Engineer, Observability [Remote]. Be the first to apply!