Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Site Reliability Engineer [Remote]

$160k - $230k

Together AI

Remote
  • Remote job

As a Site Reliability Engineer (SRE) at Together, you are responsible for keeping all user-facing services and production systems running smoothly. You are a blend of a pragmatic operator and a software engineer that applies sound engineering principles, operational discipline, and mature automation to our operating environments and codebase.

You specialize in systems (operating systems, storage subsystems, networking), while implementing best practices for availability, reliability and scalability, with varied interests in algorithms and distributed systems.

Requirements

  • 5+ years of professional SRE or related experience
  • Bachelor's degree in Computer Science or a related field or equivalent work experience
  • Expert knowledge of Ansible (roles, playbooks), Terraform, and Kubernetes
  • Proficiency in programming/scripting languages
  • Direct experience in monitoring and observability practices
  • Advanced knowledge of cloud services
  • Ability to thrive in a collaborative environment involving different stakeholders and subject matter experts

Responsibilities

  • Be on an on-call (PagerDuty) rotation to respond to incidents that impact availability
  • Build and run our infrastructure with Ansible, Terraform, and Kubernetes to enable scaling to a massive number of concurrent users
  • Build monitoring systems to ensure the highest quality service for our customers
  • Design and implement operational processes (such as deployments and upgrades)
  • Debug production issues across all services and levels of the stack
  • Identify improvements for the product architecture from the reliability, performance and availability perspectives 
  • Plan the growth of Together AI’s infrastructure

About Together AI

Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure.

Compensation

We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

Equal Opportunity

Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.

Please see our privacy policy at

Vacancy posted more than 2 months ago
Similar jobs that could be interesting for youBased on the Site Reliability Engineer [Remote] in Remote vacancy
  • $76k - $127k

     ...products and services that help people, businesses and governments realize their greatest potential. Title and Summary Site Reliability Engineer II Who is Mastercard? At Mastercard technology, we work to connect and power an inclusive, digital economy that benefits... 
    Suggested
    Full time
    Part time
    Worldwide
    Flexible hours

    Mastercard

    O Fallon, MO
    13 hours ago
  • $96k - $163k

     ...products and services that help people, businesses and governments realize their greatest potential. Title and Summary Senior Site Reliability Engineer Who is Mastercard? At Mastercard technology, we work to connect and power an inclusive, digital economy that... 
    Suggested
    Full time
    Part time
    Worldwide
    Flexible hours

    Mastercard

    O Fallon, MO
    13 hours ago
  •  ...shape the future of healthcare, we’d love to meet you. About the role We’re hiring an SRE to join our engineering team at Plenful and take ownership of the reliability and performance of the systems that power our product. You’ll work across our distributed workflow... 
    Suggested
    Work at office
    Remote work
    Flexible hours
    2 days per week

    Plenful

    San Francisco, CA
    1 day ago
  •  ...Site Reliability Engineers are responsible for ensuring the availability, reliability, scalability, and performance of the firm’s most critical customer-facing microservices that power all eCommerce channels. This role applies Google-inspired SRE principles to balance... 
    Suggested
    Local area
    Remote work
    Flexible hours
    Shift work

    O'Reilly Technology Services, Inc.

    Pierce, ID
    5 days ago
  • $175k - $225k

     ...Old Mission Capital is seeking a well-rounded technologist with core strengths in Linux and network administration. This Site Reliability Engineer will be responsible for owning and managing the deployment, maintenance, and enhancement of our servers. This Site Reliability... 
    Suggested
    Full time
    Work at office
    Remote work
    Monday to Friday
    Flexible hours
    Rotating shift

    Old Mission

    Chicago, IL
    3 days ago
  •  ...Description The Site Reliability Engineer will support a premier Navy program team in reviewing, assessing, and improving the reliability, resilience, observability, and operational maintainability of next generation Navy afloat architecture. The candidate will work... 
    Contract work
    For contractors
    Work at office
    Local area
    Remote work

    Scientific Research

    Charleston, SC
    2 days ago
  • $60 - $80 per hour

     ...Description We are seeking a highly specialized Observability Engineer with deep expertise in Dynatrace (latest Grail platform) to...  ...implementations Troubleshoot and diagnose complex performance and reliability issues using Dynatrace Drive adoption of best practices for... 
    Contract work
    Temporary work
    Remote work

    TEKsystems

    Atlanta, GA
    2 days ago
  • $86.9k - $198k

     ...Job Number: R0232211 Site Reliability Engineer, Senior The Opportunity: Engineering to make a system more resilient and efficient frees up time and money to build more capabilities. Whether you come from a background in network engineering, systems administration... 
    Full time
    Contract work
    Part time
    Work at office
    Local area
    Remote work

    Booz Allen Hamilton

    Aurora, CO
    1 day ago
  • $178.13k - $205.4k

     ...telecommuting. Salary Range: $178,131 - $205,400 Basic Qualification Bachelor's degree or foreign degree equivalent in Computer Engineering, Computer Science, Engineering, or related field plus five (5) years of progressive, post-baccalaureate experience in job offered... 
    Work at office
    Remote work

    Workday

    Atlanta, GA
    1 day ago
  •  ...generative AI and cloud-native platforms to advanced release engineering practices, our teams are redefining how financial technology...  ...AI-driven solutions that accelerate development and improve reliability. Your work will directly influence how GM Financial leverages... 
    Full time
    Work at office
    Remote work
    Flexible hours
    2 days per week

    GMAC Financial Services

    Irving, TX
    3 days ago
  • $75k - $120k

     ...headquarters in Denver, Colorado, and offices across the U.S., Canada, and India. Role Summary We are seeking a Site Reliability Engineer II to support the reliability, scalability, and performance of critical production services. This role contributes to the... 
    Contract work
    Temporary work
    Work at office
    Work from home
    Flexible hours

    Vertafore

    Denver, CO
    4 days ago
  • $109.5k - $150.55k

     ...strive for the best, own our actions, and grow and evolve. Job Description Renaissance is looking for an experienced Sr Site Reliability Engineer to be part of the Engineering Enablement group's Site Reliability Team with a focus on Application and Infrastructure... 
    For contractors
    Local area
    Remote work
    Worldwide
    Work visa
    Flexible hours
    Weekend work

    Renaissance Services

    Salt Lake City, UT
    4 days ago
  • $166k - $220k

     ...technology to the military in months, not years. ABOUT THE TEAM We are seeking a highly skilled and mission-driven Site Reliability Engineer (SRE) to join our Mission Autonomy team. In this critical role, you will be responsible for ensuring the reliability,... 
    Full time
    Work experience placement
    Immediate start
    Remote work

    Anduril Industries

    Costa Mesa, CA
    1 day ago
  •  ...Site Reliability Engineer Visa: USC,GC only Rate: DOE Position is remote to start, then after conversion to W2, moves into one of three offices: Nashville, Los Angeles or New York Job Description: Strong problem solving/triage skills Strong cloud/infrastructure... 
    Remote work

    ShiftCode Analytics

    New York, NY
    4 days ago
  • $195k - $240k

     ...Senior Site Reliability Engineer San Francisco (Hybrid) At You.com, we are building the AI Search Infrastructure that powers modern AI systems. Our goal is to create the trusted knowledge layer that agents, applications, and enterprises rely on to retrieve real-time... 
    Full time
    Immediate start
    Remote work
    Work from home
    Flexible hours

    Y.O.U.

    San Francisco, CA
    2 days ago
  • $95k - $171k

     .... Opportunities exist to focus on GPU infrastructure, Kubernetes, and ensuring reliability for AI workloads within Akamai's serverless inference platform. As an Site Reliability Engineer II, you will be responsible for: Building and maintaining dashboards, alerts... 
    Permanent employment
    Work experience placement
    Work at office
    Remote work
    Work from home
    Worldwide
    Flexible hours

    Akamai

    Little Rock, AR
    7 days ago
  • $13 per hour

     ...building America's mortgage rails. About the Job You'll own reliability and operational excellence for Pylon's production systems....  ...scale as we grow. You'll build tooling that makes the entire engineering team more effective, establish on-call rotations and runbooks... 
    Remote work

    Pylon

    Palo Alto, CA
    1 day ago
  •  ...environments. Our teams thrive at the intersection of engineering excellence and mission impact, building systems that matter....  ...Job Overview Cogent People Inc. is seeking a Site Reliability to support system reliability, monitoring, and operational... 
    Permanent employment
    Full time
    Contract work
    Temporary work
    H1b
    Remote work

    Cogent People Inc.

    Columbia, MD
    4 days ago
  • $86.8k - $198k

     ...Job Number: R0243370 Site Reliability Engineer The Opportunity: At Booz Allen, our Global Defense Sector (GDS) supports the Department of War (DoW) in delivering resilient, mission-critical digital capabilities. We are seeking a Site Reliability Engineer to help... 
    Full time
    Contract work
    Part time
    Work at office
    Local area
    Remote work

    Booz Allen Hamilton

    Arlington, VA
    2 days ago
  • $86.8k - $198k

     ...Job Number: R0238722 Site Reliability Engineer The Opportunity: Engineering to make a system more resilient and efficient frees up time and money to build more capabilities. Whether you come from a background in network engineering, systems administration, or software... 
    Full time
    Contract work
    Part time
    Work at office
    Local area
    Remote work

    Booz Allen Hamilton

    Herndon, VA
    1 day ago
  • $67 per hour

     ...High School Diploma or GED and eleven (11) years of related experience Or Bachelor's degree in Computer Science, Computer Engineering or a related field and seven (7) years of related experience Skills and Competencies Ability to collaborate with programmers... 
    Immediate start
    Remote work

    United IT Solutions

    Pampa, TX
    5 days ago
  • $123k - $165k

     ...Department/Group Overview Our engineering fleet is a horizontal set of teams providing engineering...  .... Our specific team provides reliability engineering and operational support to backend...  ...and brands. We are seeking a Site Reliability Engineer who will contribute... 

    The Walt Disney Company

    New York, NY
    5 days ago
  •  ...an SRE to join our infrastructure team. This role will be responsible for building software to ensure the reliability of our back-end systems, working with engineers who develop them, and planning for our future growth. You will work with our existing production... 
    Worldwide
    Home office
    Flexible hours

    Superhuman

    San Francisco, CA
    1 day ago
  •  ...Site Reliability Engineer Join the innovators connecting just about anything—from families to cars to now things—on T-Mobile's biggest and best network yet. The SyncUP Things platform team has an immediate need for a Site Reliability Engineer. Responsibilities:... 
    Contract work
    Immediate start
    Remote work

    Software Technology Inc

    Bellevue, WA
    5 days ago
  • $110k - $145k

     ...content reflecting our world. NBCU’s Distribution engineering is responsible for the automation and reliability of NBCU’s Live sources. Reasonable for the...  ...Distribution Engineering is looking to add a talented Site Reliability Engineer to be part of our Video Streaming... 
    Work experience placement
    Work at office
    Local area
    Remote work

    NBCUniversal

    Greenwood Village, CO
    2 days ago
  •  ...leverage Armor to co-manage their risk or c) completely outsource their risk to Armor. Learn more at: SUMMARY The Site Reliability Engineer reports to the Manager, SRE & Platform Engineering, and contributes to the reliability, availability, and performance of... 
    Work at office
    Local area
    Immediate start
    Remote work
    3 days per week

    Armor Defense

    Plano, TX
    1 day ago
  •  ...Purpose Financial Website. Position Summary Purpose Financial is looking for a talented, result-oriented, innovative Site Reliability Engineer to join the team! This position reports to the Systems Integration Manager and is responsible for building and... 
    Work experience placement
    Casual work
    Work at office

    Purpose Financial

    Greenville, SC
    2 days ago
  • $65 - $68 per hour

     ...Site Reliability Engineer Immediate need for a talented Site Reliability Engineer. This is a 12+months contract opportunity with long-term potential and is located in Abbott Park, IL (Onsite). Please review the job description below and contact me ASAP if you are interested... 
    Contract work
    Local area
    Immediate start

    Pyramid Consulting

    Chicago, IL
    2 days ago
  • $113.3k - $205.52k

     ...important to maintain our strong culture, achieve our goals, and thrive as #OneJamf. What you'll do at Jamf: As a Senior Site Reliability Engineer, you'll help us balance development velocity with the reliability our customers depend on. You'll partner with engineering... 
    Work at office
    Remote work
    Worldwide
    Flexible hours
    Shift work

    JAMF Software LLC

    Minneapolis, MN
    4 days ago
  • $111.6k - $186k

     ...Company Cox Automotive - USA Job Family Group Engineering / Product Development Job Profile Sr Software Engineer...  ...an incentive program. Job Description Senior Site Reliability Engineer Department: Engineering / Platform... 
    Remote work
    Relocation
    Flexible hours
    Shift work

    Cox Communications

    Austin, TX
    5 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Site Reliability Engineer [Remote]. Be the first to apply!