Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Senior Site Reliability Engineer - Data Infrastructure (San Jose)

$212.8k - $387.6k

ByteDance

Senior Site Reliability Engineer - Data Infrastructure

Location: San Jose

Team: Infrastructure

Employment Type: Regular

Job Code: A59871

Responsibilities

The Data Infrastructure SRE team is responsible for the reliability, scalability, and efficiency of the core data services that power our products. We manage a massive, distributed environment built on technologies like Kubernetes, Redis, MySQL, and Message Queue. Our work is not about building features, but about engineering the resilience and performance of the underlying platform that all product teams depend on. We are the guardians of production, ensuring our data systems run smoothly, nonstop. This role includes participation in a rotational on-call schedule to ensure nonstop coverage for our critical data infrastructure. You will be expected to respond to, troubleshoot, and resolve production incidents. Our team collaborates across multiple time zones, and you will engage in rigorous change management and post-incident review processes to maintain system stability. Role Summary As a Site Reliability Engineer, you will be on the front lines of keeping our large-scale data systems running reliably and efficiently. You will focus on hands-on operational work, from responding to alerts and managing production changes to automating routine tasks. This role is an excellent opportunity to develop deep expertise in modern infrastructure technologies and SRE practices while working alongside senior engineers to solve challenging problems.

Responsibilities:

  • Incident response and postmortems: Act as an incident commander for critical production issues, guiding the team through triage and resolution. Drive deep, blameless post-incident reviews and ensure that follow-up actions are implemented to prevent recurrence.
  • SLO/SLA and error budgets: Define, negotiate, and maintain Service Level Objectives (SLOs) for critical data services. Champion the use of error budgets to balance reliability work with feature development.
  • Capacity and cost optimization: Lead initiatives in capacity planning, performance tuning, and resource management. Develop strategies and automation to ensure our infrastructure scales efficiently and stays within budget.
  • Pragmatic automation and AI orchestration: Design and build automation and leverage AI Agents to eliminate operational toil, improve deployment safety, and enhance overall operational efficiency. Focus on creating maintainable, robust tools and intelligent workflows that make the entire team more effective.
  • Operational excellence and change management: Uphold and improve our standards for production operations, including runbooks, monitoring, and alerting. Vet complex changes and deployments to ensure they meet our bar for production readiness.
  • Data Center and AI Infrastructure: Lead the construction, maintenance, and optimization of data centers and specialized AI infrastructure, ensuring high availability and peak performance for complex AI-driven workloads.
  • Cross-team influence and mentorship: Act as a subject matter expert on reliability, consulting with application development and other infrastructure teams. Mentor junior SREs, helping them develop their technical and operational skills.
Qualifications

Minimum Qualifications:

  • Bachelor's degree in Computer Science, a related technical field, or equivalent practical experience.
  • 5+ years of experience in a Site Reliability Engineering, Production Engineering, or similar role.
  • Strong proficiency in a programming or scripting language (e.g., Go, Python, Bash) for automation and tool development.
  • Deep understanding of Linux/Unix operating systems, networking fundamentals (TCP/IP, DNS), and distributed systems.

Preferred Qualifications:

  • Extensive hands-on experience managing large-scale data infrastructure (e.g., MySQL, Redis, Kafka, Flink).
  • Proven experience with container orchestration technologies, particularly Kubernetes, in a production environment.
  • Expertise in designing, analyzing, and troubleshooting large-scale distributed systems.
  • A systematic problem-solving approach, coupled with strong communication skills and a sense of ownership.
  • Experience leading incident response for complex, high-impact events.
  • Experience in the operation and construction of Data Centers is a big plus.
Job Information

For Pay TransparencyCompensation Description (Annually)

The base salary range for this position in the selected city is $212800 - $387600 annually. Compensation may vary outside of this range depending on a number of factors, including a candidate's qualifications, skills, competencies and experience, and location. Base pay is one part of the Total Package that is provided to compensate and recognize employees for their work, and this role may be eligible for additional discretionary bonuses/incentives, and restricted stock units. Benefits may vary depending on the nature of employment and the country work location. Employees have day one access to medical, dental, and vision insurance, a 401(k) savings plan with company match, paid parental leave, short-term and long-term disability coverage, life insurance, wellbeing benefits, among others. Employees also receive 10 paid holidays per year, 10 paid sick days per year and 17 days of Paid Personal Time (prorated upon hire with increasing accruals by tenure). The Company reserves the right to modify or change these benefits programs at any time, with or without notice.

For Los Angeles County (unincorporated) Candidates: Qualified applicants with arrest or conviction records will be considered for employment in accordance with all federal, state, and local laws including the Los Angeles County Fair Chance Ordinance for Employers and the California Fair Chance Act. Our company believes that criminal history may have a direct, adverse and negative relationship on the following job duties, potentially resulting in the withdrawal of the conditional offer of employment: 1. Interacting and occasionally having unsupervised contact with internal/external clients and/or colleagues; 2. Appropriately handling and managing confidential information including proprietary and trade secret information and access to information technology systems; and 3. Exercising sound judgment.

About Us

Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok, Lemon8, CapCut and Pico as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create content.

Why Join ByteDance

Inspiring creativity is at the core of ByteDance's mission. Our innovative products are built to help people authentically express themselves, discover and connect – and our global, diverse teams make that possible. Together, we create value for our communities, inspire creativity and enrich life - a mission we work towards every day. As ByteDancers, we strive to do great things with great people. We lead with curiosity, humility, and a desire to make impact in a rapidly growing tech company. By constantly iterating and fostering an "Always Day 1" mindset, we achieve meaningful breakthroughs for ourselves, our Company, and our users. When we create and grow together, the possibilities are limitless. Join us.

Diversity & Inclusion

ByteDance is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At ByteDance, our mission is to inspire creativity and enrich life. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.

Reasonable Accommodation

ByteDance is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs or other reasons protected by applicable laws. If you need assistance or a reasonable accommodation, please reach out to us at

Vacancy posted 4 hours ago
Similar jobs that could be interesting for youBased on the Senior Site Reliability Engineer - Data Infrastructure (San Jose) in San Jose, CA vacancy
  • $153.6k - $286.6k

     ...A leading software company in San Jose seeks a Senior Software Development Engineer to build innovative AI-driven solutions. The role involves architecting systems and collaborating with diverse teams. Ideal candidates have extensive experience in programming and web technologies... 
    Senior
    Full time

    Adobe

    San Jose, CA
    12 hours ago
  •  ...A cloud solutions company is seeking a GCP Architect in San Jose, CA, to develop comprehensive cloud architecture solutions. The ideal candidate should have 5-7 years of experience as a Cloud Platform Architect, demonstrating deep expertise in GCP environments. Key responsibilities... 
    Senior
    Contract work
    2 days per week
    3 days per week

    Saransh

    San Jose, CA
    4 days ago
  •  ...About the job Senior Software Engineer - Data Security (AI-Driven) | San Jose, CA (Hybrid) Senior Software Engineer - Data Security (AI-Driven) | San...  ...in a fast-paced environment-balancing speed with reliability Collaborate closely with engineering partners... 
    Senior
    Full time
    Contract work
    Relocation package

    The Valentino Group

    San Jose, CA
    4 hours ago
  • $212.8k

     ...Senior/Tech Lead AI/LLM Network Software Development Engineer - San Jose Location: San Jose Team: Technology...  ...create hyperscale data-center networking solutions...  ...intelligent network infrastructure to meet the...  ...improve the scalability, reliability and performance of... 
    Senior
    Temporary work
    Local area

    ByteDance

    San Jose, CA
    4 hours ago
  • $177.9k - $277.4k

     ...Senior Software Engineer (Nextest, San Jose) Location: San Jose, CA, US Opportunity Overview Nextest is looking for software engineers to join an exciting, dynamic, hardworking, engaging, and collaborative team. As a candidate contributor, you will be involved in all aspects... 
    Senior
    Flexible hours

    Teradyne

    San Jose, CA
    1 day ago
  • $150k - $180k

     ...Senior Software Engineer - Core Team - San Jose, CA About ZEDEDA ZEDEDA unlocks the value of AI...  ...operate, turning real-time data into real and tangible...  ...that run natively in our infrastructure — agents that integrate...  ...with our services, run reliably at scale, and deliver real... 
    Senior
    Work at office

    ZEDEDA

    San Jose, CA
    12 hours ago
  • $185k

     ...Senior Software Security Engineer Engineering · US, San Jose · Hybrid Who We Are Spectro Cloud lets organizations...  ...the world run AI infrastructure at scale - without...  ...infrastructure across edge, data center, and cloud....  ...response, and reliability improvements Clearly communicate... 
    Senior
    Work at office
    Flexible hours
    Shift work
    3 days per week

    Skydrop

    San Jose, CA
    12 hours ago
  • $156k - $387.6k

     ...Data Center Network Automation Engineer ByteDance San Jose, CA, US ByteDance is a global incubator of platforms at the cutting edge of commerce, content...  ...and operating the global, intelligent network infrastructure to meet the requirements of high availability,... 
    Temporary work

    Softbank Investment Advisers

    San Jose, CA
    1 day ago
  • $177.9k - $284.7k

     ...Senior Software Engineer - Tech Lead (Nextest, San Jose) Location: San Jose, CA, US North Reading, MA, US Opportunity Overview Nextest is looking for software engineers to join an exciting, dynamic, hardworking, engaging, and collaborative team. As a candidate contributor... 
    Senior
    Flexible hours

    Teradyne

    San Jose, CA
    12 hours ago
  •  ...The City of San José is recruiting for a Senior Systems Applications Programmer (SSAP) with a focus on financial applications in the Information Technology Department. As a technical support, this position will be responsible for citywide customer supporting, enhancing... 
    Senior

    (ISC)2 East Bay Chapter

    San Jose, CA
    4 days ago
  •  ...About the job Software Engineer - Data Security (AI-Driven) | San Jose, CA (Hybrid) Software Engineer Are you ready to join an exciting early-stage start-up that detects active data breaches and protects businesses? Be part of a team thats revolutionizing... 
    Full time
    Contract work
    Worldwide
    Relocation package

    The Valentino Group

    San Jose, CA
    12 hours ago
  • $149.9k - $166.3k

     ...a Bachelor's degree in Systems Engineering, or a related Science, Engineering...  ...of software, hardware, reliability, maintainability, safety and other...  ...This position is fully on site. While on-site, you will be a part of our San Jose, CA location. #CJ1 Salary... 
    Senior
    Work experience placement
    Flexible hours

    General Dynamics Mission Systems

    San Jose, CA
    12 hours ago
  • $78.5k - $102.9k

    ## Data Center Engineer - San JoseSan Jose,California,United StatesFind out how well you match...  ...for ensuring the reliable installation, maintenance...  ...of optical transport and infrastructure within customer data centers...  ...travel to a non-home market site at a moment’s notice and... 
    Temporary work
    Work at office
    Immediate start
    Remote work

    Ericsson

    San Jose, CA
    12 hours ago
  • $78.5k - $102.9k

     ...motivated and skilled Data Center Optical Engineer to lead work in...  ...responsible for ensuring the reliable installation,...  ...optical transport and infrastructure within customer data...  ...a non‑home market site at a moment's notice...  ...United States (US) || San Jose (CA) Job details:... 
    Temporary work
    Work at office
    Immediate start
    Remote work

    Ericsson

    San Jose, CA
    4 days ago
  •  ...Job Description: AI Infrastructure Engineer San Jose, CA Duration: 6+ months Must have skills: AI, Kubernetes, Orchestration...  ...tailored AI solutions that bridge the gap between private data centers and public cloud. Your day-to-day will involve... 
    Full time
    Work at office

    ESR Healthcare

    San Jose, CA
    3 days ago
  •  ...Machine Learning Engineer | Python | Pytorch | Distributed Training...  ...Optimisation | GPU | Hybrid, San Jose, CA Title: Machine Learning...  ...models from Research into reliable, performant, and cost-efficient...  ...vector/feature stores and data pipelines (FAISS/Milvus/Pinecone... 

    Enigma

    San Jose, CA
    2 days ago
  • $156k - $387.6k

     ...Network Software Development Engineer ByteDance San Jose, CA, US ByteDance is a...  ..., to create hyper-scale data-center networking...  ...global, intelligent network infrastructure to meet the requirements...  ...congestion control, and system reliability. 3. Design and maintain... 
    Temporary work
    Local area

    Softbank Investment Advisers

    San Jose, CA
    1 day ago
  • $176.3k - $293.7k

     ...Principal Software Engineer- Java (HYBRID San Jose, CA) page is loaded## Principal Software Engineer- Java...  ...identifying root causes, and implementing reliable fixes. They are also comfortable...  ...understanding of software design principles, data structures, and algorithms.*... 
    For contractors
    Work experience placement

    Stryker

    San Jose, CA
    4 days ago
  •  ...US staffing Inc is seeking a Business Analyst for an onsite position in San Jose, CA. The ideal candidate will have over 6 years of experience in the IT field, strong analytical skills, and proficiency in Agile methodology. Responsibilities include gap/feasibility analysis... 
    Senior

    US Staffing Inc

    San Jose, CA
    1 day ago
  •  ...A leading cloud hosting provider is seeking a Data Center Operations Engineer for its San Jose location. This full-time role involves hands-on management and support of server and data center operations. Candidates should have at least 1 year of experience, strong hardware... 
    Full time

    www.leaseweb.com

    San Jose, CA
    4 days ago
  • $220k

     ...Samsung SDS America in San Jose, CA is seeking a Senior Security Operations Engineer to lead security operations projects and ensure effective threat management. You will develop detailed runbooks, oversee the configuration of SIEM systems, and collaborate with engineering... 
    Senior

    Samsung SDS America

    San Jose, CA
    2 days ago
  • $120k - $150k

     ...Monolithic Power Systems, Inc. is seeking a Sr. IT ERP System Administrator in San Jose, CA. This key role supports over 300 ERP users, primarily in Finance and Sales & Marketing, focusing on system issue resolution, user training, and efficiency improvement. Qualified... 
    Senior

    Monolithic Power Systems

    San Jose, CA
    12 hours ago
  •  ...experienced SAP BASIS ADMIN to administer SAP HANA and S/4 HANA landscapes. This full-time, permanent position is based onsite in San Jose, California. The role requires extensive knowledge of SAP Basis tasks including installations, upgrades, and high availability strategies... 
    Senior
    Permanent employment
    Full time

    Jobs via Dice

    San Jose, CA
    1 day ago
  • $190k - $270k

     ...agentic workflows. Build systems that allow hardware engineers to "query" complex design rules and legacy data with high accuracy. Engineering Data Strategy:...  ...exemptions or licenses must be filed. Nearest Major Market San Jose Nearest Secondary Market Palo Alto Job Segment... 
    Senior
    Temporary work
    For contractors
    Work at office
    Shift work
    Night shift

    Celestica

    San Jose, CA
    4 days ago
  • $900 per month

     ...A healthcare provider in San Jose, CA, seeks a compassionate Podiatrist for a part-time role. This position requires providing on-site podiatric care to senior communities. The ideal candidate will have an active Podiatry license in California and experience with geriatric... 
    Senior
    Part time

    Comprehensive Mobile Care

    San Jose, CA
    4 days ago
  •  ...Veriipro is seeking a seasoned Dynamics 365 Consultant located in San Jose, CA. Candidates must have over 8 years of experience leading enterprise-level implementations of Dynamics 365 solutions. Key responsibilities include collaborating with clients, customizing modules... 
    Senior
    Local area

    Veriipro

    San Jose, CA
    2 days ago
  •  ...Position: Wireless Reliability Engineer (AP SRE) Location: San Jose, CA Mission: Help eliminate bad WiFi experiences by making Nile's access point...  ...on. This is an individual contributor role at Senior / Staff level, with high technical ownership and visibility... 
    Night shift

    Nile Global Inc

    San Jose, CA
    4 days ago
  •  ...Broadcom Inc. is looking for a Physical Design Engineer to join the ASIC Products Division in San Jose, CA. This role involves working with cutting-edge technology to drive next-gen AI designs while executing Physical Design and Verification processes. The ideal candidate... 
    Senior

    Broadcom Corporation

    San Jose, CA
    4 days ago
  •  ...Avanciers Inc. is seeking an Accounts Receivable Specialist for a contract role in San Jose, CA. The ideal candidate will have over 5 years of accounts receivable experience and a Bachelor's degree in Accounting or Finance. The role involves processing customer payments... 
    Senior
    Contract work

    Avanciers Inc.

    San Jose, CA
    12 hours ago
  •  ...Adobe Inc. is seeking a skilled software engineer in San Jose to define API integration patterns,...  ...solutions using Kubernetes, and guide senior engineers. The ideal candidate has over...  ...development, performance optimization, and data migration strategies. Adobe is an Equal... 
    Senior

    Adobe

    San Jose, CA
    12 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Site Reliability Engineer - Data Infrastructure (San Jose). Be the first to apply!