Tech Lead - Data Infrastructure Site Reliability
$232.56k - $427.5kSoftbank Investment Advisers
Tech Lead - Data Infrastructure Site Reliability
ByteDance
Seattle, WA, US
Full-Time
IT
Consumer
ByteDance is a global incubator of platforms at the cutting edge of commerce, content, entertainment and enterprise services - over 2.5bn people interact with ByteDance products including TikTok.
Job Description
Our Site Reliability Engineering (SRE) team blends software and systems engineering to build and operate large-scale data infrastructure with high reliability and efficiency. We provide a dependable cloud environment that powers our global business. In this role, you will leverage your expertise in data center architecture, data infrastructure services, and systems and tools development to solve complex scaling and reliability challenges. We're looking for a Technical Lead (SRE) who can provide deep technical leadership, drive architectural improvements, and collaborate effectively across multiple organizations. You'll partner with engineering, product, data, and infrastructure teams to deliver resilient, scalable platforms. This is a highly technical, hands-on role that requires strong problem-solving ability, clear communication, and the ability to influence without formal authority. Responsibilities - Strong hands-on skills in the design, development, and operation of large-scale cloud infrastructure and distributed systems. - Collaborate with cross-functional teams (e.g., Advertising, Machine Learning, E-commerce, and Core Infra) to drive system reliability, performance, and scalability. - Lead initiatives to automate operations, eliminate toil, and improve overall system efficiency. - Troubleshoot complex production issues, perform root-cause analysis, and drive long-term reliability improvements. - Promote best practices in system design, observability, performance optimization, and cost efficiency. - Communicate complex technical concepts effectively to both technical and non-technical stakeholders.
Qualifications
Minimum Qualifications - 5+ years of experience in Site Reliability Engineering, Software Development, or related fields, with a strong focus on designing, building, scaling, and operating cloud-based systems. - Deep hands-on expertise in at least one of the following areas: - Databases (SQL/NoSQL) - Kubernetes or container orchestration - Big Data processing and storage systems (streaming and batch) - Strong knowledge of system architecture, distributed systems, and performance bottlenecks. - Excellent communication and collaboration skills, with experience working across engineering, product, and data science teams. Preferred Qualifications - Proven track record of driving automation, tooling, and process improvements that enhance reliability and efficiency. - Experience in cost optimization and performance tuning at scale, backed by data-driven decision making. - Thought leadership in adopting new technologies, improving operational practices, and influencing system design.
Compensation Description (Annually)
The base salary range for this position in the selected city is $232560 - $427500 annually. Compensation may vary outside of this range depending on a number of factors, including a candidate's qualifications, skills, competencies and experience, and location. Base pay is one part of the Total Package that is provided to compensate and recognize employees for their work, and this role may be eligible for additional discretionary bonuses/incentives, and restricted stock units.
Benefits may vary depending on the nature of employment and the country work location. Employees have day one access to medical, dental, and vision insurance, a 401(k) savings plan with company match, paid parental leave, short-term and long-term disability coverage, life insurance, wellbeing benefits, among others. Employees also receive 10 paid holidays per year, 10 paid sick days per year and 17 days of Paid Personal Time (prorated upon hire with increasing accruals by tenure).
The Company reserves the right to modify or change these benefits programs at any time, with or without notice.
$202.16k - $368.22k
Senior Site Reliability Engineer - Foundational Storage, ByteStore... ...: Seattle Team: Infrastructure Employment Type: Regular... ...to provide superior data storage and access... ...and release processes - Lead safe change management... ...Experience with storage tech stacks, including NVMe...DataWebsiteTemporary workLocal area$207k - $300k
...field. 8 years of experience leading, designing and implementing... ...design, networking, security, data compression, user interface design... ...-generation of on‑prem AI infrastructure to bring the best of Google... ...is responsible for building reliable and scalable TPU orchestration...DataFull timeTemporary work$146.88k - $220.32k
...offices in Seattle. On-site requirements vary... ...that world-class infrastructure should be a... ...InfiniBand fabric. You lead by example,... ...weights; we release the data, the training code... ...of a world-class tech startup but with the... ...or "Site Reliability Engineering" (SRE)...DataWebsiteContract workWork experience placementWork at officeFlexible hoursWeekend work$160k - $230k
..., Cryptographic and Identity Infrastructure) team. SCII builds the foundational... ...to deliver the most secure Data Cloud to our customers -... ...safety. Write well-tested, reliable code and participate in... ...posting on the Snowflake Careers Site for salary and benefits information...DataWebsiteFlexible hours- ...committed to creating category-leading enterprise software that... ...Your Mission At UiPath's Site Reliability team, we build the platforms... ...monitoring, alerting, cloud infrastructure, access management, standardized... ...Deep understanding of data structures, algorithms, multithreading...DataWebsiteWork at officeImmediate startRemote work
$148.5k - $313.7k
...ambition meets action. Tech meets trust. And... ...at the company leading workforce transformation... ...performance, reliability, and engineering... ...building foundational infrastructure that enables great... ...preferred), and data warehousing... ...out our benefits site which explains our...DataWebsiteWork experience placement- ...for Dataplane & Edge Infrastructure, you will serve as the... ...infrastructure. You will lead the design and... ...the high-performance data path that processes packets... ...of traffic with high reliability and precision Collaborate... ...$323,081.85 Other site ranges may differ...DataWebsitePermanent employmentTemporary workLocal area
$104.5k - $160k
...mission is to deliver fast, reliable internet to customers... ...in enterprise infrastructure deployment and operations... ...across Leo's facilities Lead architecture and design... ...emphasizes ownership, data-driven decision making,... ...Qualifications ~4+ years of site reliability engineering...DataWebsiteWork experience placementFlexible hours- ...are seeking a high-caliber Site Reliability Engineer (SRE) to join our Forward... ...ensuring that our complex, data-driven AI platforms remain... ...inference. 2. MLOps & AI Infrastructure Model Serving Reliability:... ...responder in on-call rotations, leading the technical resolution of...DataWebsiteLocal area
- ...the team Airwallex's Database team sits within the Infrastructure division and is responsible for the reliability, performance, security, and automation of... ...the availability, performance, and security of the data layer that underpins Airwallex's global payments platform...DataWebsite
$165k - $242k
...Senior Software Engineer, Data Center Infrastructure Tooling CoreWeave is The... ...with confidence. Trusted by leading AI labs, startups, and global... ...across hundreds of sites. As a senior backend engineer... ...pipelines, observability, and reliability practices. What We're...DataWebsiteTemporary workFlexible hours$198.36k - $416.1k
...the Team The TikTok U.S. Data Security (USDS) team is responsible... ...Team builds a scalable, reliable, cost-efficient infrastructure that ensures data... ...Responsibilities We are seeking a Tech Lead to help drive company-wide... ...on a global scale. On-site presence across teams...DataWebsiteTemporary workLocal areaShift work$160k - $210k
...cutting-edge deep learning technology and data science to transform how brands... ...The Role We are looking for a senior site reliability engineer to work on expanding our global... ...begin to rapidly expand our hybrid cloud infrastructure. Past that, we are a rapidly growing...DataWebsiteWork at officeImmediate startRemote workWork from home$139k - $242k
...Engineer, Server Fleet Infrastructure Livingston, NJ /... ...confidence. Trusted by leading AI labs, startups, and... ...company's delivery of reliable and efficient infrastructure... ...of scale for multi-site deployment and... ...infrastructure stack, from data center hardware to orchestration...DataWebsiteTemporary workCasual workWork at officeRemote workFlexible hours$230.77k - $323.08k
...foundational platform infrastructure-the "nervous system"-that... ...and Gateways, ensuring reliable, secure, and performant... ...IPC frameworks Lead the architecture for global... ...devices to ground-based data centers Partner... ...0 - $323,081.85 Other site ranges may differ...DataWebsitePermanent employmentTemporary workLocal area$42 - $60 per hour
...clear vision: develop a cloud data platform that is effective,... ...How to build enterprise grade, reliable, and trustworthy software/... ...building automation frameworks and infrastructure to improve efficient of... ...posting on the Snowflake Careers Site for salary and benefits...DataWebsiteHourly payFull timeCasual workInternshipWork at officeWork from homeFlexible hours$122.3k - $158.5k
...United States of America Senior Site Reliability Engineer (SRE) - SPEAR... ...anti‑cheat platform, security infrastructure, and real‑time detection... ...detection and enforcement systems Lead incident response, root... ...original techniques to keep player data secure, prevent cheating,...DataWebsiteFull time$197.3k - $313.7k
...ambition meets action. Tech meets trust. And... ...at the company leading workforce transformation... ...on the Cloud Infrastructure Automation team at... ...nodes in multiple data centers,... ...time, balancing live-site management, feature... ...skills to produce reliable, secure, production...DataWebsite$163.62k - $212.71k
...advertising. We deal with BIG data, operating mainly in... .... You will both lead the team and remain hands... ...Lead/Principal Site Reliability Engineer to drive the... ...available, and reliable cloud infrastructure in AWS leveraging... ...~ Experience in Ad-Tech or "BIG Data" processing...DataWebsiteFull timePart timeWork experience placementWork at officeLocal areaImmediate startRemote workWork from homeFlexible hoursShift work3 days per week1 day per week- ...Manager in Bellevue, WA, to lead the Snowtrail infrastructure team. The ideal candidate... ...with distributed data systems and is skilled in... ...involves ensuring platform reliability and performance, evolving... ...check the Snowflake Careers Site for further details. #J-18...DataWebsite
$160k - $185k
...provides enterprises and leading AI labs with the most... ...growing footprint of data centers covering every... ...and motivated Sr. Infrastructure Engineer to join our Hardware... ...highly performant and reliable infrastructure... ...in cloud operations, site reliability engineering...DataWebsiteFull timeTemporary workCasual workWork at officeRemote workFlexible hours$139.5k - $258.1k
...designs, builds, and operates the cloud infrastructure, server systems, and platform... ...that store, protect, and serve Apple's data at massive scale, with a mission to deliver... ...collaborative, and pragmatic Storage Site Reliability Engineer to join our team. In this role...DataWebsiteRelocation- About the job We are looking for a senior site reliability engineer to join the Cloud FinOps team at Hopper. We manage a large infrastructure in Google Cloud that is used by hundreds... ...unnecessary headers. Ensure that our warehouse data is in use and select the most efficient...DataWebsiteRemote jobWork from homeSleeping nights
- ...Lead Infrastructure Engineer Assume a vital position as a key member of a high-performing team that... ...Strongly considers upstream/downstream data and systems or technical implications... ...comprehensive health care coverage, on-site health and wellness centers, a retirement...DataWebsite
$192k - $278k
...years of experience managing or leading a team. 5 years of... ...instrumenting first, then turning data into knowledge, and finally... ...Together we engineer and build the infrastructure, tools, access and telemetry... ...outcomes, and engage with Site Reliability Engineering (SRE) teams to...DataWebsitePermanent employmentFull timeTemporary work$198.36k - $416.1k
Responsibilities TikTok video system is a world-leading video platform that provides multimedia... ...leadership and mentorship to a team of Site Reliability Engineers focused on building observable... ...protocols and enforcement that ensure data protection and regulatory compliance...DataWebsiteTemporary workLocal area$38k - $130k
...Role - Lead Infrastructure Services Roles & Responsibilities 4+ Yrs Data Center Support Experience Knowledge of Networking Devices and Concept... ...Generic Managerial Skills, If any Digital: Site Reliability Engineering (SRE); windows TCS Employee Benefits...DataWebsiteWork experience placementRemote work$31 - $49 per hour
LiveRamp is the leading data connectivity platform. We believe connected data has the... ...teams Setup and maintain Infrastructure & Product Reliability monitoring and alerting Maintain and... ...innovators, from iconic consumer brands and tech giants to retailers, financial...DataWebsiteHourly payFull timeInternshipWorldwide$198.36k - $416.1k
Team Introduction The Data BP team on Data Platform USDS is focused... ...on ensuring the stability, reliability, scalability, and risk... ...high quality at scale. As a Tech Lead Manager in the Data BP team,... ...businesses on a global scale. On‑site presence across teams allows...DataWebsiteTemporary workLocal areaShift work- ...About the job Manager of Infrastructure Inizio Partners is a... ...engineers, DevOps engineers, site reliability engineers, and other IT operations... ...Directory (AD) Protect data assets from unauthorized... ...Minimum 10 years of experience leading IT operations on-premises...DataWebsiteRelocation package
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Tech Lead - Data Infrastructure Site Reliability. Be the first to apply!



