LLM AIOps Development Engineer - Data Center Networking
$202.16k - $368.22kByteDance
Responsibilitie
About the team Networking brings together innovative ideas and technologies from network architecture, software defined networking (SDN), network virtualization, switch software and hardware co-design, and high-speed networking, to create hyper-scale data-center networking solutions that power several of the most popular apps of the world such as Douyin and TikTok which serve hundreds of millions of users around the globe. Network Observation team is committed to building a world-leading hyperscale data center network infrastructure that supports hundreds of millions of users' real-time access and explosive growth of massive data volumes. We believe that the next generation of network operations will be fundamentally powered by artificial intelligence technologies, particularly Large Language Models (LLMs). We are seeking a passionate development engineer who combines deep networking expertise with innovative AIOps capabilities to join us in defining and building "autonomous" data center networks. Together, we will transform network operations from a reactive "firefighting" mode into a proactive, data-driven intelligent ecosystem with predictive and self-healing capabilities. Responsibilities: As a core member of our team, you will collaborate closely with our NetOps, SRE, and platform engineering teams to tackle the complexities of one of the world's largest data center networks. You will design and implement a closed-loop AIOps for NetWork platform, covering: - Build a Panoramic Network Observability Platform: Develop a streaming telemetry data pipeline for both physical and virtual networks, integrating multi-source data from gNMI, Netconf, IPFIX/NetFlow, and SNMP to provide a high-quality, real-time data foundation for AIOps. - Develop an Intelligent Diagnostics and Root Cause Analysis System: Apply machine learning and deep learning algorithms to perform anomaly detection, correlation analysis, and intelligent noise reduction on massive volumes of network metrics, logs, and events. Swiftly pinpoint root causes of failures across the entire stack, from optical transceivers and switch hardware to protocol adjacencies and application traffic. - Explore Innovative Applications of LLMs and Agents: - Intelligent Operations Assistant: Build a conversational chatbot powered by Retrieval-Augmented Generation (RAG) that understands natural language queries, automatically queries knowledge bases and monitoring data, and provides precise troubleshooting guidance and network status reports. - Automated Remediation and Smart Runbooks: Train operational Agents to safely and controllably invoke network change tools and APIs. Empower them to autonomously generate, recommend, or even execute remediation plans and emergency runbooks based on their understanding of failure scenarios. - Establish Capacity and Risk Prediction Capabilities: Forecast network capacity bottlenecks, high-risk links, and "sub-healthy" devices based on historical data and business growth models, enabling proactive scaling and preventative maintenance. - Forge a Rock-Solid Engineering System: Adhere to engineering best practices to design and develop a highly available and scalable AIOps platform. Guarantee the stability and performance of the entire pipeline, from data collection and model training to online inference and automated closed-loop actions. Qualification Minimum Qualifications: - Solid Fundamentals in Computer Science and Networking: A deep understanding of data center network architectures (e.g., Spine-Leaf Fabric), and proficiency in key protocols such as EVPN/VXLAN and BGP/OSPF. In-depth knowledge of the Linux network stack is essential. - Excellent Software Engineering Skills: Mastery of Golang or Python with outstanding coding and system design abilities. Familiarity with modern software development workflows, including microservices, containerization (Docker/Kubernetes), and CI/CD. - Rich Platform Development Experience: Practical experience in one or more of the following areas is highly desirable: - Big Data Processing: Familiarity with Kafka, Flink, ClickHouse/TSDB, and experience building real-time data pipelines and analytics systems. - Observability Technologies: Experience with Prometheus/OpenTelemetry, graph databases (e.g., Neo4j), and developing alert and event platforms. - A Passion for AIOps/ML/LLM Practices: - A keen interest in the latest advancements in Large Models and Agent technologies, with thoughtful insights or hands-on experience in their application to operations (e.g., RAG, tool use, safety evaluation). Preferred Qualifications: - Experience in operating or developing for hyperscale (100,000+ servers) data center networks. - Proven experience leading or making significant contributions to an LLM/Agent-based intelligent operations project with measurable business impact. - Active contributions to open-source communities such as SONiC, P4/PINS, eBPF, Prometheus, or OpenTelemetry. - In-depth research or practical experience in high-performance networking (RDMA/RoCE), SmartNICs (NIC Offload), or DPDK/eBPF. - Experience building network configuration and control systems (e.g., based on SONiC, gNMI, Netconf). Job Information [For Pay Transparency]Compensation Description (Annually) The base salary range for this position in the selected city is $202160 - $368220 annually.Compensation may vary outside of this range depending on a number of factors, including a candidate's qualifications, skills, competencies and experience, and location. Base pay is one part of the Total Package that is provided to compensate and recognize employees for their work, and this role may be eligible for additional discretionary bonuses/incentives, and restricted stock units.
Benefits may vary depending on the nature of employment and the country work location. Employees have day one access to medical, dental, and vision insurance, a 401(k) savings plan with company match, paid parental leave, short-term and long-term disability coverage, life insurance, wellbeing benefits, among others. Employees also receive 10 paid holidays per year, 10 paid sick days per year and 17 days of Paid Personal Time (prorated upon hire with increasing accruals by tenure).
The Company reserves the right to modify or change these benefits programs at any time, with or without notice.
For Los Angeles County (unincorporated) Candidates:
Qualified applicants with arrest or conviction records will be considered for employment in accordance with all federal, state, and local laws including the Los Angeles County Fair Chance Ordinance for Employers and the California Fair Chance Act. Our company believes that criminal history may have a direct, adverse and negative relationship on the following job duties, potentially resulting in the withdrawal of the conditional offer of employment:
1. Interacting and occasionally having unsupervised contact with internal/external clients and/or colleagues;
2. Appropriately handling and managing confidential information including proprietary and trade secret information and access to information technology systems; and
3. Exercising sound judgment. About U Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok, Lemon8, CapCut and Pico as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create content. Why Join ByteDance Inspiring creativity is at the core of ByteDance's mission. Our innovative products are built to help people authentically express themselves, discover and connect - and our global, diverse teams make that possible. Together, we create value for our communities, inspire creativity and enrich life - a mission we work towards every day.
As ByteDancers, we strive to do great things with great people. We lead with curiosity, humility, and a desire to make impact in a rapidly growing tech company. By constantly iterating and fostering an "Always Day 1" mindset, we achieve meaningful breakthroughs for ourselves, our Company, and our users. When we create and grow together, the possibilities are limitless. Join us.
Diversity & Inclusion
ByteDance is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At ByteDance, our mission is to inspire creativity and enrich life. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too. Reasonable Accommodation ByteDance is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs or other reasons protected by applicable laws. If you need assistance or a reasonable accommodation, please reach out to us at
Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the LLM AIOps Development Engineer - Data Center Networking in Seattle, WA vacancy
- ...cable assemblies that support next-generation data centers, enterprise servers, storage systems, networking equipment, and high-speed computing environments... ...We are seeking an experienced Connector Development Engineer to join our team and contribute to the design...NetworkWork at officeRemote workWorldwide
$200k - $250k
...About The Role Controls Commissioning Engineer will take the lead in commissioning, startup... ...5+ years in controls commissioning, data center MEP systems, or industrial automation (experience... ...logic. Experience with industrial network protocols (Ethernet/IP, BACnet, Modbus)...NetworkFor contractorsLocal area$165k - $242k
...Senior Software Engineer, Data Center Infrastructure Tooling CoreWeave is The Essential Cloud... ...performance internal platform that gives network engineers, fleet engineers, and... ...encourages collaboration and enables the development of innovative solutions to complex problems...Network$165k - $242k
...Senior Business Systems Engineer- Data Center Systems II Livingston, NJ / Bellevue, WA / Sunnyvale, CA CoreWeave is The Essential Cloud for... ...Implement infrastructure security best practices (RBAC, network policies, pod security standards, admission controllers) in...NetworkTemporary workCasual workWork at officeImmediate startFlexible hours$113k - $175k
...Description Arista Networks is an industry leader in data-driven, client-to-cloud networking for large data center, campus and routing environments... ...awards, such as Best Engineering Team, Best Company for... ...Management, and Software Development teams to represent the...NetworkLocal areaRemote work- LTIMindtree seeks skilled professionals in Bellevue, WA, with familiarity in data center setups and networking concepts. Candidates should possess capabilities in XML, VS Code, and Incident Management. Benefits include comprehensive medical coverage, 401(k) plans, and paid...Network
$148.2k - $300.96k
Site Reliability Engineer - Data Infrastructure Location: Seattle Employment Type: Regular Job... ...problem detection and diagnosis. Data Center and AI Infrastructure: Support daily... ...understanding of Linux operating systems and networking concepts. Preferred Qualifications:...NetworkTemporary workLocal area- ...wins across its portfolio. The role involves providing engineering support for DoD data center and cloud projects, requiring a Top Secret security clearance... ...at least 10 years in IT, with proven capabilities in network and cloud design. #J-18808-Ljbffr Essnova Solutions,...NetworkFor contractors
- ...talented Team. Job Title: Senior Data Engineer Location: Seattle, WA Job... ...running by supporting all AWS data centers and all of the servers, storage, networking, power, and cooling equipment that... ...business customers and software development teams to gather and document requirements...Network
- ...Opportunity We are seeking a highly motivated and skilled Data Center Optical Engineer to lead work in customer environments and co-locations. The... ...(MoPs), engineering documents, vendor documentation, and network diagrams. Read and interpret cut sheets and physical network...NetworkWork at officeImmediate startRemote work
$160k - $200k
...Principal Architect Of Data Center Engineering Fleet Data Centers designs, builds and operates... ...uniquely capable of upleveling data center development scale and operations in the face of... ...mechanical, structural engineering, network, controls, and operations teams to...NetworkFor contractorsWork at officeLocal area$139.2k - $174k
...are seeking a Senior Engineer 2 to play a key role in... ...compute fleet, storage and networking for running AI... ...Drive the design and development of distributed services... ...complex orchestration for LLM inference and hosting... ...position is based on market data, relevant years of...NetworkLocal areaRemote workWorldwideFlexible hours$71k - $93.45k
...opportunity We are seeking a highly motivated and skilled Data Center Optical Engineer to lead work in customer environments and co-locations. The... ...(MoPs), engineering documents, vendor documentation, and network diagrams. Read and interpret cut sheets and physical network...NetworkTemporary workWork at officeImmediate startRemote work$128k - $161k
...builders in the world. At DigitalOcean, Data Center Engineers play a critical role in building and... ..., and scaling the servers and networking equipment that enable millions of developers... ...engineers and contributing to team development What You'll Add To DigitalOcean 6-8...NetworkLocal areaRemote workWorldwideFlexible hours$275k - $300k
...Vice President – Facilities Engineering Fleet's owner for quality... ...details, parts lists, and asset data standards Internal Audit... ..., QA/QC, or data center operations or similar infrastructure... ...Build strong partnerships and networks Location and Travel:...NetworkFlexible hours$148.2k - $300.96k
...knowledge integration - Design prompt engineering and reasoning workflows that... ...indicators, and real-time LLM-based decisions. - Knowledge... ...task agents (e.g., structured data retrieval, open-source search,... ...-based modeling, graph neural networks, or similarity search - Background...NetworkTemporary workLocal areaWorldwide$57 per hour
...Team Introduction ByteDance Networking brings together innovative ideas... ..., to create hyperscale data-center networking solutions that power... ...gain marketable software development and/or network operation experience... ...technologies to support AI/LLM applications. - Design and...NetworkHourly payInternshipLocal area$202.16k - $368.22k
...About the Team ByteDance Networking brings together... ...to create hyperscale data-center networking solutions... ...marketable software development and/or network operation... ...technologies to support AI/LLM applications. -... ...science, electronic engineering, network engineering...NetworkTemporary workLocal area$275k - $325k
Fleet Data Centers designs, builds and operates mega-scale data center... ...of upleveling data center development scales and operations in the... ...positioned to bring in-house design, engineering and operational capabilities... ..., while building a broad network that provides valuable...NetworkFlexible hours$202.16k - $368.22k
...clustering. - Responsible for data construction, instruction... ...the research and development of multi-agent tools, models, engines, and platforms to enhance... ...multimodal, search, graph, LLM, Agent etc. to provide support... .... Model large-scale networks to support business scenarios...NetworkTemporary workLocal areaOverseas$242k - $355k
...Reporting directly to the Global Data Center Group President and functionally to the Electrical Sector CTO, the VP of Engineering is a critical member of the senior leadership... ...engineering community. You will lead the development and implementation of product and solutions...Local area- ...on innovation. Everything we do centers around people. That means we... ...hiring a **Specialist Solutions Engineer** with deep expertise in **AI, Data Science, and LLM behavior** to support our **AI... ...travel to an offsite location as needed. #J-18808-Ljbffr F5 Networks, Inc.NetworkWork at officeLocal areaRemote workWork from home
- ...and services they want to use. Plaid's network covers 12,000 financial institutions across... ...D.C., London and Amsterdam. Making data-driven decisions is key to Plaid's culture... ...datasets and tooling to teams across engineering, product, and business and help them explore...NetworkWork experience placementLocal areaFlexible hours
$147.3k - $193.3k
Senior Software Engineer - Data and AI Platforms Who we are lululemon is... ...(logical and semantic), AI/LLM retrieval and grounding, and... ...technical concepts and career development, partner with product and... ...development course offerings People networks, mentorship programs, and...NetworkPermanent employmentPart timeLocal areaWork visa$147.3k - $193.3k
...About this team The Data & AI Security team is responsible... ...partnering with Data & Analytics, Engineering, Legal, Privacy, and GRC... ...design decisions early in the development lifecycle Own end-to-end... ...course offerings People networks, mentorship programs, and...NetworkPermanent employmentPart timeWork at officeWork visa$108.5k - $162.7k
...Senior Analytics Engineer At Weyerhaeuser, we are the... ...and enterprise data into trusted, reusable... ...Apply automation and LLM-assisted workflows to accelerate... ...powered tools to accelerate development, generate and improve... ...a host of diversity networks, promote mentoring, and...NetworkTemporary work$124.72k - $243.2k
Machine Learning Engineer Graduate (Data-Search-Recommendation TikTok.US - Seattle) - 2026 Start (... ...perspective of the video itself and the social network, improving the authority, credibility... ...’s or Master’s degree in Software Development, Computer Science, Computer...NetworkTemporary workLocal area$120k - $150k
A leading investment firm is seeking a Senior Acoustics Engineer to oversee acoustic compliance for power generation and battery energy storage systems at data centers. The ideal candidate has over 7 years of experience in noise control engineering, a Bachelor's degree...Local area$120k - $150k
A leading digital infrastructure investment firm in Seattle is seeking a Senior Electrical Design Engineer to lead BTM power solutions design. This role demands robust electrical design and coordination with vendors to ensure safety and compliance. Ideal candidates will...Work at office- A global engineering firm seeks a Senior Project Engineer specializing in data center designs in Seattle. The role involves overseeing technical design, managing projects, and mentoring junior staff. Candidates should have extensive knowledge of engineering standards and...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to LLM AIOps Development Engineer - Data Center Networking. Be the first to apply!
Related searches
- process development scientist Seattle, WA
- r&d engineering technician Seattle, WA
- engineering development program Seattle, WA
- r&d engineer Seattle, WA
- product development scientist Seattle, WA
- product development engineer Seattle, WA
- research and development engineer Seattle, WA
- development engineer Seattle, WA
- staff data engineer Seattle, WA
- data engineering intern summer Seattle, WA


