Principal AI/ML Engineer, Reliability
$295.25k - $345.04kRoblox
Every day, tens of millions of people come to Roblox to explore, create, play, learn, and connect with friends in 3D immersive digital experiences- all created by our global community of developers and creators.
At Roblox, we're building the tools and platform that empower our community to bring any experience that they can imagine to life. Our vision is to reimagine the way people come together, from anywhere in the world, and on any device.We're on a mission to connect a billion people with optimism and civility, and looking for amazing talent to help us get there.
A career at Roblox means you'll be working to shape the future of human interaction, solving unique technical challenges at scale, and helping to create safer, more civil shared experiences for everyone.
As a Principal Machine Learning Engineer within Reliability, you will set the 3-5 year technical strategy and architectural blueprint for how machine learning systems/practices can be leveraged to improve the reliability of the overall Roblox platform. You will own the architectural and execution roadmap of leveraging massive data across - logs, traces, metrics, production changes, to proactively detect issues before they become real problems (MTTD) and/or reduce time to resolve incidents (MTTR). You will have the opportunity to cross functionally collaborate with other similar teams at Roblox to define best practices and software. You will:
- Define the strategy of leveraging Machine Learning Engineering to improve Production Systems Reliability at Roblox.
- Improve realtime anomaly detection capabilities by leveraging various state of the art ML techniques, thereby directly contributing to improving Mean Time to Detect Production issues.
- Develop methods to build pipelines to consume various streams of data (metrics, logs, traces, change management systems etc.).
- Build a reasoning layer that interacts with the streams of data to find possible root causes of problems happening in production.
- Build time-series models to predict capacity exhaustion and seasonal traffic spikes to drive automated scaling
- Beyond off the shelf: We are looking for an expert who has knowledge of various modeling techniques, ability to go deep and fine tune models to fit our use cases.
- Ability to propose and architect the infrastructure that allows us to implement systems that learn from user and/or automated feedback.
- Good distributed systems fundamentals and understanding of large scale high throughput systems
- Comfortable with Ambiguity : You thrive in undefined or open-ended problem spaces, providing structure, clarity, and decisive direction to your teams.
- A Pragmatic Builder : You are scrappy and impact-oriented. You view undefined data and messy systems as opportunities to build structure rather than blockers to progress.
- An Inspiring Leade r: Passionate about developing the next generation of technical leaders, managers, and engineers.
- An Executive Communicator: Highly effective at communicating complex technical concepts to both engineering teams and non-technical executive leadership.
- Data & System Oriented: You understand that robust data and systems are the foundation of any production application, and you design infrastructure for scale, correctness, and reliability.
- Curious & Creative : You enjoy tackling hard problems, exploring new technologies, and driving continuous improvements in both systems and workflows.
$96.8k - $251.6k
...Senior Principal AI Agent / ML Software Engineer (OCI) Redwood City, CA; Seattle, WA, United States Job Identification 334239 Job Category Product Development... .... The expectation is to ship, scale, and operate reliable, secure, observable, and cost‑aware AI platform...PrincipalTemporary workFlexible hours$295.25k - $345.04k
...experiences for everyone. Why Reliability? Roblox serves over 100... ...goal. We are hiring our first Principal Machine Learning engineer within our team. As a... ...demonstrating the impact of ML on user trust and safety outcomes... ...stack, leveraging modern AI coding tools (e.g., Cursor)...PrincipalFull timeWork experience placementWork at officeLocal areaMonday to Friday- ...Computer Vision AI & ML Engineer San Mateo, CA Company Overview At Skild AI, we are building the world's first general purpose... ..., augmentation, and versioning. Implement monitoring and reliability frameworks, including uncertainty estimation, failure detection...Suggested
$169.1k - $270.8k
...with you. Job Description AI Governance (AIG) Engineering team is part of the Data and AI Platform... ...product provides an inventory of ML models and AI systems , oversight for... ...develop, and maintain scalable and reliable AI governance service s. You...SuggestedWork experience placementWork at officeLocal area$180k - $212k
...visit About the department Franklin Templeton is seeking an AI/ML Lead Engineer to design and implement agents for financial advisors that... ...Optimize systems for latency, cost efficiency, and reliability in production Contribute to infrastructure decisions around...SuggestedFull timeLocal areaWorldwideWork visaFlexible hours3 days per week$296k
...I did my part and supported the Regular Toilet is looking for a Principal ML Engineer in California to lead the development of advanced machine learning algorithms for autonomous systems. You will leverage state-of-the-art sensor data, collaborating across teams to innovate...Principal$200k
...United Cerebral Palsy of Georgia is seeking a leader to guide a talented team in engineering simulations using AI and machine learning. This role offers a competitive starting salary of $200,000 and the opportunity to shape systems and establish engineering standards....$115k - $140k
...Qualys is seeking a Senior Security Engineer specializing in AI/ML in Foster City. This role involves building and securing GenAI applications, conducting research on AI vulnerabilities, and collaborating across teams. The ideal candidate will possess strong programming...- ...Senior AI/ML Engineer — LLM & Agent Stack Every production AI system, whether it's powering customer support, writing code, analyzing financial data, or diagnosing medical conditions, needs the same foundational infrastructure. A way to route between models. A way...
- ...A leading technology company is seeking a Principal Software Engineer for the Economy ML team. You will lead data engineering efforts, setting standards for high-scale data systems and pipelines. Collaborate with Product and Data Science teams to prioritize business growth...Principal
- ...Upstart is seeking a Principal Machine Learning Engineer to lead initiatives that enhance machine learning capabilities... ...candidate has 7+ years of applied ML experience, proficiency in key ML... ...Join Upstart to drive innovation in the AI lending marketplace and contribute to...Principal
- ...Physics AI Leader Luminary helps engineering companies be more competitive by getting to market faster, creating new, better products, and reducing... ...similar in scope and capability to NVIDIA Modulus/Physics-ML (formerly Physics-Nemo), ensuring the delivery of models...
$247k - $297k
...Staff Machine Learning Engineer Protingent Staffing has an exciting... ...giving our customers powerful AI tools for transforming data and... ...create agent harnesses to build reliable AI powered systems Keep up... ...using LLMs, embeddings and other ML technologies Full lifecycle experience...Contract work- ...AI Models Team Member Splunk, a Cisco company, is building... ...of Splunk and Cisco's global engineering capabilities. Our work spans networking... ...models that enhance reliability, strengthen security, prevent... ...and production monitoring of ML models. Strong Research Track...PrincipalFlexible hours
$247k - $297k
...Protingent is seeking a Staff Machine Learning Engineer for a direct hire position based in San Francisco, CA or NYC, NY. The role involves prototyping and developing AI tools that transform data, productionizing core AI technologies, and collaborating with cross-functional...$345.04k - $399.42k
...safer, more civil shared experiences for everyone. Why Engineering Efficiency? The Engineering Efficiency AI Infrastructure Pod acts as Roblox’s center of... ...design infrastructure for scale, correctness, and reliability. Curious & Creative: You enjoy tackling hard problems...PrincipalFull timeWork experience placementWork at officeLocal areaMonday to FridayShift work$148k - $247k
...leading P&C insurance software. Our team is at the forefront of AI, cloud, and data platform adoption, working collaboratively... ...diverse perspectives and teamwork. ¹ As a Senior AI/ML Platform Engineer, you will architect and scale the ML platform for data scientists...Full timePart timeImmediate startFlexible hours$201k - $281k
...Principal Engineer At Coupa Coupa makes margins multiply through its community-generated AI and industry-leading total spend management platform for businesses large and small. Coupa AI is informed by trillions of dollars of direct and indirect spend data across a...PrincipalWork at office$200k - $295k
...Zoox in Foster City, CA is seeking a Senior Software Engineer specialized in Simulation Graphics and AI/ML. This role involves researching and implementing advanced 3D rendering techniques for sensor simulations. Candidates must have over 5 years of experience in 3D algorithms...$215.2k - $312.35k
...as a service, together with reliable and scalable data platform as... ...of scalable and responsible AI, ML and Data Innovations and products... ...managers, AI and data engineers, program managers focused on... ...the payments ecosystem. As a Principal ML Engineer, you will drive the...PrincipalWork experience placementWork at officeLocal areaRemote work2 days per week3 days per week$166.9k - $230.9k
...Every day, we bring creativity, experimentation, and advanced AI to reshape access to credit, helping millions move forward... ...matters, we’d love to hear from you. The Team Upstart’s Site Reliability Engineering (SRE) team owns the reliability, resiliency, and...Summer workCurrently hiringWork at officeLocal areaRemote workWork from home$152.7k - $249.2k
...Overview We're looking for a Senior AI Engineer to help bring pragmatic, production-grade... ...identify high-impact opportunities, build reliable AI services, and ensure they are safe,... ...Responsibilities Identify, prototype, and deploy AI/ML solutions into production to improve...Temporary work- ...challenges that push the boundaries of what AI can do: Create AI agents capable of... ...Establish scientific processes for prompt engineering by leveraging deep knowledge of LLM... ...engineer with working knowledge of modern AI/ML technologies—from hands-on experience with...Work at officeRemote work
$162.6k - $302k
...computational and data ecosystems. As a Site Reliability Engineer in the Solutions Engineering capability,... ...applications, machine‑learning (ML) workloads, and high‑performance computing... ...services such as AWS SageMaker, Google AI Platform, or Azure ML. Deep understanding...PrincipalLocal areaRelocation package3 days per week$159.21k - $196.67k
...and quick to deliver results. Our people-first approach to AI eliminates friction, making employees more effective and... ...build it with us. Job Description Freshworks is seeking a Principal AI Solutions Engineer to serve as the AI technical leader for our North America...PrincipalFull timeFlexible hours- ...About Obvio AI Each year, more than 40,000 people in the U.... ...models and handles inference reliably. Optimize for GPU utilization... ...pipeline downtime. Set the engineering standard. This is an early hire... ...meaningful experience working on ML-heavy pipelines. You've owned...Local area
$192k - $238k
...Our people-first approach to AI eliminates friction, helping businesses... ...activity across the funnel by engineering contextual data pipelines (web... ..., GTM/ RevOps Engineer, or AI/ML Engineer working with complex... ...backgrounds. If you can build reliable systems that help revenue...Flexible hours$125.5k - $230.2k
...Technology – Data and Decision Science – AI Native Engineering AI/Machine Learning Engineer,... ...addressing domains including grid and asset reliability, outage prediction and response,... ...economics. Designing and delivering AI/ML use cases relevant to Power &...Full timeWork experience placementSummer holidayFlexible hours$195k - $350k
...ML Engineer San Mateo, CA (Hybrid) About Eve Eve is redefining legal technology for plaintiff... ...recover more for clients, and grow with AI that works across every stage of a case,... ...AI performance, ensuring meaningful, reliable outcomes. What We're Looking For...Temporary workWork at officeLocal areaFlexible hours$296k
...and you could work on any (or all!) of these components. As a Principal ML Engineer, you will lead the development of machine learning algorithms... ...applications Develop new algorithms to apply generative AI to simulation to improve the realism of our offline validation...PrincipalTemporary workImmediate startRelocation package
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Principal AI/ML Engineer, Reliability. Be the first to apply!
- data center chief engineer San Mateo, CA
- hotel chief engineer San Mateo, CA
- principal developer San Mateo, CA
- senior civil engineer project manager San Mateo, CA
- general engineer San Mateo, CA
- senior principal engineer San Mateo, CA
- chief engineer San Mateo, CA
- senior chief engineer San Mateo, CA
- engineering director San Mateo, CA
- director software engineering San Mateo, CA

