Staff Software Engineer, ML Infrastructure
$146.6k - $215.1kSimpliSafe
Job Description
Job Description
About SimpliSafe
We're a high-tech home security company that's passionate about protecting the life you've built and our mission of keeping Every Home Secure. And we've created a culture here that cares just as deeply about the career you're building. Ours is a no ego culture of collaboration and innovation where those seeking their next challenge can find big opportunities and make a huge impact on the lives of all those who we protect. We don't just want you to work here. We want you to grow and thrive here.
We're embracing a hybrid work model that enables our teams to split their time between office and home. Hybrid for us means we expect our teams to come together in our state-of-the-art office on two core days, typically Tuesday, Wednesday, or Thursday – working together in person and choosing where they work for the remainder of the week. We all benefit from flexibility and get to use the best of both worlds to get our work done. Why are we hiring?Well, we're growing and thriving. So, we need smart, talented, and humble people who share our values to join us as we disrupt the home security space and relentlessly pursue our mission of keeping Every Home Secure.
About the RoleWe're looking for a Staff Software Engineer to join our Cloud ML team — the team that owns both the cloud-side ML infrastructure and the applied ML research that powers SimpliSafe's intelligent home security products. This is a senior individual contributor role for a distributed systems expert who wants to apply that craft to one of the most demanding problem domains in the company.
You'll partner closely with other Staff and Principal engineers to drive architecture, mentor across the team, and set the technical direction for our ML platform. The work spans two of our most demanding workloads: real-time computer vision inference that processes video from cameras and doorbells across our customer base, and LLM/GenAI infrastructure that will power our future generation of intelligent applications. Both are, fundamentally, distributed systems problems — high-throughput, low-latency, multi-tenant, GPU-aware, and unforgiving of regressions.
This role is for someone who has built and operated large-scale distributed services in production — high-QPS APIs, real-time platforms, low-latency serving systems — and is excited to bring that depth to ML infrastructure. Prior ML experience is a plus, not a prerequisite. If you've shipped systems that serve a lot of traffic, scale gracefully, and stay up at 3am, we want to talk to you.
What You'll DoSet technical direction for ML infrastructure
- Drive architecture decisions for our Kubernetes-based ML platform — anchored on Ray for inference, alongside KServe, Triton, and vLLM — across real-time and batch workloads.
- Lead deep technical reviews on system design, capacity planning, and reliability for the highest-stakes ML systems at SimpliSafe.
- Identify and remove the systemic bottlenecks in our ML deployment infrastructure — whether that's serving reliability, deployment friction, observability gaps, scaling, or cost.
Build and operate real-time CV inference at scale
- Own the design and evolution of cloud-side inference systems that process live video and events from SimpliSafe devices in real time.
- Drive throughput, latency, and cost improvements (batching strategies, GPU utilization, autoscaling, multi-model serving) for production CV models.
- Build the feedback loops between cloud inference, edge devices, and the data flywheel that improves model quality over time.
Stand up LLM/GenAI serving infrastructure
- Help shape how SimpliSafe serves LLMs in production — model serving patterns, KV-cache and batching strategies, evaluation pipelines, guardrails, and cost controls.
- Partner with applied ML engineers to take new GenAI-powered product features from prototype to scaled deployment.
Raise the engineering bar across Cloud ML
- Mentor engineers across the team through design reviews, code reviews, pairing, and written guidance — a meaningful uplift on everyone you work with.
- Establish and evangelize best practices for model lifecycle management (registry, deployment, monitoring, rollback, drift) and on-call.
- Write the documentation, runbooks, and architectural decision records that make the platform legible and durable.
Own reliability and operational excellence
- Lead incident response and postmortems for critical ML systems; turn lessons learned into platform-level improvements.
- Define SLOs, observability standards, and on-call practices for ML services in production.
- 8+ years of software engineering experience, with a clear track record of building and operating large-scale distributed systems in production.
- Deep expertise in high-throughput, low-latency services — ad serving, recommendations, real-time APIs, online platforms, or similar — including the operational reality of running them at scale.
- Strong production experience on Kubernetes and AWS (EKS, S3, IAM, networking) and with Kafka, containerized deployments, CI/CD, and infrastructure-as-code.
- Demonstrated experience with the building blocks of high-scale systems: load balancing, autoscaling, batching, caching, multi-tenancy, queuing, and capacity planning.
- Proficiency in Python is required; experience with a systems language (Go, C++, Rust) for performance-sensitive components is a plus.
- Staff-level technical leadership : ability to drive ambiguous, cross-cutting initiatives, align senior stakeholders, and elevate the engineers around you without formal authority.
- Strong written and verbal communication — you can make complex technical tradeoffs legible to ML scientists, product, and other infra teams.
- ML exposure is preferred — having deployed or operated production ML systems, worked closely with ML teams, or built ML-adjacent infrastructure. Exceptional distributed systems engineers without direct ML experience are encouraged to apply; we'll help you ramp.
- Hands-on experience with Ray , KServe , Triton , vLLM , or other ML serving stacks.
- Hands-on experience with LLM serving in production (vLLM, TGI, TensorRT-LLM, SGLang) — KV cache management, continuous batching, speculative decoding, quantization for serving.
- Experience building real-time video or streaming pipelines (Kafka, Kinesis, Flink, or similar) at scale.
- Experience operating GPU-based inference systems — GPU-aware scheduling, multi-model serving, accelerator utilization optimization.
- Familiarity with ML fundamentals — how models are trained, evaluated, versioned, deployed, monitored, and rolled back in production.
- Experience with model lifecycle tooling (MLflow, Weights & Biases, model registries, drift detection, shadow deployments).
- Open source contributions to distributed systems or ML infrastructure projects.
- Experience operating in environments with strong security and compliance requirements .
The Cloud ML team owns the full surface area — infrastructure and applied research — which means your work as a Staff infra engineer directly shapes what's possible for the science. You'll have unusual leverage: the platform you build determines how fast SimpliSafe can ship intelligent features, and the features we ship directly impact whether someone's home is safer tonight than it was yesterday.
What Values You'll Share- Customer Obsessed - Building deep empathy for our customers, putting them at the core of our work, and developing strong, long-term relationships with them.
- Aim High - Always challenging ourselves and others to raise the bar.
- No Ego - Maintaining a "no job too small" attitude, and an open, inclusive and humble style.
- One Team - Taking a highly collaborative approach to achieving success.
- Lift As We Climb - Investing in developing others and helping others around us succeed.
- Lean & Nimble - Working with agility and efficiency to experiment in an often ambiguous environment.
- A mission- and values-driven culture and a safe, inclusive environment where you can build, grow and thrive
- A comprehensive total rewards package that supports your wellness and provides security for SimpliSafers and their families (For more information on our total rewards please click here )
- Free SimpliSafe system and professional monitoring for your home.
- Employee Resource Groups (ERGs) that bring people together, give opportunities to network, mentor and develop, and advocate for change.
The target annual base pay range for this role is $146,600 to $215,100.
This target annual base pay range represents our good-faith estimate of what we expect to pay for this role. We use a market-based compensation approach to set our target annual base pay ranges and make adjustments annually. We carefully tailor individual compensation packages, including base pay, taking into consideration employees' job-related skills, experience, qualifications, work location, and other relevant business factors.
Beyond base pay, we offer a Total Rewards package that may include participation in our annual bonus program, equity, and other forms of compensation, in addition to a full range of medical, retirement, and lifestyle benefits. More details can be found here.
We're committed to fair and equitable pay practices, as well as pay transparency. We regularly review our programs to ensure they remain competitive and aligned with our values.
We wholeheartedly embrace and actively seek applications from all individuals, no matter how they identify. We are committed to cultivating a diverse and inclusive workplace, and we believe our work is enriched when we incorporate a multitude of perspectives, backgrounds, and experiences. We want everyone who works here to thrive and contribute to not only our mission of keeping every home secure, but also to making our workplace safe and supportive for others. If a reasonable accommodation may be needed to fully participate in the job application or interview process, to perform the essential functions of a position, or to receive other benefits and privileges of employment, please contact View email address on ziprecruiter.com .
$200k - $325k
...can't access because the operational infrastructure to run clinical trials efficiently doesn... ...intelligence. We're looking for a Staff Software Engineer to own the design and implementation of... ..., between integration engineering and ML infrastructure, between defining technical...Suggested- ...at the intersection of data, software engineering, and applied machine... ...rigorous environment. As a Staff Software Engineer , you’ll... ...software, data, and applied ML teams to ensure systems are... ...performance tuning ~ Cloud infrastructure experience (AWS), including...Suggested
$106.61k - $284.28k
...microservices-based design and AI platform engineering. Design and develop highly scalable... ...Lead the development and integration of AI/ML solutions, including LLMs, Retrieval‑Augmented... ...7+ years of experience in software engineering, including full software development...SuggestedLocal area- Staff Full-Stack Software Engineer Financial institutions - banks and credit unions - have begun a seismic shift in how they operate and serve their customers... ...pipelines, vector store migrations, orchestration of ML utility services Optimize applications for reliability...SuggestedRemote workWork from homeShift work
$170k - $230k
...intelligence to lead the charge. As an AI software company who deploys its inventions... .... The Role We are looking for a Staff Software Engineer to lead the design and... ...work across backend services, data and ML infrastructure, internal tools, and customer-facing...Suggested$170k - $230k
As a Staff Software Engineer on our Clinical Health team, you will design, build, and operate the production... ...learning, backend engineering, cloud infrastructure, and software as a medical device (... ...you will partner closely with Applied ML Scientists, ML Research Engineers, and...Full timeWork at officeRelocation$106.61k - $284.28k
...can be in the digital world. Currently, we are seeking a Staff Software Development Engineer - Fulfillment who as both a Technical Lead and Individual... ...to propagate data from SQL/NoSQL stores to analytics and ML systems with strict latency and throughput requirements...Hourly payFull timeContract workTemporary workLocal area$192k - $256k
...the future of science! We are seeking Staff Software Engineers with backend experience to join our... ...across large-scale workloads. * Cloud & Infrastructure: Leverage AWS services, Kubernetes... ...-Functional Collaboration: Work with ML researchers, engineers, and scientists...Full timeWork at officeLocal areaFlexible hours$144k - $288k
Your Impact at LILA We are seeking a Staff Software Engineer to join our Scientific System of Record... ...workflows. You’ll work closely with ML researchers, platform engineers, and scientists... ...large-scale workloads. Cloud and Infrastructure: Leverage AWS services, Kubernetes,...Full timeWork at officeLocal areaFlexible hours$87.97k - $188.95k
...to harness the potential of Cloud, AI, ML, IoT, 5G, and quantum computing to design... ...seeking a Senior Associate, Cloud DevOps Engineer for our Consulting practice.... ...implementation and migration to new IT infrastructure and cloud (IaaS/PaaS) environments...Full timeH1bLocal area- Hybrid Full-time, Staff Software Engineer at Activ Surgical About the job, About the Company Activ Surgical is an early‑stage medical device startup dedicated to transforming advanced surgical visualization through innovative imaging, computer vision, and AI technologies...Full timeImmediate startFlexible hours
$229k - $331k
On the Motion Planning team at Zoox, you’ll be dedicated to improving the driving behaviors of the robot on public roads. The Planner's job is to figure out where the robot is going and how to get there safely in situations as complex as those found in Downtown San Francisco...Temporary workRelocation package- ...Distributed Systems Software Engineer, Python / Go 3 months ago Be among the first 25 applicants... ...to new clouds and developing AI/ML pipelines for automatic analysis of test... ...Create automated testing approaches and infrastructure for validating reliability, performance...Full timeCasual workLocal areaRemote workWorldwide
$234k - $300k
...The ML Observability team builds cutting-edge tools to monitor... ...AI with confidence. As a Staff Engineer, you’ll lead the development... ...of both AI systems and software engineering to solve open-ended... ..., cloud migration, and infrastructure monitoring of our customers’...Work at office- ...global manufacturing capacity. We are looking for a Staff Software Engineer to join our core machine learning and data... ...leadership team to improve and extend the foundational infrastructure leveraged by Xometry’s AI/ML solutions, including the Instant Quoting Engine®,...
$197k - $291k
Staff Software Engineer, Infrastructure, Google Cloud Copy link Advanced Experience owning outcomes and decision making, solving ambiguous problems and influencing stakeholders; deep expertise in domain. Apply Copy link By applying to this position you will have an...Full timeWorldwide$254k - $336k
...assembled a diverse team of experts in software, robotics, artificial intelligence,... ...defense capability. About The Job Staff Robotics Engineers lead the delivery of vehicle... ...behaviour analysis, simulation and test infrastructure, and interfaces with lower‑ and higher...Full timeWork experience placementFlexible hours- Senior or Staff Full-Stack Software Engineer (Remote) Job Openings Senior or Staff Full-Stack Software Engineer (Remote) About the job Senior or Staff Full-Stack Software Engineer (Remote) Are you a senior or principal full-stack software engineer with startup experience...Remote jobWork experience placement
$197k - $291k
A leading technology company is seeking a Staff Software Engineer to develop and enhance large-scale software solutions. You will provide technical leadership to project teams, influence cross-functional engineering efforts, and manage project priorities. The role requires...- ...that have a capability to move. Your role is to work with the ML model teams to bring cutting-edge models into the vehicle stack... ...access to the best sensor data in the world and an incredible infrastructure for testing and validating your algorithms. We are creating new...Relocation package
$114.48k - $130k
...DevSecOps, Application Security, and Platform Engineering teams to identify, assess, and remediate... ...across SaaS platforms, public cloud infrastructure, and containerized workloads. The... ...early in the development lifecycle. AI / ML Security (Plus) Contribute to emerging...Part timeWork experience placementWork at officeRemote workFlexible hoursShift work- ...messiness, built training infrastructure that scales reliably,... ...frameworks that give engineers genuine confidence in... ...about the craft of software engineering as much as... ...AI model development, ML infrastructure, MLOps,... ...radar, IMU). Prior staff or principal engineer...
$170k - $240k
...sound — from lunchroom conversations to the studio in our office. About the Role We’re looking for early members of our software engineering infrastructure team. You’ll work closely with the founding team and have ownership of a wide variety of technical and design...Work at officeFlexible hours- DataRobot, Inc. is seeking a Staff Software Engineer to provide technical leadership in architecting and implementing Kubernetes-based infrastructures. The role involves mentoring engineers, leading technical direction for core Fleet initiatives, and driving modernization...
$87.97k - $188.95k
...currently seeking a Senior Associate, AI Engineer to join our Advisory Services practice.... ...transformation, and validation to support AI/ML model training and deployment... ...academic experience in AI/ML, data science, software engineering, or cloud technologies Bachelor...Full timeH1bLocal area- ...development, ensuring the team stays at the forefront of AI and cloud engineering technology advancements Work closely with our Microsoft,... ...: Minimum eight years of recent experience in AI/ML, data analytics, and cloud technology, including leadership roles...Full timeH1bLocal area
- Junior Quant Developer - Systematic Trading Infrastructure Perm Hybrid Role Overview Develop core components of systematic... ...and traders. Must-Haves Bachelor’s in CS, Engineering, Stats, or related Exposure to ML/statistics packages and large financial data sources...Permanent employment
- ...then consider a career in Advisory. KPMG is currently seeking a Manager, AI Engineer to join our Advisory Services practice. Responsibilities: End-to-end design and development of AI/ML solutions, leveraging cloud AI services (Microsoft Azure, AWS, Google Cloud)...Full timeH1bLocal area
- ...award-winning, AI-First global digital engineering company that helps the world’s leading Fortune... ...awards in the past 10 years3 AWS AI/ML Partner of the Year awards3 NVIDIA... ...visit: Website or LinkedIn PageRole: Cloud Infrastructure ArchitectExperience Level: 10+ yearsEmployment...Work experience placementRemote workWorldwideShift work
- ...As a Senior Platform Engineer , you’ll be at the heart of our infrastructure - designing, building, and scaling the core... ...experience in Platform, DevOps, or Software Engineering roles. Experience working... ...Nice to Have Experience with AI/ML platforms or data-intensive...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Staff Software Engineer, ML Infrastructure. Be the first to apply!
- entry level infrastructure engineer Boston, MA
- security infrastructure engineer Boston, MA
- infrastructure engineer Boston, MA
- lead infrastructure engineer Boston, MA
- data infrastructure engineer Boston, MA
- infrastructure engineering manager Boston, MA
- senior infrastructure engineer Boston, MA
- infrastructure automation engineer Boston, MA
- remote infrastructure engineer Boston, MA
- infrastructure developer Boston, MA



