Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Staff Site Reliability Engineer

$165k - $230k

SimSpace Corporation

SimSpace serves as an AI Proving Ground where organizations can confidently train, test, and outmaneuver adversaries in any environment. Trusted by allied governments, militaries, enterprises, and research institutions worldwide, SimSpace enables adaptive, AI-ready defenses that stay ahead of evolving threats. Founded in 2015 by experts from U.S. Cyber Command and MIT Lincoln Laboratory, the platform unifies training, testing, and validation in a realistic, live-fire simulation—helping teams evaluate security investments, optimize performance, and compress cyber readiness cycles from months to days. Why join SimSpace? We are an organization that is focused on building our culture and mindfully enhancing our atmosphere every day which is why we have collaborated on an integral value system. Our governing philosophy of being Human Centered is deeply embedded within our value system. We apply this philosophy to every one of our internal team members, external clients, and their customers. How Do We Work? We believe that people are at the center of everything we do. SimSpace fosters a culture of continuous learning, curiosity, and professional growth. That belief shows up in action: in-house training, internal and external learning platforms, cyber conferences, industry events, and dedicated time for skill development. Our people are empowered to shape their careers - and it shows. Year over year, SimSpace consistently outperforms industry benchmarks in internal mobility, promotions, and total rewards growth. Who Thrives Here? We are a team of innovators, protectors, and problem-solvers. We believe diversity of thought and experience fuels better solutions, and we’re committed to building teams that reflect the communities we serve. Whether you’re remote or office-based, you’ll collaborate with talented colleagues across departments and time zones, united by the mission to create a safer digital world. We invite you to apply today!

About the Role We are looking for a Staff Site Reliability Engineer to define the technical vision, lead the architecture, and secure the infrastructure that powers the SimSpace cyber range platform. The ideal candidate is a deeply experienced SRE and exceptional software engineer who thinks strategically about distributed systems, reliability, and operability at a global scale. At the Staff level, you will act as a force multiplier—architecting resilient systems, driving engineering standards, and solving our most complex infrastructure challenges rather than relying on manual processes or localized fixes. In this position, you'll provide overarching technical leadership across our SRE practice, bridging traditional site reliability, DevOps, and DevSecOps. You'll architect the systems and strategies that allow SimSpace to deliver software seamlessly across our own data centers, to customers who bring their own hardware, and as pre-packaged appliances with bundled hardware and software. As our on-premises product matures and scales, you will design the long-term automation frameworks that make these varied deployments robust, secure, and repeatable. What will you be doing as a Staff SRE at SimSpace? Technical Strategy & Architecture: Design and architect the overarching infrastructure strategy that enables consistent, repeatable, and secure deployments across SimSpace-hosted data centers, customer-provided hardware, and highly restricted air-gapped environments. Platform Evolution & Configuration Management: Lead the evolution of our CI/CD and Kubernetes platforms. Drive advanced application packaging, templating, and configuration management strategies using Jsonnet and Grafana Tanka (alongside Kustomize). Move beyond maintaining pipelines to architecting multi-cluster, multi-environment deployment frameworks that drastically improve developer velocity. Reliability Leadership: Define, measure, and govern Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Error Budgets across the engineering organization. Partner with product and engineering leadership to balance feature delivery with platform stability. Advanced Observability: Architect our enterprise observability strategy using the Grafana stack. Design frameworks for proactive monitoring, complex anomaly detection, and distributed tracing that give teams unparalleled visibility into system health, pod scaling, and latency bottlenecks. Security & Compliance Architecture: Drive the infrastructure security posture at an architectural level. Embed advanced container security, zero-trust network segmentation, and automated compliance policies directly into our deployment pipelines and runtime environments. Cross-Functional Enablement: Serve as a strategic partner and consultant to development teams. Advocate for an "SRE culture" by designing self-service tooling, establishing "paved roads" for developers, and reducing operational toil across the entire engineering org. Incident Command: Act as an Incident Commander during complex, high-severity outages. Drive blameless post-mortems and engineer long-term, systemic, and architectural fixes to ensure classes of failures never repeat. Mentorship & Multiplier: Act as a technical mentor to senior and mid-level engineers. Raise the baseline of engineering excellence across the company by coaching, documenting best practices, and leading by example. Who you are: Experience: 8+ years of experience in Site Reliability, Platform, or DevOps engineering, with a proven track record of operating at a Staff, Principal, or Lead level to drive organization-wide infrastructure initiatives. Expert Software Engineering: You possess deep software engineering skills (beyond scripting) and can architect complex, production-quality systems. You design clean interfaces, build maintainable tooling, and can dictate the technical direction of our internal toolchain. Language agnostic, but highly proficient in at least one modern language (e.g., Go, Python). Advanced Kubernetes & Configuration Mastery: Deep, architectural understanding of Kubernetes in multi-tenant and multi-cluster production environments. You possess expert-level knowledge of Jsonnet and Grafana Tanka for managing complex, scalable Kubernetes configurations and application packaging. GitOps & IaC Expertise: Extensive experience architecting sophisticated CI/CD pipelines and GitOps workflows using GitHub Actions, ArgoCD, and infrastructure-as-code principles at an enterprise scale. Complex Deployments: Systems-level thinking with the ability to design architectures that span self-hosted, on-premises, VMware-based, and air-gapped deployment models. Observability Expert: Deep expertise with observability platforms (Grafana stack preferred) and a proven ability to design alerting and monitoring strategies for complex distributed systems. Security Mindset: Strong background in infrastructure security architecture, including container hardening, network security, vulnerability management, and delivering software to heavily regulated or customer-managed environments. Influential Communicator: Exceptional communication and stakeholder management skills. You have a service-oriented mindset, but you also have the ability to influence cross-functional leadership, negotiate reliability tradeoffs, and align engineering teams behind a unified technical vision. We’re proud to offer a competitive and comprehensive package designed to support your well-being, growth, and success: Compensation. Base salary range: $165,000 - $230,000 reflecting our confidence in your expertise and impact, with the opportunity for bonuses tied to company performance and individual contributions. Health & Wellness. Comprehensive medical, dental, and vision benefits, plus savings plans—coverage starts on day one! Mental Health Support. Access to company-paid counseling, coaching, and resources for you and your family through Spring Health. Financial Well-Being. Plan for your future with a 401(k)-retirement savings plan featuring a company match. Flexible Time Off. Take the time you need with unlimited vacation and dedicated health & wellness days. SimSpace provides flexible solutions to meet the diverse work-life needs of team members. Parental Leave. Paid leave plans to support you and your loved ones during life’s most important moments. Ownership Opportunities. Equity stock options at hire, with annual performance-based grants—become an invested stakeholder in our shared success. Referral Rewards. Earn $1,500–$3,500 for every qualified hire through our employee referral program. Peloton Interactive Wellness Program. Full- and partial- subsidized membership plans and equipment discounts to help you reach your personalized fitness goals. Continuous Learning. Access a LinkedIn Learning membership to prioritize your personal and professional development. Social Connections. Monthly reimbursements for meaningful connections with teammates through our SocialSpace Community. Extra Perks. Legal plan coverage, pet insurance, wellness reimbursements, and more to simplify life’s details. Join SimSpace and enjoy benefits that enhance your career, health, and happiness! SimSpace is an Equal Opportunity Employer: In compliance with federal law, all persons hired will be required to verify identity and eligibility to work in the United States and to complete the required employment eligibility verification document form upon hire. SimSpace is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, pregnancy, genetic information, disability, status as a protected veteran, or any other protected category under applicable federal, state, and local laws. We are committed to providing an inclusive and welcoming environment for all members of our staff, clients, volunteers, subcontractors, vendors, and clients. Research shows that women and people from underrepresented groups only apply to jobs if they meet all of the qualifications. However, no one ever meets 100% of the qualifications. SimSpace encourages you to break that statistic and to apply. We look forward to your application! We also consider qualified applicants regardless of criminal histories, in accordance with applicable law. We are committed to providing reasonable accommodations for qualified individuals with disabilities in our job application procedures. If you need assistance or accommodation due to a disability, please contact . SimSpace does not accept unsolicited resumes from employment agencies. Actual compensation for the position is based on a variety of factors, including, but not limited to affordability, skills, qualifications and experience, and may vary from the range.

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Staff Site Reliability Engineer in New York, NY vacancy
  •  ...customer acquisition, and Connor was a machine learning research engineer at Scale AI. The rest of our team comes from companies like...  ...of-the-art AI. As a Senior SRE, you'll tackle the scaling and reliability challenges that come with adding terabytes of data monthly and... 
    Suggested

    Unify

    New York, NY
    1 day ago
  •  ...It's designed so Stellar's ecosystem can make a real-world, lasting impact. About the Role SDF is looking for a Senior Site Reliability Engineer to help build and operate the foundation that powers our engineering teams. You'll ensure the reliability and scalability... 
    Suggested

    TechChain Talent

    New York, NY
    3 hours ago
  •  ...hatch I.T. is partnering with CardioOne to find a Site Reliability Engineer (SRE) to join their team. See deteails below: About the Role: CardioOne is seeking a highly skilled Site Reliability Engineer (SRE) to ensure the reliability, scalability, security, and performance... 
    Suggested
    Full time

    Hatchit Co

    New York, NY
    1 day ago
  • $70 per hour

     ...resolve system failures in real time. Build and manage resilient systems for stability and performance optimization. Collaborate with engineering teams to improve CI/CD pipelines and automation. Manage filesystem structures, storage, and process scheduling in containerized... 
    Suggested
    Remote work

    Crossing Hurdles

    New York, NY
    1 day ago
  • $133.11k - $148.04k

     ...As a Site Reliability Engineer at Weedmaps you will work cross‑departmentally with your partners on the application, infrastructure and quality teams to enhance the performance, reliability, resilience and scalability of the web services that make up Weedmaps.com. We are... 
    Suggested
    Full time
    Temporary work
    Local area
    Remote work
    Worldwide

    Weedmaps

    New York, NY
    2 days ago
  •  ...remote role, we will consider applicants based in LATAM. Our Engineering team is having a blast while delivering the most...  ...engineers building and maintaining Kraken's infrastructure. As a Site Reliability Engineer, you will keep one of the fastest growing companies... 
    Local area
    Remote work

    Framework Ventures

    New York, NY
    1 day ago
  • $127k - $249k

     ...The Team Platform Engineering is the department within SRE that is responsible for a range of critical infrastructure and operational...  ...fleet, alongside the critical components that ensure cluster reliability and security (e.g., CoreDNS, cert-manager, and Gatekeeper). As... 
    Work at office
    Local area
    Remote work
    Worldwide
    Flexible hours

    MongoDB

    New York, NY
    1 day ago
  •  ...A leading technology firm is seeking a Sr. Site Reliability Engineer in the United States. The ideal candidate will enhance system reliability and stability and should possess over 8 years of relevant experience in site reliability engineering. The position covers cloud... 

    Jobgether

    New York, NY
    1 day ago
  • $165k - $235k

     ...and the SDF team is expanding to support the rapidly growing and changing Stellar ecosystem. SDF is looking for a Senior Site Reliability Engineer to help build and operate the foundation that powers our engineering teams. You’ll ensure the reliability and scalability... 
    Temporary work
    Work at office
    Worldwide
    Flexible hours

    Crypto Pro Network

    New York, NY
    1 day ago
  • $150k - $170k

     ...Senior Site Reliability Engineer – Zip Co Join to apply for the Senior Site Reliability Engineer role at Zip Co At Zip, we build cloud‑native software applications that serve millions of customers and process billions of dollars in payments. We’re looking for a seasoned... 
    Casual work
    Work at office
    Remote work
    Flexible hours

    ZIP

    New York, NY
    3 days ago
  •  ...subscriptions at scale, combining the agility of a high-growth business with the backing of a global organization. As the Site Reliability Engineer, you will help ensure the reliability, scalability, and observability of CloudBlue’s multi-tenant SaaS platforms used by service... 
    Remote work
    Worldwide
    Flexible hours

    HostPapa

    New York, NY
    1 day ago
  • $148.32k - $185.4k

     ...professionals, we’re proud of where we’ve been and even more excited about where we’re going. We’re looking for a senior Site Reliability Engineer to join our small, high-ownership SRE team. In this hands-on individual contributor role, you\'ll own the reliability, scalability... 
    Remote work
    Flexible hours

    AbsenceSoft

    New York, NY
    1 day ago
  • $182.3k - $220k

     ...patients first - and that mission depends on reliable, secure, and scalable systems. As a...  ...infrastructure and building tools that empower our engineers to ship safely and confidently....  ...the year (i.e., during team on-sites).   At Ro, we believe that our diverse... 
    Local area
    Flexible hours

    Ro

    New York, NY
    3 hours ago
  • $7.5k

     ...and benefits packages, technology talks by our experts, a beautiful modern office, daily catered lunches, and more. As a Site Reliability Engineer (SRE), you will work at the intersection of production operations and software development as you improve, manage, and monitor... 
    Work at office
    Local area

    The Voleon Group

    New York, NY
    1 day ago
  •  ...and impactful role. Overall Purpose and Responsibilities of the Role As a Site Reliability Engineer, you will help build and support a technology platform while working closely with support staff and developers. You will be responsible for monitoring and troubleshooting... 
    Full time
    For contractors
    Remote work
    Work from home
    Monday to Friday

    Manila Recruitment

    New York, NY
    1 day ago
  • $150k - $200k

     ...Join to apply for the Senior Site Reliability Engineer role at Gradle Inc. Develocity is a first‑of‑its‑kind toolchain observability and acceleration platform that helps software teams adopt and improve DORA capabilities (including continuous delivery) in order to achieve... 
    Full time
    Local area
    Remote work
    Work from home

    Gradle Inc.

    New York, NY
    1 day ago
  •  ...mission is to unlock the next era of financial, creative, and personal freedom. The Department: Onchain The Role: Senior Site Reliability Engineer The Onchain infrastructure team at Gemini creates and manages software tools and platforms, automates the creation and support... 
    Remote work
    Flexible hours

    WorksHub

    New York, NY
    3 days ago
  •  ...Overview Discover exciting DevOps job opportunities and connect with 28,396 DevOps professionals. Responsibilities The Site Reliability Engineer (SRE) role involves ensuring the reliability, availability, and performance of core services. Successful candidates will collaborate... 
    Remote work

    DevOpsChat

    New York, NY
    1 day ago
  •  ...The Voleon Group is seeking a Site Reliability Engineer (SRE) to enhance production operations alongside software development. Responsibilities include improving fault-tolerance in data pipelines, diagnosing bugs, automating workflows, and leading deployments. The ideal... 

    The Voleon Group

    New York, NY
    1 day ago
  •  ...We’re on the lookout for a Site Reliability Engineer ! 45-65K EUR | Full Remote (Latam) | Series A startup backed by top US VCs. At Agentero we believe in simple and smart solutions for complex problems. We are building cutting‑edge technology to help insurance agents... 
    Remote work
    Home office
    Night shift

    Agentero

    New York, NY
    1 day ago
  • $153k - $190k

     ...interconnected health network and we want you to join us to change healthcare for the better! Job Description As a Senior Site Reliability Engineer you will be tasked with making sure we build a reliable, secure and efficient platform for the b. Well network. You will be... 
    Full time
    Contract work
    Live in
    Remote work

    b.well Connected Health

    New York, NY
    1 day ago
  •  ...Senior Site Reliability Engineer (Tax free & based in GCC) The ambition is to create a global leader in space – driving innovation globally for a better world, while transforming and inspiring Saudi society. Much attention has turned to the space sector in recent years... 
    Local area

    Firstaff Personnel Consultants Ltd

    New York, NY
    1 day ago
  •  ...Job Summary Minio is seeking a Remote Site Reliability Engineer to enhance the performance and reliability of its cloud-native storage solutions. In this role, you will be responsible for monitoring systems, troubleshooting incidents, and implementing automation to improve... 
    Remote work
    Flexible hours

    DevOpsChat

    New York, NY
    1 day ago
  •  ...New York, United States | Posted on 11/13/2025 Title: Senior Site Reliability Engineer (SRE) Location: Remote AboutJanuary AtJanuary, we’re transforming the lives of borrowers by bringing humanity to consumer finance. Our data-driven products empower financial institutions... 
    Remote work

    Govserviceshub

    New York, NY
    3 days ago
  •  ...obsessed about achieving the high quality and reliability our customers demand. You will work...  ...deliverables will reach the entire engineering organization to enable product teams to...  ...secure cloud platforms and tools. Apply site reliability engineering principles to improve... 
    Remote work

    BOSTON TRUST WALDEN COMPANY

    New York, NY
    1 day ago
  •  ...Job Description Job Description Forhyre is looking for engineers who can bring unique perspectives and innovative ideas to all areas...  ...evangelize cloud best practices while building a culture of reliability and observability Engage in and improve the end to end lifecycle... 

    forhyre.com

    New York, NY
    3 hours ago
  •  ...Senior Site Reliability Engineer – Azure Cloud Join to apply for the Senior Site Reliability Engineer role at Concord Technologies Concord Technologies is growing! Currently seeking a full‑time Senior Site Reliability Engineer (Sr. SRE) , with experience engineering solutions... 
    Full time
    Local area
    Immediate start
    Remote work
    Flexible hours

    Concord Technologies

    New York, NY
    1 day ago
  • $185k - $227k

     ...united by this common purpose and we are hiring the world’s best engineers, scientists, designers, product managers, operations experts...  ...on for more details. ROLE AND RESPONSIBILITIES A Senior Site Reliability Engineer (SRE) is expected to own the operational stability... 
    Remote work

    JUUL Labs

    New York, NY
    1 day ago
  • $150k - $200k

     .... But while there is a lot to celebrate in our past, there is almost as much opportunity ahead of us. We’re seeking a Sr. Site Reliability Engineer to join our team! About the Role We are seeking a Senior Site Reliability Engineer (SRE) to help ensure the stability, scalability... 
    Full time
    Remote work
    Flexible hours

    Backblaze

    New York, NY
    1 day ago
  • $75 per hour

     ...platform ( Snowflake, Databricks ), or a Fortune 500 platform/infrastructure team . ~ Background in cloud architecture, site reliability engineering, platform engineering, DevOps/DevSecOps, or cloud FinOps. ~ Day-to-day use of HashiCorp Terraform/Pulumi , Splunk/... 
    Hourly pay
    Full time
    Contract work
    For contractors
    Summer work
    Remote work

    Mercor Inc

    New York, NY
    3 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Staff Site Reliability Engineer. Be the first to apply!