Site Reliability / Infrastructure Engineer
Medal
The Company Medal Medal is the world’s largest and fastest-growing platform for gaming clips, where millions of gamers capture, share, and relive their best moments. Every year, our players record billions of clips, each representing a unique, action-packed highlight. We’re building the next generation of gaming communities: social, monetized, and creator-powered. Our mission is to design products that make sharing, discovering, and connecting around gaming moments seamless and fun. We raised a seed round of $133M from General Catalyst and Khosla to discover the next generation of intelligence. The Role Medal's infrastructure handles billions of clips, video ingestion pipelines, and social features at a massive scale most engineers never get to touch. We're looking for an SRE who cares deeply about reliability and scalability. The work centers on reliability, incident response, scaling, and making sure our infrastructure keeps up with our growth. You'll own the on-call rotation, drive postmortems, and work directly with engineering teams to meet their infra needs. The right person probably came through startups and scale-ups. You've been in the room when things broke at 2am, you've scaled databases under pressure, and you know the difference between a durable fix and a patch that buys you a week. Key Responsibilities Own reliability across our GCP infrastructure: Kubernetes clusters, managed services, and data pipelines, driving measurable improvements to availability and latency Lead incident response end-to-end: on-call rotations, runbooks, postmortems, and the follow-through that makes sure the same thing doesn't happen twice Architect and execute database scaling strategies (sharding, replication, query optimization, and capacity planning) across MySQL and Postgres at meaningful scale Partner with product engineering to translate feature requirements into infrastructure designs that hold up as we grow Manage and evolve our Terraform-managed GCP environment and Kubernetes cluster configurations Own our Elasticsearch cluster end-to-end: capacity planning, sharding strategy, index lifecycle management, version upgrades, and performance tuning at production scale Build and maintain observability across the stack: metrics, dashboards, alerting, and tracing Constantly improve CI/CD reliability and delivery pipelines across GitHub Actions Harden IAM, secrets management, and network segmentation as part of normal infra hygiene About You You’ve worked at startups and are comfortable in an environment of rapid growth where scaling up is a priority You have great judgment - you know the difference between a durable, sustainable fix vs. a patch that buys you a week You have deep, hands-on experience scaling and sharding relational databases in production environments You know GCP maybe a little too well: Kubernetes, VPC, IAM, Cloud Logging, and the managed services ecosystem You are fluent in Terraform and have owned real infrastructure-as-code at scale You've operated Elasticsearch in production and know how to keep a cluster healthy You have strong incident response instincts: you can work a P0 calmly, communicate clearly under pressure, and run a postmortem that prevents recurrence. You’ve worked with GitHub Actions in a production CI/CD environment. You have excellent communication skills (this is crucial!) and can both flag issues clearly and rapidly during incidents, and lead / write actionable postmortems Our Stack Google Cloud Platform Terraform, Salt, GitHub Actions Java, Redis, RabbitMQ, ElasticSearch, BigQuery, Kubernetes for backend Electron+React C# and C++ for native windows recording & more Swift for iOS, Kotlin for Android Benefits Competitive salary and meaningful equity Comprehensive medical, dental, and vision coverage 401(k) Wellness and fitness perks including a Wellhub membership and mental health resources Paid parental leave, fertility and maternal health benefits Generous PTO policy Daily meals and commuter benefits at our NYC HQ in Flatiron Learning and development stipend Benefits vary by country and employment type. #J-18808-Ljbffr
$160k - $215k
...based closer to our customer sites (i.e. Bay Area). We strongly... ...(including software engineers) to visit customer sites — ask... ...build the observability and reliability foundations that let us run... ...be our first full‑time SRE/infrastructure engineer , so we’ll look to...SuggestedFull timeWork experience placementWork at office2 days per week- ...experienced SRE to take ownership of reliability across our multi-region, cloud-native... ...we build observability, and how we run infrastructure that supports billions of events and large... ...to harden the platform. Mentor engineers and set best practices for SRE across...SuggestedRemote work
- ...connect with 28,396 DevOps professionals. Responsibilities The Site Reliability Engineer (SRE) for the platform will play a crucial role in... ...include incident management, capacity planning, and deploying infrastructure to enhance system reliability. Qualifications Solid...SuggestedRemote workFlexible hours
- ...skillsets to drive innovation and modernize the world’s most complex and mission-critical systems. As a Site Reliability Engineer III at JPMorgan Chase within the Infrastructure Platforms, you will solve complex and broad business problems with simple and straightforward...SuggestedRotating shift
- ...MediSolution is seeking a Site Reliability Engineer (SRE) to ensure the reliability and performance of our healthcare platforms. This remote position requires 7+ years of experience in supporting enterprise applications, with a strong focus on Azure cloud environments...SuggestedRemote work
- A cloud-native technology firm is seeking a Site Reliability Engineer to enhance the performance and reliability of its web services. The successful candidate will work cross-departmentally, driving best practices for monitoring and CI/CD pipelines while automating processes...
- ...to scale effectively and empower our engineering teams to focus on building innovative... ...around the world. Within Platform, the Site Reliability Engineering team is responsible for... ...drive, automation-first public cloud infrastructure (Terraform) It Pays to Work Here...Remote workFlexible hours
$165k - $242k
...CoreWeave combines superior infrastructure performance with deep technical... ...Platform & Infrastructure Engineering team in the Data... ...organization is responsible for the reliability, scalability, and security of... ...About the role: As a Senior Site Reliability Engineer, you...Permanent employmentTemporary workCasual workWork at officeFlexible hours- About the job We are looking for a senior site reliability engineer to join the Cloud FinOps team at Hopper. We manage a large infrastructure in Google Cloud that is used by hundreds of engineers to provide a first class experience to millions of end users around the world...Remote jobWork from homeSleeping nights
- Manila Recruitment is seeking a full-time Site Reliability Engineer to support and enhance our client's technology platform from the United States. The role entails monitoring platform performance, troubleshooting issues, and supporting customer onboarding. Ideal candidates...Remote jobFull timeWork from home
$114k - $148k
OneStream Software is actively seeking a Site Reliability Engineer to join their remote team. In this vital role, you will ensure the reliability... ...services. The ideal candidate will have extensive cloud infrastructure experience and will enjoy automating deployments and...Remote job$161.64k - $175k
...seeking a skilled candidate for a telecommuting position focusing on managing complex cloud-native environments and enhancing the reliability of their database systems. The ideal candidate will have a Master's degree in Computer Science or a related field, with...Remote job- ...A leading consulting firm is seeking a Senior Site Reliability Engineer based in the GCC. The successful candidate will utilize Ansible and Terraform to design automation solutions, manage cloud infrastructure, and support CI/CD pipelines. Ideal candidates should possess...
- ...time Location Type Remote Department Engineering About Chainlink Chainlink is the... ...also adopted Chainlink’s standards and infrastructure, including Swift, Euroclear, Mastercard... ...will be a part of that growth to ensure reliability and security remain at the forefront of...Full timeRemote work
$124k - $155k
...leading educational resources provider is seeking a Lead Site Reliability Engineer to oversee a 6-member team dedicated to the reliability and... ...expertise in AWS and automation tools to improve cloud infrastructure. The role requires strong problem-solving abilities and a...- ...A technology solutions company is looking for an Intermediate Site Reliability Engineer to enhance the reliability and scalability of cloud-hosted services. This role involves supporting cloud operations, automation, and performance tracking. The ideal candidate has 4–...Remote work
- SweetRush, Inc. is looking for a Site Reliability Engineer (SRE) to enhance the reliability and performance of our cloud-native environments. This fully remote role offers the chance to shape technical direction within a collaborative IT & Security team. Preferred candidates...Remote job
$150k - $170k
...Senior Site Reliability Engineer – Zip Co Join to apply for the Senior Site Reliability Engineer role at Zip Co At Zip, we build cloud‑native software applications that serve millions of customers and process billions of dollars in payments. We’re looking for a seasoned...Casual workWork at officeRemote workFlexible hours- ...for its strong employee culture and outstanding business performance. To learn more, visit Role Summary As an Intermediate Site Reliability Engineer, you will support the reliability, performance, and scalability of cloud‑hosted services and database platforms. You will...Remote workWorldwideHome office
$127k - $249k
...The Team Platform Engineering is the department within SRE that is responsible for a range of critical infrastructure and operational functions that support the broader engineering... ...components that ensure cluster reliability and security (e.g., CoreDNS, cert-manager...Work at officeLocal areaRemote workWorldwideFlexible hours$130k - $165k
...Job Title: Senior Software Engineer Company: Snapsheet Job Location: USA, Remote... ...Job Department: Technology Team: Site Reliability Engineering About Snapsheet... ...a 100% hands‑on role, involving both infrastructure management and software development....Full timeTemporary workLocal areaRemote workVisa sponsorshipWork visaFlexible hours$123k - $165k
...Site Reliability Engineer II Job Posting ID: 10143234 Department: Engineering Fleet – Reliability Engineering & Operational Support to backend... ...strategies for distributed systems. Maintain and improve Infrastructure‑as‑Code (IaC) definitions and cloud environment...Full timeWorldwide$180k - $200k
...Parabola is looking for a Senior Site Reliability Engineer to improve performance and reliability of its software systems in New York. This role requires 5+ years of SRE or DevOps experience and expertise in AWS and containerization tools. Offering a salary of $180,000...Work at office3 days per week$160k - $230k
...currently looking to add Platform Engineers to our team, with at least 5... ...scaling large-scale multi-cloud infrastructure. You’ll ensure our platform is reliable, secure, and performant from day... ...collaborative setting. Our team works on-site five days a week, growing and...Work at officeLocal area- ...We are hiring a Senior Site Reliability Engineer to help build and operate the infrastructure foundation that supports engineering teams. The role centers on reliability, scalability, cloud infrastructure, Kubernetes operations, and automation that allows developers to...
$142k - $214.7k
...financial intelligence. We build governance and intelligence infrastructure that enables artificial intelligence to operate safely,... ...-tenant platform on Google Cloud, and we're hiring a Site Reliability Engineer to own the reliability and observability of that platform...Shift work- ...unlock the next era of financial, creative, and personal freedom. The Department: Onchain The Role: Senior Site Reliability Engineer The Onchain infrastructure team at Gemini creates and manages software tools and platforms, automates the creation and support of this...Remote workFlexible hours
- ...access technology protocols are a plus Job Description: Site Reliability Engineer Periodic updates and maintenance of Windows-based golden... ...products and distributed systems Deployment and maintenance of infrastructure and applications in AWS using IaC Automate the process of...Remote workShift work
- ...We’re on the lookout for a Site Reliability Engineer ! 45-65K EUR | Full Remote (Latam) | Series A startup backed by top US VCs. At Agentero... ...‑mortems to prevent incidents from ever happening again. Infrastructure Improvements — You will build and maintain our cloud infrastructure...Remote workHome officeNight shift
$182.3k - $220k
...patients first - and that mission depends on reliable, secure, and scalable systems. As a Senior SRE on the infrastructure team, you’ll sit at the core of that effort:... ...infrastructure and building tools that empower our engineers to ship safely and confidently. You...Local areaFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Site Reliability / Infrastructure Engineer. Be the first to apply!
- site reliability engineer remote New York, NY
- site reliability engineer sre New York, NY
- site reliability engineer New York, NY
- site reliability engineering manager New York, NY
- entry level infrastructure engineer New York, NY
- infrastructure automation engineer New York, NY
- security infrastructure engineer New York, NY
- senior infrastructure engineer New York, NY
- associate infrastructure engineer New York, NY
- remote infrastructure engineer New York, NY


