Site Reliability / Infrastructure Engineer

Medal

The Company Medal Medal is the world’s largest and fastest-growing platform for gaming clips, where millions of gamers capture, share, and relive their best moments. Every year, our players record billions of clips, each representing a unique, action-packed highlight. We’re building the next generation of gaming communities: social, monetized, and creator-powered. Our mission is to design products that make sharing, discovering, and connecting around gaming moments seamless and fun. We raised a seed round of $133M from General Catalyst and Khosla to discover the next generation of intelligence. The Role Medal's infrastructure handles billions of clips, video ingestion pipelines, and social features at a massive scale most engineers never get to touch. We're looking for an SRE who cares deeply about reliability and scalability. The work centers on reliability, incident response, scaling, and making sure our infrastructure keeps up with our growth. You'll own the on-call rotation, drive postmortems, and work directly with engineering teams to meet their infra needs. The right person probably came through startups and scale-ups. You've been in the room when things broke at 2am, you've scaled databases under pressure, and you know the difference between a durable fix and a patch that buys you a week. Key Responsibilities Own reliability across our GCP infrastructure: Kubernetes clusters, managed services, and data pipelines, driving measurable improvements to availability and latency Lead incident response end-to-end: on-call rotations, runbooks, postmortems, and the follow-through that makes sure the same thing doesn't happen twice Architect and execute database scaling strategies (sharding, replication, query optimization, and capacity planning) across MySQL and Postgres at meaningful scale Partner with product engineering to translate feature requirements into infrastructure designs that hold up as we grow Manage and evolve our Terraform-managed GCP environment and Kubernetes cluster configurations Own our Elasticsearch cluster end-to-end: capacity planning, sharding strategy, index lifecycle management, version upgrades, and performance tuning at production scale Build and maintain observability across the stack: metrics, dashboards, alerting, and tracing Constantly improve CI/CD reliability and delivery pipelines across GitHub Actions Harden IAM, secrets management, and network segmentation as part of normal infra hygiene About You You’ve worked at startups and are comfortable in an environment of rapid growth where scaling up is a priority You have great judgment - you know the difference between a durable, sustainable fix vs. a patch that buys you a week You have deep, hands-on experience scaling and sharding relational databases in production environments You know GCP maybe a little too well: Kubernetes, VPC, IAM, Cloud Logging, and the managed services ecosystem You are fluent in Terraform and have owned real infrastructure-as-code at scale You've operated Elasticsearch in production and know how to keep a cluster healthy You have strong incident response instincts: you can work a P0 calmly, communicate clearly under pressure, and run a postmortem that prevents recurrence. You’ve worked with GitHub Actions in a production CI/CD environment. You have excellent communication skills (this is crucial!) and can both flag issues clearly and rapidly during incidents, and lead / write actionable postmortems Our Stack Google Cloud Platform Terraform, Salt, GitHub Actions Java, Redis, RabbitMQ, ElasticSearch, BigQuery, Kubernetes for backend Electron+React C# and C++ for native windows recording & more Swift for iOS, Kotlin for Android Benefits Competitive salary and meaningful equity Comprehensive medical, dental, and vision coverage 401(k) Wellness and fitness perks including a Wellhub membership and mental health resources Paid parental leave, fertility and maternal health benefits Generous PTO policy Daily meals and commuter benefits at our NYC HQ in Flatiron Learning and development stipend Benefits vary by country and employment type. #J-18808-Ljbffr

Apply

Vacancy posted 4 days ago

Similar jobs that could be interesting for youBased on the Site Reliability / Infrastructure Engineer in New York, NY vacancy

Site Reliability and Infrastructure Engineer
$160k - $215k
...based closer to our customer sites (i.e. Bay Area). We strongly... ...(including software engineers) to visit customer sites — ask... ...build the observability and reliability foundations that let us run... ...be our first full‑time SRE/infrastructure engineer , so we’ll look to...
Suggested
Full time
Work experience placement
Work at office
2 days per week
Treeswift Inc
New York, NY
5 days ago
Sr./Staff - Infrastructure/Site Reliability Engineer (SRE)
...experienced SRE to take ownership of reliability across our multi-region, cloud-native... ...we build observability, and how we run infrastructure that supports billions of events and large... ...to harden the platform. Mentor engineers and set best practices for SRE across...
Suggested
Remote work
Oscilar
New York, NY
3 days ago
Site Reliability Engineer, Platform
...connect with 28,396 DevOps professionals. Responsibilities The Site Reliability Engineer (SRE) for the platform will play a crucial role in... ...include incident management, capacity planning, and deploying infrastructure to enhance system reliability. Qualifications Solid...
Suggested
Remote work
Flexible hours
DevOpsChat
New York, NY
3 days ago
Site Reliability Engineer III- Kafka Platform
...skillsets to drive innovation and modernize the world’s most complex and mission-critical systems. As a Site Reliability Engineer III at JPMorgan Chase within the Infrastructure Platforms, you will solve complex and broad business problems with simple and straightforward...
Suggested
Rotating shift
Aumni
Jersey City, NJ
4 days ago
Remote Site Reliability Engineer - Healthcare Platform
...MediSolution is seeking a Site Reliability Engineer (SRE) to ensure the reliability and performance of our healthcare platforms. This remote position requires 7+ years of experience in supporting enterprise applications, with a strong focus on Azure cloud environments...
Suggested
Remote work
MediSolution
Brooklyn, NY
22 hours ago
Site Reliability Engineer: Cloud-Native, Observability & CI/CD
A cloud-native technology firm is seeking a Site Reliability Engineer to enhance the performance and reliability of its web services. The successful candidate will work cross-departmentally, driving best practices for monitoring and CI/CD pipelines while automating processes...
Weedmaps
New York, NY
5 days ago
Remote Staff Site Reliability Engineer, Platform - Gemini
...to scale effectively and empower our engineering teams to focus on building innovative... ...around the world. Within Platform, the Site Reliability Engineering team is responsible for... ...drive, automation-first public cloud infrastructure (Terraform) It Pays to Work Here...
Remote work
Flexible hours
WorksHub
New York, NY
5 days ago
Senior Site Reliability Engineer, Data Infrastructure
$165k - $242k
...CoreWeave combines superior infrastructure performance with deep technical... ...Platform & Infrastructure Engineering team in the Data... ...organization is responsible for the reliability, scalability, and security of... ...About the role: As a Senior Site Reliability Engineer, you...
Permanent employment
Temporary work
Casual work
Work at office
Flexible hours
CoreWeave
New York, NY
27 days ago
Senior Site Reliability Engineer, Platform & Cloud FinOps (100% Remote - USA Central & EST)
About the job We are looking for a senior site reliability engineer to join the Cloud FinOps team at Hopper. We manage a large infrastructure in Google Cloud that is used by hundreds of engineers to provide a first class experience to millions of end users around the world...
Remote job
Work from home
Sleeping nights
Hopper
New York, NY
3 days ago
Remote Site Reliability Engineer — LegalTech Platform
Manila Recruitment is seeking a full-time Site Reliability Engineer to support and enhance our client's technology platform from the United States. The role entails monitoring platform performance, troubleshooting issues, and supporting customer onboarding. Ideal candidates...
Remote job
Full time
Work from home
Manila Recruitment
New York, NY
3 days ago
Remote Site Reliability Engineer - Cloud & Observability
$114k - $148k
OneStream Software is actively seeking a Site Reliability Engineer to join their remote team. In this vital role, you will ensure the reliability... ...services. The ideal candidate will have extensive cloud infrastructure experience and will enjoy automating deployments and...
Remote job
OneStream Software
New York, NY
3 days ago
Remote Site Reliability Engineer: Cloud & Automation
$161.64k - $175k
...seeking a skilled candidate for a telecommuting position focusing on managing complex cloud-native environments and enhancing the reliability of their database systems. The ideal candidate will have a Master's degree in Computer Science or a related field, with...
Remote job
Redis
New York, NY
3 days ago
Senior Site Reliability Engineer - Cloud Automation (GCC)
...A leading consulting firm is seeking a Senior Site Reliability Engineer based in the GCC. The successful candidate will utilize Ansible and Terraform to design automation solutions, manage cloud infrastructure, and support CI/CD pipelines. Ideal candidates should possess...
Firstaff Personnel Consultants Ltd
New York, NY
3 days ago
Senior Site Reliability Engineer, Node Platform
...time Location Type Remote Department Engineering About Chainlink Chainlink is the... ...also adopted Chainlink’s standards and infrastructure, including Swift, Euroclear, Mastercard... ...will be a part of that growth to ensure reliability and security remain at the forefront of...
Full time
Remote work
P2P
New York, NY
3 days ago
Lead Site Reliability Engineer: Cloud, Scale & Automation
$124k - $155k
...leading educational resources provider is seeking a Lead Site Reliability Engineer to oversee a 6-member team dedicated to the reliability and... ...expertise in AWS and automation tools to improve cloud infrastructure. The role requires strong problem-solving abilities and a...
McGraw Hill
New York, NY
2 days ago
Site Reliability Engineer II - Cloud, DBA & Automation
...A technology solutions company is looking for an Intermediate Site Reliability Engineer to enhance the reliability and scalability of cloud-hosted services. This role involves supporting cloud operations, automation, and performance tracking. The ideal candidate has 4–...
Remote work
Cority Inc
New York, NY
3 days ago
Remote Site Reliability Engineer - Cloud Reliability & Automation
SweetRush, Inc. is looking for a Site Reliability Engineer (SRE) to enhance the reliability and performance of our cloud-native environments. This fully remote role offers the chance to shape technical direction within a collaborative IT & Security team. Preferred candidates...
Remote job
SweetRush, Inc.
New York, NY
2 days ago
Senior Site Reliability Engineer
$150k - $170k
...Senior Site Reliability Engineer – Zip Co Join to apply for the Senior Site Reliability Engineer role at Zip Co At Zip, we build cloud‑native software applications that serve millions of customers and process billions of dollars in payments. We’re looking for a seasoned...
Casual work
Work at office
Remote work
Flexible hours
ZIP
New York, NY
5 days ago
Site Reliability Engineer II
...for its strong employee culture and outstanding business performance. To learn more, visit Role Summary As an Intermediate Site Reliability Engineer, you will support the reliability, performance, and scalability of cloud‑hosted services and database platforms. You will...
Remote work
Worldwide
Home office
Cority Inc
New York, NY
3 days ago
Senior Site Reliability Engineer, Fleet Management
$127k - $249k
...The Team Platform Engineering is the department within SRE that is responsible for a range of critical infrastructure and operational functions that support the broader engineering... ...components that ensure cluster reliability and security (e.g., CoreDNS, cert-manager...
Work at office
Local area
Remote work
Worldwide
Flexible hours
MongoDB
New York, NY
2 days ago
Senior Software Engineer - Site Reliability Engineering
$130k - $165k
...Job Title: Senior Software Engineer Company: Snapsheet Job Location: USA, Remote... ...Job Department: Technology Team: Site Reliability Engineering About Snapsheet... ...a 100% hands‑on role, involving both infrastructure management and software development....
Full time
Temporary work
Local area
Remote work
Visa sponsorship
Work visa
Flexible hours
Snapsheet
New York, NY
2 days ago
Site Reliability Engineer II
$123k - $165k
...Site Reliability Engineer II Job Posting ID: 10143234 Department: Engineering Fleet – Reliability Engineering & Operational Support to backend... ...strategies for distributed systems. Maintain and improve Infrastructure‑as‑Code (IaC) definitions and cloud environment...
Full time
Worldwide
5014 Disney Entertainment & Sports LLC
New York, NY
4 days ago
Senior Site Reliability Engineer - Scalable Workflows
$180k - $200k
...Parabola is looking for a Senior Site Reliability Engineer to improve performance and reliability of its software systems in New York. This role requires 5+ years of SRE or DevOps experience and expertise in AWS and containerization tools. Offering a salary of $180,000...
Work at office
3 days per week
Parabola
New York, NY
5 days ago
Sr. Site Reliability Engineer
$160k - $230k
...currently looking to add Platform Engineers to our team, with at least 5... ...scaling large-scale multi-cloud infrastructure. You’ll ensure our platform is reliable, secure, and performant from day... ...collaborative setting. Our team works on-site five days a week, growing and...
Work at office
Local area
Standard Template Labs
New York, NY
4 days ago
Senior Site Reliability Engineer - NYC
...We are hiring a Senior Site Reliability Engineer to help build and operate the infrastructure foundation that supports engineering teams. The role centers on reliability, scalability, cloud infrastructure, Kubernetes operations, and automation that allows developers to...
Rad-Hires
Hoboken, NJ
21 hours ago
Site Reliability Engineer (SRE)
$142k - $214.7k
...financial intelligence. We build governance and intelligence infrastructure that enables artificial intelligence to operate safely,... ...-tenant platform on Google Cloud, and we're hiring a Site Reliability Engineer to own the reliability and observability of that platform...
Shift work
Monstro
New York, NY
22 days ago
Remote Senior Site Reliability Engineer, Onchain - Gemini
...unlock the next era of financial, creative, and personal freedom. The Department: Onchain The Role: Senior Site Reliability Engineer The Onchain infrastructure team at Gemini creates and manages software tools and platforms, automates the creation and support of this...
Remote work
Flexible hours
WorksHub
New York, NY
5 days ago
Site Reliability Engineer
...access technology protocols are a plus Job Description: Site Reliability Engineer Periodic updates and maintenance of Windows-based golden... ...products and distributed systems Deployment and maintenance of infrastructure and applications in AWS using IaC Automate the process of...
Remote work
Shift work
TechDigital Group
New York, NY
4 days ago
Site Reliability Engineer (LATAM ONLY)
...We’re on the lookout for a Site Reliability Engineer ! 45-65K EUR | Full Remote (Latam) | Series A startup backed by top US VCs. At Agentero... ...‑mortems to prevent incidents from ever happening again. Infrastructure Improvements — You will build and maintain our cloud infrastructure...
Remote work
Home office
Night shift
Agentero
New York, NY
3 days ago
Senior Site Reliability Engineer
$182.3k - $220k
...patients first - and that mission depends on reliable, secure, and scalable systems. As a Senior SRE on the infrastructure team, you’ll sit at the core of that effort:... ...infrastructure and building tools that empower our engineers to ship safely and confidently. You...
Local area
Flexible hours
Ro
New York, NY
3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Site Reliability / Infrastructure Engineer. Be the first to apply!