Senior Site Reliability Engineer

Megaport

About Megaport We’re not your typical tech company – and we don’t want to be. Megaport is the global leader in Network as a Service (NaaS), and has transformed the way businesses connect to the cloud, data centers, and each other. We’re publicly listed on the Australian Stock Exchange and partnered with the biggest names in tech like Amazon, Microsoft, Google, Oracle, IBM, and more. Headquartered in Brisbane with a crew of over 600 people spread across Asia-Pacific, Europe, and the Americas, our employees enjoy an environment that is collaborative, supportive, and (actually) fun. Our Team Culture We’re a team of problem solvers, pixel pushers, code slingers, and cloud fanatics. Culture is more than a poster on the wall here – collaboration beats hierarchy, curiosity fuels our growth, and everyone’s voice matters. We take our work seriously, but not ourselves. We work across time zones to execute on our global vision, trust each other to get things done, and never compromise our values for commercial gain. Most importantly, we place our customers at the center of everything we do. We’re committed to increasing representation in the tech industry and welcome applicants from all backgrounds. Don’t meet every requirement? That’s okay. If you’re excited about this role, we encourage you to apply. The Role As a Senior Platform Engineer, you are a champion for DevOps and SRE culture and industry best practice within Megaport. You will work alongside talented team members in multiple timezones ensuring that systems are secure, maintainable and available. External to the team you will be engaging with stakeholders in requirements analysis and demonstrations. Technically you will be very hands on. Continually evolving your skills through a mix of peer reviews and research. Ultimately your obsession is customer success and ensuring company goals are met. What You Will Be Doing Improving production reliability and system resilience within an SRE scoped team Championing high standards of work and industry best practices Communicating with teams and stakeholders at all stages Bearing fresh ideas to the table and encouraging others Diving into complex technical problems with a can-do attitude Working across numerous technologies in a fast-changing industry Participating in on-call rotation, incident response, and blameless post-incident reviews Writing code, handling alerts, improving solutions, and supporting others Playing a crucial role in the success of your company and team What We Are Looking For 5+ years administering Linux systems and related infrastructure in production environments A collaborative SRE mindset, with familiarity around SLIs/SLOs/SLAs, error budgets, blast radius, and blameless postmortems A focus on automation, reducing toil, and preventing problem recurrence A track record of writing runbooks that work for the broader team, not just yourself Strong Kubernetes and broader ecosystem fundamentals Cloud infrastructure experience; AWS strongly preferred and bare-metal is a bonus Strong tool development - Bash, plus either Python or Go preferred, or similar Infrastructure-as-code tooling experience - Terraform preferred CI/CD and version control, GitHub preferred Database experience - one of Postgres, Cassandra, or ClickHouse preferred Experience operating a production observability stack (metrics, logs, traces), with an eye for signal over noise Comfortable working on live production infrastructure, with strong troubleshooting instincts and ownership of incident response A history of continual professional development A self-directed style suited to an async, globally distributed team, and comfortable picking up adjacent work when the situation calls for it What We Offer Flexible working environments Birthday Leave Generous study and training allowance + 5 days paid study leave Creative, fun, and contemporary workspaces Motivated team of industry experts and new talent Celebrated success with ‘Legend’ and ‘Kudos’ Awards Health and wellness program #J-18808-Ljbffr

Apply

Vacancy posted 2 days ago

Similar jobs that could be interesting for youBased on the Senior Site Reliability Engineer in Cambridge, ID vacancy

Senior Platform Engineer - SRE & Cloud Reliability
Megaport is hiring a Senior Platform Engineer in Cambridge, Idaho, to champion DevOps and SRE practices. In this role, you'll work with a talented team to ensure system security and availability while engaging stakeholders in requirements analysis. The ideal candidate...
Senior
Flexible hours
Megaport
Cambridge, ID
1 day ago
Sr. TMF Lead
£43k - £73.5k per year
ABOUT ALIMENTIV Alimentiv is a global CRO with a singular focus: advancing therapies for patients with gastrointestinal diseases. GI is our WHY and for more than 30 years that purpose has driven our scientific rigour, operational excellence, and deep therapeutic expertise...
Senior
Contract work
Local area
Medium
Cambridge, ID
4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior Site Reliability Engineer. Be the first to apply!