Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Sr. SRE Platform Software Engineer

Full-time

Bitdeer Technologies Group

About Bitdeer Technologies Group

Bitdeer is a world-leading technology company for AI and Bitcoin mining infrastructure.

Bitdeer is committed to providing comprehensive Bitcoin mining solutions for its customers and building AI computational infrastructure to support the AI revolution. Bitdeer handles complex processes involved in computing such as equipment procurement, transport logistics, data center design and construction, equipment management, and daily operations. Bitdeer also offers advanced cloud capabilities to customers with high demand for artificial intelligence.

Headquartered in Singapore, Bitdeer has deployed data centers across multiple countries, including the United States, Norway, Bhutan, and Ethiopia.

To learn more, visit  (

Position Overview

Build and operate one or more bounded contexts of the NeoCloud SRE platform — the multi-region substrate that observes, protects, and operates a GPU rental fleet across self-built and OEM-rented data centers. You take an architect-approved design and turn it into production code that ships through GitOps + the CICD release pipeline, ride the Plugin Framework conventions, meet declared SLOs, and stay drift-free.

This is the build + run role. You don't only ship code; you ship a service that other squads, cloud-service teams, and tenants depend on. You take the on-call pager for what you build.

Key Responsibilities
You will own 1-2 of these:

  • Collection & Storage: collection-agent, customer-sdk-gateway, metrics-store, logs-store, traces-store, profiles-store, analytics-lake, enrichment-service, collection-monitor.
  • Alert, Correlation & SLO: alert-engine-framework, alert-correlation, slo-framework, default M-series alert rules.
  • Topology, Cluster-Health & Cluster Platform Services: topology-service, cluster-health-rollup, OSS-SRE-tool collection plugins for K8s, Slurm, Ray, Volcano, Kueue, and KubeRay.
  • Fault-Prediction: prediction-engine-framework and built-in predictors (GPU, Link, Disk, XPA, Straggler, SDC, Stranded GPU).
  • Remediation, Workflow, Inspection & Jobs: remediation-actuator, orchestration-substrate (workflow engine), inspection-orchestrator, job-scheduler, NCCL-baseline inspection probe.
  • Hardware Lifecycle & DC Ops: hardware-lifecycle, dc-operations, boot-provisioning, rolling-upgrade, bare-metal-bmc-service, auto-discovery, ZTP D0–D5 pipeline, IPMI bare-metal management.
  • Identity, Secrets, Tenant-Config & CMDB: iam-service, secrets-service, tenant-sre-config, cmdb-cache, schema registry.
  • Customer-Bridge, Ticketing & SRE Platform Portal: customer-bridge, customer-ticketing, sre-operation-system, Customer Console BFF, SRE Console BFF.
  • Backup, DR & Meta-Monitor: backup-orchestrator, meta-monitor, external-watcher integration (Datadog or equivalent).
  • CI/CD, GitOps, Plugin Framework & SRE Image Registry: cicd-pipeline, gitops-sync, plugin-registry, sre-image-registry.
  • Self-Improving Agent: agent-control-plane, agent-discovery, agent-codegen, agent-sandbox, per-Region LLM gateway.
  • Global SRE Management: maintenance-window-orchestrator, change-management, capacity-planner, cost-optimizer, gpu-efficiency-dashboard, network-stability-dashboard, patching-orchestrator, artifact-management, compat-matrix-service, security-platform.

Qualifications

  • Software Engineering Experience: 7+ years of production software engineering experience, including 2 or more years operating what you built (real on-call experience, not just shipping code).
  • Programming Languages: Production-depth mastery of at least one systems-grade language—Go (preferred), Rust, or Java. Proficiency in Python for tooling and SDK work.
  • Distributed Systems Fundamentals: Strong grasp of at-least-once vs. exactly-once trade-offs, idempotency, back-pressure, leader election, consistent hashing, gossip, and fan-out. Ability to evaluate CRDT vs. Raft vs. Paxos and select the right tool for the job.
  • Multi-Region Observability Stack: Experience at production scale with Prometheus, VictoriaMetrics, Mimir, Thanos, Loki, Elasticsearch, Tempo, Jaeger, or OpenTelemetry. Must have built or substantively contributed to the ingest, query, or storage paths of these systems.
  • GitOps & CI/CD: Hands-on experience with Argo, Flux, Helm, Kustomize, Cosign signing, signed-bundle promotion, and blast-radius-aware rollouts.
  • Kubernetes Operator Pattern: Proven experience writing a controller or CRD handling real production traffic, with a deep understanding of watch-cache mechanics, leader election, and reconcile loops.
  • mTLS & Secrets Management: Experience executing end-to-end mTLS bootstrap with certificate rotation. Hands-on experience with HashiCorp Vault or cloud KMS (AWS KMS / GCP KMS).
  • SQL & Time-Series Data: Ability to read a Prometheus query plan, build a recording-rule strategy, and write SQL that joins per-tenant telemetry against analytics-lake tables.
  • Testing Discipline: Rigorous approach to unit, integration, contract, chaos, and soak testing. Experience writing and maintaining your own comprehensive tests.
  • Technical Writing Fluency: Ability to author clear design docs that align with existing platform architecture, create runbooks optimized for 3 AM on-call responses, and write intent-driven PR descriptions.

Preferred Qualifications (GPU / AI-Infra Context)
Experience in at least one of the following areas is a strong plus:

  • NVIDIA Internals: Deep understanding of DCGM and NVIDIA driver internals, including XID semantics and MIG / vGPU partitioning.
  • Networking & Fabrics: Experience with InfiniBand or RoCE fabrics, including subnet managers, partitioning, optical health, and NCCL collective tracing.
  • HPC Storage: Experience managing Lustre, NetApp, Pure, DDN, VAST, or NVMe-oF under multi-tenant loads.
  • Hardware Management: Hands-on experience with BMC, IPMI, and Redfish at OEM scale (Supermicro, Dell, HPE, Lenovo).
  • Cluster Platform Internals: Familiarity with Kubernetes GPU Operator, Slurm controller, or Ray GCS.
  • BS/MS in Computer Science or similar
  • Hyperscale or NeoCloud experience

--------------------------------------------------------------------

Bitdeer is committed to providing equal employment opportunities in accordance with country, state, and local laws. Bitdeer does not discriminate against employees or applicants based on conditions such as race, color, gender identity and/or expression, sexual orientation, marital and/or parental status, religion, political opinion, nationality, ethnic background or social origin, social status, disability, age, indigenous status, and union.

Vacancy posted 23 days ago
Similar jobs that could be interesting for youBased on the Sr. SRE Platform Software Engineer in Austin, TX vacancy
  • Roku, Inc. is seeking a talented Senior Software Engineer specializing in MLOps/DevOps to join its Advertising Performance team in Austin, Texas. This role involves supporting and scaling ML infrastructure, ensuring efficient model lifecycle management, and optimizing cloud... 
    Senior
    Software

    Roku, Inc.

    Austin, TX
    2 days ago
  •  ...Technology (AST), Service Availability and Engineering team, you will be immersed in a...  ...Advisors (RIAs). We offer Trading Platform and Products, Account Management...  ...The role incorporates aspects of software engineering and operations, SRE practices embracing automation to... 
    Senior
    Software
    Work at office
    Night shift

    Charles Schwab Corporation

    Austin, TX
    3 days ago
  • Site Reliability Engineer Long term contract- 2+ years 100% remote in the continental US...  ...team. You will be helping to build a new platform built on AWS cloud technologies. The...  ...or Master's degree in Computer Science, Software Engineering, or a related field 4+ years... 
    Senior
    Software
    Long term contract
    Remote work

    ASCENDING LLC

    Austin, TX
    1 day ago
  •  ...Senior Platform Engineer The Senior Platform Engineer at ClosedLoop is responsible for building...  ...safe — spanning AI-first enablement, SRE/DevOps practices, AWS cloud-native infrastructure...  ...of experience in engineering roles at software companies. ~ Proficiency in Python,... 
    Senior
    Software
    Shift work

    ClosedLoop.ai

    Austin, TX
    3 days ago
  • $106.5k - $133.1k

    Certinia delivers a Services-as-a-Business platform that powers and connects all aspects of...  ...visit . THE ROLE As a Senior Platform Engineer, Veda AI you will work in Certinia's...  ...An ability to apply architectural and software patterns appropriately, and the judgement... 
    Senior
    Software
    Full time
    Remote work
    Flexible hours

    Certinia Inc

    Austin, TX
    3 days ago
  • $89.2k - $209.5k

     ...Description Role Summary Oracle Health Platform Engineering builds core platform capabilities that...  ...operations. We are seeking a Senior Software Developer (IC3) to design, develop, and...  ...with cross-functional stakeholders (SRE/Operations, Security, Product, and... 
    Senior
    Software
    Temporary work
    Visa sponsorship
    Flexible hours

    Oracle

    Austin, TX
    1 day ago
  • Job Title: Sr. Kong Middleware Engineer or Kong Lead Location: Austin, TX or Fort...  ...Practices solutions in Kong platform. Key Responsibilities Design...  ...with architects, SRE/operations, and security teams...  ...Qualifications 8+ years of software development experience with... 
    Senior
    Software
    Contract work

    US staffing Inc

    Austin, TX
    2 days ago
  • LPL Financial LLC is seeking an AVP, Software Engineer in Austin, Texas. This hands-on position focuses on delivering cloud-native platform services and enabling AI capabilities. The ideal candidate will have extensive experience in application development and AI solutions... 
    Senior
    Software

    LPL Financial

    Austin, TX
    1 day ago
  • $86.4k - $199.5k

     ...you learning and, on your toes, delivering mission critical services that our customers depend on. Work with Site Reliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end‑to‑end configuration... 
    Senior
    Hourly pay
    Temporary work
    Flexible hours
    Shift work
    Weekend work

    Oracle

    Austin, TX
    1 day ago
  •  ...The Role The AI Engineering and Productivity team in the Global Planning...  ...lifecycle. As a Senior Software Engineer , you will be responsible...  ...across enterprise data platforms (e.g., SQL Server, Oracle, PostgreSQL...  ..., dashboards, alerting, SRE concepts) for data and application... 
    Senior
    Software

    General Motors

    Austin, TX
    5 days ago
  •  ...watches TV Roku is the #1 TV streaming platform in the U.S., Canada, and Mexico, and we...  ...seeking a talented and experienced Senior Software Engineer, MLOps/DevOps, to join the Advertising...  ...has a strong background in DevOps/SRE practices, cloud infrastructure management... 
    Senior
    Software
    Work at office
    Local area
    Remote work
    Monday to Thursday
    Flexible hours

    Roku

    Austin, TX
    2 days ago
  • Indeed, Inc. is seeking a Software Engineer III to design and maintain data infrastructure for our database platform team. You will enhance reliability and simplify adoption for engineers by collaborating with site reliability engineers and application teams. The ideal... 
    Senior
    Software

    Indeed, Inc., c/o CT Corporation (Indeed.com)

    Austin, TX
    2 days ago
  •  ...global asset management firm seeks a Senior Integration Engineer to join their Platform Security team in Austin, TX. The ideal candidate will have...  ...with teams, writing clean code, and ensuring high-quality software solutions. This role offers competitive compensation and... 
    Senior
    Software

    PIMCO Europe

    Austin, TX
    4 days ago
  • $156.64k

     ...currently seeking a Senior Cloud Platform Architect to lead the vision,...  ..., governance models, and engineering standards while ensuring platforms...  ...new custom and cloud software, coordinate installation and...  ...observability, incident management, and SRE practices. Drive... 
    Senior
    Software
    Remote work
    Shift work

    MAXIMUS

    Austin, TX
    2 days ago
  •  ...A tech-driven insurance provider based in Austin is seeking a Senior Software Engineer to lead the development of an internal AI platform. The role involves building APIs, collaborating with applied scientists, and designing data strategies to enable effective AI integration... 
    Senior
    Software
    Flexible hours

    Coalition

    Austin, TX
    1 day ago
  • Apple Inc. in Austin, Texas is seeking a Software Architect to design and build distributed systems for their products. This role involves...  ...candidate will have significant experience with cloud-native platforms and tools like Kubernetes, and a strong understanding of... 
    Senior
    Software

    Apple Inc.

    Austin, TX
    3 days ago
  •  ...Senior Platform Engineer Develop and maintain a software platform that: Exposes Linux security instrumentation information Processes and stores information into a database Publishes information for consumption by a distributed system Ensure all software... 
    Senior
    Software

    confluera

    Austin, TX
    3 days ago
  •  ...Senior Platform Engineer – NODA AI Location: Austin, TX (Hybrid on-site, with up to 10% travel) Clearance Requirement: U.S. Citizen with...  ...our ability to rapidly iterate and deploy mission-critical software across diverse environments. They ensure our orchestration software... 
    Senior
    Software
    Flexible hours

    Noda Ai

    Austin, TX
    2 days ago
  • Cloudera is looking for a Staff Software Engineer in Austin, Texas to help build their next-generation AI & Machine Learning platform. The role involves designing and coding scalable application services, collaborating closely with various engineering teams, and driving... 
    Senior
    Software
    Work from home
    Flexible hours

    Cloudera

    Austin, TX
    1 day ago
  • Dimensional Fund Advisors is seeking a Senior SRE in Austin, Texas, to manage the developer tooling ecosystem. This hybrid position will involve both operations and engineering work, focusing on Python and .NET toolchains. The ideal candidate will have extensive experience... 
    Senior

    Dimensional Fund Advisors

    Austin, TX
    3 days ago
  • $146k - $204.5k

    Expedia, Inc. in Austin seeks an engineer for their GraphQL platform, responsible for building and enhancing the core infrastructure. The ideal candidate will have over 5 years of software development experience, particularly in GraphQL and high-performance systems. This... 
    Senior
    Software

    11105 Expedia, Inc.

    Austin, TX
    5 days ago
  • Traveltechessentialist in Austin is seeking an Engineer to design and develop a core GraphQL platform. This role involves optimizing observability solutions...  ...in Computer Science and 5+ years of experience in software development, particularly with Rust or Kotlin. A full... 
    Senior
    Software

    Traveltechessentialist

    Austin, TX
    2 days ago
  •  ...Enterprise Middleware organization is seeking a Senior Kafka Platform Engineer with experience to lead the evolution of its enterprise...  ...science, Engineering, or related field 10+ years of hands‑on software engineering experience with proven technical leadership 5+... 
    Senior
    Software
    Work at office

    Charles Schwab Corporation

    Austin, TX
    2 days ago
  • Expedia Group is looking for an engineer to design and develop their core GraphQL platform in Austin, Texas. The role includes coding high-performance distributed...  ...’s degree in a relevant field, 5+ years of software development experience, and expertise in Rust or Kotlin... 
    Senior
    Software

    Expedia Group

    Austin, TX
    4 days ago
  • $184.5k - $258k

    Expedia, Inc. is seeking a Senior Software Developer in Austin, Texas. This role involves leading the design and development of AI-powered travel experiences and managing complex cross-service dependencies. The ideal candidate will have strong software development experience... 
    Senior
    Software

    11105 Expedia, Inc.

    Austin, TX
    5 days ago
  • $146k - $204.5k

    Expedia, Inc. is seeking a Software Engineer to design and develop the core GraphQL platform in Austin, Texas. The role involves collaboration across engineering teams and focuses on building high-performance systems. Ideal candidates should have a Bachelor's degree in... 
    Senior
    Software

    Expedia, Inc.

    Austin, TX
    5 days ago
  •  ...! Job Description Title: Senior DevOps Engineer Intro Contoro Robotics is an Austin-based...  ...to help scale and harden our Cloud Platform infrastructure. This role is critical to...  ...Best Practices & Collaboration Promote software development best practices, including automation... 
    Senior
    Software
    Remote work

    Contoro Inc.

    Austin, TX
    9 hours ago
  • $198.24k - $272.58k

    Procore is looking for a talented Senior Manager, Software Engineering to lead their pivotal platform for construction technology. Located in Austin, Texas, this role involves overseeing strategic direction, managing a high-performing team, and championing data security... 
    Senior
    Software

    SupportFinity™

    Austin, TX
    7 days ago
  • Sr Application Engineer — IT Inventory Platform, IS&T Austin, Texas, United States Software and Services The Asset Intelligence Platform is the consolidated registry that maps applications and services to their underlying infrastructure and dependencies across our data... 
    Senior
    Software

    Apple

    Austin, TX
    5 days ago
  • Senior Software Engineer (Voice Platform), Customer Systems Austin, Texas, United States Software and Services Join a team building the next generation of large-scale voice and real-time communication platforms that power seamless customer and agent experiences across... 
    Senior
    Software

    Apple

    Austin, TX
    4 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Sr. SRE Platform Software Engineer. Be the first to apply!