Future Openings - SRE Support Engineer - Observability
Virtasant
SRE Support Engineer - Observability
While this position is not currently open, we are interviewing strong candidates for upcoming opportunities on this team.
Location: Remote | Time Zone: (US, Canada, Brazil, Chile, Colombia, Mexico) (8AM–5PM Pacific)
Freedom to grow. Power to deliver.
Virtasant is a global technology services company delivering large-scale cloud, data, and engineering solutions across 130+ countries. We partner with some of the world’s largest organizations to help them build, operate, and scale internal platforms used by tens of thousands of engineers.
For this role, you will be supporting one of the most advanced internal developer platforms in the world, powering products used by hundreds of millions of people. The problems you will solve are deep, complex, and essential to keeping a global-scale organization moving.
Role Overview
The Observability & Tools Support Engineer provides high-impact technical support for customers of a large technology company’s internal IaaS platform, with a focus on monitoring, alerting, telemetry, and operational tooling .
This role spans a wide range of support—from white-glove onboarding and end-to-end customer enablement, to deep technical troubleshooting across Linux, networking, and observability systems (especially Prometheus and AlertManager ). You will also contribute to improving the support function itself: strengthening tooling, documentation, workflows, and feedback loops so the service scales.
Success depends on excellent troubleshooting, strong written communication, comfort working with highly technical customers, and the maturity to identify patterns and drive operational improvements beyond individual ticket resolution.
Business Outcome
Become a trusted frontline expert for the customer’s observability ecosystem and operational tooling - delivering fast, accurate support across Slack and tickets, improving monitoring reliability, and reducing incident impact through better triage, troubleshooting, onboarding, and knowledge capture.
Success Measures
Healthy volume of threads and tickets handled with high-quality outcomes
Consistent achievement of time-based SLAs
High customer satisfaction through surveys
Accurate classification of issue type, severity, and recurring patterns
Reduced repeat issues through better docs, tooling, and scalable onboarding
What Will Be True When You Succeed
Customers can onboard smoothly to monitoring/alerting with minimal friction
Monitoring and alerting issues are resolved quickly, with fewer escalations
Linux and networking-related incidents reach resolution faster due to strong troubleshooting and clean handoffs
Engineering and SRE teams receive clear, actionable feedback based on real customer trends
Knowledge base content prevents tickets and accelerates self-service
Core Work Units
1) Frontline Support for Observability & Tooling
Manage Slack threads and tickets (roughly 50/50)
Handle a broad range of customer support: simple issue resolution through end-to-end onboarding
Provide clear, structured guidance to highly technical customers
Maintain strong attention to detail while managing multiple interactions in parallel
2) Deep-Dive Troubleshooting & Incident Support
Troubleshoot, isolate, and resolve monitoring and alerting issues (especially Prometheus + AlertManager )
Troubleshoot complex Linux and networking issues (TCP/IP fundamentals required)
Support OpenTelemetry, tracing, and telemetry pipelines , including investigation of gaps in signals and instrumentation
Drive incidents to resolution in partnership with Engineering/SRE teams
3) Documentation & Knowledge Development
Build and maintain customer-facing and internal knowledge base articles
Create informational posts for the community support platform
Turn repeated issues into reusable guides, checklists, and onboarding playbooks
4) Trend Analysis & Feedback to Engineering
Analyze and categorize customer interaction trends
Provide accurate, meaningful feedback to Engineering and SRE orgs to improve product/tooling
Identify “top offenders” and propose practical fixes (tooling, docs, process, product)
5) Operational Excellence & Continuous Improvement
Participate in post-mortem reviews and drive follow-through on improvements
Contribute meaningfully to team objectives and goals (process, tooling, and service scaling)
Bring creativity and discretion to resolve highly complex issues “outside the box”
High-Quality Work - what top performance looks like
Frontline Support
Moves smoothly from triage to deeper analysis without losing the customer
Communicates clearly and confidently with technical users
Maintains clean follow-ups and thread hygiene even with high context switching
Troubleshooting
Rapidly isolates issues across monitoring/alerting configs, Linux runtime behavior, and network connectivity
Uses structured approaches to incident handling: hypothesis → test → evidence → resolution
Produces high-signal writeups that accelerate downstream resolution
Documentation & Enablement
Documentation is clear enough that customers avoid opening tickets
Onboarding flows reduce time-to-value and prevent common misconfigurations
Captures “tribal knowledge” quickly and makes it reusable
Operational Excellence
Obsessing over details: correct severity, accurate tagging, clean timelines, strong handoffs
Spots patterns early and proactively proposes improvements that scale support
Typical Day / Work Patterns
~50% Slack support, ~50% ticket handling
Deep-dive investigations during lower ticket volume periods
Documentation writing and lightweight tooling/process improvements when patterns emerge
Weekly team review of escalations, themes, and operational improvements
High rate of context switching and parallel issue management
Required Skills & Experience (Non-Negotiable)
Several years supporting highly scalable applications and web services
Hands-on experience with open-source observability and cloud-native tooling, including:
Kubernetes (and container fundamentals)
Prometheus and AlertManager troubleshooting
OpenTelemetry and distributed tracing concepts
Strong understanding of the Linux operating system (command line, process/network debugging, logs)
Good understanding of infrastructure observability principles (signals, alerting strategy, SLO thinking, noise reduction)
Good understanding of the TCP/IP suite and practical networking troubleshooting
Strong experience troubleshooting ambiguous, multi-layer issues
Excellent analytical capability and strong attention to detail
Strong written and verbal communication (clear, structured, customer-friendly)
Comfortable working with a very technical customer base
Passion for Technical Support and a service mindset
Nice-to-Haves
Experience improving or supporting internal support tooling or workflows (automation, templates, runbooks)
Experience operating at scale in a services environment (pattern detection, KPI/SLA awareness, operational process maturity)
Familiarity with Grafana, log aggregation, incident tooling, and production support practices
Prior SRE or platform support experience
Minimum Qualifications
3–7+ years in Technical Support Engineering, SRE support, DevOps, Platform Support, or similar
Demonstrated experience supporting distributed systems, IaaS, or cloud platforms
Strong Linux, troubleshooting, and customer-facing communication background
Evidence of documentation, knowledge-base contributions, and process improvement mindset
Disqualifiers: weak Linux fundamentals, inability to troubleshoot systematically, poor written communication, or discomfort supporting highly technical users.
What You’ll Love
Real technical problem solving with tangible customer impact
A role that blends deep troubleshooting with scaling support via docs, tooling, and process
High autonomy in a remote-first environment
What May Be Challenging
High context switching and managing multiple threads in parallel
Repeated patterns that require discipline to convert pain into scalable improvements
Supporting high-visibility systems where speed and accuracy matter
Differentiation
Industry: Remote-first, trust-based culture; global team; autonomy; modern systems; meaningful technical challenges
Internal: High-impact, customer-facing observability support; direct influence on tooling and process maturity; opportunity to shape scalable support practices
- A leading technology company is looking for an Observability & Tools Support Engineer to provide high-impact technical support for its internal IaaS platform. This role involves monitoring, alerting, and troubleshooting, with a focus on helping customers seamlessly onboard...SuggestedRemote job
- ...Build & Release Support Engineer – CI/CD While this position is not currently open, we are interviewing strong candidates for upcoming opportunities on this team.... ...Monitoring tools (Prometheus/Grafana) Prior SRE experience Minimum Qualifications ~2–5 years...SuggestedImmediate startRemote work
$152k - $241.5k
...Site Reliability Engineer - HPC page is loaded##... .... Our work opens up new universes to... ...looking for a Senior SRE to join our Compute... ...building and supporting critical services.... ...auto-healing, E2E observability or data-driven operations... ...our current and future employees, we do...Suggested- About the Role We are looking for a Senior SRE to join our Platform Engineering team as the operations owner of our observability platforms. You’ll be responsible for the reliability... ...on steady‑state operations and platform support, and the other half on engineering projects...Suggested
$106.61k - $284.28k
.... Manager, Frontline Support Engineering to lead our organization... ...). Experience with Observability & Monitoring Tools... ...Qualifications Experience in IT, SRE, DevOps, or Software... ...Our people fuel our future. Our teams reflect... ...window for this opening will close on: 07/20/...SuggestedHourly payFull timeTemporary workWork experience placementLocal area- ...nothing worth doing ever is. We envision a future powered by robots that work seamlessly... ...and infrastructure. As a Software Engineer - Observability & Debugging , you will strengthen our... ...maintain observability tooling that supports debugging and root-cause analysis of robot...
$78k - $112k
...We are searching for a Senior Customer Support Engineer in the AMR region who will be responsible... ...Your Contribution: Be Yourself. Be Open. Stay Hungry and Humble, Collaborate and... ...yourself and your loved ones, now and in the future. We believe that good health means more...Full timeImmediate startRemote workWork from homeFlexible hours- ...Amazing Career Opportunity for a Technical Support Engineer!! Location: Austin, Texas Job ID:... ...difference? Join us and help shape the future of security. As our Technical... ...challenges and want to drive change. We are open to ideas, including flexible work...Job sharingPart timeWorldwideFlexible hoursShift work
$47.85 - $57.85 per hour
...made to facilitate the recruiting process are not a guarantee of future or continued accommodations once hired. If you would like to... ...have accommodation needs such as for a disability or religious observance, please call us toll free at (***) ***-**** or send us an email...Hourly payWork experience placementLive inWork at officeLocal areaFlexible hours- .... Workplace Services Engineering (WSE) is an organization... ...transformation. We support Workplace Services, and we’re shaping the future of how people experience... ...environments. Observability: Use Splunk and Grafana... ...recovery automation. SRE Practices & Observability...Work at office
$102.5k - $187.9k
...we’re all in to shape your future with confidence. We’ll... ...digital strategy, architecture, engineering, design, operations, and... ...system design decisions, and support solutions from inception through... ...practices, and production observability Experience supporting...Summer holidayLocal areaFlexible hours- ...provider of technology services and support. Integritek was formed with... ...: The Senior Systems Engineer leverages technical expertise... ...Expectations Flexibility - Is open to change, enjoys the challenge... ...when things change, can flex to future consequences and trends appropriately...Contract workWork at officeFlexible hours
- ...Cornelis we're building the future of AI and HPC networking with... ...software development. We're seeking engineers who are energized by working... ...Contribution:Engage with the open-source community and... ...Experience with monitoring and observability stacks like Prometheus, Grafana...Full timeRemote workFlexible hours
$174.9k - $222k
...As a Senior Software Engineer on GM's Notification... ...Improve system resiliency, observability, and operational... ...sponsorship now or in the future. This includes direct... ...or other immigration support from the company (e.g.... ...receive updates about GM, open roles, career insights...Temporary workWork experience placementLocal areaWork from homeRelocation packageFlexible hours- Sr. Software Engineer - Site Reliability About ShipperHQ: ShipperHQ... ...product-led company shaping the future of e-commerce logistics.... ...practices, and automation to support and improve our complex cloud... ...systems in AWS Build and maintain observability, monitoring, and logging...Full timeWork at office
- ..., and we’re honored to support first responders. And... ...Senior Site Reliability Engineer who can own our data tier... ...the broader platform, observability with Prometheus, Loki,... ...that looks like for an SRE and excited to help shape... ...engineer will actually open. You write code that...Permanent employmentLocal areaFlexible hours
- ...center innovation, delivering a future-proof, cloud platform that redefines... ...for a Senior Site Reliability Engineer to help build and scale a high-impact SRE function. You’ll be a technical leader... ...priorities Design and develop observability systems (metrics, logging,...
$144.6k - $198.8k
...the Team Shaping the Future of Industrial Technology... ...architecture and engineering standards that shape how... ...improvement in DevSecOps, observability, CI/CD, and compliance... .... Contributions to open-source projects or... ...You will work with a supportive team that fosters a genuine...Ongoing contractFull timeContract workLocal areaWorldwide$225k - $300k
...identity company of the future. Our mission is to... ...experienced Senior Software Engineer to help us build the... ..., documentation and observability * Partner with... ...energetic space with an open concept, and plenty of... ...and adoption/surrogacy support), flexible time off,...Casual workWork at officeFlexible hours- ...skilled Senior Software Engineer to join our digital... ...reliability, testing, CI/CD, observability, and delivery... ...DevOps and Delivery: Support CI/CD, release quality... ..., etc.) Now or in the future. GM may support relocation... ...updates about GM, open roles, career insights...H1bLocal areaWork from homeRelocationFlexible hours3 days per week
- .... In short, this team supports systems that are foundational... ...for a Sr. Software Engineer who brings broad... ...journey to shape the future of technology in Manufacturing... ...within a RedHat Open Shift environment.... ...security, reliability, and observability of applications...Local areaWork from homeRelocation packageShift work
- ...experienced Senior Software Engineer ( .NET ) to join our... ...solutions that support Human Resources, Communications... ...in Monitoring and Observability practices ~... ..., etc.) NOW OR IN THE FUTURE. About GM Our vision... ...receive updates about GM, open roles, career insights...H1bLocal areaWork from homeRelocation package
$144.6k - $198.8k
...the Team Shaping the Future of Industrial Technology... ...architecture and engineering standards that shape how... ...improvement in DevSecOps, observability, CI/CD, and compliance... ...Contributions to open-source projects or involvement... ...with a collaborative, supportive team that shares your...Ongoing contractFull timeContract workLocal areaWorldwide$119k
...Software Development Engineer II Expedia Group brands... .... To shape the future of travel, people must... ...Agreements, we foster an open culture where everyone... ...ongoing operational support. Collaborate with other... ...practices in testing, observability, security, and...Local areaFlexible hours$119k
...Software Development Engineer II Expedia Group brands... .... To shape the future of travel, people must... ...Agreements, we foster an open culture where everyone... ...ongoing operational support for web-based and service... ...practices in testing, observability, security,...Local areaFlexible hours$148k - $185k
...As a Senior Solutions Engineer , you’ll be the technical... ...and Customer Success Support seamless handoffs into... ...share their discoveries openly, and help define best practices... .../statutory holidays observed 4 BetterUp Inner Work... ...may be modified in the future. The base salary range...Work experience placementSummer holidayLive outWork at officeLocal areaFlexible hours2 days per week$146k
...Join Us? To shape the future of travel, people must... ...Agreements, we foster an open culture where everyone... ...We're looking for an engineer who are excited about... ...schema workflows, and supporting infrastructure... ...productivity Optimize existing observability and monitoring...Local areaFlexible hours- Site Reliability Engineer (Edge Services), Infrastructure Services... ...are resilient, scalable, and observable, bridging the gap between complex... ...As a key member of the SRE team, your mission is to treat... ...to harden the system against future failures. Ability to consult...Shift work
- ...us! THE ROLE: At FloSports, SRE is the team that acts as a force multiplier for our engineering organization. Our mission is to... ...ensuring it can scale to meet future product demands. Act as a primary... ..., not just consumed them. Observability Architect: You have designed...Temporary workImmediate startFlexible hours
$70k - $130k
...directly with our organization’s software and quality assurance engineers to enable high quality software delivery and improve quality of... ...innovative solutions. Success in this position requires strong open communication skills, the ability to engage in healthy debate,...Shift work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Future Openings - SRE Support Engineer - Observability. Be the first to apply!
- site reliability engineer remote Austin, TX
- site reliability engineer sre Austin, TX
- site reliability engineer Austin, TX
- software technical support engineer Austin, TX
- senior application support engineer Austin, TX
- lab support engineer Austin, TX
- IT network engineer Austin, TX
- application support engineer Austin, TX
- IT developer Austin, TX
- implementation support engineer Austin, TX

