SRE

TriOptus LLC

Site Reliability Engineer (SRE)

As a part of the FRDC Site Reliability Engineer (SRE) team, you will help identify resilience challenges, build reusable, foundational software and infrastructure components to improve, influence, and validate the resilience and reliability for technologies that move trillions of dollars per day. Responsibilities include, but are not limited to:

Participate in the design of build, refactor major software components that improve the availability, resilience, performance of our system
Design, code, test, and deliver software to automate manual operational work
Support incident responses, blameless postmortem, design and implement the product improvement to prevent incident reoccurring
Implement application patterns in support of better service level objectives
Implement self-healing, resiliency patterns
Exercise failure cases regularly to validate resilience assumptions
Engage with development teams throughout the life cycle of incident, ensure lessons learned are translated into automated or process adjust responses to help develop software for reliability and scale, ensuring minimal refactoring or changes
Code, test and deliver software to automate manual operational work
Troubleshoot incidents, participate in blameless post-incident evaluations and ensure permanent closure of incidents
Identify application patterns and analytics in support of better service level objectives
Analyze self-healing and resiliency patterns and contribute to software which can use these outcomes
Implement best in class monitoring frameworks to accomplish end to end flow monitoring and noiseless alerting

Requirements & Qualifications:

Bachelor’s degree or equivalent experience in a software engineering discipline
2+ years of hands-on software engineer experience
Curious about solving resilience problems in run time at scale
Expertise in at least one technology stack designing, coding, testing, and delivering software
Knowledge in a few of infrastructure components (e.g. routers, load balancers, cloud products, container systems, compute, storage, and networks)
Experience in cloud native, distributed application design and implementation
Demonstrated communication and ownership skills
Debugging and trouble shooting skills
Collaboration with a diversified high-performing multi-location team
Excellent analytical, interpersonal and communication skills
Understanding of SRE methodologies/practices

Required Skills: Technical expertise of 4+ years the below areas, overall IT experience of 6+ years:

Proficiency in Java / JVM based system design & implementation
Infrastructure knowledge required including Unix, Windows, networking, and scripting (e.g. Perl / Python)
Experience with orchestration tools like Jenkins CI/CD, or Jules
Experience following source control best practices: Git/bitbucket
Experience with database development (MySQL / Oracle)
Understanding of architecture and design across distributed systems

Prefer Skills:

Knowledge of SpringBoot / Microservices architecture
Experience using Pivotal Cloud Foundry
Experience with Public Cloud: AWS
Enterprise platforms using Big Data tools and technologies (e.g. Hadoop, Spark, Hive, Impala, Dremio, Nifi, Ignite)
Experience setting up & building solutions for Containers e.

Apply

Vacancy posted 3 days ago

Similar jobs that could be interesting for youBased on the SRE in San Antonio, TX vacancy

Platform Reliability Lead, Observability & SRE
HDR is seeking an experienced individual to define and lead the operational strategy for observability, monitoring, and reliability engineering within the VCF platform. The role involves developing enterprise-level standards, leading major incident management, and ensuring...
Suggested
HDR
San Antonio, TX
4 days ago
Senior Cloud SRE Lead - Reliability & Performance
iHeartMedia in San Antonio is looking for a Senior Site Reliability Engineer. This role includes leading a talented team of SREs/DevOps Engineers to maintain the reliability, availability, and performance of software systems and infrastructure. The ideal candidate will ...
Suggested
iHeartMedia
San Antonio, TX
4 days ago
Lead SRE & DevOps Engineer - Scale Reliable Cloud Systems
NAB Leadership Foundation in San Antonio, TX is seeking a full-time leader for a team of SREs and DevOps Engineers. Responsibilities include ensuring reliability and performance of software systems across Cloud Services. The ideal candidate will hold a Master's degree in...
Suggested
Full time
NAB Leadership Foundation
San Antonio, TX
5 days ago
SRE Engineer
...SLOs and engage with exception processes when technical limitations exist. Work with dependent process (Site Reliability Management, SRE, Event Management and Incident Management, CMDB) to create enhancement stories that will improve SLO efficacy and value when passing...
Suggested
Syntricate Technologies
San Antonio, TX
4 days ago
Site Reliability Engineer (SRE)
Job Title Responsibilities: Develop, test, and debug automated tasks (Apps, Systems, Infrastructure) Troubleshoot minor incidents and contribute to resolution through post-mortems Participate in the application or service development lifecycle through code ...
Suggested
TriOptus LLC
San Antonio, TX
4 days ago
DevOps Engineer 0049A
...hybrid - on site and telework. Minimum Requirements 8+ years required: Senior systems administration, platform engineering, DevOps, SRE, or infrastructure operations experience supporting enterprise or government environments. 5+ years required: Hands‑on...
For contractors
Remote work
Sistema Technologies Inc.
San Antonio, TX
2 days ago
Senior DevOps Engineer
...degree in Computer Science or related field or equivalent practical experience* 4+ years of experience in DevOps, Cloud Engineering, or SRE roles* Strong expertise in: + CI/CD tools (Azure DevOps, GitHub Actions, Jenkins) + Cloud platforms (Azure strongly preferred; AWS/...
XPEL
San Antonio, TX
5 days ago
Sr Systems Developer
...outbound, inventory, and labor management workflows to maintain system stability and reliability. Apply Site Reliability Engineering (SRE) practices to improve platform scalability, observability, performance tuning, monitoring, alerting, and automation. Mentor junior...
Full time
H.E.B.
San Antonio, TX
9 days ago
Site Reliability Engineer
$80k - $133k
...to Obtain Public Trust**What You Will Do:*** Collaborate with team members and cross-departmental partners to establish and maintain SRE practice in an Agile Scrum framework.* Participate in system design reviews to identify points of failure, promote automation and self...
Permanent employment
Contract work
Temporary work
Remote work
Flexible hours
Dovel Technologies, Inc
San Antonio, TX
3 days ago
Senior DevSecOps Platform Architect
...strategy, maintaining multi-cloud solutions, and enforcing security measures in CI/CD pipelines. Ideal candidates have 4+ years in DevOps/SRE, robust AWS and Kubernetes knowledge, and a passion for mission-driven work. The position is on-site and offers competitive benefits,...
Flexible hours
Rippling
San Antonio, TX
3 days ago
Senior Platform Engineer (DevSecOps - San Antonio)
...Identity and Access Management (IAM), encryption (at rest/in transit), and cloud-native security toolsets. Site Reliability Engineering (SRE): Implement robust observability (monitoring, logging, alerting) with tools like Prometheus, Grafana, and CloudWatch. Drive...
Flexible hours
Shift work
Rippling
San Antonio, TX
3 days ago
Remote Site Reliability Engineer - Cloud Automation
Dovel Technologies, Inc is seeking a Site Reliability Engineer to establish and maintain SRE practices within Agile teams. The candidate will collaborate with cross-departmental partners to enhance system reliability and participate in code reviews and incident management...
Remote job
Flexible hours
Dovel Technologies, Inc
San Antonio, TX
3 days ago
Consulting - Managed Services - AI, IT and Automation Senior Manager
$151.5k - $346k
...Automation: GenAI, LLM/RAG patterns, MLOps, RPA/ITPA orchestration Security & Risk: identity, data protection, Responsible AI guardrails SRE & Observability: SLOs/SLIs, error budgets, runbooks, auto‑remediation FinOps: unit economics, cost optimization, capacity management...
Contract work
Summer holiday
Flexible hours
Ernst & Young Oman
San Antonio, TX
1 day ago
Managing Director, Platform Engineering
...leadership and deep technical expertise. The successful candidate will be tasked with leading and scaling the platform engineering and SRE organizations, defining long-term cloud infrastructure and platform strategy, and driving modernization initiatives across Kubernetes...
Confidential
San Antonio, TX
5 days ago
DevSecOps Engineer
...infrastructure, and security teams to drive best practices. Required Qualifications Experience: ~3-5+ years in DevOps, SRE, or infrastructure engineering roles. ~ Deep experience with Azure. ~ Previous experience supporting Power Platforms. ~ Experience...
Remote work
Apex Systems
San Antonio, TX
21 hours ago
AWS Consultant
...networking (VPCs, Route53, etc), GoLang Terraform, Understanding regulatory compliance (e.g, SOC 2, ISO 27001), Datadog, Elastic, SRE or DevOps methodologies Key Responsibilities Monitoring & Alerting • Continuously monitor AWS Resilience Hub to track resilience...
Omni Inclusive
San Antonio, TX
2 days ago
Manager Site Reliability Engineering
...automation, continuous improvement, collaboration, and patient safety. Develops core metrics for monitoring and maintaining system health for SRE practitioners (e.g., latency, traffic, errors, and saturation) leveraging industry practices, manufacturer guidance, and other...
Local area
Highmark Health
San Antonio, TX
4 days ago
Digital Technical Product Manager
...AWS/Azure/GCP) bility to interpret architecture diagrams, data flows, and system design Knowledge of DevOps, CI/CD, monitoring, SRE fundamentals Familiarity with data engineering concepts (ETL, data models, analytics). Product visioning, roadmap planning, and...
ClifyX
San Antonio, TX
15 hours ago
Senior Site Reliability Engineer
...what we do! What We Need: iHeartMedia Entertainment, Inc. seeks candidates for the position of Senior Site Reliability Engineer (SRE), responsible for leading a talented team of SREs/DevOps Engineers across a wide variety of Cloud Services to ensure the reliability,...
Full time
Flexible hours
iHeartMedia
San Antonio, TX
2 days ago
Platform Engineer Team Lead
$131.3k - $237.35k
...implementing Kubernetes-based developer platforms or Internal Developer Platforms (IDPs). Experience with Site Reliability Engineering (SRE) practices and operational excellence programs. Professional certifications such as: AWS Certified Solutions Architect AWS...
Work at office
Local area
Immediate start
Leidos
San Antonio, TX
15 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to SRE. Be the first to apply!