Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

SRE

TriOptus LLC

Site Reliability Engineer (SRE)

As a part of the FRDC Site Reliability Engineer (SRE) team, you will help identify resilience challenges, build reusable, foundational software and infrastructure components to improve, influence, and validate the resilience and reliability for technologies that move trillions of dollars per day. Responsibilities include, but are not limited to:

  • Participate in the design of build, refactor major software components that improve the availability, resilience, performance of our system
  • Design, code, test, and deliver software to automate manual operational work
  • Support incident responses, blameless postmortem, design and implement the product improvement to prevent incident reoccurring
  • Implement application patterns in support of better service level objectives
  • Implement self-healing, resiliency patterns
  • Exercise failure cases regularly to validate resilience assumptions
  • Engage with development teams throughout the life cycle of incident, ensure lessons learned are translated into automated or process adjust responses to help develop software for reliability and scale, ensuring minimal refactoring or changes
  • Code, test and deliver software to automate manual operational work
  • Troubleshoot incidents, participate in blameless post-incident evaluations and ensure permanent closure of incidents
  • Identify application patterns and analytics in support of better service level objectives
  • Analyze self-healing and resiliency patterns and contribute to software which can use these outcomes
  • Implement best in class monitoring frameworks to accomplish end to end flow monitoring and noiseless alerting

Requirements & Qualifications:

  • Bachelor’s degree or equivalent experience in a software engineering discipline
  • 2+ years of hands-on software engineer experience
  • Curious about solving resilience problems in run time at scale
  • Expertise in at least one technology stack designing, coding, testing, and delivering software
  • Knowledge in a few of infrastructure components (e.g. routers, load balancers, cloud products, container systems, compute, storage, and networks)
  • Experience in cloud native, distributed application design and implementation
  • Demonstrated communication and ownership skills
  • Debugging and trouble shooting skills
  • Collaboration with a diversified high-performing multi-location team
  • Excellent analytical, interpersonal and communication skills
  • Understanding of SRE methodologies/practices

Required Skills: Technical expertise of 4+ years the below areas, overall IT experience of 6+ years:

  • Proficiency in Java / JVM based system design & implementation
  • Infrastructure knowledge required including Unix, Windows, networking, and scripting (e.g. Perl / Python)
  • Experience with orchestration tools like Jenkins CI/CD, or Jules
  • Experience following source control best practices: Git/bitbucket
  • Experience with database development (MySQL / Oracle)
  • Understanding of architecture and design across distributed systems

Prefer Skills:

  • Knowledge of SpringBoot / Microservices architecture
  • Experience using Pivotal Cloud Foundry
  • Experience with Public Cloud: AWS
  • Enterprise platforms using Big Data tools and technologies (e.g. Hadoop, Spark, Hive, Impala, Dremio, Nifi, Ignite)
  • Experience setting up & building solutions for Containers e.
Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the SRE in San Antonio, TX vacancy
  • HDR is seeking an experienced individual to define and lead the operational strategy for observability, monitoring, and reliability engineering within the VCF platform. The role involves developing enterprise-level standards, leading major incident management, and ensuring...
    Suggested

    HDR

    San Antonio, TX
    4 days ago
  • iHeartMedia in San Antonio is looking for a Senior Site Reliability Engineer. This role includes leading a talented team of SREs/DevOps Engineers to maintain the reliability, availability, and performance of software systems and infrastructure. The ideal candidate will ...
    Suggested

    iHeartMedia

    San Antonio, TX
    4 days ago
  • NAB Leadership Foundation in San Antonio, TX is seeking a full-time leader for a team of SREs and DevOps Engineers. Responsibilities include ensuring reliability and performance of software systems across Cloud Services. The ideal candidate will hold a Master's degree in...
    Suggested
    Full time

    NAB Leadership Foundation

    San Antonio, TX
    5 days ago
  •  ...SLOs and engage with exception processes when technical limitations exist. Work with dependent process (Site Reliability Management, SRE, Event Management and Incident Management, CMDB) to create enhancement stories that will improve SLO efficacy and value when passing... 
    Suggested

    Syntricate Technologies

    San Antonio, TX
    4 days ago
  • Job Title Responsibilities: Develop, test, and debug automated tasks (Apps, Systems, Infrastructure) Troubleshoot minor incidents and contribute to resolution through post-mortems Participate in the application or service development lifecycle through code ...
    Suggested

    TriOptus LLC

    San Antonio, TX
    4 days ago
  •  ...hybrid - on site and telework. Minimum Requirements 8+ years required: Senior systems administration, platform engineering, DevOps, SRE, or infrastructure operations experience supporting enterprise or government environments. 5+ years required: Hands‑on... 
    For contractors
    Remote work

    Sistema Technologies Inc.

    San Antonio, TX
    2 days ago
  •  ...degree in Computer Science or related field or equivalent practical experience* 4+ years of experience in DevOps, Cloud Engineering, or SRE roles* Strong expertise in: + CI/CD tools (Azure DevOps, GitHub Actions, Jenkins) + Cloud platforms (Azure strongly preferred; AWS/... 

    XPEL

    San Antonio, TX
    5 days ago
  •  ...outbound, inventory, and labor management workflows to maintain system stability and reliability. Apply Site Reliability Engineering (SRE) practices to improve platform scalability, observability, performance tuning, monitoring, alerting, and automation. Mentor junior... 
    Full time

    H.E.B.

    San Antonio, TX
    9 days ago
  • $80k - $133k

     ...to Obtain Public Trust**What You Will Do:*** Collaborate with team members and cross-departmental partners to establish and maintain SRE practice in an Agile Scrum framework.* Participate in system design reviews to identify points of failure, promote automation and self... 
    Permanent employment
    Contract work
    Temporary work
    Remote work
    Flexible hours

    Dovel Technologies, Inc

    San Antonio, TX
    3 days ago
  •  ...strategy, maintaining multi-cloud solutions, and enforcing security measures in CI/CD pipelines. Ideal candidates have 4+ years in DevOps/SRE, robust AWS and Kubernetes knowledge, and a passion for mission-driven work. The position is on-site and offers competitive benefits,... 
    Flexible hours

    Rippling

    San Antonio, TX
    3 days ago
  •  ...Identity and Access Management (IAM), encryption (at rest/in transit), and cloud-native security toolsets. Site Reliability Engineering (SRE): Implement robust observability (monitoring, logging, alerting) with tools like Prometheus, Grafana, and CloudWatch. Drive... 
    Flexible hours
    Shift work

    Rippling

    San Antonio, TX
    3 days ago
  • Dovel Technologies, Inc is seeking a Site Reliability Engineer to establish and maintain SRE practices within Agile teams. The candidate will collaborate with cross-departmental partners to enhance system reliability and participate in code reviews and incident management... 
    Remote job
    Flexible hours

    Dovel Technologies, Inc

    San Antonio, TX
    3 days ago
  • $151.5k - $346k

     ...Automation: GenAI, LLM/RAG patterns, MLOps, RPA/ITPA orchestration Security & Risk: identity, data protection, Responsible AI guardrails SRE & Observability: SLOs/SLIs, error budgets, runbooks, auto‑remediation FinOps: unit economics, cost optimization, capacity management... 
    Contract work
    Summer holiday
    Flexible hours

    Ernst & Young Oman

    San Antonio, TX
    1 day ago
  •  ...leadership and deep technical expertise. The successful candidate will be tasked with leading and scaling the platform engineering and SRE organizations, defining long-term cloud infrastructure and platform strategy, and driving modernization initiatives across Kubernetes... 

    Confidential

    San Antonio, TX
    5 days ago
  •  ...infrastructure, and security teams to drive best practices. Required Qualifications Experience: ~3-5+ years in DevOps, SRE, or infrastructure engineering roles. ~ Deep experience with Azure. ~ Previous experience supporting Power Platforms. ~ Experience... 
    Remote work

    Apex Systems

    San Antonio, TX
    21 hours ago
  •  ...networking (VPCs, Route53, etc), GoLang Terraform, Understanding regulatory compliance (e.g, SOC 2, ISO 27001), Datadog, Elastic, SRE or DevOps methodologies Key Responsibilities Monitoring & Alerting • Continuously monitor AWS Resilience Hub to track resilience... 

    Omni Inclusive

    San Antonio, TX
    2 days ago
  •  ...automation, continuous improvement, collaboration, and patient safety. Develops core metrics for monitoring and maintaining system health for SRE practitioners (e.g., latency, traffic, errors, and saturation) leveraging industry practices, manufacturer guidance, and other... 
    Local area

    Highmark Health

    San Antonio, TX
    4 days ago
  •  ...AWS/Azure/GCP) bility to interpret architecture diagrams, data flows, and system design Knowledge of DevOps, CI/CD, monitoring, SRE fundamentals Familiarity with data engineering concepts (ETL, data models, analytics). Product visioning, roadmap planning, and... 

    ClifyX

    San Antonio, TX
    15 hours ago
  •  ...what we do! What We Need: iHeartMedia Entertainment, Inc. seeks candidates for the position of Senior Site Reliability Engineer (SRE), responsible for leading a talented team of SREs/DevOps Engineers across a wide variety of Cloud Services to ensure the reliability,... 
    Full time
    Flexible hours

    iHeartMedia

    San Antonio, TX
    2 days ago
  • $131.3k - $237.35k

     ...implementing Kubernetes-based developer platforms or Internal Developer Platforms (IDPs). Experience with Site Reliability Engineering (SRE) practices and operational excellence programs. Professional certifications such as: AWS Certified Solutions Architect AWS... 
    Work at office
    Local area
    Immediate start

    Leidos

    San Antonio, TX
    15 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to SRE. Be the first to apply!