Site Reliability Engineer
3B Staffing LLC
Job Description:
Qualifications:
This position is 60 % SRE and 40% SDE. Also open for candidates to join MTH along with ATL GO team. Must agree to work onsite at either of these two locations based on the team's hybrid schedule. Required Skillset
• Manage and optimize data streaming and API components in OpenShift Onpremise and AWS.
• Proactively review the application's APIs and processes to identify opportunities to optimize the response times for various application components.
• Automate various types of testing including data quality checks, automate delivery to system integration, production and automate deployment for production
• Develop integrations between the application in Onpremise and AWS and our third-party tools (ServiceNow, VersionOne, Sumo)
• Work with teams to create SLI/SLO's .
• Actively monitor and lead troubleshooting of degraded performance and hard to define issues for the platform applications, develop the solution and document artifacts in the back log from root cause analysis.
• Evolve the cloud infrastructure ecosystem for our application suite by experimenting with emerging technologies and completing prototypes to understand benefit
• Design and develop CI/CD pipeline in AWS and Openshift to deploy various application artifacts, including APIs and Data Process Jobs.
• Analyze, design and develop the artifacts to configure the monitoring and alerting metrics so the support engineers can proactively and timely validate, troubleshoot and resolve the issues.
• Maintain data integrity and access control by using AWS security tools and services such as HSM, IAM, etc.
• Understand and develop tools to monitor AWS billing for the services, generate cost related reports and help develop and implement cost optimization strategies.
• Work with enterprise security architects to design and implement data security tools, measures, data encryption, key management; design and develop solutions to address the security vulnerabilities discovered by internal security audit team, as well as by the vendors, security community, etc.; design and develop solutions for support team to regularly scan and review to fix security issues
• Regularly and proactively monitor and analyze the capacity and performance of the platform, work with architecture team to design and implement elastic infrastructure to accommodate the irregular burst of user traffic/requests.
• Work with architecture team to develop backup strategy and implement the backup solution for critical data and application components for service restoration and disaster recovery purpose.
• Work with architecture, infrastructure, and application teams to provide input on continuous improvement on the design, performance and security enhancements. Desired Skillset:
• Deep understanding of the operations of AWS cloud platforms.
• Must be well versed in the automation, scripting, monitoring, including use of tools from the major cloud platforms, including but not limited to OpenShift Cloud Formation, Terraform, Ansible, Shell, Python
• Preferable for candidates with significant technical knowledge with infrastructure layers, including but not limited to: Linux OS, major virtualization platforms, Traditional and software defined network, Load Balancers, firewall, API tools, element/performance/intelligent monitoring tools, storage, backup strategy, etc.
• Significant knowledge and experience in end-to-end operations for enterprise systems and applications, including driving issue resolution for mission critical systems.
• Must have experience working to automate, operationalize and improve the Development/QA using CI/CD tools (Gitlab, Github, Jenkins, Maven, Gradle, Nexus)
• Working experience with Software Release Management. Desired Qualification
• BS degree in Computer Science or a related technical field or equivalent practical experience. Minimum Experience
• 3+ years of related DevOps, SysOps engineering experience with focus on major cloud platforms (AWS preferred).
• 2+ years of application development experience including data streaming, deploying/monitoring high availability critical application components.
• 1+ Years in Site Reliability Engineering organization preferred
• Overall 4-6years of experience
Responsibilities:
As a engineer with Retail, Site Reliability Engineering team, you will be at the forefront of Cloud and Big Data technology. In this role you will establish yourself as a technical leader by exposing yourself to a broad range of industry leading technologies that will help to drive acceleration. The ideal candidate will have expert design and development capabilities and be positioned to contribute to a growing set of services and features for the ecosystem. This role will be supporting highly available, business critical applications. This role will serve as the escalation point for complex and hard to define issues in both on premise and AWS environments. We are seeking talented engineers, well versed in DevOps technologies, automation, infrastructure orchestration, configuration management, continuous integration, troubleshooting of complex issues, who are not constrained by how "things are usually done".
Qualifications:
This position is 60 % SRE and 40% SDE. Also open for candidates to join MTH along with ATL GO team. Must agree to work onsite at either of these two locations based on the team's hybrid schedule. Required Skillset
• Manage and optimize data streaming and API components in OpenShift Onpremise and AWS.
• Proactively review the application's APIs and processes to identify opportunities to optimize the response times for various application components.
• Automate various types of testing including data quality checks, automate delivery to system integration, production and automate deployment for production
• Develop integrations between the application in Onpremise and AWS and our third-party tools (ServiceNow, VersionOne, Sumo)
• Work with teams to create SLI/SLO's .
• Actively monitor and lead troubleshooting of degraded performance and hard to define issues for the platform applications, develop the solution and document artifacts in the back log from root cause analysis.
• Evolve the cloud infrastructure ecosystem for our application suite by experimenting with emerging technologies and completing prototypes to understand benefit
• Design and develop CI/CD pipeline in AWS and Openshift to deploy various application artifacts, including APIs and Data Process Jobs.
• Analyze, design and develop the artifacts to configure the monitoring and alerting metrics so the support engineers can proactively and timely validate, troubleshoot and resolve the issues.
• Maintain data integrity and access control by using AWS security tools and services such as HSM, IAM, etc.
• Understand and develop tools to monitor AWS billing for the services, generate cost related reports and help develop and implement cost optimization strategies.
• Work with enterprise security architects to design and implement data security tools, measures, data encryption, key management; design and develop solutions to address the security vulnerabilities discovered by internal security audit team, as well as by the vendors, security community, etc.; design and develop solutions for support team to regularly scan and review to fix security issues
• Regularly and proactively monitor and analyze the capacity and performance of the platform, work with architecture team to design and implement elastic infrastructure to accommodate the irregular burst of user traffic/requests.
• Work with architecture team to develop backup strategy and implement the backup solution for critical data and application components for service restoration and disaster recovery purpose.
• Work with architecture, infrastructure, and application teams to provide input on continuous improvement on the design, performance and security enhancements. Desired Skillset:
• Deep understanding of the operations of AWS cloud platforms.
• Must be well versed in the automation, scripting, monitoring, including use of tools from the major cloud platforms, including but not limited to OpenShift Cloud Formation, Terraform, Ansible, Shell, Python
• Preferable for candidates with significant technical knowledge with infrastructure layers, including but not limited to: Linux OS, major virtualization platforms, Traditional and software defined network, Load Balancers, firewall, API tools, element/performance/intelligent monitoring tools, storage, backup strategy, etc.
• Significant knowledge and experience in end-to-end operations for enterprise systems and applications, including driving issue resolution for mission critical systems.
• Must have experience working to automate, operationalize and improve the Development/QA using CI/CD tools (Gitlab, Github, Jenkins, Maven, Gradle, Nexus)
• Working experience with Software Release Management. Desired Qualification
• BS degree in Computer Science or a related technical field or equivalent practical experience. Minimum Experience
• 3+ years of related DevOps, SysOps engineering experience with focus on major cloud platforms (AWS preferred).
• 2+ years of application development experience including data streaming, deploying/monitoring high availability critical application components.
• 1+ Years in Site Reliability Engineering organization preferred
• Overall 4-6years of experience
Responsibilities:
As a engineer with Retail, Site Reliability Engineering team, you will be at the forefront of Cloud and Big Data technology. In this role you will establish yourself as a technical leader by exposing yourself to a broad range of industry leading technologies that will help to drive acceleration. The ideal candidate will have expert design and development capabilities and be positioned to contribute to a growing set of services and features for the ecosystem. This role will be supporting highly available, business critical applications. This role will serve as the escalation point for complex and hard to define issues in both on premise and AWS environments. We are seeking talented engineers, well versed in DevOps technologies, automation, infrastructure orchestration, configuration management, continuous integration, troubleshooting of complex issues, who are not constrained by how "things are usually done".
Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the Site Reliability Engineer in Atlanta, GA vacancy
$60 - $80 per hour
...Description We are seeking a highly specialized Observability Engineer with deep expertise in Dynatrace (latest Grail platform) to... ...implementations Troubleshoot and diagnose complex performance and reliability issues using Dynatrace Drive adoption of best practices for...SuggestedContract workTemporary workRemote work$178.13k - $205.4k
...telecommuting. Salary Range: $178,131 - $205,400 Basic Qualification Bachelor's degree or foreign degree equivalent in Computer Engineering, Computer Science, Engineering, or related field plus five (5) years of progressive, post-baccalaureate experience in job offered...SuggestedWork at officeRemote work$117k - $209.33k
...Job Requisition ID # 26WD98046 Position Overview An exciting new opportunity has opened for a Site Reliability Engineer within the Autodesk PDMS Platform SRE team. The successful candidate will wear multiple hats: first responder, performance analyst,...SuggestedPermanent employmentFor contractors$109.5k - $150.55k
...strive for the best, own our actions, and grow and evolve. Job Description Renaissance is looking for an experienced Sr Site Reliability Engineer to be part of the Engineering Enablement group's Site Reliability Team with a focus on Application and Infrastructure...SuggestedFor contractorsLocal areaRemote workWorldwideWork visaFlexible hoursWeekend work$81.1k - $187k
...Job Description We are looking for a Site Reliability Engineer 3 to support mission-critical cloud services and production operations. The role focuses on improving service reliability, reducing operational risk, automating repetitive tasks, and driving faster detection...SuggestedTemporary workImmediate startFlexible hoursShift work- ...Technical Support Specialist In Site Reliability Engineering (Sre) Mandatory skills: Scripting and programming languages like Python, Java, Ruby. Cloud and infrastructure management – AWS, Google cloud and Azure is a plus- CI/CD Automation, Database Management. The...
- ...Lead Engineer, Site Reliability Engineering Team As a lead engineer with Retail, Site Reliability Engineering team, you will be at the forefront of Cloud and Big Data technology. In this role you will establish yourself as a technical leader by exposing yourself to...
- ...advances cures by helping the world's most important research sites do their best work. Our solutions are now used by over 30,00... ...What You'll Bring to the Team: We are seeking a Site Reliability Engineer (SRE) to join one of our Scrum teams and help ensure the...Work at office
$75.7k - $136.3k
...solve complex challenges? Do you have a passion for automation and building systems that scale? Join our highly skilled Site Reliability Engineering team! Our team designs, develops, and manages applications and infrastructure that support Akamai Cloud's products and...Work experience placementWork at office$109.5k
...and YouTube. ( Job Description AbbVie Information Security is looking for a highly motivated, diligent, and skillful Site Reliability Engineer to join the Cyber Security Engineering (CSE) Team. The CSE Team, working within the Cyber Security Operations (CSO) function...Temporary workLocal areaRemote work- ...availability. • Automation Experience with Build/deployment, Software Configuration/Continuous Integration/Continuous Delivery/Release Engineering related tasks in JavaEE/C++ Environments. • Experience in automating manual processes using Python, Ruby, Unix Shell (bash,...Immediate start
- ...tooling that improve cloud infrastructure reliability, scalability, and operational efficiency... ...platforms and tools that enable engineering teams to provision services rapidly, consistently... ...engineering, cloud infrastructure, or site reliability engineering. Experience...
- A technology company is seeking a skilled Site Reliability Engineer (SRE) with expertise in AEM to ensure application reliability, performance, and scalability. Responsibilities include implementing monitoring solutions, automating deployments, and optimizing cloud costs...
- Summary: As a Sr. Site Reliability Engineer, you are instrumental in helping make our client’s Kubernetes-centric ProArchive application resilient. This position will coordinate with multiple teams to develop a migration plan for various components and services as well...
- ...automation platforms that Axon's product engineering teams depend on. You will architect... ..., using operational experience to drive reliability improvements and inform platform investment... ...software engineering, cloud infrastructure, or site reliability engineering. Experience...Work at officeRemote work
- Job Title :- Site Reliability Engineer (SRE) Employment Type :- W2 Duration :- Long Term Visa Type :- All Visa applicable which are ready for W2 Location :- Atlanta, GA (Onsite) Job Description We are seeking a highly skilled Site Reliability Engineer (SRE)with expertise...
$99.09k - $123.86k
Position Overview We’re seeking a seasoned Site Reliability Engineer (SRE) who thrives at the intersection of software engineering, infrastructure, and AI systems. You’ll help ensure our platforms are scalable, reliable, and secure while also contributing code, automation...Local areaFlexible hours- Site Reliability Engineering (SRE) Architect Location: Atlanta, GA Duration: 12Months+ Extension Hourly Rate: Depending on Experience (DOE) Work Authorization: As an SRE Architect, you will be a pivotal technical leader responsible for designing, building, and...Hourly payPermanent employmentContract workLocal areaEarly shift
$180k - $220k
...a Lead SRE, you’ll be a technical and operational leader for reliability across Develocity. You’ll help define our SRE vision, set standards... ...grows. This is a hands-on role with broad influence across engineering, cloud platform, and customer-facing teams. The SRE team will...Local areaRemote workWork from homeHome officeShift work$126k - $248k
...As a TPM for SRE, you will partner with SRE leaders and engineers to scale the platform that underpins all of MongoDB's cloud products. You will drive program execution, strengthen production reliability practices, and coordinate cross-functional efforts across US and...Local areaRemote workWorldwideFlexible hours- Waystar, located in Atlanta, is looking for an experienced Site Reliability Engineer to manage complex cloud and network systems. Your role will involve automation for infrastructure, performance monitoring, and collaboration with development teams for reliable software...
- A leading IT solutions provider in Atlanta is looking for a Site Reliability Engineer (SRE) with expertise in Adobe Experience Manager (AEM) and DevOps practices. The successful candidate will maintain and enhance the reliability of AEM applications while implementing scalable...
$240k - $250k
...for our customers. As we scale globally, reliability, availability, and performance are not... ...core product features. As a Principal Engineer, you will define and drive the reliability... ...Infrastructure Development, Platform Engineering, or Site Reliability Engineering role, with a...Full time- ...enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our customers include the world's... ...is founder-led, profitable, and growing. We are hiring a Site Reliability Engineer Our goal is to perfect enterprise infrastructure DevOps...Work at officeLocal areaRemote workWork from homeWorldwide
- An innovative firm is seeking a Technical Support Specialist to join their Site Reliability Engineering team. This role is pivotal in ensuring the reliability and stability of systems and applications. You will provide technical support, troubleshoot issues, and implement...
- ...Job Description - Agile -Release Train Engineer (RTE) Duration: FULL TIME Location: Atlanta, GA Role and Responsibilities: Serve as the key facilitator for the Agile Release Train (ART) in Mobile/Web/Services IT projects. Collaborate...Full time
- SRE DevOps Engineer Location: Overland Park, KS / Atlanta, GA / Frisco, TX (Onsite) Requirements 4-9 years in SRE/DevOps/Systems Engineering... ...trends, propose improvements in monitoring, capacity, and reliability. Collaborate with engineering teams on deployment, upgrades,...
- ...We have an immediate need for a Senior Release Train Engineer for a contract assignment located in Carmel, Indiana . The Release Train Engineer (RTE) has a primary purpose of supporting an Agile Release Train (ART) by steering it to success and navigating the complexity...Contract workWork at officeImmediate start
- 4p-Consulting-Inc. is looking for an experienced DevOps Engineer IV / Site Reliability Engineer (SRE) in Atlanta, GA. This professional will focus on observability, telemetry, and service reliability, working with engineering and operations teams to enhance operational...
- ...Release Train Engineer (RTE) At Xebia we are always looking for talented people to deliver value to our clients and their customers. We help our customers to become data-driven by building their self-service data platform in the cloud and therefore deliver direct value...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Site Reliability Engineer. Be the first to apply!
Related searches
- site reliability engineer remote Atlanta, GA
- site reliability engineer sre Atlanta, GA
- site reliability engineer Atlanta, GA
- on-site clinical research associate (traveling/remote) Atlanta, GA
- junior website developer Atlanta, GA
- IT site lead Atlanta, GA
- site leader Atlanta, GA
- site safety Atlanta, GA
- site recruiter Atlanta, GA
- on site coordinator Atlanta, GA

