Senior Software Engineer - Application Reliability , Hybrid
$199.7k - $254.6kWebex Events (formerly Socio)
Senior Software Engineer In Application Reliability
This position is based in San Jose, CA or North Carolina and operates under a hybrid work model.
Join Cisco's Enterprise AI team, the core group enabling Generative AI powered experiences across Cisco. Our mission is to build secure, scalable AI platforms that empower teams to safely develop, deploy, and operationalize AI-powered solutions. We operate at the intersection of applied AI, cloud infrastructure and security — partnering across engineering, security, compliance, and product teams to bring trusted AI to life at an enterprise scale.
We are a fast-growing, highly collaborative team of platform engineers, AI engineers, and data scientists who value technical depth, ownership, and pragmatic execution. What makes this team exciting is the opportunity to define how secure Generative AI is built and governed inside a global technology leader.
As a Senior Software Engineer in Application Reliability, you will own the reliability of our AI-powered applications and features from the user's perspective. While our infrastructure SRE team ensures the platform is healthy, your focus will be on feature uptime, usage trends, automated issue identification, and self-healing remediation at the application layer. You will build LangGraph-based agents for automated diagnostics, Looker dashboards for observability, and evaluation harnesses for agent quality - all powered by BigQuery, BigTable, and Python. You will partner closely with application developers, data engineers, and infrastructure SREs to ensure our APIs, RAG systems, agents, and user-facing features are reliable, observable, and continuously improving.
Your Impact
- Define, implement, and enforce feature-level SLIs, SLOs, and error budgets for APIs, RAG systems, AI agents, and user-facing applications.
- Build and maintain application observability systems using Looker dashboards on BigQuery and BigTable — providing real-time visibility into feature health, error patterns, and usage trends for developers, PMs, and leadership.
- Design and build LangGraph-based agents for automated issue identification and remediation: anomaly detection on BQ logs, root cause diagnosis, auto-rollback, feature flag kill switches, and self-healing workflows.
- Develop agent evaluation harnesses to benchmark agent performance, test multi-step workflows, handle non-deterministic outputs, and run regression testing as agents evolve.
- Write complex SQL (BigQuery) for usage trend analysis, anomaly detection, and operational analytics; design BQ table schemas optimized for observability and debugging.
- Analyze application usage trends and adoption metrics to proactively identify reliability risks, capacity needs, and degraded user experiences before they become incidents.
- Partner with application development teams to embed reliability practices into the development lifecycle: deployment safety (canary, progressive rollout), structured logging standards, and distributed tracing.
- Lead application-level incident response, root cause analysis, and blameless postmortems focused on feature impact rather than infrastructure symptoms.
- Build Python-based tooling and automation to reduce mean time to detect (MTTD) and mean time to resolve (MTTR) for application-layer issues.
- Stay current with the rapidly evolving AI landscape (new frameworks, tools, and paradigms) and apply emerging techniques to improve platform reliability and developer productivity.
Minimum Qualifications
- 10+ years of experience in software engineering with significant focus on reliability, observability, or production operations; Bachelor's or Master's Degree in Computer Science, Engineering, or a related technical discipline.
- Strong Python development skills, with experience building production tooling, automation, and agent-based systems.
- Production GCP experience — deploying and managing applications on GKE (Kubernetes), deep SQL expertise with BigQuery (complex queries, window functions, schema design, cost optimization), and hands-on experience with BigTable (or equivalent) for high-throughput operational data.
- Proven experience designing and operating application-level SLI/SLO frameworks, burn-rate alerting, and error budget policies.
- Strong debugging skills at the application layer — distributed tracing, profiling, structured log analysis, and dependency mapping.
Preferred Qualifications
- Experience building agent evaluation harnesses (benchmarking, regression testing, guardrail validation for AI agents).
- Familiarity with A2A protocols, streaming architectures, and event-driven systems.
- Experience with deployment safety patterns: feature flags, canary deployments, progressive rollouts, and automated rollback.
- Experience with GCP observability services (Cloud Logging, Cloud Trace, Cloud Monitoring).
- Exposure to AIOps concepts: ML-driven anomaly detection, automated root cause analysis, intelligent alerting.
- Experience driving reliability culture across engineering teams — SLO adoption, postmortem processes, and reliability reviews.
- Active engagement with the evolving AI ecosystem; awareness of emerging tools and frameworks.
- Hands-on experience with GenAI application development: LangGraph, agent engineering, prompt design, and agentic workflows.
- Experience building Looker dashboards and Look ML models for operational observability.
Why Cisco?
At Cisco, we're revolutionizing how data and infrastructure connect and protect organizations in the AI era – and beyond. We've been innovating fearlessly for 40 years to create solutions that power how humans and technology work together across the physical and digital worlds. These solutions provide customers with unparalleled security, visibility, and insights across the entire digital footprint.
Fueled by the depth and breadth of our technology, we experiment and create meaningful solutions. Add to that our worldwide network of doers and experts, and you'll see that the opportunities to grow and build are limitless. We work as a team, collaborating with empathy to make really big things happen on a global scale. Because our solutions are everywhere, our impact is everywhere.
We are Cisco, and our power starts with you.
Message to applicants applying to work in the U.S. and/or Canada:
The starting salary range posted for this position is $199,700.00 to $254,600.00 and reflects the projected salary range for new hires in this position in U.S. and/or Canada locations, not including incentive compensation, equity, or benefits.
Individual pay is determined by the candidate's hiring location, market conditions, job-related skillset, experience, qualifications, education, certifications, and/or training. The full salary range for certain locations is listed below. For locations not listed below, the recruiter can share more details about compensation for the role in your location during the hiring process.
U.S. employees are offered benefits, subject to Cisco's plan eligibility rules, which include medical, dental and vision insurance, a 401(k) plan with a Cisco matching contribution, paid parental leave, short and long-term disability coverage, and basic life insurance. Please see the Cisco careers site to discover more benefits and perks. Employees may be eligible to receive grants of Cisco restricted stock units, which vest following continued employment with Cisco for defined periods of time.
U.S. employees are eligible for paid time away as described below, subject to Cisco's policies:
- 10 paid holidays per full calendar year, plus 1 floating holiday for non-exempt employees
- 1 paid day off for employee's birthday, paid year-end holiday shutdown, and 4 paid days off for personal wellness determined by Cisco
- Non-exempt employees receive 16 days of paid vacation time per full calendar year, accrued at rate of 4.92 hours per pay period for full-time employees
- Exempt employees participate in Cisco's flexible vacation time off program, which has no defined limit on how much vacation time eligible employees may use (subject to availability and some business limitations)
- 80 hours of sick time off provided on hire date and each January 1st thereafter, and up to 80 hours of unused sick time carried forward from one calendar year to the next
- Additional paid time away may be requested to deal with critical or emergency issues for family members
- Optional 10 paid days per full calendar year to volunteer
For non-sales roles, employees are also eligible to earn annual bonuses subject to Cisco's policies.
Employees on sales plans earn performance-based incentive pay on top of their base salary, which is split between quota and non-quota components, subject to the applicable Cisco plan.
- .75% of incentive target for each 1% of
$90k - $215k
...Senior Software Engineer- Observability and Reliability Platform Engineering (REMOTE)Senior Software Engineer- Observability... ..., zero-downtime platforms, and applications. You will help drive our... ...experience with AWS, GCP, Azure, or hybrid data centerEducationBachelor's degree...ApplicationSeniorHourly payWork experience placementLocal areaRemote workFlexible hours$165k - $241.4k
...Software Development Engineer This is a hybrid position in the Milpitas, CA Office. Meet the Team You will be part of a best-in-class Software... ...breaking technologies in Routers connecting to Cloud applications. You will have the opportunity to work with...ApplicationSeniorFull timeTemporary workWork at officeLocal areaFlexible hours$140k - $215k
...Back-End Engineer As a global leader in... ...the cloud product software engineering team,... ...Sensor. This is a hybrid role and will... ...organization and up to senior leadership. Our... ...design patterns, reliability and scaling) of... ...building cloud-deployed applications ~ BS/BE in CS...ApplicationSeniorWork experience placementLocal area2 days per week$140k - $215k
## Sr. Software Engineer - Falcon Fusion Product (Hybrid)Applylocations: USA - Sunnyvale, CAtime... ...Our Fusion is seeking a Senior-to-Principal (Level 7)... ...attention to performance, reliability and scalability will be... ...is 10-15) of overall applicable experience in a...ApplicationSeniorWork experience placementWork at officeLocal area$140k - $215k
...the Role: This is a Software Development Engineer role on the Cloud Runtime... ...operations. This role is hybrid, requiring 2-3 days per week... ...systems and components reliability and performance through monitoring... ...for all employees and applicants for employment. The...ApplicationSeniorWork experience placementWork at officeLocal area2 days per week3 days per week- ...Applications are still being accepted. Apply now! Job Type Full-Time Workspace Hybrid/Remote Job Description As a Senior Software Optimization Engineer, you’ll lead the design and implementation... ...system efficiency and reliability. Collaborate with cross...ApplicationSeniorPermanent employmentFull timeContract workRemote work
$140k - $215k
Sr. Software Engineer - Sensor - Cloud Runtime Protection (Hybrid) page is loaded## Sr. Software Engineer - Sensor - Cloud... ...Ensure systems and components reliability and performance through monitoring... ...for all employees and applicants for employment. The Company does...ApplicationSeniorWork experience placementWork at officeLocal area2 days per week3 days per week$112k - $179k
...Senior Full Stack Software Engineer Job Locations US-CA-Santa Clara | US... ...the ability to work in a hybrid capacity but primarily on... ...TMPC) while ensuring the reliability, scalability, and security... ...driven capabilities where applicable (e.g., data analysis, predictive...ApplicationSeniorContract workShift work$224k - $356.5k
...NVIDIA is hiring engineers to scale up the introduction... ..., familiarity with software testing and deployment... ...at scale with focus on hybrid deployments between cloud... ...orchestration and application tuning Provide fast... ...effective, clear and reliable architecture specification...ApplicationSenior$152k - $241.5k
...world. We are looking for a Senior Software Engineer to join our mission to... ...business critical services and AI applications. You will be working with a... ..., crafting and building reliable distributed systems, and... ...in a globally distributed, hybrid multi‑cloud environment (AWS...ApplicationSenior- ...firm in California is seeking a Senior/Staff Java Developer to... ...maintain large-scale cloud-based applications. The ideal candidate will... ...technologies like AWS or Azure. This hybrid role requires onsite work... ...teams to ensure high-quality software delivery. Competitive salary...ApplicationSenior
$143k - $191k
...decisions. We’re looking for a Software Engineer to design and build highly... ...intuitive, scalable, and reliable product features. This is a... ...backend or full-stack applications ~ Proficiency in Python and... ...this role is categorized as hybrid in Santa Clara, CA The base...ApplicationSeniorWork at officeRemote workFlexible hours$179.06k - $198.95k
...Clara 2 days per week (Hybrid) Expertise coding... ...skilled and motivated engineer to design, develop, and... ...designing for scale, reliability, and operational excellence... ...to run efficiently as Software-as-a-Service (SaaS) on... ...Pursuant to Applicable State Equal Pay Transparency...ApplicationSeniorHourly payFull timeWork at office2 days per week3 days per week$154.42k - $235.9k
...experience that make complex systems reliable, observable, and fast. As a Senior Software Engineer, you will design and deliver... ...blocks used by AV/Robotics applications on vehicles, on benches, and... ...eligible for relocation benefits. Hybrid: This role is categorized...ApplicationSeniorPermanent employmentLocal areaWork from homeRelocationRelocation packageFlexible hours$139k - $204k
...Senior Software Engineer, Cluster Orchestration CoreWeave is The Essential... ...workloads run seamlessly, reliably, and efficiently across massive... ...workloads, GPU-based applications, or ML pipelines. Knowledge... ...While we prioritize a hybrid work environment, remote work...ApplicationSeniorPermanent employmentTemporary workCasual workWork at officeRemote workFlexible hours- ...effortlessly run large-scale ML applications, without the hassle of... ...We are looking for a Software Engineer to join the ML Integration... ...large-scale ML workloads run reliably and efficiently across our... ...This role follows a hybrid schedule, requiring in-office...ApplicationSeniorWork at officeRemote work
- ...Description About Index Engines At Index Engines,... ...for mid to senior level Software Engineers for our San... ...Index Engines’ Linux application and will work closely... ...systems that are scalable, reliable, and secure Guide... ...~ Unlimited PTO ~ Hybrid work schedule with WFH...ApplicationSeniorWork at officeWork from homeMonday to Friday
- ...Senior Full Stack Software Engineer (Network) This role has been designed as 'Hybrid' with an expectation that you will work on average... ...and act on their data and applications wherever they live, from edge... ...and support scalable and reliable new features for Orchestrator...ApplicationSeniorWork at office2 days per week
$136k - $224.25k
...NVIDIA is looking for a Senior Network Reliability Engineer to support and maintain our cloud and datacenter... ...serves the needs across the whole software stack for NVIDIA, from Graphics... ...eligible for equity and benefits ( . Applications for this job will be accepted at...ApplicationSeniorRemote workShift work$140k - $215k
...CrowdStrike Backend Software Engineer As a global leader in cybersecurity, CrowdStrike protects... ...to the right place. This role is hybrid, requiring 2-3 days per week on-site at... ...employment opportunity for all employees and applicants for employment. The Company does not...ApplicationSeniorFull timeWork experience placementWork at officeLocal area2 days per week3 days per week$323k - $428k
...groups within Advertising Engineering to deliver high-... ...seeking a highly skilled Senior Software Engineer with deep... ...Apache Airflow. This hybrid position bridges... ...scalable systems for both application development and large... ...Looker to deliver reliable, high-performance solutions...ApplicationSeniorWork at officeLocal areaRemote workMonday to ThursdayFlexible hours$176k - $220k
...connecting users, devices, and applications in any location. Here,... ...are looking for a Sr. Staff Software Engineer to join our Service... ...offers flexibility to work a hybrid schedule (three days a week... ...operate the orchestration and reliability automation that manages ZIA...ApplicationSeniorFull timeWork at officeLocal area3 days per week$136.5k - $276.5k
...Senior Software Engineer, Systems/Solutions Test This role has been designed as 'Hybrid' with an expectation that you will work on average... ...act on their data and applications wherever they live, from... ...deployments and help ensure reliability, scalability, and...ApplicationSeniorWork experience placementWork at officeLocal areaImmediate start2 days per week$144k - $216k
...Description Job Description The Senior Software Engineer is a technical pillar and... ...to system performance, reliability, and cost-efficiency... ...with team values. #LI-Hybrid #LI-JM1 This job... ...offerings at any time. All applicants acknowledge that by...ApplicationSeniorWork experience placementImmediate start$136.5k - $276.5k
...Senior Platform Software Engineer This role has been designed as 'Hybrid' with an expectation that you will work on average 2 days... ..., and act on their data and applications wherever they live, from... ...diagnostics tools to ensure the reliability and performance of Juniper...ApplicationSeniorWork experience placementWork at officeLocal areaImmediate start2 days per week$300k - $425k
...re actively looking for a Senior Software Engineer, Content Platform who can... ...real-time data processing applications to support business-critical... ..., scalability, and reliability. Leverage Java frameworks... ...Aerospike). #LI-JC5 Our Hybrid Work Approach Roku fosters...ApplicationSeniorWork at officeLocal areaRemote workMonday to ThursdayFlexible hours$210k - $267k
...We're looking for an engineer to help lead the scaling and reliability of our data infrastructure... ...or Temporal. Strong software engineering skills.... ...startup, so Gridmatic has a hybrid policy that will ask... ...process, such as reviewing applications, analyzing resumes, or...ApplicationSeniorWork at officeRemote workWork from homeHome officeFlexible hours3 days per week$160k - $250k
...Crowdstrike Falcon Host Sr. Software Development Engineer (SDE) As a global leader in cybersecurity,... ...and market trends. Note: This is a hybrid role based out of our offices in Redmond... ...opportunity for all employees and applicants for employment. The Company does not...ApplicationSeniorWork experience placementWork at officeLocal areaRemote work2 days per week$117k - $234k
...building highly scalable and reliable APIs, services and applications which will drive the next... ...high impact, critical software/systems monitoring issues... ...Were a team of software engineers, data scientists, cybersecurity... ...of retail. Flexible, hybrid work: We use a hybrid...ApplicationSeniorFull timeTemporary workPart timeWork at officeFlexible hours$168k - $270.25k
Senior Site Reliability Engineer page is loaded## Senior Site Reliability Engineerlocations: US, CA, Santa Claratime type: Full timeposted on: Posted... ...- 270,250 USD.You will also be eligible for equity and .Applications for this job will be accepted at least until May 8, 2026...ApplicationSenior
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Senior Software Engineer - Application Reliability , Hybrid. Be the first to apply!
- graduate software developer San Jose, CA
- rust software engineer San Jose, CA
- senior software design engineer San Jose, CA
- software engineer student San Jose, CA
- software engineer amazon San Jose, CA
- software developer positions San Jose, CA
- software engineer full time San Jose, CA
- software qa engineer San Jose, CA
- new graduate software engineer San Jose, CA
- junior software developer San Jose, CA



