Remote Senior Scalability Engineer - Observability
$110.4k - $213kGrabJobs
About Judi Health
Judi Health is an enterprise health technology company providing a comprehensive suite of solutions for employers and health plans, including:
Capital Rx , a public benefit corporation delivering full-service pharmacy benefit management (PBM) solutions to self-insured employers,
Judi Health™ , which offers full-service health benefit management solutions to employers, TPAs, and health plans, and
Judi® , the industry’s leading proprietary Enterprise Health Platform (EHP), which consolidates all claim administration-related workflows in one scalable, secure platform.
Together with our clients, we’re rebuilding trust in healthcare in the U.S. and deploying the infrastructure we need for the care we deserve. To learn more, visit .
Location: Remote
Position Summary :
Our Scalability team as a Senior Scalability Engineer focused on observability platform development and engineering productivity. In this role, you will define, own, and build Judi Healths organization-wide observability strategy, tooling, and platform products. Beyond maintaining infrastructure, youll architect and develop a custom observability platform that gives engineering teams powerful, fast, and cost-effective visibility into every layer of our infrastructure—from application logs and metrics to distributed traces. Youll build production-grade internal products using React/TypeScript frontends with Python and Rust backends, creating tools that fundamentally improve how engineers at Judi Health debug, monitor, and optimize their systems. Working closely with leadership and cross-functional teams, your work will be foundational to platform stability, performance optimization, and developer productivity across our rapidly growing healthcare platform.
Position Responsibilities:
In this role, youll own the observability infrastructure that powers our engineering organization. You will:
Architect observability platform: Design, implement, and maintain the LGTM stack (Loki, Grafana, Tempo, Mimir/Prometheus) as the primary observability platform across all engineering teams, making architectural decisions that balance cost, performance, and developer experience.
Build internal observability products: Design and develop production-grade internal platform products with React/TypeScript frontends and Python/Rust backends that provide engineers with powerful log search, metrics visualization, and trace analysis capabilities.
Develop custom log indexing systems: Architect and build high-performance log indexing solutions using Rust that process logs and provide sub-second search across billions of log lines at a fraction of the cost.
Integrate SQL analytics for logs: Design and implement solutions leveraging AWS Athena or similar SQL query engines (DuckDB, ClickHouse) for ad-hoc log analysis and historical queries, enabling engineers to run complex SQL queries over S3-based log data for deep investigations and trend analysis.
Create advanced query interfaces: Build sophisticated web interfaces that allow engineers to query logs, metrics, and traces with features like saved queries, query templates, correlation analysis, and pattern detection, supporting both full-text search and SQL-based analytics.
Balance cloud-native and open-source: Architect solutions that thoughtfully leverage both AWS-managed services (CloudWatch, Athena, Kinesis) and open-source tooling (LGTM stack, Quickwit) to optimize for cost, performance, and operational flexibility based on use case requirements.
Integrate AWS observability: Design seamless integration between AWS CloudWatch Logs/Metrics and our custom observability platform, providing unified visibility across managed and self-hosted infrastructure.
Build intelligent alerting: Develop smart dashboards, monitors, and alerting systems that reduce noise, detect anomalies, and help teams respond to incidents quickly.
Partner with engineering teams: Work directly with product teams to integrate observability into their services, establish logging and metrics standards, and instrument code effectively, serving as the observability subject matter expert.
Enable performance optimization: Provide the observability foundation that allows the Scalability team to identify performance bottlenecks, track optimization impact, and measure platform stability with data-driven insights.
Establish observability standards: Define and document comprehensive observability standards including structured logging patterns, metric naming conventions, trace instrumentation, dashboard design principles, and query best practices.
Drive platform adoption: Lead workshops, create documentation, and build self-service tooling that democratizes observability across engineering, making it easy for teams to adopt best practices.
Demonstrate technical leadership: Mentor engineers on observability practices, lead architecture reviews for instrumentation approaches, and represent the Scalability team in cross-functional planning.
Work in an Agile/Scrum environment to continually deliver value to stakeholders and clients.
Code of Conduct: Responsible for adherence to the Capital Rx Code of Conduct including reporting of noncompliance.
Required Qualifications:
10+ years of software engineering or infrastructure engineering experience with demonstrated progression into technical leadership roles.
Several years of experience leading technical initiatives, building platform products, or serving as a subject matter expert on observability infrastructure.
Strong experience with React/TypeScript for frontend development and Python (Flask/ SQLAlchemy ) for backend services.
LGTM stack expertise : Deep production experience with Loki, Grafana, Tempo, and Prometheus/Mimir for logs, metrics, and distributed tracing at scale.
AWS observability: Extensive experience with AWS CloudWatch Logs and Metrics, including custom metrics, log insights, dashboard creation, and integration patterns.
SQL analytics for logs: Production experience with SQL-based log analytics using AWS Athena, DuckDB , or similar query engines for analyzing structured and semi-structured data at scale.
Cloud-native and open-source balance: Demonstrated ability to architect solutions leveraging both managed cloud services and open-source tooling, understanding trade-offs between operational overhead, cost, flexibility, and vendor lock-in.
Search and indexing experience: Hands-on experience building or operating search systems using OpenSearch, Elasticsearch, Lucene, Tantivy, or similar search and analytics engines.
Performance-critical systems: Experience building high-performance systems that process large volumes of data efficiently (millions of log lines, high-cardinality metrics).
Systems thinking: Deep understanding of distributed systems, microservices architectures, and the complex observability challenges they present.
Data at scale: Proven track record handling high-volume structured and unstructured logging data, identifying patterns, and building efficient search/query solutions that perform well under load.
Product mindset: Ability to build internal platform products that engineers love to use, with attention to UX, performance, and reliability.
Preferred Qualifications:
Rust development experience: Production experience with Rust for building high-performance data processing, indexing, or search systems. Strong interest in learning Rust is acceptable if combined with systems programming experience in C/C++/Go.
Infrastructure as code: Experience with Terraform for managing observability infrastructure and AWS resources.
Additional observability platforms: Experience architecting or operating Datadog, New Relic, Splunk, or other enterprise observability platforms.
Advanced query languages: Deep expertise with PromQL, LogQL, SQL optimization, and query optimization for high-cardinality data.
Columnar storage formats: Experience with Parquet, ORC, or other columnar storage formats for efficient log storage and analytics on S3.
Incident management: Experience designing incident response workflows, postmortem processes, and SLO/SLI frameworks that drive reliability improvements.
Cost optimization: Track record of reducing observability costs while maintaining or improving capabilities (e.g., CloudWatch → S3/custom indexing migration).
Data pipelines: Experience with streaming data pipelines, ETL processes, or real-time data processing.
Distributed tracing: Deep knowledge of OpenTelemetry , Jaeger, Zipkin, or distributed tracing architectures.
Git expertise and experience working in a mono repository.
Previous Pharmacy Benefits Manager (PBM) or healthcare technology experience.
Experience building developer tools or internal platforms that improve engineering productivity.
This range represents the low and high end of the anticipated base salary range for the NY - based position. The actual base salary will depend on several factors such as: experience, knowledge, and skills, and if the location of the job changes.
Nothing in this position description restricts management’s right to assign or reassign duties and responsibilities to this job at any time.
This range represents the low and high end of the anticipated base salary range. The actual base salary will depend on several factors such as: experience, knowledge, skills, and location of the job.
Remote, US Salary Range
$110,400 - $213,000 USD
All employees are responsible for adherence to the Capital Rx Code of Conduct including the reporting of non-compliance. This position description is designed to be flexible, allowing management the opportunity to assign or reassign duties and responsibilities as needed to best meet organizational goals.
Judi Health values a diverse workplace and celebrates the diversity that each employee brings to the table. We are proud to provide equal employment opportunities to all employees and applicants for employment and prohibit discrimination and harassment of any type without regard to race, color, religion, age, sex, national origin, disability status, medical condition, genetic information, protected veteran status, sexual orientation, gender identity or expression, or any other characteristic protected by federal, state or local laws.
By submitting an application, you agree to the retention of your personal data for consideration for a future position at Judi Health. More details about Judi Healths privacy practices can be found at .
- Fieldguide is seeking a Senior Site Reliability Engineer to ensure the reliability and scalability of our production systems in San Francisco, CA. The role involves... ...define reliability standards and build robust observability practices. Candidates should have at least 5...Remote jobSeniorFlexible hours
$160k - $220k
Senior Scalability Engineer - Streaming & Realtime Systems Judi Health is an enterprise health technology... .... To learn more, visit Location: Remote Position Summary: Join our... ...event processing, error handling, and observability. Partner with product teams: Work directly...Remote workSeniorLocal areaFlexible hours- ...Benefits Inc. is seeking a Performance and Capacity Engineer to architect infrastructure stability and scalability. You will drive operational excellence by... ...degree and relevant experience. This position is remote, available to candidates within the continental United...Remote workSenior
- UST Global Inc is seeking a Senior AEM Developer for a full-time remote position. The role involves designing and implementing scalable Adobe Experience Manager solutions while providing technical leadership and working collaboratively with cross-functional teams. The ideal...Remote jobSeniorFull time
- ...organization located in Raleigh is seeking a Senior Performance Engineer to lead performance testing... ...tests to ensure system robustness and scalability. You'll collaborate with cross-functional... ...with a mixture of onsite and remote flexibility. #J-18808-Ljbffr State Employees...Remote workSenior
$190k - $200k
...Job Summary: Our client is seeking a Senior Observability Engineer to join their team! This position is remote. Duties: Design and implement the foundational technical patterns for user-journey observability Design and implement how services...Remote workSeniorLocal area$175k - $195k
...A leading fintech startup in New York seeks a software engineer to build scalable systems to power B2B billing. You'll own critical infrastructures... ...a competitive salary of $175,000 - $195,000 with share options, supporting a remote or hybrid work model. #J-18808-Ljbffr...Remote workSenior$160k - $200k
...A leading manufacturing technology firm is looking for a Senior Backend Engineer to enhance their AI-powered manufacturing platform. The role entails developing scalable backend services using AWS and collaborating with cross-functional teams. Candidates should have over...Remote workSeniorFlexible hours$161.93k - $227.33k
...Freenome Holdings, Inc is looking for a Senior Software Engineer II to join their Engineering team. The role involves designing and maintaining scalable backend services to detect cancer early, requiring at least 8 years of experience, a strong proficiency in Python,...Remote workSenior- ...Fortress is seeking a Senior Software Engineer (JavaScript) in Orlando. The role involves designing and maintaining scalable software solutions to secure the software supply chain, collaborating closely with cross-functional teams to deliver high-quality solutions. Candidates...Remote workSeniorFlexible hours
$93k - $124k
GE Aerospace is seeking a Sr. Data Engineer based in Bellevue, Washington. You will play a critical role in building and architecting automation and observability for the EDAS platform, emphasizing workflow automation and platform reliability. The position offers a competitive...Remote jobSenior$93k - $124k
GE Aerospace in Indianapolis is looking for a Sr. Data Engineer to enhance automation and observability on the EDAS platform. This role requires expertise in Python, AWS, and various observational tools to ensure platform reliability. The ideal candidate will have a Bachelor...Remote jobSenior- Commerce.com US, Inc. is seeking an experienced Senior Software Engineer - Backend to join the fully remote engineering team. In this role, you will collaborate on architecture, design, and development of scalable applications while solving complex infrastructure problems...Remote jobSenior
$93k - $124k
GE Aerospace in Omaha, Nebraska is seeking a Sr. Data Engineer to enhance the EDAS platform through automation and observability. This involves building frameworks for monitoring and logging, automating workflows, and ensuring data governance. Candidates should have a...Remote jobSenior- GE Aerospace in Providence, Rhode Island is seeking a Sr. Data Engineer to enhance automation and observability for the EDAS platform. You will design and implement frameworks for real-time data monitoring and streamline workflows to ensure platform reliability and efficiency...Remote jobSenior
- ...Job Description Job Description JobID: 51180 Senior Observability Engineer Location: Remote Job Summary: We are seeking a skilled and... ...cross-functional teams to ensure system reliability, scalability, and performance. Qualifications: - Bachelor's...Remote workSenior
$300k
...startup is seeking exceptional Backend Software Engineers to help architect, build, and scale the... ...or Java. The role focuses on designing scalable backend systems and collaborating closely with research teams. This fully remote position offers a compensation package up...Remote jobSenior- ...is seeking an integral member of an agile software engineer team responsible for building complex scalable software applications. The role involves designing... ...with modern programming languages. The position is remote, offers a competitive salary range, and values innovation...Remote jobSenior
$86.4k - $138.6k
...leading healthcare technology company based in California is seeking an experienced software engineer to join an agile team. You will be responsible for designing and building scalable software applications that enhance customer experience. The ideal candidate should have a...Remote jobSenior- GE Aerospace is seeking a Sr. Data Engineer in Overland Park, Kansas, to architect and build automation and observability tools for the EDAS platform. This role emphasizes enhancing platform reliability by designing observability frameworks and optimizing data pipelines...Remote jobSenior
- ...99000 General Electric Company is seeking a Sr. Data Engineer to enhance automation and observability for the EDAS platform. You will be automating workflows... ...languages like Python. The position offers a remote work setup and includes comprehensive health benefits...Remote jobSenior
$93k - $124k
GE Aerospace is looking for a Senior Data Engineer to build automation and observability for the EDAS platform in San Francisco. Responsibilities include developing observability frameworks, analyzing metrics, and collaborating with teams to enhance efficiency. Minimum...Remote jobSenior$80 - $85 per hour
...long term contract opportunity for a Senior Observability Engineer to join the team located in... ...This is 2 days onsite and 3 days remote in Holmdel, NJ. Responsibilities... ...and remediate platform performance, scalability, and reliability issues. Administer...Remote workSeniorLong term contractWork at office- CoreWeave is seeking a Senior Manager, Observability Engineering to lead a team focused on building and scaling observability systems for metrics, logs, and telemetry pipelines. You will define strategy, ensure platform reliability, and collaborate with engineering teams...Senior
- ...Senior Engineer, Storage Join our team as a Senior Engineer, focusing on the design and implementation of a scalable multi-tenant control plane for AI storage. At CoreWeave, we are dedicated... .... Enhance the reliability and observability of the storage stack. Monitor...Remote workSenior
- Senior Site Reliability Engineer, Observability Join Chainlink Labs as a Senior Site Reliability Engineer focused on Observability. The role supports our... ...evolution of reliability and velocity. Experience working remotely in a distributed team. A strong desire to grow and...Remote workSeniorFull time
- A leading tech firm is seeking a Sr. Performance Engineer for a 90-day contract-to-hire position. This fully remote role involves building a performance testing environment, leading various testing methods, and developing automation frameworks. The ideal candidate should...Remote jobSeniorContract work
- EPAM Systems, Inc. is seeking a Senior Observability Engineer to resolve technical monitoring issues in New Relic and drive observability practices... ..., automation, and integration, while also working remotely from Ukraine. The ideal candidate must have proven experience...Remote workSeniorFlexible hours
$147k - $237.5k
Palo Alto Networks, Inc. is on the lookout for an Infrastructure Engineer to optimize and enhance platform components. In this remote role, you will solve complex problems and scale infrastructure for a containerized world. The position requires a passion for developing...Remote jobSenior- ...Senior Database Engineer This is a remote position. We are looking for a Senior Database Engineer to join our team and help evolve database... ...approaches, emphasizing recoverability, observability, and platform scalability. The role combines database administration, reliability...Remote workSenior
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Remote Senior Scalability Engineer - Observability. Be the first to apply!
- remote education consultant United States
- remote nonprofit United States
- remote financial analyst United States
- remote virtual assistant United States
- package handler remote United States
- remote vue developer United States
- remote real estate United States
- remote design intern United States
- remote hr assistant United States
- remote legal internship United States

