AI / ML Engineer (with Observability)
Mindlance
Hybrid onsite at Dallas, TX, 75019 / Tampa, FL, 33647
CTH
2 rounds of interviews Overview
We are seeking a passionate and hands-on AI/ML Engineer to accelerate our Enterprise Observability strategy . This role will design, build, and operationalize AI/ML capabilities that enhance end to end telemetry pipelines, anomaly detection, intelligent alerting, and proactive system resiliency.
You will work at the intersection of AI/ML engineering, Observability platforms, and automation, developing solutions that improve detection, diagnosis, and prevention of operational issues across distributed systems.
Key Responsibilities
• Design and deploy AI/ML models supporting anomaly detection, baselining, event correlation, and predictive operational analytics.
• Build and integrate AI-enabled capabilities into enterprise Observability platforms, including Grafana, APM/RUM tools, network telemetry systems, and data observability tools.
• Develop AI Agents that can autonomously triage issues, recommend corrective actions, and initiate automated remediation workflows to reduce recovery time and improve system resilience.
• Implement self-healing automation using AI-driven decisioning, integrating with orchestration frameworks, service APIs, and infrastructure automation pipelines.
• Engineer and maintain real-time and batch data pipelines using Snowflake ML Jobs, Snowflake Cortex, streams, tasks, and UDFs.
• Implement and manage OpenTelemetry-based telemetry ingestion for logs, metrics, traces, and spans across distributed systems.
• Build asynchronous Python APIs and services for model inferencing and operational integration.
• Enhance observability intelligence with AI-powered capabilities such as root-cause acceleration, chatbot/search enablement, and automated insights.
• Contribute to SLO/SLI modeling, Golden Signals instrumentation, and Observability NFR adoption.
• Collaborate across engineering, SRE, platform and business teams to embed proactive intelligence and Observability standards throughout the ecosystem.
Required Skills & Qualifications
Core Technical Skills
• Strong proficiency in Python and data science/ML libraries:
NumPy, Pandas, scikit learn, TensorFlow, PyTorch, Matplotlib, Seaborn.
• Experience with Generative AI, LLM fine tuning, prompt engineering, RAG pipelines, and LLM evaluation frameworks.
• Expertise in developing and deploying ML models in production (batch & streaming).
• Strong understanding of statistics, time series modeling, and anomaly detection.
Observability & Telemetry
• Experience with OpenTelemetry for logs, metrics, traces, spans.
• Familiarity with Observability concepts:
Golden Signals, SLO/SLI design, APM, RUM, Synthetics, event correlation, baselining.
• Experience with Observability tools such as:
Grafana (Alloy agents, dashboards, ML capabilities), Dynatrace, Monte Carlo (Data Observability), Netscout, ThousandEyes, SolarWinds, NetBrain. Cloud, Data & Platform
• Hands on with AWS (SageMaker, Bedrock), Snowflake ML, Snowflake/Openflow, Snowflake AI Observability tooling.
• Experience building Snowflake data pipelines (streams, tasks, UDFs) - plus for Cortex features.
• Strong understanding of distributed systems and microservices telemetry requirements.
Automation & Engineering Quality
• Experience with automation pipelines, CI/CD, and infrastructure as code patterns supporting Observability adoption.
• Ability to build asynchronous Python APIs or services for model inference and operational integration.
Preferred Qualifications
• Experience developing agentic AI systems that analyze telemetry, generate action recommendations, or execute automated operational responses.
• Experience building self-healing patterns, including automated rollback, service restarts, configuration corrections, and predictive maintenance.
• Experience in Snowflake ML workflows, Snowflake Cortex Agents, and data pipeline automation.
• Exposure to AI-enabled alerting, RCA automation, and operational self-healing concepts.
• Experience with large-scale operational telemetry and multi-cloud ecosystems.
Soft Skills
• Strong analytical thinking and problem solving.
• Excellent communication skills for cross functional collaboration with infrastructure, SRE, engineering, business, and leadership teams.
• Curiosity, continuous learning mindset, and passion for applied AI and Observability.
EEO:
"Mindlance is an Equal Opportunity Employer and does not discriminate in employment on the basis of - Minority/Gender/Disability/Religion/LGBTQI/Age/Veterans."
CTH
2 rounds of interviews Overview
We are seeking a passionate and hands-on AI/ML Engineer to accelerate our Enterprise Observability strategy . This role will design, build, and operationalize AI/ML capabilities that enhance end to end telemetry pipelines, anomaly detection, intelligent alerting, and proactive system resiliency.
You will work at the intersection of AI/ML engineering, Observability platforms, and automation, developing solutions that improve detection, diagnosis, and prevention of operational issues across distributed systems.
Key Responsibilities
• Design and deploy AI/ML models supporting anomaly detection, baselining, event correlation, and predictive operational analytics.
• Build and integrate AI-enabled capabilities into enterprise Observability platforms, including Grafana, APM/RUM tools, network telemetry systems, and data observability tools.
• Develop AI Agents that can autonomously triage issues, recommend corrective actions, and initiate automated remediation workflows to reduce recovery time and improve system resilience.
• Implement self-healing automation using AI-driven decisioning, integrating with orchestration frameworks, service APIs, and infrastructure automation pipelines.
• Engineer and maintain real-time and batch data pipelines using Snowflake ML Jobs, Snowflake Cortex, streams, tasks, and UDFs.
• Implement and manage OpenTelemetry-based telemetry ingestion for logs, metrics, traces, and spans across distributed systems.
• Build asynchronous Python APIs and services for model inferencing and operational integration.
• Enhance observability intelligence with AI-powered capabilities such as root-cause acceleration, chatbot/search enablement, and automated insights.
• Contribute to SLO/SLI modeling, Golden Signals instrumentation, and Observability NFR adoption.
• Collaborate across engineering, SRE, platform and business teams to embed proactive intelligence and Observability standards throughout the ecosystem.
Required Skills & Qualifications
Core Technical Skills
• Strong proficiency in Python and data science/ML libraries:
NumPy, Pandas, scikit learn, TensorFlow, PyTorch, Matplotlib, Seaborn.
• Experience with Generative AI, LLM fine tuning, prompt engineering, RAG pipelines, and LLM evaluation frameworks.
• Expertise in developing and deploying ML models in production (batch & streaming).
• Strong understanding of statistics, time series modeling, and anomaly detection.
Observability & Telemetry
• Experience with OpenTelemetry for logs, metrics, traces, spans.
• Familiarity with Observability concepts:
Golden Signals, SLO/SLI design, APM, RUM, Synthetics, event correlation, baselining.
• Experience with Observability tools such as:
Grafana (Alloy agents, dashboards, ML capabilities), Dynatrace, Monte Carlo (Data Observability), Netscout, ThousandEyes, SolarWinds, NetBrain. Cloud, Data & Platform
• Hands on with AWS (SageMaker, Bedrock), Snowflake ML, Snowflake/Openflow, Snowflake AI Observability tooling.
• Experience building Snowflake data pipelines (streams, tasks, UDFs) - plus for Cortex features.
• Strong understanding of distributed systems and microservices telemetry requirements.
Automation & Engineering Quality
• Experience with automation pipelines, CI/CD, and infrastructure as code patterns supporting Observability adoption.
• Ability to build asynchronous Python APIs or services for model inference and operational integration.
Preferred Qualifications
• Experience developing agentic AI systems that analyze telemetry, generate action recommendations, or execute automated operational responses.
• Experience building self-healing patterns, including automated rollback, service restarts, configuration corrections, and predictive maintenance.
• Experience in Snowflake ML workflows, Snowflake Cortex Agents, and data pipeline automation.
• Exposure to AI-enabled alerting, RCA automation, and operational self-healing concepts.
• Experience with large-scale operational telemetry and multi-cloud ecosystems.
Soft Skills
• Strong analytical thinking and problem solving.
• Excellent communication skills for cross functional collaboration with infrastructure, SRE, engineering, business, and leadership teams.
• Curiosity, continuous learning mindset, and passion for applied AI and Observability.
EEO:
"Mindlance is an Equal Opportunity Employer and does not discriminate in employment on the basis of - Minority/Gender/Disability/Religion/LGBTQI/Age/Veterans."
Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the AI / ML Engineer (with Observability) in United States vacancy
- ...AI/ML & GenAI Observability Design, develop, and operationalize end-to-end data pipelines using distributed processing frameworks (e.g., Spark, Kafka, dbt) to handle high-volume ingestion, transformation, and serving of structured and unstructured datasets across cloud...Suggested
- ...the Fortune Global 500, with our deep business and... ...data, technology, and AI. About the Role: Genpact... ...Intelligence Engineer / Machine Learning Engineer... ...Responsibilities: Own end-to-end ML/AI projects: problem... .../TorchServe/TGI/vLLM), observability. Fluency in Python...SuggestedFull timeWork experience placement
- ...Sr. Python AI/ML Engineer Job Location: Morris Plains, NJ, Austin or Dallas, TX, Tampa or... ...changes. Responsible for low level design with the team. Convey architectural... ...exception handling Experience with observability using tools like CloudWatch, Open Telemetry...Suggested3 days per week
- ...Senior AI/ML Engineer AI/ML; Gen AI; C# 5.0; Digital: HTML5-CSS3; Foundation: JavaScript; Digital: Chatbots/Conversational Agents;... ...MCP| A2A frameworks) | Experienced in DevOps practices (CICD| Observability| Telemetry| Monitoring Alerting) | Strong collaborator and communicator...SuggestedRemote work
$91.7k - $163.7k
...advanced data analytics and AI to cybersecurity, we... ...& Safety Implement Observability, Operations excellence... ...Architecture, Governance & Engineering Standards Cross... ...automation ~3+ years in ML / DL Frameworks ~... ...you begin a career with us, you'll find a far-...SuggestedMinimum wageFull timeWork experience placementWork at officeLocal areaRemote workShift work$124k - $210k
...all. Role summary: The Senior AI/ML Engineer, epocrates, will help design and deliver... ...how users access information, interact with product experiences, and benefit from new... ...maintain shared standards for testing, observability, and release readiness. Partner...Full timeTemporary workWork at officeRemote work$91.7k - $163.7k
...Center Voice Modernization Engineer Optum Tech is a... ...advanced data analytics and AI to cybersecurity, we... ...engineering problems with modern cloud stacks (CI... ...build/release automation, observability, and environment... ...experience implementing AI/ML capabilities, including...Minimum wageFull timeWork experience placementWork at officeLocal areaRemote work$145k - $210k
...Senior AI/ML Engineer Cooley is seeking a Senior AI/ML Engineer to join the Practice Engineering... ...strong software engineering practices with applied AI and ML expertise to deliver... ...AI, or similar orchestration and observability tooling ~ Knowledge of cloud security...Full timeTemporary workWork at officeFlexible hoursWeekend work- ...by providing their customers with seamless experiences. With a... ...building a pipeline of exceptional AI Engineering talent for future... ...relational + vector stores) CI/CD, observability, and production operations... ...systems Machine Learning (ML) Understanding of training...Local areaImmediate startRemote work
$91.7k - $163.7k
...Senior AI/ML Engineer Transform healthcare through AI innovation at Optum. Optum is a global... ...is clear: to simplify healthcare with AI, turning insight into action at a scale... ...using CI/CD, testing, monitoring, and observability best practices in cloud-native environments...Minimum wageFull timeWork experience placementWork at officeLocal areaRemote work$120k
...AI/ML Engineer Location: Remote Look for Only $No Visa Sponsorship$ / GC / GC-EAD Full-time... ...preferred candidate should have worked with AI agents, Model Context Protocol (MCP)... ...for agent communication, control, and observability Build, transform, and manage data pipelines...Full timeFor contractorsRemote workVisa sponsorship$60 - $85 per hour
...AI/ML Engineer This is a contract position through the end of 2026, with possibility of extension. Candidates based in San Diego are preferred. Remote candidates... ...services on major cloud platforms; strong CI/CD, observability, and operational excellence. Translate...Hourly payContract workRemote work- ...Jconnect INC . Below is the requirement with my client. Please let me know if you are available for this role. Title: AI/ML Engineer Location: Mason,OH /Woodland... ...deployment (Azure preferred) • Establish observability, monitoring, and evaluation frameworks...Full timeImmediate startRemote workRelocation
$72.8k - $130k
...advanced data analytics and AI to cybersecurity, we use innovative... ...together. As an AI/ML Engineer within the Consumer Engineering... ...for a high-performing engineer with strong software fundamentals... ...unit testing, code quality, observability, and performance tuning across...Minimum wageFull timeWork experience placementWork at officeLocal areaRemote work$120.1k - $214.5k
...AI Engineering Lead Optum Tech is a global leader in health care innovation... ..., you will combine real-time ML, graph analytics, and... ...development experience; proficiency with ML frameworks such as PyTorch... ...implementing end-to-end observability for agentic systems, including...Minimum wagePermanent employmentFull timeWork experience placementWork at officeLocal areaImmediate startRemote work- ...Job Description: Job Title: AI/ML Engineer Onsite in Dallas, TX + Charlotte, NC... ...Description: Experienced AI/ML Engineer with a strong foundation in knowledge graph... ...standardized orchestration, prompt management, observability, and governance to improve consistency,...Remote work
$120k - $140k
...AI ML Engineer Location: Remote Job Type: Fulltime Salary Range: $120,000 – $140,000 per... ...and fine-tuning Hands-on experience with Generative AI and LLM-based applications... ...Implement CI/CD integration, monitoring, observability, and evaluation frameworks for AI...Full timeRemote work- ...Snowflake AI/ML Engineer We are looking for a Snowflake AI/ML Engineer to design, develop... ...Snowflake ecosystem. You’ll collaborate with data engineers, data scientists, and... ...performance tuning, data governance, and model observability. Partner with product and analytics...Remote work
$140k - $220k
...Technology Inc. (FTI) is seeking an AI/ML Software Engineer to design, build, and deploy secure,... ...time mission use cases. Collaborate with system engineers and architects to ensure... ..., focusing on maintainability and observability. Education/Qualifications...Remote work$112.7k - $193.2k
...innovation company seeks an experienced professional for a role in AI processing rules and compliance. This position allows for... ...in certain areas. Candidates must have extensive experience in observability, programming, and AI frameworks. The role emphasizes collaboration...Work at officeRemote workFlexible hours- ...TechPossible is seeking a motivated AI/ML Engineer to join our team! At TechPossible, youll... ...their technology systems, and we do it with a strong commitment to honesty, integrity... ...enterprise systems Implement observability for AI systems, including structured logging...Temporary work
- ...We’re looking for an exceptional Senior AI Engineer with 4+ years of software engineering... ...production-quality code, integrating LLMs and ML models into real systems, and deploying... ...MCP, function calling), and evaluation/observability libraries. Cloud deployment of LLM-...Remote work
- ...Sr AI/ML Engineer Duration: Long Term Location: Albany, NY (Remote) Essential Responsibilities... ...retrieval frameworks Experience with development tools and workflows for prototyping... ...and management, including security, versioning, and observability best practices...Remote work
$194k - $228k
...experiences that matter most. With platforms on iOS, Android,... ...at the speed of life. Engineering at Gametime You will be a... ...building and maintaining the AI and ML platforms that help power... ...grounded in testing, code reviews, observability, experimentation, and...- ...Data Science Engineer Must Have Technical/Functional... ...Training Generative AI & LLMs Document Processing... ...Production-grade ML Engineering Roles... ...The engineer will work with both structured and unstructured... ...) Establish observability, monitoring, and evaluation...Remote work
- ...a tech company that deploys AI-assisted teams to build and secure... ...enterprise solutions with our clients – spanning software... ...looking for a Senior Software Engineer, AI/ML who combines a strong .NET engineering... ...AI systems are reliable, observable, and maintainable in...Full timeRemote work
- ...AI/ML Engineer CGI is actively seeking a Senior AI/ML and Emerging Tech Expert to join our... ...proofs of concept to production—partnering with solution architects, data scientists,... ...AI & analytics; implement monitoring, observability, and model governance.. Graph Database...Contract workWork at officeLocal area
- ...AI/ML Engineer Location: Phoenix, AZ Experience Level: 8+ years Rate: We are seeking a highly... ...deep learning applications, particularly for observability data (AIOps). This role requires hands-on experience with time series forecasting, anomaly detection, event...
- ...Closure Technologies is seeking a AI/ML Engineer who will Implement and maintain Retrieval... ...refinement. Clearance Requirement: TS/SCI with Polygraph Key Responsibilities:... ..., inference optimization, and AI observability/monitoring tools. Experience with vector...Full time
$170.6k - $261.3k
...Data Labeling Engineering Engineer Remote/Hybrid: This role is based... ..., data engineering, and AI/ML, defining the strategies, tooling... ...Level up how ML teams work with data Develop automation and tooling... ...(TDD, code quality, observability, CI/CD). ~ Experience developing...Remote workFlexible hours
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to AI / ML Engineer (with Observability). Be the first to apply!
Related searches
- ai research engineer United States
- machine learning ai engineer United States
- ai engineer remote United States
- ai prompt engineer United States
- ai developer United States
- ai engineer United States
- ai ml engineer United States
- senior ai engineer United States
- entry level machine learning engineer United States
- senior ml engineer United States


