AI Infrastructure / Platform Engineer
$140k - $200kAnthelion Capital Holdings
About Anthelion Anthelion is a next-generation investment firm building a proprietary AI and data platform that powers our investment lifecycle from underwriting to portfolio management. The platform integrates structured and unstructured data, advanced analytics, and automated workflows to drive superior, risk-adjusted returns in private credit and structured finance. We are engineers and investors working together to redefine how institutional investment decisions are made - faster, smarter, and more transparent. The Role We are looking for an AI Infrastructure / Platform Engineer to work on the foundational systems that power our data science and AI platform. You will work across the infrastructure layer beneath our ML and AI workflows: data pipelines, orchestration, compute provisioning, model serving, and observability. You will also play a key role in operationalizing our agentic AI platform, ensuring agents are hosted, monitored, and integrated into production-grade systems. What You'll Do
Data Pipelines & Orchestration
• Design, build, and maintain production data pipelines that ingest, transform, and deliver structured and unstructured data to downstream ML workflows.
• Own and extend our Prefect-based orchestration layer, including flow scheduling, error handling, retry logic, and human-in-the-loop (HITL) suspend/resume patterns.
• Build and maintain feature stores, data contracts, and promotion workflows that ensure data quality and traceability from raw ingestion through model consumption.
• Collaborate with data scientists to operationalize experimental workflows into reliable, repeatable pipelines.
ML/AI Infrastructure & Deployment
• Build and maintain scalable infrastructure for model training, retraining, and inference (batch and real-time), including GPU compute provisioning and container orchestration.
• Implement and manage model serving infrastructure - including containerized endpoints, API gateways, and self-serve deployment frameworks for the data science team.
• Deploy and manage monitoring systems that track model health, data drift, prediction consumption, and pipeline reliability.
• Ensure all deployed systems are highly available, resilient, and well-documented with clear data lineage and runbooks.
Agentic AI Platform & Tooling
• Support the buildout and operationalization of agentic AI workflows, including agent hosting, lifecycle management, and integration with Model Context Protocol (MCP) servers.
• Build shared tooling and infrastructure that enables data scientists to develop, test, and deploy agents with minimal friction.
• Design and implement evaluation frameworks and quality standards for AI agents, including automated benchmarking, regression testing, and production-readiness criteria.
• Ensure observability and reliability across agent execution environments, including logging, tracing, and performance monitoring.
DevOps & Platform Engineering
• Deploy, configure, and maintain shared AI platform services (e.g., observability tools, memory layers, evaluation platforms) as containerized workloads on Azure - including end-to-end ownership of networking, access, and connectivity between services.
• Manage cloud infrastructure (Azure) including container registries, managed identities, Key Vault secrets, storage backends, and virtual network configurations.
• Maintain CI/CD pipelines, branch protection policies, and release management workflows across data science repositories.
• Continuously evaluate and adopt tools and technologies that improve platform reliability, developer experience, and team velocity. What We're Looking For
Required
• 3+ years of experience in data engineering, MLOps, or ML infrastructure roles - with a clear track record of building and maintaining production data and ML pipelines.
• Strong proficiency in Python and SQL, with hands-on experience building ETL/ELT pipelines and data transformation workflows.
• Experience with workflow orchestration tools (Prefect, Airflow, Dagster, or similar) in production environments.
• Solid understanding of containerization and cloud infrastructure - Docker, Kubernetes, and at least one major cloud provider (Azure preferred).
• Hands-on experience deploying and operating containerized services in cloud environments, including configuring networking, load balancing, and service-to-service connectivity.
• Experience with model serving and deployment patterns (batch inference, real-time APIs, feature stores).
• Familiarity with monitoring and observability tooling for pipelines and deployed models (data drift detection, health metrics, alerting).
• Strong documentation habits and the ability to communicate technical architecture clearly to diverse stakeholders.
Preferred
• Experience with Azure services: Container Apps, ACI, ACR, Blob Storage, Key Vault, Managed Identities, VNets.
• Familiarity with Prefect (especially cloud-managed work pools, result backends, and HITL patterns).
• Experience with dbt, Snowflake, or similar data transformation and warehousing tools.
• Exposure to LLM serving infrastructure and agentic workflow frameworks (e.g., MCP, LangChain, or similar).
• Experience standing up and maintaining third-party AI/ML platform tools (e.g., Langfuse, MLflow, or similar observability and evaluation platforms).
• Experience managing internal Python package distribution (private PyPI, Artifactory, or similar).
• Familiarity with Git-based release management, branch protection, and CI/CD for data science repos. Why Join Anthelion
• Build at the frontier of AI, data, and finance - where infrastructure directly shapes institutional investment decisions.
• Work on greenfield architecture with high autonomy and technical depth.
• Collaborate with a multidisciplinary team of data scientists, engineers, and investors.
• Culture grounded in technical excellence, transparency, and measurable impact. Benefits
• Comprehensive health, dental, and vision insurance.
• Retirement savings plan with company match.
• Hybrid/flexible work arrangements and a supportive work environment. Culture
• Demonstrates a strong bias for action and executes quickly with limited guidance.
• Takes full ownership of outcomes and drives problems to resolution.
• Approaches challenges with a solutions-first mindset and delivers measurable results.
• Maintains composure under pressure while keeping momentum and focus.
• Simplifies complex issues into clear, actionable steps that move the work forward. Base Salary Range: $140,000 to $200,000 per year
Data Pipelines & Orchestration
• Design, build, and maintain production data pipelines that ingest, transform, and deliver structured and unstructured data to downstream ML workflows.
• Own and extend our Prefect-based orchestration layer, including flow scheduling, error handling, retry logic, and human-in-the-loop (HITL) suspend/resume patterns.
• Build and maintain feature stores, data contracts, and promotion workflows that ensure data quality and traceability from raw ingestion through model consumption.
• Collaborate with data scientists to operationalize experimental workflows into reliable, repeatable pipelines.
ML/AI Infrastructure & Deployment
• Build and maintain scalable infrastructure for model training, retraining, and inference (batch and real-time), including GPU compute provisioning and container orchestration.
• Implement and manage model serving infrastructure - including containerized endpoints, API gateways, and self-serve deployment frameworks for the data science team.
• Deploy and manage monitoring systems that track model health, data drift, prediction consumption, and pipeline reliability.
• Ensure all deployed systems are highly available, resilient, and well-documented with clear data lineage and runbooks.
Agentic AI Platform & Tooling
• Support the buildout and operationalization of agentic AI workflows, including agent hosting, lifecycle management, and integration with Model Context Protocol (MCP) servers.
• Build shared tooling and infrastructure that enables data scientists to develop, test, and deploy agents with minimal friction.
• Design and implement evaluation frameworks and quality standards for AI agents, including automated benchmarking, regression testing, and production-readiness criteria.
• Ensure observability and reliability across agent execution environments, including logging, tracing, and performance monitoring.
DevOps & Platform Engineering
• Deploy, configure, and maintain shared AI platform services (e.g., observability tools, memory layers, evaluation platforms) as containerized workloads on Azure - including end-to-end ownership of networking, access, and connectivity between services.
• Manage cloud infrastructure (Azure) including container registries, managed identities, Key Vault secrets, storage backends, and virtual network configurations.
• Maintain CI/CD pipelines, branch protection policies, and release management workflows across data science repositories.
• Continuously evaluate and adopt tools and technologies that improve platform reliability, developer experience, and team velocity. What We're Looking For
Required
• 3+ years of experience in data engineering, MLOps, or ML infrastructure roles - with a clear track record of building and maintaining production data and ML pipelines.
• Strong proficiency in Python and SQL, with hands-on experience building ETL/ELT pipelines and data transformation workflows.
• Experience with workflow orchestration tools (Prefect, Airflow, Dagster, or similar) in production environments.
• Solid understanding of containerization and cloud infrastructure - Docker, Kubernetes, and at least one major cloud provider (Azure preferred).
• Hands-on experience deploying and operating containerized services in cloud environments, including configuring networking, load balancing, and service-to-service connectivity.
• Experience with model serving and deployment patterns (batch inference, real-time APIs, feature stores).
• Familiarity with monitoring and observability tooling for pipelines and deployed models (data drift detection, health metrics, alerting).
• Strong documentation habits and the ability to communicate technical architecture clearly to diverse stakeholders.
Preferred
• Experience with Azure services: Container Apps, ACI, ACR, Blob Storage, Key Vault, Managed Identities, VNets.
• Familiarity with Prefect (especially cloud-managed work pools, result backends, and HITL patterns).
• Experience with dbt, Snowflake, or similar data transformation and warehousing tools.
• Exposure to LLM serving infrastructure and agentic workflow frameworks (e.g., MCP, LangChain, or similar).
• Experience standing up and maintaining third-party AI/ML platform tools (e.g., Langfuse, MLflow, or similar observability and evaluation platforms).
• Experience managing internal Python package distribution (private PyPI, Artifactory, or similar).
• Familiarity with Git-based release management, branch protection, and CI/CD for data science repos. Why Join Anthelion
• Build at the frontier of AI, data, and finance - where infrastructure directly shapes institutional investment decisions.
• Work on greenfield architecture with high autonomy and technical depth.
• Collaborate with a multidisciplinary team of data scientists, engineers, and investors.
• Culture grounded in technical excellence, transparency, and measurable impact. Benefits
• Comprehensive health, dental, and vision insurance.
• Retirement savings plan with company match.
• Hybrid/flexible work arrangements and a supportive work environment. Culture
• Demonstrates a strong bias for action and executes quickly with limited guidance.
• Takes full ownership of outcomes and drives problems to resolution.
• Approaches challenges with a solutions-first mindset and delivers measurable results.
• Maintains composure under pressure while keeping momentum and focus.
• Simplifies complex issues into clear, actionable steps that move the work forward. Base Salary Range: $140,000 to $200,000 per year
Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the AI Infrastructure / Platform Engineer in New York, NY vacancy
$180k - $220k
...Senior Platform Engineer Blackbird is looking for a Senior Platform Engineer to own the platforms and tooling... ...that make our engineering team fast, reliable, and AI-augmented. You'll be responsible for the infrastructure powering APIs that handle payments, loyalty, and...SuggestedFull timeLocal areaFlexible hours$200k - $220k
...and categories vary by payment type. Spade is a data and AI platform that turns messy transaction strings into structured,... ...work. What will you be doing? As a Senior Platform Infrastructure Engineer, you'll own the systems that keep Spade's core platform fast...SuggestedTemporary workWork from home$208k - $349.66k
...We’re looking for a Senior Staff engineer to help lead our Infrastructure division. We own the infrastructure... ...application teams to evolve the Sidekiq platform they depend on and improve the reliability... ...marketers to combine and activate AI agents, models, and features at...SuggestedFull timePart timeWork at officeFlexible hours$180k - $220k
...A healthcare AI startup is seeking a Machine Learning Engineer to contribute to scalable AI pipelines. This role entails developing infrastructure for model training and optimization, requiring extensive experience with ML frameworks and cloud services. The ideal candidate...SuggestedRemote work- ...A financial technology company is seeking a Senior Cloud and Platform Engineer to design and operate cloud-native infrastructure for AI development. The ideal candidate has over 8 years in cloud infrastructure and DevOps, with expertise in MLOps practices. You will lead...SuggestedRemote workFlexible hours
- ...now launching publicly. Our platform already drives 8-figure gross... ...The Role You'll own the infrastructure that keeps Merciv running at... ...closely with backend and ML engineers to ensure our platform is secure... ...Experience supporting ML/AI workloads and GPU infrastructure...Immediate startHome officeVisa sponsorshipFlexible hours3 days per week
- Scribd, Inc. is hiring a Senior AI Data Engineer in New York City to lead AI engineering workstreams. This role involves building data infrastructure, supporting platform stakeholders, and mentoring other engineers. Candidates should have over 5 years of data engineering...Flexible hours
$190.5k - $230k
...is pioneering the future of engineering design with our advanced software... ...aircraft get designed. Our platform collapses months of... ...We are looking for a Senior Infrastructure Engineer (5-10yrs experience... ...workloads. Experience with AI/ML technologies, Evals Infra...Local area- ...started. Come join us for a whale of a ride! Our Infrastructure Engineering team builds and operates the cloud-native platform that powers Docker’s suite of products. We... ...partner closely with product and security teams. AI-accelerated execution: we build agentic workflows...Remote workHome officeShift work
- ...Software Engineer, AI/ML (AI Infrastructure & Platform) Role: Software Engineer, AI/ML (AI Infrastructure & Platform) Location: Hybrid, NYC About Us Wealth.com is the industry's leading estate planning platform, empowering more than 1,000 wealth management...Temporary workFlexible hours
$160k - $200k
...Senior Engineer, Platform Infrastructure New York, New York, United States We are seeking an experienced Senior Engineer to serve as the technical... ...the charge into Agentic Infrastructure - leveraging AI to build self-healing, autonomous systems that reduce operational...Work at officeWorldwideFlexible hours$156k - $211k
...Senior Software Engineer, Infrastructure Remote within U.S. or Remote within Ontario Afresh is the leading AI company in fresh food—partnering with grocers like Albertsons, Wakefern... ...70% growth in 2025, we’ve expanded our platform to cover all fresh departments, launched...Remote work$132.95k - $189.93k
...Platform Infrastructure Engineer The Platform team creates the technology that enables Spotify to learn quickly and scale easily, enabling rapid... ...foundations that power our internal developer platform and the AI agents running on top of it, and help bring those...Work from homeFlexible hours$175k - $220k
...A leading AI technology company in the United States is seeking a Senior Cloud Infrastructure Engineer to build scalable infrastructure for its AI video platform. The role involves designing and managing cloud and on-premise systems, collaborating with cross-functional...$100k - $200k
...Senior Infrastructure Engineer (DevOps / Platform) Location Remote – United States (Occasional travel for team offsites 2–3 times per year) Compensation $... ...real estate industry , enabling automation, analytics, and AI-powered workflows for real estate operators. The team...Remote work- ...Qdrant is an open‑source vector search engine powering the next generation of AI applications, from semantic search... ..., we’re building the retrieval infrastructure layer for modern AI. Recently... ...will work at the intersection of platform engineering and reliability. You’ll...Remote workFlexible hours
- ...with XBOW. Attackers are already using AI to move faster than defenders can react—we’re creating the platform that puts security ahead in the arms race... ...we’d love to talk. Your Role: Software Engineer - Platform / Core Infrastructure We’re looking for a Software Engineer -...Full timeContract workImmediate startRemote work
$215k - $250k
...critical software, analytics, and AI to life and is the company... ...-leading unified DataOps platform powered by Apache Airflow®. Astro... ...looking for a Staff+-level engineer who has built production platform... ...has a healthy and complex infrastructure estate spanning multiple...$136k - $253k
...the Role: Advanced Content Engineering (ACE) is seeking a Staff Software... ...design and delivery of the search platform's control-plane API and cloud infrastructure. The platform's core promise is self... ...ships to production constantly, AI-assisted development is the norm,...Temporary workWork at officeLocal areaFlexible hours2 days per week3 days per week- ...Position - Senior Cloud DevOps Infrastructure Engineer Location - Austin, TX (Hybrid) Duration... ...functionality across different platforms. Our infrastructure supports a broad client... ...providers. Use generative AI elements to increase efficiency and speed...Local area
$216k - $270k
...As a Software Engineer on the Machine Learning Infrastructure team, you will build the "Operating System" for our large... ...a high-performance training platform that handles the immense complexity... ...transforms raw compute into breakthrough AI. You will: Architect and scale...Full time- Artemis in New York is seeking a Senior Software Engineer to design and operate cloud infrastructure at scale. The role requires 5+ years of experience in infrastructure or DevOps engineering, with deep expertise in AWS and familiarity with infrastructure-as-code tools...
$216k - $270k
...As a Software Engineer on the ML Infrastructure team, you will design and build platforms for scalable, reliable, and efficient serving of LLMs. Our platform powers cutting... ...At Scale, our mission is to develop reliable AI systems for the world's most important decisions...Full time$215k - $250k
...critical software, analytics, and AI to life and is the company... ...‑leading unified DataOps platform powered by Apache Airflow®. Astro... ...innovative minds in cloud infrastructure and open‑source software. You... ...add a world‑class Staff+ level engineer to our team, to set out our...- ...DevOps Engineer We're building infrastructure to test and secure AI systems under extreme, adversarial, and failure-prone conditions. From custom CUDA kernels and trusted execution environments (TEEs) to zero-knowledge proof systems, we design for correctness and resilience...
$241.5k - $284k
...Company Valon is building the AI-native operating system for... .... ValonOS is our unified platform that makes every process structured... ...industries and beyond. Engineering at Valon Our engineering team... ...we’re focused on scaling infrastructure, improving developer experience...Work at officeLocal areaRemote workFlexible hours- ...customers you’ll ever meet. Software Engineer Comprehensive health and... ...providing a modern operating platform and electronic health record... ...be a core contributor to our infrastructure — owning the systems that... ...observability tools as we integrate AI capabilities into our product...Full timeRemote workFlexible hours
- ...DevOps Platform Engineer with our client in the financial industry located in Charlotte, NC and... ...work scheduling tools ~ Setting up infrastructure monitoring & reporting for GPU/CPU & memory... ...Desired skills: Familiarity with AI & Deep learning, modeling techniques,...Contract workWork experience placementShift work
- ...A tech-focused company is seeking a Senior Infrastructure Platform Engineer to design and maintain robust infrastructure platforms. This role involves automating deployments, monitoring performance, and collaborating with development teams. The ideal candidate should...Remote work
- Varonis is seeking a DevSecOps Engineer to join our DevOps team in... ...responsible for securing cloud platforms. You will implement security... ...AWS, GCP), and expertise in Infrastructure as Code tools. We support a hybrid... ...us to contribute to a safer AI environment for all...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to AI Infrastructure / Platform Engineer. Be the first to apply!
Related searches
- machine learning ai engineer New York, NY
- senior ai engineer New York, NY
- ai engineer remote New York, NY
- ai ml engineer New York, NY
- ai engineer New York, NY
- ai developer New York, NY
- ai research engineer New York, NY
- ai prompt engineer New York, NY
- data infrastructure engineer New York, NY
- infrastructure engineering manager New York, NY

