Head of AI Inference & MLOps
SupportFinity
Location: Austin, Texas area / On-site preferred Project: 7MW Phase I AI Datacenter → 50MW Campus Expansion Reports to: Founders / Executive Team About The Project We are building a high-density AI datacenter campus outside Austin, Texas, beginning with approximately 7MW of NVIDIA GB300 NVL72 infrastructure and scaling to 50MW+ . The initial deployment is designed around real-time inference, reasoning, and high-value AI serving workloads , with a focus on monetizing capacity in live markets rather than simply leasing powered space. This is not a traditional datacenter operations role. We are hiring the person who will make the racks make money. This leader will own the strategy and execution required to turn rack-scale GPU infrastructure into a profitable inference business: selecting the right models, runtimes, orchestration stack, routing layer, pricing strategy, customer segments, and marketplace relationships to maximize revenue, uptime, and utilization. The right candidate understands that raw compute is not the business. Monetized tokens, latency-adjusted utilization, and gross margin are the business. The Role We need a senior operator-builder who can sit at the intersection of: AI infrastructure Inference performance engineering Model serving and routing Marketplace monetization Customer / partner integration Revenue optimization You will design and run the inference platform that determines how our GB300 NVL72 racks are monetized in the real‑time market. That may include direct enterprise workloads, marketplace distribution, API‑based reselling, model hosting, fine‑tuned/private deployments, and emerging inference channels. You should know what makes money on modern inference hardware, what does not, and why. You should be able to answer questions like: Which open‑weight and commercial‑compatible models should run on this hardware first? How should workloads be split between premium low‑latency serving, bulk throughput, reserved capacity, and experimental capacity? Should we route through third‑party marketplaces, sell directly, or do both? What software stack gives us the best performance per watt, per GPU, and per dollar of capex? How do we maximize realized revenue rather than theoretical benchmark performance? How do we scale from a 7MW launch to a repeatable 50MW AI factory operating model? What You’ll Own Build and lead the inference monetization strategy for our first 7MW deployment and expansion to 50MW Define the technical and commercial operating model for turning GB300 NVL72 racks into revenue‑producing assets Evaluate and implement the model serving stack, scheduling layer, inference engine, observability stack, and API platform Select and optimize the mix of workloads across: Real‑time inference Reasoning workloads Premium low‑latency API traffic Batch / overflow workloads Dedicated enterprise deployments Private/fine‑tuned model hosting Identify the best go‑to‑market channels for capacity monetization, including direct sales and marketplace/API distribution partners Develop strategy for integration with platforms such as OpenRouter‑style aggregation, OpenAI‑compatible endpoints, and other inference distribution channels where appropriate. OpenRouter provides a unified API and provider aggregation layer, while Inference.net offers an OpenAI‑compatible API experience around model access and deployment, making both relevant examples of the ecosystem this role would evaluate. (OpenRouter) Own benchmarking methodology based on actual profit and production metrics, not vanity metrics Drive workload placement decisions based on revenue per rack, revenue per GPU‑hour, revenue per MW, latency targets, and customer value Partner with datacenter engineering, networking, and facilities teams to ensure the physical plant supports the intended software monetization strategy Build pricing, SLAs, utilization strategy, and customer segmentation framework Create dashboards and control systems for: Utilization Queue health Latency Token throughput Margin by workload Failure rate Realized revenue by cluster / rack / model / customer Lead decisions around multi‑tenant vs single‑tenant deployments, reserved vs on‑demand capacity, and when to prioritize direct contracts over marketplace traffic Build and manage the team required to scale this function over time What Success Looks Like In the first 3–6 months, you will: Stand up a production inference platform for our initial GB300 NVL72 deployment Recommend the highest‑value initial workloads and monetization channels Launch a repeatable commercialization strategy for rack capacity Establish a clear performance and revenue measurement framework Identify where we should sell capacity: direct, through marketplaces, via strategic partners, or through a hybrid approach Turn the first cluster into a measurable cash‑generating operation In the first 12 months, you will: Build the operating playbook for scaling from 7MW to 50MW Increase utilization without destroying margins or SLA quality Improve realized revenue per rack through model, routing, pricing, and customer mix optimization Establish the company as a serious real‑time inference operator, not just a GPU owner Required Experience Significant experience in production AI/LLM inference, MLOps, model serving, or AI infrastructure monetization Proven experience running or scaling GPU‑backed inference systems in production Strong understanding of modern inference runtimes, serving frameworks, and optimization techniques Experience with one or more of: vLLM TensorRT‑LLM SGLang Ray Serve Triton Inference Server Kubernetes‑based GPU orchestration Custom routing / scheduler layers Experience optimizing for real‑world production metrics such as throughput, latency, GPU utilization, availability, and cost efficiency Strong understanding of LLM inference economics, including tradeoffs among model size, quantization, latency, throughput, memory footprint, and customer willingness to pay Experience building or managing API‑based AI platforms or inference products Ability to translate infrastructure capability into a pricing and product strategy Experience working with enterprise customers, developer platforms, or AI marketplaces Strong technical judgment on model selection, infrastructure topology, and commercialization strategy Preferred Experience Experience monetizing large‑scale NVIDIA GPU infrastructure Experience with rack‑scale or cluster‑scale inference environments Background in both technical operations and business strategy Familiarity with AI inference aggregators, routing platforms, and model marketplaces Experience designing multi‑tenant GPU systems with strong isolation and predictable performance Experience with advanced observability, token‑level metering, cost accounting, and SLA enforcement Familiarity with reasoning‑model workloads, agentic inference, multimodal inference, and future high‑density AI factory architectures Experience supporting OpenAI‑compatible APIs and enterprise private deployments What Makes Someone Great In This Role You know the difference between "high benchmark performance" and "high realized revenue" You understand that some workloads are great for utilization but terrible for margin You can spot when a shiny model is commercially useless You know how to tune systems for the workloads customers will actually pay for You are opinionated about the stack, but flexible about the business model You can go deep technically and still think like an owner Compensation Competitive salary, bonus, and equity participation tied to the scale, importance, and revenue generated from the role. About the company Deeter Analytics #J-18808-Ljbffr
- ...A leading AI infrastructure company is seeking a senior operator-builder in Austin, Texas, to develop monetization strategies for GPU-based inference systems. This role entails designing an inference platform that maximizes revenue by optimizing workload placements and...Suggested
- ...Head of Inference About the Company Ambitious AI infrastructure provider Industry Information Technology and Services Type Privately Held, VC-backed About the Role The Company is seeking a Head of Inference to take ownership of model serving at...Suggested
- ...Head of Artificial Intelligence About the Company A premier financial institution building enterprise-wide AI and advanced analytics capability. Industry Financial Services Type Privately Held About the Role The Company is seeking a Head of Artificial...Suggested
$146k - $244k
...strategies to enhance GTM productivity across various departments. The ideal candidate will drive a culture of continuous learning and AI-enabled training, ensuring consistency and business impact. In this leadership position, you will partner closely with Sales,...SuggestedFlexible hours- Sage Hospitality Group in Austin, Texas is seeking a Hotel Operations Manager responsible for managing room and related operations to achieve guest satisfaction and financial goals. You will oversee human resources functions, ensure seamless check-in/check-out procedures...Suggested
- ...Head of Cybersecurity About the Company A trusted leader in medication dispensing serving healthcare and pharmaceutical regulations. Industry Medical Devices Type Privately Held About the Role The Company is seeking a Head of Cybersecurity to lead...
- ...Head of Marketplace About the Company Prominent consumer brand Industry Consumer Services Type Privately Held About the Role The Company is in search of a Head of Marketplace to spearhead the strategic and operational aspects of its expansion onto...
- ...Head of Ocean Marine (Marine Cargo) About the Company Accomplished insurance company Industry Insurance Type Privately... ...contribute to their market revolution, especially in the areas of AI, data, and analytics. The position offers a balance between...
- Texas Permanent School Fund Job Opportunity The Texas Permanent School Fund (PSF) is a constitutionally established education endowment with a 150+ year history of supporting Texas public schools. Today, PSF manages more than $60 billion in assets and invests with a...Permanent employmentWork at officeRemote workFlexible hours
- ...Head Of Merger & Integration Management Office Bluespring Wealth Partners (Bluespring) is part of Kestra Holdings, an industry-leading wealth management platform for independent financial professionals nationwide. Bluespring acquires successful, growing wealth management...Work at office
- ...Head of OTC Trading About the Company Globally recognized digital asset liquidity provider Industry Financial Services Type Privately Held About the Role The Company is in search of a Head of US OTC Trading to spearhead the growth of its institutional...Remote work
- ...Head of UI & UX About the Company Dynamic financial services company Industry Financial Services Type Privately Held... ...business impact, as well as a practical understanding of integrating AI into UX research, product design, and decision-making. The Head...
$90k - $130k
...latest tech in the autonomous mobility space. We will add to the Cafe, Tesla’s Optimus Prime Robot to showcase the future of robotics, AI mixed with our unique blend of hospitality as we deliver an unparalleled customer experience. We don't just show cars and software...Temporary workWork at office- ...Head of Actuarial About the Company Respected insurance company Industry Insurance Type Privately Held About the Role The Company is seeking a Head of Actuarial to play a pivotal role in shaping the strategic direction of the organization. The...
- ...Head of Community Impact About the Company Internationally recognized network of NGOs providing technology solutions & support... ...engagement and adoption of the organization's data, technology, and AI solutions across cradle-to-career and place-based partnerships. The...Remote work
- ...Head of CRM About the Company Emerging iGaming startup Industry Gambling & Casinos Type Privately Held About the Role The Company is in need of a remote Head of CRM to join their team. The successful candidate will be responsible for defining and...Remote work
- ...team. Responsibilities include defining technical strategies, managing engineers, and partnering across departments to develop advanced AI solutions. The role requires substantial experience with ML technologies, strong leadership skills, and expertise in cloud platforms....Remote work
- ...Head of Toxicology, Small Molecules About the Company A clinical-stage biopharmaceutical company redefining pain management through non-opioid therapies. Industry Biotechnology Type Privately Held About the Role The Company is in search of a Head...
- ...Head of Revenue About the Company Ambitious private equity firm specializing in vertical software business acquisitions Industry Internet Type Privately Held, Private Equity-backed Founded 2015 Employees 1001-5000 Categories Information...
- ...Head of M&A About the Company Industrial manufacturing or distribution company focused on acquisitions and doubling growth. Industry Machinery Type Privately Held, Private Equity-backed About the Role The Company is in need of a Head of M&A to...Remote workFlexible hours
- ...Head of Payments About the Company A rapidly growing Vertical ERP software company with a collaborative, servant-leadership culture. Industry Computer Software Type Privately Held About the Role The Company is in search of a Head of Payments to...
$150k
...accessible, starting with the Aalo‑1, a 10 MWe reactor leveraging cutting‑edge safety, modularity, and efficiency. About the role As the Head of Community Affairs, you will be the bridge between Aalo Atomics and the public. Based in our Austin headquarters, you will...Local area- ...employees worldwide. Position POSITION WITH: Ouro Global, Inc. TITLE: Head of Architecture LOCATION: Austin, TX Duties Provides strategic... ...oversight for platforms utilizing advanced technologies like AI/ML models to enhance customer experiences and reduce operational...Worldwide
- ...Head Of Community At Port.io, we are building an open and flexible Agentic Engineering Platform for modern engineering organizations... ...expose it as a governed layer through golden paths for developers and AI agents. By combining rich engineering context, workflows, and...Flexible hours
$180k - $230k
...partnerships, reporting directly into the COO. This is not a Head of Sales and not a Head of Marketing. It's a systems leader who turns... ...and tools they need to succeed professionally in the age of AI. We got our start in 2011 as The Digital Project Manager blog,...Full timeRemote workShift work- ...The 14th Annual Head for the Cure 5K - Austin is a 5K walk/run to raise awareness, funds, and hope for the Austin brain tumor community. Funds raised by this event will benefit brain tumor efforts at Texas Oncology. We are looking for volunteers to help us with set up...
- ...Head Of Firmware Base is America's next-generation power company. We're rebuilding the foundation of modern civilization–electricity–by deploying a vast network of distributed batteries that is transforming today's fragile, centralized grid into a resilient and abundant...Full timeRelocationShift work
- ...About Texas Sports Academy Texas Sports Academy is an AI-powered, sports-focused K–12 education venture built on the Alpha School... ...can tell it. The Role Texas Sports Academy is hiring a Head of Publicity to own the company's narrative across every audience...
- Nbutexas seeks a Director of Banquets to join their team in Frisco, Texas. This leadership role demands expertise in overseeing large events while ensuring flawless execution across various venues. The ideal candidate will have 3-5 years of experience in 4- or 5-star hotels...
$225k - $275k
Senior Director of Commercialization As The Zebra's Senior Director of Commercialization, you will lead the account management team and build a brand‑new Business Development function from the ground up. Your focus will be on ensuring client needs are met, growing the revenue...Remote work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Head of AI Inference & MLOps. Be the first to apply!

