GenAI Inference Architect — Onsite GPU (H200, OpenShift)
United Software Group Inc
United Software Group Inc is searching for an LLM Inference & GPU Systems Consultant based in Charlotte, NC. The role involves building and maintaining large-scale on-prem LLM infrastructure, specifically focusing on NVIDIA H200 GPU clusters and OpenShift AI deployment. The ideal candidate will possess over 8 years of experience in optimizing GPU runtime, deploying inference engines like vLLM and TensorRT-LLM, and must be onsite at least 3 days a week. #J-18808-Ljbffr United Software Group Inc
$65 - $70 per hour
...years of AI experience and 3 years in product management for AI/ML platforms. The job offers a pay range of $65/hr to $70/hr, employee benefits including health insurance and a 401(k) plan, and requires onsite presence. #J-18808-Ljbffr Creative Solutions Services, LLCSuggestedContract work$180k - $200k
GenAI Architect Location: East Coast, USA (Onsite) Level: Director About the Role Cognizant is seeking a Director-level GenAI Architect to lead strategy, delivery, and client engagement for high‑impact AI initiatives. This is a highly visible role that blends GenAI solutioning...SuggestedTemporary work- ...Local candidates only. Must be onsite at client in Charlotte, NC at... ...We are seeking a Senior Cloud GenAI Governance Engineer to manage... ..., through Model Armor, to the inference endpoint, and finally into... ...gateway, load balancing, and GPU saturation. AI Observability...SuggestedLocal area3 days per week
- ...NC, is looking for a Senior Generative AI Architect to lead the design and implementation of... ...candidate will define a strategic roadmap for GenAI, evaluate architectures, and ensure model... ...in GenAI best practices. The job is onsite from day one, aligning with client goals...Suggested
- NTT DATA North America is seeking a Principal GenAI Architect in Charlotte, North Carolina. This role involves serving as the ultimate technical... ...experience in programming, LLM optimization, Kubernetes, and GPU orchestration. This position offers an exciting opportunity to...Suggested
- ...AI/ML Inference Engineer Major Financial Services... ...| Hybrid - 3 Days Onsite | Charlotte, NC... ...model serving, GPU infrastructure, and... ...performance on NVIDIA H200 GPU clusters in a... ...8, AWQ, GPTQ) Architect and manage GPU... ...Kubernetes using KServe, OpenShift AI, Helm, and Run:...Immediate start
- ...your earliest convenience, and I'll be happy to provide more details about the role. Position: Blockchain Architect Location: Charlotte, NC (Hybrid 3 Days Onsite) Duration: Long Term Contract Job Description: We are seeking a Senior Architect to join our Architecture group...Long term contractContract workLocal area
$120k - $125k
...AI Architect Compensation: $120K– $125K per annum with no benefits... ...Local to Charlotte Only Onsite Interview and Onsite from Day... ...Learning (ML), Generative AI (GenAI), and Large Language Model (LLM... ...engineering, model training, inference pipelines, and monitoring...Local area- ...Job Title: GenAI Enterprise Architect Duration: 06 Months (potential to extend) Location: Charlotte, NC - Hybrid Role (3 Days Onsite/2 Days WFH) Description: ~ We are seeking a visionary and technically adept GenAI Enterprise Architect to lead the design...Work from home
- ...Job Title: Senior Cloud GenAI Governance Engineer Location: Charlotte, NC (Hybrid - 3 Days Onsite Required) Job Description: We are seeking a Senior... ...cloud-native infrastructure, and distributed inference operations. Key Responsibilities:...
- ...We are currently seeking a AI Architect to join our team in Charlotte,... ...Overview: We are seeking a Principal GenAI Architect to serve as a hands-... ...reliability. Infrastructure, Inference & Edge Computing: Design,... ...strategy, including rigorous GPU management, utilization, and...Local area
- Job Overview Principal GenAI Architect - Charlotte, North Carolina, United States. Key Responsibilities... ...and AI/ML platforms. Optimize LLM inference, implementing advanced batching, caching... ...hardware strategy, including rigorous GPU management, utilization, and thermal/...Local area
- ...Engineer (Hands-On) - Azure Platform, AI/ML & GenAI Work Authorization: GC & USC only... ...) Work Model: Hybrid - 3 days onsite mandatory Role Summary... ...support training, deployment, and scalable inference. Build reusable Terraform modules...Remote work
$70.23 per hour
...Title: Senior Generative AI Architect/Developer Location: Charlotte... ...Kennesaw, GA | Richmond, VA - Onsite Required Duration: Contract -... ...Responsibilities: Architect and implement GenAI solutions using open-source... ...patterns and scalable inference services. Build and integrate...Contract work$180k - $200k
Cognizant is seeking a Director-level GenAI Architect based in Charlotte, NC, to lead strategy, delivery, and client engagement for GenAI initiatives. This critical role involves managing GenAI solutions and portfolio growth while engaging with clients to develop AI roadmaps...- ...Salesforce Solutions Architect Location: Charlotte, NC (100% Onsite) Duration: 6-12 Months Rate: DOE US Citizens and Green cards are Preferred. No 3rd Party C2C Acceptable. Position Overview: The Salesforce Architect will be responsible for optimizing and tuning...
$185k - $235k
...be doing? The AI Solutions Architect is the technical pre-sales partner... ...architectures spanning GPU/compute, data platforms, AI/ML... ...staock, MLOps pipeline, and inference deployment. Working knowledge... ..., Dental, and Vision Care, Onsite Health Centers, Employee Assistance...Full timeShift work$75 - $85 per hour
...pharmaceutical industries, is seeking a CLM Architect to join their growing team working on... ...chances of interviewing at Optomi by 2x Inferred from the description for this job... ...Architect Principal Security Architect - GenAI and Emerging Technologies - Remote Charlotte...Contract workRemote work- Capgemini is seeking a Senior Generative AI Full Stack Engineer based in Charlotte, NC. This role focuses on developing GenAI-enabled tools and applications, with an emphasis on creating user-facing interfaces that interact with AI systems. The candidate should have strong...
- ...Mandatory Keywords) LLM Inference & Optimization... ...AWQ, GPTQ Distributed & GPU Systems Tensor parallelism... ...serving platforms KServe, OpenShift AI Helm charts, Operators... ...Experience with LLMOps / GenAI pipelines Exposure to hybrid...
- Role Gen AI Architect Location Charlotte, NC (Hybrid - 3 days office in a week) Position Type... ...machine learning and Generative AI (GenAI) use cases. Design comprehensive end-to-... ...ingestion, feature engineering, model training, inference pipelines, and monitoring frameworks....Contract workWork at office3 days per week
$135k - $155k
...countries within key global markets. The Gen AI Architect will leverage advanced Generative AI... ...machine learning and Generative AI (GenAI) use cases. Design comprehensive end-to-... ...ingestion, feature engineering, model training, inference pipelines, and monitoring frameworks....Temporary workFlexible hours- A leading consulting firm is seeking an Advanced AI Architect in Charlotte, NC to design and deliver full stack AI architectures for various platforms. The role involves leading workshops, designing multi-agent architectures, and implementing AI solutions. The ideal candidate...
- A global consulting firm based in Charlotte, NC is seeking a Gen AI Architect. This role involves leveraging advanced Generative AI models for investment banking processes, requiring 10+ years of AI/ML architecture experience. Responsibilities include defining the AI/ML...
- Job Title: Generative AI Architect Location: Charlotte, NC (Onsite from Day 1) Job Type: Contract Job Title: Generative AI Architect Location: Charlotte... ...engineering frameworks Participate in cross-functional GenAI initiatives and PoC’s across entire software lifecycle...Contract work
- Job Title: AI QE Architect Location: Charlotte, North Carolina (onsite) Employment Type: contract Job Description We are seeking an AI QE Architect to serve... ...of leading enterprise‑scale transformations. AI & GenAI Expertise: Proven experience with Agentic AI and GenAI...Contract work
- Software Engineer 4 - GenAI / Python Full Stack Contract: 18-Month... ...TX Schedule: Hybrid - 3 Days Onsite Required Interview Process: 1... ...cloud-native applications in OpenShift (OCP) and GCP environments... ...stakeholders, and enterprise architects Mentor junior engineers and contribute...Contract work
$60.24 - $68.24 per hour
...Application Architect Genesis10 is currently seeking an Application Architect for an Onsite position with a Global Financial Institution located in Charlotte, NC. This is a 12+ month contract opportunity. In this role, you will independently develop, enhance, debug...Hourly payContract work- A leading AI solutions provider is seeking an experienced AI Engineer to design and deploy large language models and generative AI systems. The role requires expertise in Python, FastAPI, and MLOps, alongside the ability to build multi-agent workflows. Candidates should...Remote work
$69 - $74 per hour
...Financial Services Team: THBA Job Title: Software Engineer 4 / Data Platform Engineer (Kubernetes / OpenShift) Location: Charlotte, NC - Hybrid (3 days onsite) Contract Length: 12 months (strong potential to extend to 24 months; possible conversion) Pay...Contract work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to GenAI Inference Architect — Onsite GPU (H200, OpenShift). Be the first to apply!

