Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

LLM Inference & GPU Systems Architect

NTT Data Americas, Inc.

Company Overview NTT DATA is a $30 billion trusted global innovator of business and technology services. We serve 75% of the Fortune Global 100 and are committed to helping clients innovate, optimize and transform for long-term success. As a Global Top Employer, we have diverse experts in more than 50 countries and a robust partner ecosystem of established and start-up companies. Our services include business and technology consulting, data and artificial intelligence, industry solutions, as well as the development, implementation and management of applications, infrastructure and connectivity. We are one of the leading providers of digital and AI infrastructure in the world. NTT DATA is a part of NTT Group, which invests over $3.6 billion each year in R&D to help organizations and society move confidently and sustainably into the digital future. Visit us at us.nttdata.com. Role Overview We are seeking an AI Infrastructure Runtime Engineer to build and maintain large-scale on-prem LLM infrastructure. This is an enterprise private GenAI environment running on NVIDIA H200 GPU clusters and an OpenShift AI deployment ecosystem. You will manage production inference internally, including self-hosting open-source LLMs like Llama. We are focused exclusively on inferencing; this role involves no model training infrastructure or fine-tuning pipelines. Key Responsibilities NVIDIA GPU Runtime Optimization: Drive extreme runtime efficiency and optimization for the token generation pipeline. Specifically manage prefill/decode optimization and KV cache management. Inference Serving: Deploy and manage inference engines including vLLM and TensorRT-LLM. Hardware Utilization: Optimize GPU throughput tuning, batching strategies, and latency optimization. Manage workload orchestration using RunAI and Kubernetes GPU orchestration. Model Lifecycle Management: Oversee the complete Hugging Face model lifecycle, including model onboarding, deployment, and retirement. Platform Operations: Operate and maintain the OpenShift AI ecosystem as the primary container platform for GenAI workloads. Required Qualifications 8 years experience working as an LLM Systems Engineer or AI Infrastructure Runtime Engineer. 8 years hands-on experience with NVIDIA H200 clusters and runtime optimization techniques (KV Cache, prefill/decode). Proficiency in OpenShift AI and GPU orchestration tools like RunAI. Strong experience with modern inference frameworks, specifically vLLM and TensorRT-LLM. Proven track record managing the Hugging Face deployment lifecycle. Must be onsite at client in Charlotte, NC at least 3 days/week. NTT DATA endeavors to make accessible to any and all users. If you would like to contact us regarding the accessibility of our website or need assistance completing the application process, please contact us at . This contact information is for accommodation requests only and cannot be used to inquire about the status of applications. NTT DATA is an equal opportunity employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability or protected veteran status. For our EEO Policy Statement, please click here. If you’d like more information on your EEO rights under the law, please click here. For Pay Transparency information, please click here. #J-18808-Ljbffr NTT Data Americas, Inc.

Vacancy posted 2 days ago
Similar jobs that could be interesting for youBased on the LLM Inference & GPU Systems Architect in Charlotte, NC vacancy
  •  ...AI/ML Inference Engineer Major Financial Services Organization - Charlotte...  ...large language model serving, GPU infrastructure, and enterprise...  ...0 GPU clusters using TensorRT-LLM, Triton Inference Server, and...  ...(FP8, AWQ, GPTQ) Architect and manage GPU orchestration on... 
    Suggested
    Immediate start

    Hallmark Global Solutions Ltd

    Charlotte, NC
    3 days ago
  •  ...Systems Architect 4 – AI / Distributed Systems Locations: Dallas Charlotte San Francisco Bay Area (Onsite) Must-Have...  ...engineering Experience with MCP servers Hands-on experience in LLM inference optimization Knowledge of batching and caching... 
    Suggested

    Argyle Infotech

    Charlotte, NC
    18 hours ago
  •  ...capabilities — such as new inference providers or evolving...  ...clear enough for a GSI architect to present directly to...  ...multi-layer AI system architectures spanning...  ...security enforcement for LLM traffic, prompt inspection...  ...NVIDIA AI Enterprise, and GPU-accelerated inference infrastructure... 
    Suggested
    Immediate start
    Remote work
    Visa sponsorship
    Work visa

    Palo Alto Networks

    Charlotte, NC
    5 days ago
  •  ...Job Title: Senior Technology Architect | Cloud Platform | Google Machine...  ..., ML, Data Science, RAG,LLM Nice to have skills...  ...and embedding-based retrieval systems. Integrate Gen AI capabilities...  ...AI, supporting high-volume inference and secure enterprise... 
    Suggested
    Work at office

    Diverse Lynx

    Charlotte, NC
    1 day ago
  •  ...AI Solution Architect – Agentic & Generative AI Locations: Austin...  ...AI-native platforms, agentic systems, and rapid value iteration....  ...design patterns Automated LLM evaluation systems Safety...  ...Feature stores Real-time inference pipelines Ensure architectural... 
    Suggested

    Futran Tech Solutions Pvt. Ltd.

    Charlotte, NC
    1 day ago
  • $185k - $235k

     ...you be doing? The AI Solutions Architect is the technical pre-sales...  ...solution architectures spanning GPU/compute, data platforms, AI/ML...  ...software staock, MLOps pipeline, and inference deployment. Working knowledge of NVIDIA DGX/HGX systems, CUDA, AI Enterprise software... 
    Full time
    Shift work

    World Wide Technology

    Charlotte, NC
    4 days ago
  •  ...Apache Kafka for real-time data streaming, and distributed computing frameworks. ~ Experience in optimizing and deploying AI models on GPU clusters, leveraging parallel processing capabilities for large-scale deep learning and generative AI applications. ~ multi-GPU... 

    Syntricate Technologies

    Charlotte, NC
    18 hours ago
  •  ...Data - Systems Architect 4 ~5+ years of Systems Architecture experience, or equivalent ~ Seeking a Data Architect with a strong HR tech domain background ~ Will help with system architecture, data center moves, rationalization efforts, and similar.... 

    3B Staffing LLC

    Charlotte, NC
    1 day ago
  • $118k - $176k

     ...datasets, train and optimize models, and maintain and improve model inference services. You will learn and apply new techniques from open...  ...problems across Indeed. Work spans classical ML through LLM systems. You improve search and retrieval quality using real user signals... 
    Work experience placement
    Local area

    Indeed

    Charlotte, NC
    1 day ago
  •  ...evolving world of intelligent systems. Location: New York, NY...  ...deployment and maintenance. Architect and deliver AI-powered solutions...  ..., and drive innovation in LLM and audio ML applications....  ...model training, deployment, inference, and monitoring in production... 
    Full time

    Catalyst Labs, LLC

    Belmont, NC
    18 hours ago
  •  ...Role: Business Systems Architect with Artificial Intelligence Location: Charlotte, NC We are At Synechron, we believe in the power of digital to transform businesses for the better. Our global consulting firm combines creativity and innovative technology to deliver... 
    Temporary work
    Flexible hours

    Synechron

    Charlotte, NC
    4 days ago
  •  ...~5+ years of Systems Architecture experience, or equivalent ~ Seeking a Data Architect with a strong HR tech domain background ~ Will help with system architecture, data center moves, rationalization efforts, and similar. ~ Looking for someone with deep HR... 

    Saxon Global

    Charlotte, NC
    18 hours ago
  •  ...Cube AI System Architect This role is to support a new initiative - implementing cube AI (vendor product) which is a vendor product used to source and report regulatory changes across the Globe. The app runs on SQL server and.NET – most work will be on the API that... 

    InterSources

    Charlotte, NC
    1 day ago
  • A leading telecommunications company in Charlotte is seeking a Senior HR Technology Analyst specializing in compensation processes. The role involves optimizing Oracle Fusion HCM modules, developing SQL queries, and maintaining annual compensation worksheets. Candidates...

    Charter Communications

    Charlotte, NC
    4 days ago
  • Burns & McDonnell is seeking an experienced Grid Systems Solution Architect in Charlotte, NC to provide consulting services for electric utility clients. You will focus on grid automation support, driving modernization initiatives for SCADA, EMS, OMS, DMS, and DERMS solutions... 

    Burns & McDonnell

    Charlotte, NC
    3 days ago
  • An organization focused on global humanitarian initiatives in Charlotte, N.C. is seeking an NRSE Manager to oversee critical readiness efforts. This full-time role involves leading the implementation of national response projects, managing budgets, and ensuring effective...
    Full time

    GSD says

    Charlotte, NC
    3 days ago
  • $100k - $120k

    A leading building controls manufacturer is looking for a System Architect for pre-sales activities in Charlotte, NC. This senior position requires 7+ years of experience in building automation and HVAC systems. Responsibilities include technical leadership, solution architecture... 
    Remote job

    Delta Controls Inc.

    Charlotte, NC
    2 days ago
  • $100k - $120k

    System Architect - Pre Sales - USA South East About Delta Intelligent Building Technologies (Canada) Inc. (formerly known as Delta Controls). Delta Intelligent Building Technologies (Canada) Inc. (a subsidiary of Delta Electronics) is a leading building controls manufacturer... 
    Full time
    Remote work
    Flexible hours

    Delta Controls Inc.

    Charlotte, NC
    2 days ago
  • $119.77k - $140.9k

     ...Description U.S. Bank is seeking a Technology Architect with strategic and influencing...  ...solution design, defining architecture across systems, data, and AI capabilities. Translate...  ...capabilities into enterprise applications (model inference, AI services, decisioning systems)... 
    Temporary work
    Work experience placement
    Local area
    3 days per week

    Us Bank

    Charlotte, NC
    1 day ago
  • $111.1k - $197.5k

     ...Lead System Architect Wells Fargo has multiple openings in our Consumer Lending group for Lead System Architects passionate about becoming transformational agents for the company. The Lead Architect will be a forward thinker, passionate technologist, expert in modern... 
    Work experience placement
    Work at office

    Phenom People

    Charlotte, NC
    1 day ago
  •  ...Solution Architect – Ai Gateway & Intelligence Platform We are seeking a Solution Architect...  ...Advanced exposure patterns for Ai/Llm-backed Apis The intersection of Ai, security...  ...years designing large-scale distributed systems in enterprise environments ~ Deep Api platform... 

    Futran Tech Solutions Pvt. Ltd.

    Charlotte, NC
    1 day ago
  • $125.9k - $231.1k

     ...working world. Microsoft 365 AI Solution Architect (Manager) EY advises clients to...  ...technical deployment and Copilot Control System configuration and testing. Responsible...  ...IEC 42001, NIST AI RMF, EU AI Act, OWASP LLM Top 10. Ideally, you’ll also have... 
    Summer holiday
    Flexible hours

    EY

    Charlotte, NC
    18 hours ago
  •  ...Job Description Position Summary The Solutions Architect will serve as the primary architectural authority for the Data Analytics organization...  ...• Evaluate and select emerging technologies in the GenAI and LLM space to enhance data discovery and automated insights. Data... 

    Insight Global

    Charlotte, NC
    4 days ago
  •  ...Description 1898 & Co., a division of Burns & McDonnell, is seeking an experienced Grid Systems Solution Architect to provide utility grid operations modernization consulting services for our electric utility clients. The selected candidate will join the Enterprise... 
    Full time
    Work experience placement

    Burns & McDonnell

    Charlotte, NC
    18 hours ago
  • $115k - $150k

     ...Resident Solution Architect Charlotte, NC About Starburst Starburst delivers enterprise intelligence at scale by giving organizations...  ...or externally is highly desirable. ~ Familiarity with LLM-based data interaction patterns (e.g., natural language to SQL,... 
    Local area
    Flexible hours

    Starburst

    Charlotte, NC
    1 day ago
  • $111k - $197k

     ...Lead Systems Architect At Wells Fargo, we are looking for talented people who will put our customers at the center of everything we do. We are seeking candidates who embrace diversity, equity, and inclusion in a workplace where everyone feels valued and inspired. Help... 
    Work experience placement
    Remote work
    Relocation package

    Phenom People

    Charlotte, NC
    1 day ago
  • $119k - $206k

     ...platforms like Microsoft 365, developer tooling, and key business systems . You’ll be trusted as a security authority whose work...  ...security. About the Role Wells Fargo is seeking a Lead Architect to strengthen and sustain our SaaS security architecture and enterprise... 
    Work experience placement
    Work at office
    Visa sponsorship
    3 days per week

    Wells Fargo

    Charlotte, NC
    1 day ago
  • $73.67 per hour

     ...Job Title: AI Solutions Architect Location: Pennington, NJ or Charlotte, NC Duration: Contract...  ...value. Design and guide delivery of AI/LLM solutions, including orchestration, RAG,...  ...agentic AI solutions, including event-driven systems. Experience integrating AI agents with... 
    Contract work
    Relocation
    3 days per week

    BCforward

    Charlotte, NC
    10 days ago
  • $105.1k - $202.73k

     ...excellence. The successful candidate will architect and implement an advanced support model...  ..., orchestration frameworks, copilots, LLM integrations, AIOps platforms, and automation...  ...structures, ticket patterns, monitoring systems, and service management processes.... 
    Live in
    Work at office
    Local area
    Remote work
    Night shift

    Perficient

    Charlotte, NC
    2 days ago
  •  ..., mapping guidance, and transformation standards for enterprise systems Develop canonical transformation patterns and reference implementation...  ...eliminate conflicting definitions and shadow models AI / LLM Integration Design AI-ready canonical datasets and semantic... 
    Contract work
    Remote work
    Visa sponsorship

    Strategic Staffing Solutions

    Charlotte, NC
    18 hours ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to LLM Inference & GPU Systems Architect. Be the first to apply!