Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Solutions Architect, Inference Deployments

$152k - $241.5k

NVIDIA

We’re forming a team of innovators to roll out and enhance AI inference solutions at scale, demonstrating NVIDIA’s GPU technology and Kubernetes. As a Solutions Architect focused on inference, you’ll collaborate closely with our engineering, DevOps, and customers to develop enterprise AI solutions. Together, we'll deliver generative AI to production!

What you'll be doing:

  • Build inference pipelines with tools like NVIDIA Dynamo, distributing tasks among GPU workers to improve efficiency.

  • Collaborate with DevOps teams to orchestrate disaggregated inference using Kubernetes for complex workloads.

  • Accelerate inference pipelines using TensorRT-LLM, vLLM, SGLang, and other backends to ensure seamless integration with disaggregated inference.

  • Provide mentorship and technical leadership to customers and internal teams, guiding them through the deployment of disaggregated inference systems and resolving complex issues.

What we need to see:

  • 5+ Years in Solutions Architecture with a proven track record of deploying distributed systems and AI inference workloads on Kubernetes.

  • Experience with one of NVIDIA Dynamo, Triton Inference Server, or TensorRT-LLM for model optimization and serving.

  • GPU orchestration using NVIDIA GPU Operator, NIM Operator, and Multi-Instance GPU (MIG) partitioning.

  • Solving sophisticated GPU allocation, memory hierarchies (HBM, DRAM, SSD), and low-latency networking (RDMA, UCX).

  • Demonstrated success in tuning large language models for low-latency inference in enterprise environments.

  • BS in CS/Engineering or equivalent experience.

Ways to stand out from the crowd:

  • Prior experience deploying NVIDIA inference technologies such as Dynamo, NIM, NIXL and Grove.

  • Deep understanding of transformer neural network, and inference acceleration technologies like quantization, speculative decoding, WideEP etc.

  • NVIDIA Certified AI Engineer or similar credentials.

  • Contributions to open-source projects including NVIDIA Dynamo, vLLM, KServe, or SGLang.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 152,000 USD - 241,500 USD for Level 3, and 184,000 USD - 287,500 USD for Level 4.

You will also be eligible for equity and benefits ( .

Applications for this job will be accepted at least until April 19, 2026.

This posting is for an existing vacancy.

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Vacancy posted 14 hours ago
Similar jobs that could be interesting for youBased on the Solutions Architect, Inference Deployments in Santa Clara, CA vacancy
  • $184k - $287.5k

     ...at NVIDIA and help bring AI solutions to our largest customers. We...  ...seeking an expert Solutions Architect to assist customers in building...  ...scale LLM training and inference. Conducting regular technical...  ...benchmarking. Experience deploying solutions in cloud... 
    Suggested

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $184k - $287.5k

     ...We are currently seeking a Senior Solutions Architect to join our dedicated and hardworking team...  ...groundbreaking GPU products to various deployments, from data centers to edge computing....  ...problems, with a focus on deep learning inference. Lead the product through its... 
    Suggested

    NVIDIA

    Santa Clara, CA
    14 hours ago
  • $184k - $287.5k

     ...of computing. NVIDIA is searching for an AI/ML Solutions Architect focusing on Hyperscale customers and Cloud Service...  ...software customer technical engagement for AI training, inference and infrastructure being deployed at vast scale. You will work across multiple... 
    Suggested

    NVIDIA

    Santa Clara, CA
    2 days ago
  • $184k - $287.5k

     ...for an ambitious and forward-thinking solution architect to help in the enablement of Network Industry...  ...a network stack for distributed inference which will be used to orchestrate wide...  ...Enabling NVIDIA strategic Telecom ISVs to deploy and support the design defined above.... 
    Suggested
    Work experience placement

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $152k - $241.5k

     ...Join NVIDIA as a Solutions Architect to own the evolution of Agentic AI for the enterprise. You...  ...enterprise software companies to build and deploy sophisticated AI-native systems,...  ...integrated workflows, and accelerated inference. By mastering NVIDIA’s core technologies... 
    Suggested

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $184k - $287.5k

     ...NVIDIA is seeking outstanding AI Solutions Architects to assist and support customers that are...  ...projects and proof-of-concepts focused on inference for Generative AI and Large Language...  ..., and observability solutions for AI deployments ~ Excellent knowledge of the theory... 

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $184k - $287.5k

     ...We are seeking an ambitious Senior Solutions Architect - AI Factory Deployment to join our NVIDIA Infrastructure Specialists team in Santa Clara! This...  ...LLM training. Familiarity with LLM training and/or inference workflows using frameworks such as PyTorch or TensorFlow... 
    Remote work

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $184k - $287.5k

     ...world. NVIDIA is seeking an experienced Solutions Architect to be a trusted technical advisor, bridging design to deployment of large-scale AI / HPC GPU infrastructure....  ...NVIDIA Dynamo, NeMo Retriever, NVIDIA Triton Inference Server, TensorRT, TensorRT-LLM, NVIDIA CUDA... 
    Remote work

    NVIDIA

    Santa Clara, CA
    9 days ago
  • $184k - $287.5k

     ...Technology Partner (ATP) team as a Senior Solutions Architect who thrives at the intersection of...  ...organizations around the globe adopt and deploy AI at scale, we want to meet you....  ...delivering high-performance, GPU-accelerated inference at scale. Technical Problem Solving... 

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $152k - $241.5k

     ...expert, building proof-of-concept solutions, reference architectures, and deployable single-agent and multi-agent...  ...problems. The Partner Solutions Architect team is dedicated to enabling ecosystem...  ...~ Familiarity with GPU-backed inference systems, performance tradeoffs,... 
    Work experience placement

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $148k - $235.75k

    Solutions Architect, Agentic AI page is loaded## Solutions Architect, Agentic AIlocations: US, CA...  ...Time Compute, Reinforcement Learning, inference optimization and model fine-tuning. We...  ...developing production-grade deployment patterns using Kubernetes/OpenShift, CI... 

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • NVIDIA Corporation is seeking a Solution Architect based in Santa Clara, California. In this role, you will enable strategic Telecom ISVs in deploying and supporting inference solutions while collaborating extensively with development teams. You should have an MSc or PhD... 

    NVIDIA Corporation

    Santa Clara, CA
    14 hours ago
  • $184k - $287.5k

    Senior Solutions Architect, Generative AI page is loaded## Senior Solutions Architect, Generative...  ...in efficient AI model training and/or deployment for a customer facing role. Primary...  ...profiling and optimizing model training/inference performance on GPUs.* Experience developing... 
    Local area
    Remote work

    NVIDIA Corporation

    Santa Clara, CA
    4 days ago
  • $152k - $241.5k

     ...NVIDIA is seeking an outstanding Solutions Architect, Foundation Models to join our growing team...  ...models, multimodal models, and production inference! In this role, you will act as both a...  ..., benchmark, fine-tune, optimize, and deploy foundation model solutions for... 

    NVIDIA

    Santa Clara, CA
    3 days ago
  • $220.2k - $330.4k

     ...industries through intelligent edge solutions that combine connectivity,...  ...designed for generative AI inference and computer vision...  ...—spanning on‑prem and cloud deployments. Qualcomm is building end‑to...  ...Principal Systems Solutions Architect, you will define, develop, document... 
    Work experience placement
    Work at office

    Qualcomm

    Santa Clara, CA
    14 hours ago
  • $184k - $287.5k

     ...NVIDIA GPUs inter-connected by networking solutions such as InfiniBand, Ethernet, or RoCE (...  .... As a networking Sr. Solutions Architect at NVIDIA you will have agency and palpable...  ...understand their needs to help design and deploy groundbreaking NVIDIA networking platforms... 
    Remote work

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $184k - $287.5k

     ...supercomputers. We are seeking a highly motivated Senior Solutions Architect to join the Cluster Design and Architecture team with a...  ...architecture, performance modeling, validation, and NPI cluster deployments. Your expertise will directly influence how the world’s... 

    NVIDIA

    Santa Clara, CA
    4 days ago
  • $224k - $356.5k

     ...Join NVIDIA as a Solution Architect on the Infrastructure Specialists team. Help redefine deep learning, data analytics, and power data centers...  .... We are seeking a candidate who can lead the planning and deployment of large scale AI data centers, focusing on infrastructure... 
    Worldwide

    NVIDIA

    Santa Clara, CA
    2 days ago
  •  ...Apriso Solution Architect Santa Clara, California 3 Months Contract Work Model: Remote/Hybrid with Possible Travel The...  ...Digital Manufacturing work plans Support DELMIA Apriso deployments and enterprise implementations Develop technical... 
    Contract work
    Remote work

    Veracity

    Santa Clara, CA
    3 days ago
  • $176.6k - $239k

     ...customers implement innovative cloud computing solutions and solve technical problems? Would you...  ...for a highly motivated Solutions Architect to help accelerate our growing business...  ...Technical – Web Services development/deployment experience, cloud computing Operational... 
    Local area
    Flexible hours

    Amazon

    Santa Clara, CA
    1 day ago
  •  ...learn more about our people and team. Job Specs The Solution Architect's primary responsibility will be to ensure customer success...  ...Project Delivery team during the build, test, training, and deployment phases. This support includes addressing architectural... 
    Work experience placement
    Work from home
    Home office
    Work visa

    Skedulo

    Sunnyvale, CA
    2 days ago
  • $152k - $241.5k

     ...insight into various facets of AI Factories deployments. Applicants should be familiar with...  ...HPE, Lenovo and others) to use NVIDIA solutions integrated in their platforms. NVIDIA certified...  ...doing: Collaborating with solution architects, engineering or product teams!... 
    Work experience placement
    Work at office

    NVIDIA

    Santa Clara, CA
    2 days ago
  •  ...SERVICES LLC Offered Position: \tSolutions Architect III Job Location: \t\tSunnyvale,...  .... Architect hybrid AWS and on-premises solutions. Guide customers on architecture choices...  ...Ensure success in building, migrating, and deploying applications, software and services on... 
    Local area

    Amazon

    Sunnyvale, CA
    2 days ago
  •  ...Role:- SAP GTS solution Architect- E4H Location:- Santa Clara, CA ( Ready to work Onsite/Relocate) Mode of Hire:- FTE/Subcon Payrate...  ...in solution design, configuration, integration, and deployment, ensuring alignment with global trade regulations and client... 
    Relocation

    Yantran LLC

    Santa Clara, CA
    2 days ago
  • $151k - $204.3k

     ...Description Amazon is seeking a highly motivated and experienced Solutions Architect to join the Alexa Smart Properties (ASP) team, focused on...  ...to ensure solutions are designed for successful deployment Manage complex projects with significant bottom-line impact... 
    Local area
    Immediate start
    Flexible hours

    Amazon

    Sunnyvale, CA
    4 days ago
  • $184k - $287.5k

     ...NVIDIA is seeking outstanding AI Solutions Architects to assist and support customers that are building solutions with our newest AI technology...  ...advisor with our customers, work on exciting projects and deploy AI solutions in production using Data Processing for Generative... 

    NVIDIA

    Santa Clara, CA
    14 hours ago
  • $208k - $333.5k

     ...NVIDIA is looking for a Solutions Architect to work in IPP's (Infrastructure, Planning and Process) Cloud Infrastructure Team. IPP is a global...  ...Rack Scale AI Products. Finding Optimum Solutions to deploy these products in a Datacenter or a Lab environment using sophisticated... 
    Worldwide

    NVIDIA

    Santa Clara, CA
    1 day ago
  •  ...Accountable for the technical quality and business value of end-to-end solutions on specific business and technology areas like: interface...  ...results, and monitoring of technical standards compliance and deployment; • Consults with application or infrastructure development... 
    Work experience placement

    Omega Solutions

    Santa Clara, CA
    1 day ago
  • $231.5k - $298k

     ...the extra mile in a fast-paced environment. As a Solutions Architect, you are an integral part of the organization. You will work...  ...User Groups when needed What you offer: ~10+ years of deployment experience in similar role or experience in a technical... 
    Work at office

    Netskope

    Santa Clara, CA
    4 days ago
  • $152k - $241.5k

     ...NVIDIA is seeking an outstanding AI Engineer or Solutions Architect to join our growing team focused on ecosystem partner enablement for Generative...  ...blueprints and expert guidance needed to architect and deploy their own transformative applications using NVIDIA full AI... 
    Work experience placement

    NVIDIA

    Santa Clara, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Solutions Architect, Inference Deployments. Be the first to apply!