Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

LLM Inference Engineer - Distributed Systems at Scale

Gravity Engineering Services Pvt Ltd.

Gravity Engineering Services Pvt Ltd. is looking for a Distributed LLM Inference Engineer to join their team. This critical role focuses on enhancing performance for ML inference, ensuring scalability and efficiency in solutions used by both open-source and corporate clients. The ideal candidate will work closely with product teams, integrating Ray Data and LLM engines, while keeping abreast of the latest innovations in the field. Familiarity with deep learning frameworks like PyTorch and knowledge of distributed systems is crucial. #J-18808-Ljbffr Gravity Engineering Services Pvt Ltd.

Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the LLM Inference Engineer - Distributed Systems at Scale in San Francisco, CA vacancy
  • Anyscale is seeking a Distributed LLM Inference Engineer in San Francisco, California. This pivotal role involves...  ...of performance for ML inference at scale. You'll work closely with product...  ...solid understanding of distributed systems and familiarity with deep learning... 
    Suggested

    Anyscale

    San Francisco, CA
    3 days ago
  •  ...re on a mission to democratize distributed computing and make it...  ...developer or data scientist can scale an ML application from their laptop...  ...without needing to be a distributed systems expert. About the Role As a Distributed LLM Inference Engineer, you will help systems and... 
    Suggested

    Gravity Engineering Services Pvt Ltd.

    San Francisco, CA
    4 days ago
  •  .... Our vision is AI systems that are flexible,...  ...intelligence - the inference services that serve LLMs at scale and the data pipelines...  ...Researchers and ML engineers will hand you...  ...Design and operate distributed inference systems for...  ...on experience with LLM inference engines (... 
    Suggested
    Flexible hours

    Adaption

    San Francisco, CA
    23 days ago
  • $350k

     ..., and steerable AI systems. We want AI to be safe...  ...researchers, engineers, policy experts, and...  ...Role Anthropic's inference fleet serves Claude...  ..., model servers, distributed routing, autoscaling...  ...infrastructure or general LLM serving stacks. Direct large-scale inference... 
    Suggested
    Work at office
    Visa sponsorship
    Flexible hours

    Anthropic

    San Francisco, CA
    1 day ago
  • $160k - $230k

     ...LLM Inference Frameworks and Optimization Engineer San Francisco, Singapore, Amsterdam About...  ..., develop, and optimize distributed inference engines that support...  ...and language models at scale. This role will focus on...  ...frameworks, distributed systems, or high-performance... 
    Suggested
    Full time

    Together AI

    San Francisco, CA
    25 days ago
  • Staff Software Engineer, ML Infra & Distributed Systems About the Role: As a Staff Software...  ...world-class machine learning inference platforms. These platforms...  ...support Deep Learning, LLM, and Search models. This involves...  ...of our infra. Lead large scale cross functional... 

    Tubi Tv

    San Francisco, CA
    4 days ago
  • $146.5k

     ...the team: The ML Data Engineering team powers metadata...  ...of users worldwide. Our systems operate at massive scale, supporting diverse datasets...  ...learning, data engineering, and distributed systems, collaborating...  ...to deploy scalable ML and LLM-powered solutions in production... 
    For contractors
    Local area
    Worldwide
    Home office
    Flexible hours

    Scribd

    San Francisco, CA
    3 days ago
  •  ...team at Redis, shipped 100+ LLM applications, and is a contributor...  ..., integrations, distributed systems, and AI experts from Okta, Redis...  ...ship. ~7+ years of software engineering experience comprising of:...  ...contributions Experience with high-scale distributed systems... 
    Work at office
    Shift work

    Arcade AI, Inc

    San Francisco, CA
    4 days ago
  • $180k - $310k

     ...the role You'll build and scale the application and data...  ...core data model and storage systems powering Gamma's business. You...  ...shipping velocity. As Software Engineer on the Platform team, you'll...  ...and implement scalable APIs, distributed systems, and data... 
    Full time
    Work at office
    Work from home

    Gamma

    San Francisco, CA
    6 days ago
  • $170k - $260k

    | Software Engineer, Distributed Systems (Core) | Title of Role: | Software Engineer, Distributed Systems (Core) | Location: San Francisco...  ...data pipelines and integrations that operate at scale. ~ Demonstrated ability to manage the full software development... 
    Work at office
    Remote work
    Visa sponsorship

    Recruiting from Scratch

    San Francisco, CA
    10 days ago
  • $150k - $215k

    Artie Software Engineer (Distributed Systems) $150K - $215K | San Francisco, CA, US Job type: Full-time Role: Engineering, Backend Experience:...  ...ease-of-use and extensibility Past experience working on scaling async systems and exposure to topics like gRPC, Kafka, Kubernetes... 
    Full time
    Visa sponsorship

    Voiceflow

    San Francisco, CA
    2 days ago
  •  ...of Technical Staff to design and build distributed systems for AI workloads. The role involves...  ...candidates should have strong software engineering skills and experience with distributed...  ...infrastructure work and can operate systems at scale. #J-18808-Ljbffr Gimlet Labs

    Gimlet Labs

    San Francisco, CA
    1 day ago
  •  ...cybersecurity company is seeking an experienced Infrastructure Engineer to optimize and maintain their platform components. This remote position involves solving complex distributed systems problems and scaling infrastructure using Go, Kubernetes, GCP, and AWS. Ideal... 
    Remote job

    Palo Alto Networks

    San Francisco, CA
    2 days ago
  •  ...A lightweight, model-agnostic system that enforces policy, prevents...  .... By bridging the gap between LLM capabilities and domain-specific...  ...'s Senior Machine Learning Engineer will operate deep within the model...  ...policy enforcement at inference time. Who You Are Strong... 

    CTGT

    San Francisco, CA
    1 day ago
  •  ...community-owned frontier models with self-sustaining economics. We’re looking for Senior/Staff engineers with 5+ years of experience in distributed systems and ML large‑scale training. You’ll be implementing a novel substrate for training distributed ML models that work... 
    Remote work
    Visa sponsorship

    Pluralis Research

    San Francisco, CA
    2 days ago
  • $229.9k - $262.4k

    Senior Lead Software Engineer, Distributed Systems (Golang + Python on Kubernetes) Do you love building and pioneering in the technology space? Do...  ...responsible development and deployment of AI/ML solutions at scale. MLX Tech harnesses the power of Generative AI to assist... 
    Full time
    Part time
    Internship
    Local area

    Capital One National Association

    San Francisco, CA
    2 days ago
  • $229.9k - $262.4k

    Senior Lead Software Engineer, Distributed Systems (Golang + Python on Kubernetes) Do you love building and pioneering in the technology space? Do...  ...responsible development and deployment of AI/ML solutions at scale. MLX Tech harnesses the power of Generative AI to assist... 
    Full time
    Part time
    Internship
    Local area

    Information Technology Senior Management Forum

    San Francisco, CA
    10 hours ago
  •  ...deployment of AI/ML solutions at scale. **MLX Tech** harnesses the power...  ...developers with deep experience in distributed microservices, and full stack systems to create solutions that help...  ..., mentoring other members of the engineering community, and from time to time,... 
    Full time
    Part time
    Internship

    Capital One

    San Francisco, CA
    3 days ago
  •  ...AI/ML Engineer (RL & Physical Systems) FLUIX is building the AI Operating...  ...systems to power distribution, where milliseconds...  ..., and real megawatt-scale infrastructure. Who...  ...integration of LLM-based tools and workflows...  ...distillation, inference orchestration, etc.)... 
    Weekend work

    Fluix AI

    San Francisco, CA
    5 days ago
  • A leading AI research organization in San Francisco seeks an Infrastructure Engineer to design and maintain large distributed ML training and inference clusters. The ideal candidate will have a strong grasp of optimizing training workloads and experience with distributed... 

    Causal Labs

    San Francisco, CA
    2 days ago
  • Apple Inc. is seeking a backend engineer in San Francisco, CA to design and scale services for Apple Podcasts. The role focuses on building APIs, managing databases, and ensuring the reliability of distributed systems. Candidates should have over 5 years of experience,... 

    Apple Inc.

    San Francisco, CA
    1 day ago
  • $189.6k - $237k

     ...Scale's ML platform (RLXF) team builds our internal distributed framework for large language model training and inference. The platform has been powering...  ...evaluation of LLM's, as well as evaluation...  ...our ML system Ideally you'...  ...software engineering skills, proficient... 
    Full time

    Scale AI

    San Francisco, CA
    5 days ago
  • $230k - $385k

     ...seamlessly blend high-level AI capabilities with the constraints of physical systems to improve peoples' lives. About the Role As a Software Engineer, Distributed Data Systems, you will design and scale the infrastructure that powers large-scale multimodal training and... 
    Work at office
    Relocation package

    OpenAI

    San Francisco, CA
    5 days ago
  •  ...San Francisco is seeking an experienced engineer to support AI applications focused on safety...  ..., building modular agents, and scaling LLM infrastructure. Ideal candidates have significant...  ...experience, a thoughtful approach to system design, and familiarity with AI safety... 

    TRM Labs

    San Francisco, CA
    3 days ago
  •  ...Distributed Systems Engineer As a distributed systems engineer, you'll work across the stack to solve problems as they come up and help build...  ...distributed systems: you get how consensus works, you know how to scale systems, and you know what pitfalls in API design to avoid... 
    Flexible hours

    Archil

    San Francisco, CA
    3 days ago
  • Gravity Engineering Services Pvt Ltd. is looking for an Inference Frameworks and Optimization Engineer to enhance the performance of AI infrastructure. This role involves designing distributed inference engines that support multimodal models, optimizing frameworks for... 

    Gravity Engineering Services Pvt Ltd.

    San Francisco, CA
    10 hours ago
  • $180k - $300k

     ...value is discovered, priced, and distributed in real time. The mission...  ...transparency, and efficiency to systems where value is currently...  ...services while establishing engineering best practices and code quality...  ...constraints regarding latency, scale, and fault tolerance.... 
    Full time
    Remote work
    Flexible hours

    MLabs Ltd

    San Francisco, CA
    1 day ago
  • A technology firm specializing in distributed computing is seeking engineers to contribute to the Ray backend. Candidates should have experience in building scalable, fault-tolerant systems and a solid understanding of algorithms and data structures. Responsibilities include... 

    Anyscale

    San Francisco, CA
    3 days ago
  •  ...As a Research Engineer, Distributed Data Systems, you will design and scale the infrastructure that powers large-scale multimodal training and evaluation at OpenAI. You’ll manage distributed data pipelines, collaborate closely with researchers to translate requirements... 

    OpenAI

    San Francisco, CA
    3 days ago
  • $192k - $260k

     ...experience. Optional: MS or PhD in databases, distributed systems. Comfortable working towards a multi-...  ...management system that combines the scale and cost-efficiency of data lakes, the...  ...the complexity of real-world data engineering architecture. Delta Pipelines : It's... 
    Worldwide

    Cacheflow

    San Francisco, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to LLM Inference Engineer - Distributed Systems at Scale. Be the first to apply!