Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Inference Optimization Engineer United States - Remote · Remote

$198k - $286k

Modular Mailing Systems, Inc.

Los Altos, CA
  • Remote job

About Modular At Modular, we’re on a mission to revolutionize AI infrastructure by systematically rebuilding the AI software stack from the ground up. Our team, made up of industry leaders and experts, is building cutting‑edge, modular infrastructure that simplifies AI development and deployment. By rethinking the complexities of AI systems, we’re empowering everyone to unlock AI’s full potential and tackle some of the world’s most pressing challenges. If you’re passionate about shaping the future of AI and creating tools that make a real difference in people’s lives, we want you on our team. You can read about our culture and careers to understand how we work and what we value. About the role At Modular, we optimize inference from kernel to cloud on one unified stack. We are building a differentiated cloud platform that delivers state‑of‑the‑art inference performance from day one, then keeps getting better. As we learn the shape and patterns of each customer’s workload, the platform adapts and improves performance automatically over time. The Performance Labs team builds the infrastructure that makes this possible at scale. We continuously apply the latest optimizations across kernels, the inference engine, and distributed systems so that customer workloads stay on the Pareto frontier of cost and performance. We get there through deep workload insights, a scalable platform, and close collaboration with engineering and product teams. In this role you will dig into real customer inference workloads, profile them end to end, and apply the optimizations across kernels, engine, and distributed systems that push each workload toward the Pareto frontier. You will build the tooling and platform that turns one‑off performance wins into a repeatable, automated optimization loop, and you will work directly with engineering, product, and GTM to bring those gains to customers in production.

LOCATION

Candidates based in the US or Canada are welcome to apply. You can work in our office in Los Altos, CA or remotely from home. Onboarding for new hires is conducted in‑person in our Los Altos, CA office. What you will do Build the optimization platform that drives inference performance of LLMs served on Modular Cloud to state‑of‑the‑art levels across the latest GPU and ASIC architectures. Shape the technical direction of Modular Cloud, delivering LLM performance on the Pareto frontier for agentic use cases and keeping it there as the landscape evolves. Partner closely with the GTM team to deliver highly customized LLM inference tuned to specific customer use cases, and collaborate across engineering to drive optimizations spanning the full stack, from GPU kernels to cloud infrastructure. Translate insights from customer engagements into technical direction for engineering teams. Publish blog posts on innovative approaches to LLM inference optimization that shape industry‑wide best practices. What you bring to the table 5+ years of experience in distributed systems or performance engineering. A track record of building durable, reusable software tools and libraries that are adopted across teams and functions. Sound judgment in evaluating technical tradeoffs and setting priorities, paired with strong communication and technical leadership skills. Creativity and curiosity in solving complex problems, a collaborative and team oriented mindset, and alignment with our culture. Helpful, but not required Experience with GPU kernel programming, inference engine internals, or distributed inference architectures. Experience with Kubernetes and cloud native ecosystems. Familiarity with modern LLM architectures and the latest inference optimization techniques. What Modular brings to the table Amazing Team. We are a progressive and agile team with some of the industry’s best engineering and product leaders. World‑class Benefits. In order to attract the best, we need to offer the best. Premier insurance plans, up to 5% 401k matching, flexible paid time off, and more are available to you! Please note that specific benefit packages may vary based on your location. Competitive Compensation. We offer very strong compensation packages, including stock options. We want people to be focused on their best work and believe in tailoring compensation plans to meet the needs of our workforce. Team Building Events. We organize regular team onsite and local meetups in Los Altos, CA as well as different cities. Traveling 2‑4 times a year is expected for all roles. Working at Modular will enable you to grow quickly as you work alongside incredibly motivated and talented people who have high standards, possess a growth mindset, and a purpose to truly change the world. Estimated base salary range for this role to be performed in the US, regardless of the state, is $198,000.00 - $286,000.00 USD . The estimated base salary range for this role to be performed in Canada, regardless of the province, is $194,000.00 - $280,000.00 CAD . The salary for the successful applicant will depend on a variety of permissible, non‑discriminatory job‑related factors, which include but are not limited to education, training, work experience, business needs, or market demands. This range may be modified in the future. The total compensation for a candidate will also include annual target bonus, equity, and benefits, with equity making up a significant portion of your total compensation. For candidates who fall outside of the listed requirements, we nevertheless encourage you to apply as we may have openings that are lower/higher level than the ones advertised. Equal Employment Opportunity Modular is proud to emphasize an equal opportunity, safe environment for people to do their best work. Modular is an affirmative action employer. We are committed to providing equal employment opportunities regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, or Veteran status. We also consider qualified applicants regardless of criminal histories, consistent with legal requirements. Accommodations If you require reasonable accommodations to participate in the interview process, please let your recruiter know, and we will work with you to meet your needs in compliance with the ADA. E‑Verify This employer participates in E‑Verify and will provide the federal government with your Form I‑9 information to confirm that you are authorized to work in the United States. If E‑Verify cannot confirm that you are authorized to work, this employer is required to give you written instructions and an opportunity to contact the Department of Homeland Security (DHS) or Social Security Administration (SSA) so you can begin to resolve the issue before the employer can take any action against you, including terminating your employment. Employers can only use E‑Verify once you have accepted a job offer and completed the Form I‑9. #J-18808-Ljbffr Modular Mailing Systems, Inc.

Vacancy posted 3 days ago
Similar jobs that could be interesting for youBased on the Inference Optimization Engineer United States - Remote · Remote in Los Altos, CA vacancy
  • $120k - $174.3k

     ...team is looking for a Graphics Engineer with a passion for cutting-edge...  ...technologies to elevate visual quality Optimize GPU performance to support...  ...related challenges This is a fully remote role that may be based anywhere in the United States. Below are the expected salary... 
    Remote work

    2K Games, Inc.

    Novato, CA
    1 day ago
  •  ...looking for an experienced Lead Prompt Engineer to guide and manage a team through the...  ...techniques and the client’s internal tools to optimize model performance, ensuring the...  ...in English. Location: Must be based in United States. Education: Master’s, or Doctorate degree... 
    Remote job
    Part time

    Welo Global

    New York, NY
    4 days ago
  • Modular Mailing Systems, Inc. is seeking an experienced Performance Engineer to optimize LLM inference on their cloud platform. This pivotal role involves building optimization infrastructures and collaborating with teams to enhance performance across GPUs and ASICs. The... 
    Remote job
    Flexible hours

    Modular Mailing Systems, Inc.

    Los Altos, CA
    3 days ago
  • $130k - $180k

    Mixed-Signal Behavioral Modeling Engineer United States - Remote K2 is building the largest and highest‑power satellites ever flown, unlocking performance levels previously out of reach across every orbit. With $450M from leading investors and $500M in signed contracts... 
    Remote job
    Permanent employment
    Shift work

    K2 Space Corporation

    Torrance, CA
    1 day ago
  • Resource Forecasting & Analysis Engineer at ERCOT - Taylor, TX, United States At ERCOT, our diverse and dynamic work environment provides a platform on which...  ...based on business needs as determined by the Manager Remote work is required to be performed from your Texas... 
    Remote work
    Contract work
    Local area
    Flexible hours
    2 days per week

    Victrays

    Taylor, TX
    1 day ago
  • $189.6k - $312.73k

    The vLLM and LLM-D Engineering team at Red Hat is...  ...our cutting‑edge inference platform (LLM-D and...  ...to deploy, optimize, and scale distributed...  ...deep experience with stateful workloads and high...  ...positions with Remote‑US locations, the...  ...Hat located in the United States. Inclusion... 
    Remote work
    Permanent employment
    Full time
    Contract work
    Work experience placement
    Work at office
    Flexible hours

    Red Hat, Inc.

    Sacramento, CA
    2 days ago
  • $170.5k - $315.49k

     ...Make models fast on the hardware people actually own. You optimize inference engines (llama.cpp, vLLM) for constrained local and edge...  ...its sole discretion. Job Type: Shift: Shift 1 (United States of America) Primary Location: US, California, Santa... 
    Internship
    Local area
    Immediate start
    Shift work

    Intel

    Folsom, CA
    2 days ago
  • $80k - $120k

     ...global close-knit community, united by the relentless pursuit to...  ...OPPORTUNITYSilvus is seeking a ***Sales Engineer - Unmanned Systems*** to work...  ...is eligible for **100% remote work depending on location.**...  ..., customer training, system optimization, etc.* Become an expert on... 
    Remote work
    Base plus commission
    Permanent employment
    For contractors
    Relocation

    Motorola Solutions

    California, MO
    1 day ago
  • $170.5k - $315.49k

    ## Inference Optimization Engineer (local / edge runtime)Applylocations: US, California, Santa Clara: US, Oregon, Hillsboro: US, California, Folsom...  ...in its sole discretion.## Job Type:## Shift:Shift 1 (United States of America)## Primary Location:US, California, Santa Clara##... 
    Internship
    Local area
    Immediate start
    Shift work

    Intel

    Santa Clara, CA
    3 days ago
  • $85.3k - $158.1k

    Senior Systems Engineer (Dynatrace) at Centene Corporation - Remote-MO, United States Senior Systems Engineer (Dynatrace) at Centene Corporation - Remote-MO, United States You could be the one who changes everything for our 28 million members by using technology to improve... 
    Remote job
    Full time
    Part time
    Work at office
    Flexible hours

    Victrays

    Kansas City, MO
    1 day ago
  •  ...safety? Join Hitachi Energy as a Regional Machine Safety Engineer and become a trusted expert leading the development and...  ...collaboration and diverse perspectives Location: Remote - North Carolina, United States of America Job type: Full time Experience: Experienced... 
    Remote job
    Full time
    Contract work
    Monday to Friday

    Hitachi ABB Power Grids

    Raleigh, NC
    2 days ago
  • As a Service Engineer within Hitachi Energy’s Grid Integration Service organization, you...  ...coordinated intake of service needs Provide remote troubleshooting using system...  ...them to work for Hitachi Energy in the United States. Bachelor’s degree in engineering preferred... 
    Remote work
    Full time
    Contract work
    Work at office
    Local area
    Monday to Friday
    Flexible hours

    Hitachi ABB Power Grids

    Raleigh, NC
    1 day ago
  • $141k - $188k

    Location: New York, New York, United States Join Axon and be a Force for Good. Your Impact We...  ...Unified Communications Software Integration Engineer to integrate and deploy the world's...  ...infrastructure Work Location This role is fully remote within the United States. Benefits that... 
    Remote work
    Part time
    Work experience placement
    Work at office

    Axon Enterprise

    New York, NY
    1 day ago
  • Job Title: Design Verification Engineer (DV) Experience Required: 5 to 10 Years Duration: 12 months Job Location: (United States / Remote) Job Summary: We are seeking an experienced Design Verification Engineer with a strong background in SoC/IP verification, particularly... 
    Remote work

    Sperton Global AS

    New York, NY
    4 days ago
  • $266k - $372.4k

    Senior Staff Machine Learning Engineer, Feed Relevance Remote - United States We’re looking for a Senior Staff Machine Learning Engineer to join our Feed Relevance team, which is responsible for the end‑to‑end systems that power personalization and ranking for the main... 
    Remote job

    Reddit, Inc.

    New York, NY
    4 days ago
  • $114.75k - $150.24k

    Physicist - Resilience Engineering & Operations Location: Bothell, Washington, United States (option for occasional remote work) Travel: 5% or less About IonQ: IonQ, Inc. is a world‑leading quantum platform and merchant supplier delivering integrated quantum solutions... 
    Remote work
    Casual work
    Flexible hours
    Shift work

    IonQ

    Bothell, WA
    3 days ago
  • $250k - $350k

     ...Applied ML Systems Engineer  - Finance - NEW YORK - UNITED STATES Salary: $250,000 - $350,000...  ...Experience - 3-15 yrs Remote Status - No Remote...  ...work - writing custom GPU code, optimizing memory usage, finding the bottlenecks... 
    Remote work
    Permanent employment
    Full time
    Work experience placement
    Internship
    Immediate start
    Relocation
    Relocation package
    New York, NY
    20 days ago
  • $90 - $125 per hour

    A cutting-edge AI company is looking for Low-Level Engineers to design RL environments that optimize kernel development and systems programming. Candidates...  ...Python skills and a solid understanding of LLMs. This remote contractor role offers an hourly rate ranging from $9... 
    Remote job
    Hourly pay
    For contractors

    Open Data Science

    San Francisco, CA
    3 days ago
  • CloudDevs is seeking a Growth Engineer to join their team in the United States. In this full-time, remote role, you will collaborate across engineering, product, and growth...  ...flows, and use data-driven insights to optimize user experiences. Candidates must be within ±2... 
    Remote job
    Full time

    CloudDevs

    Oklahoma City, OK
    3 days ago
  • Title: Founding Engineer (exact title varies) Location: US (Remote); Eastern Time preferred; Atlanta area highly preferred Compensation: Equity only pre...  ...them. Production troubleshooting and performance optimization experience Demonstrated experience mentoring junior... 
    Remote job
    Full time
    Part time
    Flexible hours

    S27a

    Atlanta, GA
    4 days ago
  • $170k - $250k

     ...new era demands a fundamentally different class of spacecraft. Engineered to survive the harshest radiation environments and to fully...  ...scan. Lead RTL-level DFT insertion, scan chain insertion and optimization, test point insertion, and low‑power DFT methodologies. Own... 
    Remote job
    Permanent employment
    Shift work

    K2 Space Corporation

    Torrance, CA
    1 day ago
  • $80k - $120k

     ...Job Title Sales Engineer – Unmanned Systems About the Role Silvus Technologies, a Motorola Solutions company, develops advanced MANET...  ...support, including system planning, customer training, and system optimization. Become an expert on Silvus’ radio and networking solutions... 
    Remote work
    Base plus commission

    Motorola Solutions

    San Diego, CA
    3 days ago
  •  ...Cloud Security/Penetration Test Engineer We are seeking a highly...  ...Operations: Design, implement, and optimize robust cloud security...  ...leave program. Generous PTO Remote work opportunities Paid company...  ...complies with all applicable local, state, and federal regulations. For... 
    Remote job
    Work experience placement
    Casual work
    Work at office
    Local area
    Flexible hours
    Shift work
    Weekend work
    Afternoon shift

    Appspace

    Dallas, TX
    2 days ago
  • $80k - $90k

     ...oriented and highly skilled Mid-Level QA Engineer to join the team. This role plays a crucial...  ..., load distribution, prioritization, and state transitions). Validate charging...  ...Opportunity Employer (EOE) and offers a remote‑friendly work environment with benefits and... 
    Remote work
    Permanent employment
    Full time
    Contract work
    For contractors

    Zero Hiring

    Portland, OR
    1 day ago
  •  ...Test Developer EngineerLewisville,Texas,United StatesFind out how well you match with this...  ...Smart Factory Inc. seeks Test Developer Engineer in Lewisville, TX; Hybrid work policy w/in...  ...and verifying new test methods including remote control of devices under test and... 
    Remote work

    Ericsson GmbH

    Lewisville, TX
    4 days ago
  • Location: Scottsdale, Arizona, United States Join Axon and be a Force for Good. Your Impact...  ...Fridays, with the flexibility to work remotely on Mondays, unless there is an approved...  ...shared success. As a Senior Mechanical Engineer you will: Drive innovative product development... 
    Remote work
    Work at office
    Flexible hours

    Axon Enterprise

    Scottsdale, AZ
    4 days ago
  • The United Nations Development Programme (UNDP) is offering a unique Communication and Collaboration Engineer Internship at its Information and Technology Management (ITM) unit in New...  ...monthly stipend helps cover basic expenses. Remote internships receive a reduced rate... 
    Remote work
    Permanent employment
    Traineeship
    Internship
    Local area
    Worldwide
    Night shift

    Globalsouthopportunities

    New York, NY
    3 days ago
  • $80k - $90k

    Location: New York, New York, United States Join Axon and be a Force for Good. 911 is the backbone...  ...world. Your Impact The Senior NOC Engineer reports to the NOC Manager and needs to...  ...in SaaS/Web environments, ensuring optimal service availability (Using platforms like... 
    Flexible hours
    Night shift

    Axon Enterprise

    New York, NY
    3 days ago
  • Metrology Engineer at Flex - Austin, Texas, United States of America Job Posting Start Date 07-18-2025 Job Posting End Date 09-30-2025 Flex is the diversified...  ..., thread gauges, plug gauges, comparators etc.). Optimize inspection methods to expedite inspections and manage... 
    Temporary work
    Flexible hours

    Victrays

    Austin, TX
    4 days ago
  •  ...At Together.ai, we are building state-of-the-art infrastructure to enable efficient and scalable inference for large language models (LLMs). Our mission is to optimize inference frameworks,...  ...Inference Frameworks and Optimization Engineer to design, develop, and optimize... 

    Gravity Engineering Services Pvt Ltd.

    San Francisco, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Inference Optimization Engineer United States - Remote · Remote. Be the first to apply!