Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Research Scientist- Vision-Language-Action (VLA) Models

$165k - $185k

Bosch Group

Job Description

Job Description

Company Description

The Bosch Research and Technology Center North America with offices in Sunnyvale, California, Pittsburgh, Pennsylvania, and Cambridge, Massachusetts is a part of the global Bosch Group ( a company with over 70 billion euro revenue, 400,000 employees worldwide, a very diverse product portfolio, and a history spanning over 125 years. The Research and Technology Center North America (RTC-NA) is dedicated to providing technologies and system solutions for various Bosch business fields, primarily in the field of artificial intelligence, energy technologies, internet technologies, circuit design, semiconductors and wireless, as well as advanced MEMS design.

As a part of the global research, our AI research in Silicon Valley focuses on Foundation Models, Big Data Visual Analytics, Explainable AI (XAI), Natural Language Processing, Computer Vision & Mixed Reality, Cloud Robotics, Data Science, AI System Engineering, Time-series Analysis. We develop scalable, intelligent, and trustworthy AIoT solutions for Bosch products and services in application areas such as automated driving, advanced driver assistance systems (ADAS), robotics, smart manufacturing, enterprise AI, health care, smart home and building solutions.

Originating from the AI research in Silicon Valley, our Intelligent Autonomous Systems group is responsible for enabling future autonomous Bosch products by pushing the boundaries of automated driving, advanced driver assistance systems (ADAS), robotics and automation through key innovations that encompass system architecture and AI components. These include methods for motion planning, high level task planning and decision making as well as systems for making these technologies work on real products by building frameworks that take advantage of technologies in the field of reliable distributed computing. We work with internal partners of different Bosch business units to transfer our solutions into future products. We also actively collaborate with leading groups in academia and industry to promote research ideas and publish research findings in internationally renowned conferences and journals such as CVPR, ICRA, IROS, RSS, NeurIPS and CoRL.

Job Description

As a Research Scientist- Vision-Language-Action (VLA) Models, you contribute to research projects at the forefront of the ADAS/AD industry. Key responsibilities include:

  • Conduct research and engineering in core AI and machine learning fields to enable Embodied AI (including computer vision, autonomous planning, open-world learning, and so on) for related business domains of ADAS/AD, industrial automation, robotics etc.
  • Push the boundaries in (modular) end-to-end perception and planning for ADAS/AD, incorporating advancements in large vision-language-(action) models to aid reasoning capabilities and explainability.
  • Collaborate cross-functionally  with global research and engineering teams to ensure seamless technology transfer and system integration.
  • Implement research results to solve real-world challenges, ensuring high-quality system integration within Bosch's existing platforms.
  • Stay at the forefront of innovation  by actively engaging with academic and industry communities through conferences, workshops, and technical events.
  • Document and disseminate research findings through high-caliber publications and/or patent submissions.
Qualifications

Basic Qualifications

  • Ph.D. in Computer Science, Robotics or a related discipline or Master's degree with >= 2 years industry experience after graduation.
  • A minimum of 3 years of R&D experience, or an equivalent graduate research background, primarily in AI technologies including Computer Vision and Robotic or Automotive Motion and Behavioral Planning.
  • Proficiency in one or more programming languages commonly used in machine learning (e.g., Python, C++, Rust).
  • Strong interpersonal, communication, and teamwork capabilities.
  • Knowledge of major machine learning frameworks like TensorFlow or PyTorch.
  • Hands-on experience in reinforcement learning for behavior or motion planning or other applicable contexts and familiarity with common RL techniques (e.g. PPO, DQN, DDPG).
  • A strong portfolio of publications in premier machine learning, deep learning, robotics and computer vision journals and conferences.

Preferred Qualifications

  • Experience with real-world product development and deployment of autonomous systems.
  • Hands-on experience building and applying multimodal transformer-based sequence-to-sequence models, especially multimodal vision-language-action models.
  • Hands-on experience in computer vision and deep learning, with work in any of the following areas: multimodal transformers, multimodal language models, diffusion models, NeRF, gaussian splatting, object detection / segmentation, 3D scene understanding, sensor calibration, SfM, voxel/BEV grid-based feature representation.

Additional Information

We offer a competitive base salary for this position with a range in US-California of --$165,000 - $185,000 along with an annual corporate bonus, and a long-term incentive bonus designed to reward sustained impact and contribution over time. Within the salary range, the individual pay is determined based on several factors, including, but not limited to, work experience and job knowledge, complexity of the role, job location, etc.

Your well-being matters at Bosch! We offer a a benefits package designed to empower you in every area of your life. This includes premium health coverage, a 401(k) with generous matching, resources for financial planning and goal setting, ample paid time off, parental leave, and comprehensive life and disability protection.  Your Recruiter can share more details for this position during the interview process.

Learn more about our full benefits offerings by visiting: 

Equal Opportunity Employer, including disability / veterans.

*Bosch adheres to Federal, State, and Local laws regarding drug-testing. Employment is contingent upon the successful completion of a drug screen and background check. Candidates who have been offered the position must pass both screenings before their start date.

#LI-JM1

Vacancy posted 5 days ago
Similar jobs that could be interesting for youBased on the Research Scientist- Vision-Language-Action (VLA) Models in Sunnyvale, CA vacancy
  •  ...About the Institute of Foundation Models We are a dedicated research lab for building, understanding, using...  ...world-class researchers, data scientists, and engineers, tackling the most...  ...Summary As a Research Scientist in the Vision Language Model (VLM) team, your role will... 
    Language

    Institute of Foundation Models

    Sunnyvale, CA
    6 days ago
  • $126k - $423k

     ...looking for multiple passionate Research Scientists to join the Research Group...  ...on pretraining world-action foundation model with various world modalities including vision and physics associated with...  ..., human data incorporation, language modality, and spatial reasoning... 
    Language
    Full time
    For contractors
    For subcontractor
    Casual work
    Work at office
    Immediate start
    Remote work
    Day shift

    Applied Intuition

    Sunnyvale, CA
    9 days ago
  • $160.36k - $240.54k

     ...Machine Learning Research Scientist: Generative Modeling for Planning Mountain View, California (HQ)...  ...foundation models. Leverage large language models and world foundation models...  ...autonomous driving. Experiences in vision-language-action models, reinforcement learning... 
    Language

    Nuro

    Mountain View, CA
    2 days ago
  • $213k - $263k

     ...collaborations with other research teams in Alphabet. AI...  ..., generative modeling, Bayesian inference, hierarchical...  ...to a Staff Research Scientist / Tech Lead Manager ....  ...generative world action modeling solutions to...  ...Health, dental, vision, life, disability insurance... 
    Suggested
    Full time
    Temporary work
    Remote work

    Waymo

    Mountain View, CA
    2 days ago
  • $184k - $287.5k

     ...built. We are seeking a senior vision language model engineer to design and build...  ...doing: Partner with our researchers to develop and evaluate...  ...g., video, sensor, language/action traces) tailored for end‑to‑...  ...modeling, and multimodal VLM/VLA or foundation models. Excellent... 
    Language

    NVIDIA

    Santa Clara, CA
    1 day ago
  • $218.8k - $335.3k

     ...ready to redefine mobility and shape the future of autonomous transportation? As a Staff Research Scientist specializing in Vision-Language Models (VLMs), Vision-Language-Action models (VLAs), and Onboard Foundational Models, you will advance the frontier of artificial... 
    Language
    Full time
    Local area
    Remote work
    Work from home
    Relocation package
    Flexible hours

    General Motors

    Mountain View, CA
    7 hours ago
  •  ...Institute of Foundation Models We are a dedicated research lab for building, understanding...  ...-class researchers, data scientists, and engineers, tackling...  ...specializing in Computer Vision your role will be crucial...  ...-related concepts (e.g., language modeling, computer vision... 
    Language
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    21 days ago
  • $165k - $185k

     ...Description Company Description The Bosch Research and Technology Center North America...  ...Silicon Valley focuses on Foundation Models, Big Data Visual Analytics, Explainable AI (XAI), Natural Language Processing, Computer Vision & Mixed Reality, Cloud Robotics, Data... 
    Language
    Work experience placement
    Worldwide

    Bosch Group

    Sunnyvale, CA
    5 days ago
  • $235.52k - $323.04k

     ...Summary: ​ The Principal Engineer, ML (VLA Automated Driving) is the technical anchor for Vision-Language-Action (VLA /VLAM) models for our Level 2++ to Level 4 Automated Driving...  ...systems and helps turn promising research into robust in-vehicle capability. This... 
    Language
    Permanent employment
    Temporary work

    Cariad, Inc.

    Mountain View, CA
    6 days ago
  •  ...At Toyota Research Institute (TRI), we're on a mission to improve the quality of human...  ...Design and implement end-to-end modeling pipelines for machine assembly tasks,...  ...Qualifications Familiarity with large language models, vision-language models, or agentic AI frameworks... 
    Language
    Work experience placement
    Internship
    Local area
    Shift work

    Toyota Research Institute

    Los Altos, CA
    3 days ago
  • $50 per hour

     ...execution of Chinese (zh-CN) multimedia and language data labeling and review work (e.g.,...  ...rates, rework, backlog) and drive corrective actions Own calibration and consistency...  ...annotation, multimodal data labeling, computer vision labeling, content QA, or a closely... 
    Language
    Full time

    Welocalize

    Sunnyvale, CA
    1 day ago
  • $215.28k - $364.32k

     ...and smart connectivity. The Mission: The challenge of Vision-Language-Action (VLA) models and Foundation Models isn't just their intelligence—it's...  ...Learning Engineer to bridge the gap between massive research models and production-ready L4 autonomous driving systems... 
    Language
    Full time

    XPENG

    Santa Clara, CA
    16 hours ago
  •  ...stack of a unified multimodal foundation model, from pretraining to deployment on real...  ...robotic hardware. This is foundational research with direct physical impact. No hand-...  ...large-scale multimodal architectures where vision, language, and kinematics share a unified... 
    Language

    Prime Recruitment Partners

    Sunnyvale, CA
    3 hours ago
  • $215.28k - $364.32k

     ...Machine Learning Engineer - Foundation Model Santa Clara, CA XPENG is a...  ...time Machine Learning Engineer / Research Scientist to drive the modeling and algorithmic...  ...development of XPENG's next-generation Vision-Language-Action (VLA) Foundation Model — the core brain... 
    Language
    Full time

    XPENG

    Santa Clara, CA
    2 days ago
  • $165k - $195k

     ...Job Description Company Description The Bosch Research and Technology Center North America with offices in...  ...AI research in Silicon Valley focuses on Foundation Models, Natural Language Processing, Computer Vision & Mixed Reality, Cloud Robotics, Big Data Visual Analytics... 
    Language
    Full time
    Work experience placement
    Local area
    Worldwide

    Bosch Group

    Sunnyvale, CA
    15 days ago
  • $190k - $250k

     ...large-scale generative world models that learn to predict...  ...trucks. We are looking for a research scientist to lead the design and development...  ...scenarios conditioned on actions, 3D scene context, and text....  ...Excellent Medical, Dental, and Vision plans through Kaiser... 
    Temporary work
    Work at office
    Visa sponsorship
    Flexible hours

    Kodiak

    Mountain View, CA
    12 days ago
  • $131k - $180k

     ...leverages state-of-the-art generative AI and large language models to tackle complex problems in materials...  ...discovery, and hardware design. We work closely with scientists, engineers, and product leaders to translate frontier research into practical, high-value applications.... 
    Language
    Full time
    Relocation

    Applied Materials

    Santa Clara, CA
    4 days ago
  • $75k - $300k

     ...Learning Engineer: LLM, VLM/VLA and reasoning models Tensor is an agentic AI...  ...autonomy and ownership. Our vision is to build a future where...  ...machine learning, natural language processing, and computer vision...  ...Stay updated with the latest research and advancements in AI,... 
    Language

    Tensor

    San Jose, CA
    2 days ago
  • $170k - $216k

     ...the Waymo Driver. We conduct our own research to address real-world problems and collaborate...  ...scale real-world data, to (2) develop models and model training at scale, to (3)...  ...with the latest advancements in large language models, vision-language models, and continual... 
    Language
    Full time
    Remote work

    Waymo

    Mountain View, CA
    2 days ago
  • $200k - $287.5k

     ...At Toyota Research Institute (TRI), we're on a mission...  ...Policy and Large Behavior Models (LBM). The...  ...-of-the-art, pixels-to-action, end-to-end system for...  ...and integrating visual-language-action modalities. Beyond...  ...with a focus on computer vision as the primary sensing... 
    Language
    Local area
    Shift work

    Toyota Research Institute

    Los Altos, CA
    3 days ago
  • $176k - $420k

     ...You will join the team building the vision and multimodal foundation models that allow Optimus to understand,...  ...compression, and fusion of vision, language, audio, and tactile data ~...  ...video generation, vision-language-action models ~ Human/Object Reconstruction... 
    Language
    Hourly pay
    Full time
    Temporary work
    Flexible hours

    Tesla

    Palo Alto, CA
    2 days ago
  • $269.4k - $412.6k

     ...deploying advanced ML models to reliably and safely...  ...driving models. Conduct research and stay updated on...  ...such as PyTorch and languages such as python. Proven...  ...medical, dental, vision, Health Savings Account...  ...equitable for all through our actions and how we behave.... 
    Language
    Local area
    Work from home
    Relocation package
    Flexible hours

    General Motors

    Mountain View, CA
    1 day ago
  • $140k - $195k

     ...Team : Our AI Research team, reporting...  ...multimodal foundation models, generative AI,...  ...world-class team of scientists and engineers, and...  ...computer vision, robotic manipulation...  ...or more of systems languages (C++/Java) Demonstrated...  ...all through our actions and how we behave.... 
    Language
    Work at office
    Local area
    Work from home
    Relocation package
    Flexible hours

    General Motors

    Mountain View, CA
    5 days ago
  • $204k - $259k

     ...foster collaborations with other research teams in Alphabet. AI...  ...from demonstration, generative modeling, Bayesian inference, hierarchical...  ...will report to a Principal Scientist. You will: Participate...  ...: Health, dental, vision, life, disability insurance... 
    Full time
    Temporary work
    Remote work

    Waymo

    Mountain View, CA
    2 days ago
  • $100k - $300k

     ...Description Job Description OPPO Research Center is seeking a passionate and innovative Research Scientist to advance our next-generation AI...  ...of multimodal intelligence models that seamlessly integrate language, vision and action. As part of our collaborative team... 
    Language
    Full time

    OPPO US Research Center

    Palo Alto, CA
    18 days ago
  • $300k

     ...Institute of Foundation Models We are a dedicated research lab for building, understanding...  ...-class researchers, data scientists, and engineers, tackling...  ...the future of large language models. Why You’ll Love...  ...Comprehensive medical, dental, and vision benefits  *Bonus *401K... 
    Language
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    21 days ago
  •  ...Institute of Foundation Models We are a dedicated research lab for building, understanding...  ...-class researchers, data scientists, and engineers, tackling...  ...on data-centric large language model (LLM) development,...  ...Comprehensive medical, dental, and vision benefits  *Bonus *401K... 
    Language
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    a month ago
  • $150k - $300k

     .... The Silicon Valley Research Lab focuses on developing...  ...etc.   As a Research Scientist in the team, you will...  ...and evaluate algorithms, models and prototypes of AI systems...  ...machine learning, natural language processing, computer vision, reinforcement learning,... 
    Language
    Full time
    H1b
    Work at office
    3 days per week

    Horizon Robotics

    Cupertino, CA
    23 days ago
  •  ...Institute of Foundation Models We are a dedicated research lab for building, understanding...  ...-class researchers, data scientists, and engineers, tackling...  ...working with large language models, including evaluation...  ...medical, dental, and vision benefits  *Bonus *401K... 
    Language
    Worldwide
    Visa sponsorship

    Institute of Foundation Models

    Sunnyvale, CA
    21 days ago
  • $35 per hour

     ...services in translation, localization, and adaptation for over 250 languages with a growing network of over 400,000 in-country linguistic...  ...: ▪️ Medical Insurance ▪️ Dental Insurance ▪️ Vision Insurance ▪️ FSA and HSA ▪️ Voluntary Life Insurance ▪️... 
    Language
    Remote job
    Hourly pay
    Full time

    Welocalize

    Cupertino, CA
    1 day ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Research Scientist- Vision-Language-Action (VLA) Models. Be the first to apply!