Multimodal LLM Researcher (MLLM)
Pika
Multimodal LLM Researcher (MLLM) About the Role At Pika, we are pioneering next-generation creative infrastructure built around real-time, multimodal generation and intelligent, agentic platforms. We are seeking accomplished Multimodal LLM Researchers (LLM, VLM, and Audio LM) to drive forward our mission to make agentic real-time generative technology accessible, dynamic, and transformative for millions of creators. As a core member of our research team, you will be integral to designing and building foundational technologies, developing novel approaches for large multimodal language models (LLMs/VLMs/Audio LMs), and orchestrating intelligent agentic systems that power scalable, interactive multimedia experiences. You will collaborate closely with engineering and product teams, shaping the future of real-time creative platforms. What You’ll Do Lead and contribute to research efforts focused on real-time, multimodal generation—including text, image, video, and audio synthesis—as well as orchestration of agentic platform infrastructure Design and prototype novel algorithms and architectures for high-fidelity, real-time multimodal synthesis and interactive experiences Focus on real-time aspects of model inference and synthesis across modalities Work on diffusion model distillation and/or develop diffusion-based world models for multimodal applications Train and finetune autoregressive and diffusion models in LLM, VLM, or Audio LM contexts with a focus on real-time performance Curate specific datasets, especially for video, audio, cross-modal, and sensory-rich data Collaborate with cross-functional teams to bring research advancements into production-ready technologies Publish work in top-tier conferences and journals; communicate research results internally and externally Stay at the cutting edge of real-time multimodal generative AI and agentic orchestration What We’re Looking For 5+ years of relevant experience, including research during graduate studies, in large language models, vision-language models, audio language models, deep learning, or related fields Demonstrated impact as first author on major publications in top conferences or journals (e.g., NeurIPS, CVPR, ICML, ICCV, SIGGRAPH, Interspeech, etc.) Deep expertise in at least one area: language modeling (LLM), vision-language modeling (VLM), or audio language modeling (Audio LM) Strong experience with generative models, including autoregressive and diffusion models, and their real-time deployment Hands‑on experience curating, constructing, or augmenting large, high-quality multimodal datasets Experience developing and deploying real-time systems and/or agentic orchestration infrastructure Strong programming and prototyping skills (Python, PyTorch, TensorFlow, etc.) Passion for building creative tools and platforms that empower users Excellent communication and collaboration skills What We Offer Competitive salary and substantial equity in a high‑growth startup Full health benefits + 401k matching and more Collaborative, mission‑driven team environment with major growth opportunities Flexible on‑site/remote hybrid (HQ in Palo Alto, CA) About Pika Pika empowers creators by building state‑of‑the‑art agentic and multimedia platforms. Our vision is to break down technical barriers to creativity, making real-time generative and intelligent orchestration accessible to all. Join us and shape the next evolution of creative technology! If you are a leading researcher excited by real-time multimodal AI and agentic platforms, we want to hear from you. #J-18808-Ljbffr Pika
$300k - $400k
...Multimodal LLM Researcher $300,000 - $400,000 Remote, Palo Alto Full-time / Permanent DeepRec has partnered with a high-growth generative AI company (Series B, $130M+ raised). They're building multimodal, multi-agent systems that combine language, vision, audio...SuggestedPermanent employmentFull timeRemote work- Neura Market is searching for a Multimodal LLM Researcher to innovate in real-time generative AI technologies. The position involves leading research efforts in multimodal generation and collaborating closely with engineering teams to transform insights into groundbreaking...SuggestedFlexible hours
- ...leading technology firm in California is seeking a passionate Research Scientist to advance next-generation AI hardware platforms. The role involves developing multimodal intelligence models, benchmarking innovative LLM architectures, and collaborating across teams to...Suggested
$149k - $279.8k
...Omni Multimodal Large Models Research Scientist Technology Engineering Group (TEG) is responsible for supporting the company and its business groups on technology and operational platforms, as well as the construction and operation of R&D management and data centers...SuggestedFull timeRelocation package$181.1k - $318.4k
...AIML - Machine Learning Researcher - Multimodal Agent The AIML Multimodal Foundation Model Team is pioneering next-generation intelligent agent... ...applied machine learning, computer vision, multimodal LLM, and agent training experience and solid engineering skills....SuggestedRelocation- ...competitive salary with an employee stock option plan and the opportunity to be part of a leading AI startup focused on impactful research. The ideal candidates should have a strong background in machine learning and deep learning with the capability to conduct hands-on...Remote work
$180k
...strong prioritization skills are important. All engineers and researchers are expected to have strong communication skills. They should... ...knowledge with their teammates. About the Role The multimodal team at xAI creates magical AI experiences beyond text, enabling...Local areaRelocation$152k - $218.5k
At Toyota Research Institute (TRI), we’re on a mission to improve the quality of human life. We’re developing new tools and capabilities... ...as pythonocc-core for primitive fitting. Experience with multimodal generative models for boundary representations. Track record...Local areaShift work- Real-time Video Researcher About the Role At Pika, we are pioneering next-generation creative infrastructure built around real-time video... ...new developments in real-time video, generative AI, multimodal systems, and agentic orchestration What We’re Looking For 5+...Remote workFlexible hours
$200k - $300k
...Location Type Hybrid Department AI Perplexity is seeking top-tier AI Research Scientists and Engineers to advance our AI products and... ...Research, Comet, and Search products Stay current with the latest LLM research, especially in model training, optimization, and...Full time- ...Kaiser, co-author of the Transformer (“the T” in ChatGPT) and a key researcher behind OpenAI’s reasoning models. Pathway is headquartered in... ...contributions. # You have significantly contributed to an LLM training effort which became newsworthy (topped a Huggingface benchmark...Permanent employmentFull timeContract workImmediate startRemote workFlexible hours
$100k - $300k
Overview OPPO Research Center is seeking a passionate and innovative Research Scientist to... ...the design, training, and deployment of multimodal intelligence models that seamlessly integrate... ...AI conferences. Responsibilities Develop LLM model with parameter number around 1...Full time- ...the order of listing. What you’ll do As a Research Scientist at Simular, you will: Shape... ...directions in planning, reinforcement learning, multimodal reasoning, grounding, human-agent... ...LLMs/VLMs Reinforcement learning and/or LLM‑based agents Computer vision and multimodal...
- ...What you'll be doing: Developing innovative solutions that advance AI/ML systems for mobility services Conducting applied research in Agentic AI, including agent design, orchestration strategies, tool integration, and communication protocols Conducting...
- ...AI Researcher – Video World Generation San Francisco (Bay Area) Help build the next generation of AI video systems that can create... ...generation Required: Strong production experience with large multimodal or agentic AI systems Hands-on work with distributed...
$176k - $253.5k
...At Toyota Research Institute (TRI), we're on a mission to improve the quality of human life.... ...setting, particularly in areas related to LLM training or large-scale ML. Industry experience... ...and literature. Experience with LLM/MLLM pretraining, fine-tuning (e.g., SFT, RLHF)...Temporary workLocal areaShift work- ...At Toyota Research Institute (TRI), we're on a mission to improve the quality of human life. We're developing new tools and capabilities to amplify the human experience. To lead this transformative shift in mobility, we've built a world-class team advancing the state...Work experience placementInternshipLocal areaShift work
- ...contexts much more efficiently than GPUs. Sohu enables entirely new research directions and products. When our chips come out, these use... ...new and verifiable benchmark for agent reasoning • Design LLM content understanding based recommendation systems You may be...
$181.1k - $318.4k
...Senior Applied ML Researcher - Video Apps We are seeking a Senior Applied ML Researcher to design, train, and deploy state-of-the-art... ...intersection of computer vision, audio signal processing, and multimodal learning, enabling intelligent systems that can see, hear, and...Relocation- ...ML Researcher Tilde Research is a moonshot AI lab advancing mechanistic interpretability, new architectures, and pretraining science. We build foundational understanding of models to advance the frontier of intelligence. About the role: As a ML Researcher, you...Full timeInternship
- ...Role Number: 200601297-3760 Summary We are hiring a researcher with a strong technical background in Image/Video generation and editing, as well as Multimodal Foundation Models. You will play a critical role in the research and development of multimodal foundation...
$147.4k - $272.1k
...AIML - Senior ML Researcher in Foundation Models, Responsible AI Join us as we build world-class groundbreaking products for our customers... ...mitigations and safeguards to ensure safe deployment of LLM's in Apple products Advocate for scientific and engineering...Relocation$200k - $287.5k
...At Toyota Research Institute (TRI), we're on a mission to improve the quality of human life. We're developing new tools and capabilities to amplify the human experience. To lead this transformative shift in mobility, we've built a world-class team advancing the state...Local areaShift work$110.8k - $253.4k
...through our expertise, tools, and products, accelerating the advancement of gaming technologies worldwide. Responsibilities: Research and analyze the latest advancements in gaming AI technologies, including but not limited to AI-driven 2D/3D content generation,...WorldwideRelocation package- Babylon is looking for a Senior Researcher. Babylon is a blockchain infrastructure startup founded by David Tse of Stanford and Fisher Yu, and backed by a16z, Paradigm, Polychain, and other leading investors. Babylon’s vision is to enable the trillion-dollar Bitcoin asset...
- Babylon in Palo Alto is seeking a Senior Researcher to conduct impactful research in the blockchain field. The ideal candidate will hold a Ph.D. in relevant areas and have a strong interest in blockchain technology. This role involves close collaboration with Babylon’s...
- We are looking for a Senior Research Scientist passionate about Large Language Model (LLM) and Diffusion Language Model (DLM) post‑training and system optimization. This role is part of NVIDIA’s foundation models and generative AI group, focusing on post‑training algorithms...
- A leading research institute in mobility innovation is looking for a Research Scientist to develop intelligent systems for physical assembly. This role is ideal for recent PhD graduates with hands-on experience in policy learning and reinforcement learning. Candidates...
- Simular Inc. is looking for a Research Scientist to pioneer new research directions in AI and execute end-to-end experiments. You will collaborate with engineers and contribute to the AI research community through publications. The ideal candidate will have a PhD in Computer...
- A leading technology company is seeking a Senior Research Scientist to focus on Multimodal Foundation Models and Robotics. This position involves designing AI algorithms for humanoid robots, developing training methods for foundation models, and working with a collaborative...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Multimodal LLM Researcher (MLLM). Be the first to apply!
- design researcher Palo Alto, CA
- security researcher Palo Alto, CA
- field researcher Palo Alto, CA
- court researcher Palo Alto, CA
- remote researcher Palo Alto, CA
- data collection researcher Palo Alto, CA
- human factors researcher Palo Alto, CA
- researcher Palo Alto, CA
- product researcher Palo Alto, CA
- machine learning researcher Palo Alto, CA


