Senior AI Inference Engineer llama.cpp specialist 100% Remote

Framework Ventures

About the job You'll work on the C++ layer that powers local AI, porting and enhancing inference engines like llama.cpp, ONNX and similar, to run efficiently on edge devices. Your focus is on the runtime: making models load faster, run leaner, and perform well across different hardware. You'll ensure that the inference layer is stable, optimized, and ready for integration with the rest of the stack. This role is for engineers who want to work close to the metal, enabling private and fast on-device AI without relying on cloud infrastructure. Responsibilities Work on deploying machine learning models to edge devices using the frameworks: llama.cpp, ggml, ONNX. Collaborate closely with researchers to assist in coding, training and transitioning models from research to production environments. Integrate AI features into existing products, enriching them with the latest advancements in machine learning. Qualifications Excellent programming skills in C++, experience in Javascript is a bonus Strong experience with Llama.cpp and ggml inference engines, which facilitates the deployment of models to specific GPU architectures Good understanding of deep learning concepts and model architectures Experience with transformers and LLMs Demonstrated ability to rapidly assimilate new technologies and techniques A degree in Computer Science, AI, Machine Learning, or a related field, complemented by a solid track record in AI & R&D #J-18808-Ljbffr

Apply

Vacancy posted 2 days ago

Similar jobs that could be interesting for youBased on the Senior AI Inference Engineer llama.cpp specialist 100% Remote in United States vacancy

Senior AI Inference Engineer 100% Remote
...to edge devices using the frameworks: llama.cpp, ggml, onnx. Collaborate closely with researchers... ...to production environments. Integrate AI features into existing products,... ...experience with Llama.cpp and ggml inference engines, facilitating the deployment of models...
Remote work
Senior
Framework Ventures
United States
1 day ago
Edge AI Inference Engineer - Llama.cpp Specialist
...Framework Ventures is seeking a C++ Engineer in Town of Norway, Wisconsin, specializing in AI development for edge devices. The role involves working on inference engines like llama.cpp and ONNX, ensuring optimized performance across hardware. You'll deploy machine learning...
Suggested
Framework Ventures
Waterford, WI
10 hours ago
Edge AI Inference Engineer - Llama.cpp Specialist
...Framework Ventures is seeking a C++ Engineer to work on the AI layer that powers local AI on edge devices... ...models using frameworks like llama.cpp and ONNX, collaborating with researchers... ...strong C++ skills and experience with inference engines, as well as a relevant degree...
Suggested
Local area
Framework Ventures
Italian Republic
2 days ago
Senior AI Research Engineer Model Inference Remote
...Job We are looking for an experienced AI Model Engineer with deep expertise in kernel development... .... The engineer will extend the inference framework to support inference and fine... ...model architectures (e.g., Qwen, Gemma, LLaMA, Falcon, etc.). Experience implementing...
Remote work
Senior
Framework Ventures
New York, NY
2 days ago
Senior AI/ML Platform Engineer (LLM/SLM Inference)
$199.7k - $254.6k
...Join Cisco's CX AI Incubation Team as a Senior AI/MLDevOpsEngineer... ...collaborate with product and engineering teams to deploy... ..., including on-prem inference packaging, runtime optimization... ...,TensorRT-LLM, llama.cpp). ~... ...attainment between 75% and 100%; and ~ Once...
Senior
Full time
Temporary work
Local area
Flexible hours
Cisco
San Jose, CA
14 hours ago
Senior AI Inference Optimizations Engineer Remote
...A cloud technology company is looking for a Senior Engineer 2 to enhance their AI Inference Optimization team. In this role, you will drive architectural... ...position offers competitive compensation and is fully remote, promoting a collaborative and innovative work environment...
Remote work
Senior
DigitalOcean
Seattle, WA
2 days ago
Senior AI & TypeScript Engineer 100% Remote
...TypeScript-Entwickler zur Entwicklung moderner Webapplikationen und zur aktiven Mitgestaltung an KI-Features. Die Position bietet 100% Remote-Arbeit, flexible Arbeitszeiten und überdurchschnittliche Vergütung. Geübte Kommunikation in Deutsch ist Voraussetzung, da die...
Remote work
Senior
Flexible hours
dreifach.ai
United States
2 days ago
Senior AI Inference Data Plane Engineer Remote
$167.2k - $209k
A leading cloud service provider is seeking a Senior Engineer 2 for their AI Inference Data Plane team. This remote role focuses on designing and developing high-scale, resilient data plane services that enhance AI-driven applications. The ideal candidate will have strong...
Remote work
Senior
DigitalOcean
San Francisco, CA
4 days ago
Senior AI Inference Data Plane Engineer (Remote)
$167.2k - $209k
A pioneering cloud service provider in Seattle seeks a Senior Engineer 2 for its AI Inference Data Plane team. This role requires designing and delivering... ...in GoLang or Python. Competitive salary range from $167,200 to $209,000 with remote work options. #J-18808-Ljbffr
Remote work
Senior
DigitalOcean
Seattle, WA
4 days ago
Senior Information Technology Purchasing Specialist - Pay to $100,000
$100k
...Senior Information Technology Purchasing Specialist Must have two years of information technology purchasing experience... ...experience Pay up to $100,000 Must be a United States citizen... ...citizen or Green Card holder Partial remote with at least Mondays and Fridays...
Remote work
Senior
Permanent employment
Full time
Work at office
Relocation
Monday to Friday
MRINetwork
United States
1 day ago
AI Inference Engineer
...Owning the inference backbone for QVAC's local AI stack, the full-time AI Inference Engineer will work remotely to enhance C++ systems for efficient model deployment on edge devices... ...to edge devices using frameworks like llama.cpp, ggml, and ONNX Collaborate with...
Remote work
Full time
Local area
Virtual Vocations Inc
United States
18 hours ago
Senior AI Inference Compiler Engineer
$152k - $241.5k
...recently, GPU deep learning ignited modern AI - the next era of computing - with the... ...looking for an AI & Deep Learning Compiler Engineer. NVIDIA is hiring software engineers for... ...our DLC has been the backbone of NVIDIA's inference engine, spanning across data centers,...
Remote work
Senior
NVIDIA
United States
4 days ago
Llama Developer (Generative AI / LLM Engineer)
...Responsibilities Build AI applications using Llama (Llama 3 / Llama Stack / Llama API / local LLM inference) . Fine-tune and evaluate... ...via quantization, prompt engineering, and latency reduction.... ...Experience with llama.cpp , vLLM , Ollama , or NVIDIA...
Remote work
Local area
Tranzeal
United States
3 days ago
Senior AI Inference Compiler Engineer - Remote
$152k - $241.5k
A leading technology company in Austin is seeking a Senior Compiler Engineer for their AI Inference Platforms team. The role involves analyzing deep learning networks and developing optimization algorithms, requiring expertise in compiler technologies. Ideal candidates...
Remote job
Senior
NVIDIA Corporation
Austin, TX
4 days ago
Senior Experimentation & Causal Inference Specialist
...worldwide. Responsibilities: Design and execute causal inference and incrementality experiments (GEO experiments, matched markets... ...Partner with cross-functional stakeholders across regions and seniority levels Improve testing playbooks and ensure consistent...
Remote work
Senior
Worldwide
Varite
United States
1 day ago
Senior AI Engineer Data Infrastructure Multimodal Models 100% Remote
...About the job We’re seeking experienced AI infrastructure Engineers to design and implement robust, scalable pipelines for massive data workloads... ...of data and model workflows from prototyping to inference. Qualifications Proficient in Python with strong programming...
Remote work
Senior
Framework Ventures
United States
1 day ago
Senior AI Inference Engineer - Model Optimization & Deployment
$242k - $290k
...As a Model Optimization & Deployment Engineer, you will focus on bringing highly efficient... ...CUDA kernels, and build highly concurrent inference code to ensure real-time, deterministic execution... ...latency and maximize memory bandwidth on AI accelerators. Write production-level,...
Remote work
Senior
Temporary work
Relocation package
Zoox
San Diego, CA
1 day ago
Senior Paid Ads Specialist / Manager (100% Remote)
...Senior Paid Ads Specialist / Manager (100% Remote) Greater Delhi Area We are looking for a Senior Paid Ads Specialist / Manager to join our team of high-performing paid marketing experts and strengthen our digital advertising efforts. If you thrive on running and...
Remote work
Senior
Flexible hours
WIN Home Inspection
United States
17 hours ago
Senior Director, Software Engineering - AI ML Engineering
...a focused team of 3–5 senior engineers while remaining deeply... ...performance, privacy-preserving AI models that run... .... This is a Hybrid remote position located in a... ...optimized for on-device inference (Mac, iOS, Android,... ...CoreML, ONNX Runtime, or llama.cpp) ~ Experience...
Remote work
Senior
Relocation package
McAfee
San Jose, CA
14 hours ago
Senior AI Platform Engineer, Core Cloud Engineering
$110k - $140k
...for enterprises and AI innovators around... ...company. Vultr Cares 100% company‑paid... ...year $500 stipend for remote office setup in first... ...AI Platform Engineer to own the strategy... ...experience deploying LLM inference infrastructure and... ...open‑source models — Llama, Mistral, Qwen,...
Remote work
Senior
Work at office
Immediate start
Flexible hours
Vultr
Richmond, VA
2 days ago
Senior AI Engineer
$160k - $190k
...Senior AI Engineer Paper is reimagining how schools support students so that every learner can... ...and tool-using agents. Build scalable inference systems with strict latency and cost... ...stipend to set-up your workspace and $100 monthly stipends to support with on-going...
Remote work
Senior
Softbank Investment Advisers
United States
17 hours ago
Senior Software Engineer (AI Engineer)
...Senior Software Engineer (AI Engineer) Portugal, Remote Who We Are At Fluxon, we believe that how you build matters... ...ingestion, preprocessing, model inference, and output structuring... ...-tune open-source models (e.g., Llama, Mistral) for specific domain tasks...
Remote work
Senior
Flexible hours
Fluxon
United States
14 hours ago
AI Inference Engineer
$175k - $225k
...led by veteran operators and engineers, alumni of Sonos, Paypal, Tesla... ...We're looking for an AI Inference Engineer who lives at the boundary... ...autonomous navigation. Exposure to remote logging, log ingestion, and... ...this role, but do not meet 100% of the qualifications...
Remote work
Local area
Sauron
San Francisco, CA
5 days ago
Senior AI Engineer
...Senior AI Engineer WongDoody creates human experiences at 22 studios across... ..., we are team players and specialists - both in frontend and... ...Flexible working hours and 100% overtime compensation ~... ...ChatGPT, Claude, Gemini, or Llama into end-to-end workflow solutions...
Remote work
Senior
Work at office
Local area
Flexible hours
WONGDOODY
United States
4 days ago
Senior AI Engineer - Professional Services
$165k - $220k
...Description:**DataRobot delivers AI that maximizes impact... ...in the future. As an AI Engineer on our Professional... ...you.**This is a fully remote position with no requirement... ...as Langgraph, CrewAI, Llama Index* Generative AI:... ...to jobs when they meet 100% of the qualifications...
Remote work
Senior
Full time
Work at office
Local area
Worldwide
Flexible hours
DataRobot
Oklahoma City, OK
4 days ago
Senior AI Performance Engineer (CUDA / GPU / NVIDIA Stack)
$140.4k
...Job Title: Senior AI Performance Engineer (CUDA / GPU / NVIDIA Stack) Duration:... ...Min 12+ Months Location: 100% Remote This is a hands-on engineering... ...tuning) Improve inference performance using... ...models such as YOLO, GPT, LLaMA, Transformers Strong...
Remote work
Senior
Full time
Brillfy Technology Inc
United States
1 day ago
Senior AI/ML Engineer
...Our new initiative brings AI directly into this process... ...works. We're looking for a senior machine learning engineer to take the lead on this... ...now. Why join us? 100% remote based in the US Help shape... ...APIs and probabilistic inference reliably Work alongside...
Remote work
Senior
Local area
Jobot
McLean, VA
4 days ago
Senior AI Systems Engineer (Agentic & LLM Production)
$50 - $60 per hour
...Application Management Specialist NTT DATA strives... ...in New York/Dallas/Remote, New York (US-NY),... ...Build agentic AI systems: Design and... ...following MCP protocol. Engineer robust guardrails... ...., OpenAI, Gemini, Llama, Qwen, Claude). ~... ...the Fortune Global 100 and are committed...
Remote work
Senior
Hourly pay
NTT DATA
United States
4 hours ago
Applied AI Inference Engineer
...Baseten powers mission-critical inference for the world's most dynamic AI companies, like Cursor, Notion, OpenEvidence... ...us and help build the platform engineers turn to to ship AI products. THE... ..., including meaningful equity. ~100% coverage of medical, dental, and...
Remote work
Work experience placement
Flexible hours
Baseten
United States
1 day ago
Senior ML/AI Engineer
...Senior ML/AI Engineer We're Sweed, a product-driven company... ...Engineer to join our team remotely and help us build the... ...-end engineers, QA specialists, analysts, and... ...Design scalable APIs and inference services for AI-driven... ...with a US company) ~100% remote — we're a remote...
Remote work
Senior
Contract work
Trial period
Flexible hours
Sweed
United States
3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Senior AI Inference Engineer llama.cpp specialist 100% Remote. Be the first to apply!