Inference Optimization Engineer (local / edge runtime)
$170.5k - $315.49kIntel
Overview At Intel, our mission is to transform AI into something safer, more trustworthy, and respectful of human privacy by design. We build agentic AI that combines the best of local and cloud intelligence—private, affordable, and sustainable by design. Small efficient models run directly on the user’s machine, keeping data private and token costs low, while powerful cloud models handle the hardest work: planning, reasoning, and complex problem-solving. Together they give people real capability without compromise—data stays private, spend stays predictable, and energy use stays in check. Role Summary Make models fast on the hardware people actually own. You optimize inference engines (llama.cpp, vLLM) for constrained local and edge environments — GPU/iGPUs, Vulkan backends — not datacenter H100 environment, mostly PC/edge. KV cache, batching, quantization, scheduling, and CPU‑overhead reduction are your daily tools. This is the rare skill that makes a hybrid, low‑cost agent product viable. Responsibilities Profile and optimize local inference (llama.cpp‑vulkan and vLLM) for latency, throughput, and memory on edge hardware Tune KV cache, continuous batching, and scheduling for interactive agent workloads Drive quantization strategy (GGUF / AWQ / GPTQ) and validate quality impact with the Post‑Training team Cut CPU overhead and improve engine startup, model load, and lifecycle (start/stop/health) Benchmark across hardware tiers and publish honest performance comparisons Upstream fixes and patches to open‑source engines where it helps us What you’ll learn / grow into Understanding the internals of modern inference engines and where the milliseconds actually go Hardware‑aware optimization across iGPU / CPU paths (Vulkan, SYCL, oneAPI, CUDA where relevant) The quality‑vs‑speed‑vs‑memory trade‑space for small models Interest in local / edge AI and squeezing hardware Qualifications BS/MS in CS, EE, Math or related STEM field 5+ years software development background Strong in C++ and/or Python; comfortable reading systems‑level code Understands how LLM inference works (attention, KV cache, decoding) Has profiled and optimized real performance problems (CPU or GPU) and can prove the speedup Linux, build systems, and low‑level debugging expertise Preferred Qualifications Hands‑on with llama.cpp, vLLM, ggml, or similar engines Experience with GPU / accelerator programming (Vulkan, CUDA, SYCL, Metal) or SIMD / CPU kernels Familiarity with quantization formats and their quality trade‑offs Open‑source contributions to inference engines Benefits Our total rewards package includes competitive pay, stock bonuses, health benefits, retirement plans, and vacation. Find out more about the benefits of working at Intel in the dedicated benefits section. EEO Statement All qualified applicants will receive consideration for employment without regard to race, color, religion, religious creed, sex, national origin, ancestry, age, physical or mental disability, medical condition, genetic information, military and veteran status, marital status, pregnancy, gender, gender expression, gender identity, sexual orientation, or any other characteristic protected by local law, regulation, or ordinance. Annual Salary Range $170,500.00 – $315,490.00 USD (US locations) Work Model This role will be eligible for a hybrid work model which allows employees to split their time between working on‑site at their assigned Intel site and off‑site. #J-18808-Ljbffr
$170.5k - $315.49k
...agentic AI that combines the best of local and cloud intelligence - private, affordable... ...on the user's machine (AI PC, edge, on-prem, and beyond), keeping data private... ...the hardware people actually own. You optimize inference engines (llama.cpp, vLLM) for constrained...Local areaInternshipImmediate startShift work$170.5k - $315.49k
...agentic AI that combines the best of local and cloud intelligence — private, affordable... ...on the user's machine (AI PC, edge, on-prem, and beyond), keeping data private... ...hardware people actually own. You optimize inference engines (llama.cpp, vLLM) for constrained local...Local areaFull timeInternshipImmediate startShift work- Intel Corporation in Phoenix, Arizona, is looking for an AI Optimization Engineer with expertise in local inference optimizations. The role involves tuning inference engines for edge hardware, profiling performance, and driving quantization strategies while working in...Local area
- ...BOM, an autonomous stack optimized for off the shelf... ...seeking a Machine Learning Engineer to design, develop, and... ...heterogeneous inputs — local perception, swarm state... ...models to run real-time on edge devices through... ...learning, planning-as-inference, VLA models, or similar...Local area
- Koitecc Solutions is seeking a Senior Content Delivery Network Engineer, responsible for enhancing CDN platforms for optimized user experience. You will design and implement CDN standards, partner with teams, and provide technical leadership. This role requires extensive...Suggested
$175k - $200k
...passionate about cutting-edge technology, thrive in a... ...The role of Sr. Sales Engineer, also known as a... ...our MDR offering as the optimal solution. Your role as... ...data ingestion, model inference, and validation. Collectively... ...by federal, state, or local laws. Mental and Physical...Local areaTemporary workCasual workWork at officeRemote work- ...and architectures on edge devices for swarms of... ...highly skilled MLOps Engineer to design, build, and... ...frameworks (TensorRT, ONNX Runtime, TorchServe) Knowledge... ...and embedded systems optimization Prior robotics or autonomous... ...federal, state, or local law. #J-18808-Ljbffr...Local areaRelocation
- ...will find your fit here. As a BIM Engineer , you will play a pivotal role in designing and optimizing semiconductor FAB projects... ...individuals who enjoy working on cutting-edge projects, learning from... ...background check, where permitted by local regulations. #J-18808-Ljbffr...Local area
$164.47k - $269.1k
...FIP) as an Analog Circuit Design Engineer. You will design and develop cutting‑edge analog circuits in advanced process... ...‑level design, simulation, and optimization to meet power, performance, area,... ...other characteristic protected by local law. #J-18808-Ljbffr Intel...Local area$90k - $135k
...offering top-tier design and engineering services to a diverse array of... ...experts collaborate closely, optimizing and enhancing our work... ...sessions. By leveraging cutting-edge techniques and state-of-the-art... ...rigorously comply with and exceed local, state, and national...Local areaFor contractors$160k - $230k
LLMOps Engineer: Key Skills & Responsibilities in 2026 The hardest... ...‑call cost modeling Latency optimization and streaming response patterns... ...operational discipline of inference serving, batching, and GPU scheduling... ...for on‑prem or developer‑local serving SGLang for advanced...Local areaImmediate startNight shift- ...Urgent Need!!! Must be local and willing to be onsite in Phoenix, AZ 1-... ...to join our growing integration engineering team. In this role, you will design... ...to ensure resilient integrations Optimize integration performance, monitor runtime metrics, and troubleshoot production...Local areaContract work
$136.99k - $193.39k
...talented candidates with Device engineering experience in Foundry to service local and global customers. This job requisition... ...-specific device characteristics optimizations and DOE. Proficiency in data... ...development on leading-edge technology nodes. Experience in advanced...Local area$133.8k - $255.2k
...predictable execution, world‐class engineering, and innovative tooling... ...material selection, parameter optimization, equipment metrology, and... ...customers -- from delivering cutting-edge silicon process and packaging... ...characteristic protected by local law, regulation, or ordinance...Local areaWork experience placementShift work$50k - $100k
...offering top-tier design and engineering services to a diverse array of... ...experts collaborate closely, optimizing and enhancing our work... ...sessions. By leveraging cutting-edge techniques and state-of-the-art... ...strictly adhere to pertinent local, state, and national electrical...Local areaRemote work$122.44k - $172.86k
...team, within Intel's Central Engineering Group, is responsible for delivering... ...role in developing cutting-edge analog circuits for advanced... ..., power efficiency, and area optimization of design solutions, enabling... ...characteristic protected by local law, regulation, or ordinance...Local areaWork experience placementInternshipWork at officeImmediate startShift work$83.43k - $222.48k
Koitecc Solutions is looking for a Senior Content Delivery Network Engineer who will be responsible for the design, implementation, and operational reliability of CDN platforms supporting customer‑facing digital applications. The ideal candidate should have over 5 years...Full time- Mechanical Engineer Join to apply for the Mechanical Engineer role at... ...one of the biggest and leading‑edge construction projects in the... ...the selected individual to live local and in commutable distance to... ...and procedures. Analysis and optimize system design. Provide strong...Local areaFull timeContract workImmediate startRelocation
$90k - $125k
Mechanical Engineer Job Location: US-AZ-Scottsdale Overview Within... ...experts collaborate closely, optimizing and enhancing our work through... .... By leveraging cutting-edge techniques and state-of-the-art... ...familiarity with applicable state and local building codes. Ability to...Local areaFor contractors$90k - $125k
...offering top-tier design and engineering services to a diverse array of... ...experts collaborate closely, optimizing and enhancing our work... ...sessions. By leveraging cutting- edge techniques and state-of-the-art... ...strictly adhere to pertinent local, state, and national electrical...Local areaFor contractors- ...WHS team to deliver cutting-edge fulfillment solutions. You... ...to continuously improve and optimize our fulfillment processes to... ...o Medium-sized facilities, localized close to larger markets, where... ...RME (Reliability Maintenance & Engineering), Central Teams, Human Resources...Local areaFull timeSummer workInternshipWork at officeRelocationRelocation packageShift workNight shiftWeekend work
$130k - $160k
...offering top-tier design and engineering services to a diverse array of... ...experts collaborate closely, optimizing and enhancing our work... ...sessions. By leveraging cutting-edge techniques and state-of-the‑art... ...fidelity with ASHRAE, NFPA, and local mechanical codes. Perform...Local areaFor contractorsWorldwide$90k - $105k
...reliability industries, we are committed to engineering excellence, continuous improvement, and... ..., quality, delivery, and costDevelop, optimize, and document machining processes,... ...cultureOpportunity to contribute to cutting-edge manufacturing and aerospace expansion #J...Local areaWork at office$76.39k - $146.44k
...Details:**## Job Description:A Line Engineer will be expected to manage... ...at OTF) • Developing plans to optimize loop performance and partner... ...customers -- from delivering cutting-edge silicon process and packaging... ...characteristic protected by local law, regulation, or ordinance....Local areaShift workNight shift$99.03k - $188.89k
...ImpactAs a **Packaging Module Engineer**, you will play a pivotal role... ...and sustainability of cutting-edge products. By applying... ...the secure transportation and optimal performance of Intel's products... ...other characteristic protected by local law, regulation, or ordinance....Local areaWork experience placementImmediate startShift work$51.71k - $99.77k
...that processes are continuously optimized for efficiency and quality.... ...Work closely with Module Engineering to execute experiments, implement... ...-- from delivering cutting-edge silicon process and packaging... ...characteristic protected by local law, regulation, or ordinance...Local areaHourly payInternshipImmediate startShift work3 days per week$83k - $95k
...Install, configure, and test pre-engineered software for control systems... ...to end. Coordinate with local Honeywell project managers to... ...performance-driven salary, cutting-edge work, and developing... ...applications for building control and optimization; sensors, switches, control...Local areaTemporary workFor contractorsWork experience placementFlexible hours$115.11k - $219.55k
...packaging technologies for mobile, edge, and hyperscale computing... ...Packaging Module Development Engineer will contribute to advancing technology... ...assembly equipment and optimize assembly processes for quality... ...characteristic protected by local law, regulation, or ordinance....Local areaShift work$64.89k - $173.04k
...& Copilot Operations Support Engineer plays a critical role in ensuring... ..., troubleshooting, service optimization, and user enablement across... ...Teams, Outlook, Word, Excel, Edge, and SharePoint Troubleshoot... ...with all federal, state and local laws. #J-18808-Ljbffr Hispanic...Local areaFull time$128.88k - $245.16k
...industry by delivering cutting-edge silicon process and packaging... ...Verification Application Engineer provides specialized technical... ...Methodology Leadership*** Lead optimization of physical verification... ...characteristic protected by local law, regulation, or ordinance...Local areaInternshipShift work
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Inference Optimization Engineer (local / edge runtime). Be the first to apply!
- local sales manager Phoenix, AZ
- local account manager Phoenix, AZ
- local delivery truck driver Phoenix, AZ
- local home daily class a truck driver Phoenix, AZ
- local union Phoenix, AZ
- cdl b local delivery driver Phoenix, AZ
- dedicated local truck driver Phoenix, AZ
- local delivery driver Phoenix, AZ
- local truck driver cdl class a Phoenix, AZ
- local company truck driver Phoenix, AZ


