Model Serving Engineer
$100k - $150kBright Vision Technologies
Job Title: Model Serving Engineer
Location: 100% Remote (Continental United States)
Position Type: In-house Bright Vision Technologies SOW engagement (no third-party client or vendor)
Salary: $100K - $150K / Annum
Experience: 6+ years
Sponsorship: No new H1B sponsorship available. H1B transfers welcomed for qualified candidates.
Employment Type: Full-time, direct W2 with Bright Vision Technologies (no C2C, no 1099, no third-party)
Engagement: Long-term, multi-year, aligned to the Bright Vision SOW delivery roadmap
Compensation: Competitive base salary commensurate with experience, plus benefits.
Employment Terms & Visa Policy
This is a 100% remote, full-time, direct W2 position with Bright Vision Technologies.
This role is part of Bright Vision Technologies’ in-house Statement of Work (SOW) engagement. The client, end customer, and employer for this position is Bright Vision Technologies — there is no third-party client, vendor, or implementation partner involved.
We do not engage in C2C, 1099, or third-party arrangements for this role.
BUT STRICTLY NO C2C/1099/3RD PARTY COMPANIES. ALL OUR ROLES ARE W2 AND NO 3RD PARTY BROKERING PLEASE.
Candidates must be willing to work directly as a full-time W2 employee of Bright Vision Technologies and contribute to our in-house SOW deliverables.
No new H1B sponsorship is available for this role.
However, candidates who are currently on a valid H1B visa and require a transfer are welcome to apply. We will support H1B transfers for qualified candidates.
For every role, a technical coding assessment is mandatory. Please apply only if you are confident in your technical abilities and hands-on experience.
Job Summary
We are seeking a Model Serving Engineer to design, build, and operate high-performance, highly reliable inference platforms for serving large machine learning models in production. The role focuses on the systems engineering side of AI deployment, including request routing, batching, caching, autoscaling, GPU utilization, and end-to-end observability across diverse model workloads. The ideal candidate brings strong distributed systems and performance engineering expertise, has shipped serving systems at scale, and understands the trade-offs between latency, throughput, cost, and quality in ML serving.
Key Responsibilities
- Design and operate model serving platforms supporting diverse workloads including LLMs, vision models, and recommendation systems.
- Optimize inference performance using continuous batching, paged attention, speculative decoding, and request multiplexing.
- Implement multi-tenant routing, rate limiting, and quality-of-service policies across model endpoints.
- Build autoscaling and capacity management systems that balance latency, throughput, and cost.
- Tune GPU utilization, memory management, and KV cache strategies for LLM serving workloads.
- Integrate model serving with API gateways, identity systems, and observability platforms.
- Implement caching, prompt deduplication, and response reuse strategies where appropriate.
- Drive end-to-end observability including latency histograms, queue dynamics, GPU utilization, and error tracking.
- Develop deployment workflows including canary releases, shadow testing, and automated rollback.
- Operate incident response for high-availability AI services and drive durable reliability improvements.
- Collaborate with ML and product teams to support new model releases and capability rollouts.
- Implement security controls including request signing, content filtering, and abuse detection at the serving layer.
- Document operational procedures, performance characteristics, and tuning guidance for internal teams.
- Stay current with AI serving research and translate advances into production capabilities.
- Bachelor’s or Master’s degree in Computer Science or a related field.
- Six or more years of experience in distributed systems, infrastructure, or ML platform engineering.
- Strong proficiency in Python and a systems language such as Go, Rust, or C++.
- Deep experience operating high-throughput, low-latency services in production.
- Hands-on experience with LLM or large model inference frameworks such as vcLLM or TensorRT-LLM.
- Strong understanding of GPU architecture, memory hierarchies, and accelerator utilization.
- Familiarity with Kubernetes, autoscaling, and modern cloud platforms.
- Experience with observability stacks including metrics, tracing, and structured logging.
- Solid grounding in performance engineering and capacity planning.
- Strong communication and incident response skills.
- Open-source contributions to model serving infrastructure.
- Experience with multi-region or globally distributed AI serving.
- Familiarity with model quantization, distillation, and compression techniques.
- Exposure to FinOps for AI workloads and cost-efficient serving design.
- Experience supporting external-facing AI APIs at scale.
Would you like to know more about this opportunity?
For immediate consideration, please send your resume to View email address on click.appcast.io or contact us at View phone number on click.appcast.io. Learn more about Bright Vision Technologies at
We recognize that our people are our strength, and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company.
We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. We also make reasonable accommodations for applicants’ and employees’ religious practices and beliefs, as well as mental health or physical disability needs.
Bright Vision Technologies is an Equal Opportunity Employer, including Disability/Veterans.
Position offered by “No Fee Agency.”
Equal Employment Opportunity (EEO) Statement
Bright Vision Technologies (BV Teck) is committed to equal employment opportunity (EEO) for all employees and applicants without regard to race, color, religion, sex, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, veteran status, or any other protected status as defined by applicable federal, state, or local laws. This commitment extends to all aspects of employment, including recruitment, hiring, training, compensation, promotion, transfer, leaves of absence, termination, layoffs, and recall.
BV Teck expressly prohibits any form of workplace harassment or discrimination. Any improper interference with employees' ability to perform their job duties may result in disciplinary action up to and including termination of employment.
- Qualifications Experience in working with and fine-tuning Large Language Models (LLMs), Gen AI including the design, optimization of NLP systems, frameworks, and tools. Experience in building scalable applications using LLMs, utilizing frameworks such as LangChain, LlamaIndex...Suggested
$80 - $85 per hour
*Network Security Engineer* *Position Overview* *We are seeking a highly experienced Network Security Engineer* for a 6-month contract... ...onsite presence (approximately 20%). The ideal candidate will serve as a subject matter expert in network security, responsible for...SuggestedContract workTemporary workRemote work- ...collaboration platforms cloud services or enterprise applications Serve as the Slack SME Subject Matter Expert manage workspace... ...6 years of experience in IT operations DevOps or collaboration engineering ~ Certifications in Slack administration or cloud platforms AWS...Suggested
$38.27 per hour
...another and how we show up for our community. Together, we keep getting better - advancing our mission to transform healthcare and serve as a leader of positive change. The X-ray Technologist is responsible for obtaining diagnostic radiographic examinations. A day...SuggestedHourly payDaily paidFull timePart timeApprenticeshipWork experience placementWork at officeShift workNight shiftAfternoon shift- ...participating in external/internal computer system/software audits. Serve as the SME for CSV activities and maintain knowledge and... ...Management System Bachelor's/Master's degree in Computer Science Engineering or equivalent. #J-18808-Ljbffr Creative Solutions Services,...Suggested
- ...repairs on equipment, arranging for repairs as needed Complete forms and maintains records, logs, and reports of work performed Serve as Medical Receptionist and/or Medical Assistant if/when needed within appropriate guidelines Other duties and responsibilities...Flexible hours
- ...one constant remains: our unwavering vision and dedication to serving our customers. We strive to continuously improve and optimize... ...functional teams (Operations, RME (Reliability Maintenance & Engineering), Central Teams, Human Resources, Transportation Operations, and...Full timeSummer workInternshipWork at officeLocal areaRelocationRelocation packageShift workNight shiftWeekend work
- ...operational efficiency. Associate Advocacy & Cultural Leadership - Serve as a trusted safety advisor and advocate for associate... ...Qualifications Working towards a degree in Health Science, Safety Engineering, Health & Safety, Safety Management, Business Administration,...Full timeWork at officeRelocation packageFlexible hoursShift workNight shiftWeekend work
$2,900 - $5,800 per month
.... At the center of these projects is a talented group of Civil Engineers who help to ensure that each initiative is conceived, planned and... ...may vary depending upon whether you’re currently serving, whether you’ve served before or whether you’ve never served before...Civilian ContractorFull timeContract workPart timeWork at office- Sr Electrical Engineer page is loaded## Sr Electrical Engineerlocations: NJ-Edisontime type: Full timeposted on: Posted Todayjob requisition... ...and compliance with safety standards. The ideal candidate will serve as a technical authority across design, testing, documentation,...Work at officeLocal areaRemote workFlexible hoursShift workNight shift
- ...Systems, Inc. has an opening for an Electrical Engineer to join our team supporting the... ...factor analysis and correction Harmonic modeling and analysis Grounding analysis Short‑circuit... ...: The listed duties are not intended to serve as a comprehensive list of all duties...InternshipLocal areaRemote work
$60k - $75k
Under the close supervision of a Principal Engineer or other designated supervisor, and while serving as a trainee and contributing team member, the Engineer Trainee performs basic engineering and related tasks. Duties include assisting with design, construction inspection...TraineeshipWork at officeLocal area- ...Posting Title System Programmer/Administrator IV (IdM Senior PAM Engineer) Job Category Staff & Executive - Information... ...System Programmer/Administrator IV reports to the IdM OIT Director serves as a highly skilled resource to help promote and facilitate the...Full timeTemporary workSeasonal workWork at officeFlexible hoursShift work
$17.65 per hour
...area and dining areas. Will assist with the stocking or food and beverage items and set up cafeteria lines and may be responsible for serving meals. Assist in the storage and inventory of supplies; transfer food ingredients and make available for the meal preparation....Hourly pay$95k - $105k
Overview We have an immediate need for a Structural Engineer to join our fast-growing team in our Edison, NJ office. This position offers... ...to create a larger, more efficient, and cost-effective team to serve clients. LiRo‑Hill is a 1100-person firm with offices in NYC,...Work at officeImmediate start$109.6k - $155k
...brighter, healthier future for all. Summary The Senior Packaging Engineer - Global Oral Care develops and manages packaging initiatives... ...development, qualification and implementation. In addition, the role serves as an Oral Care Packaging subject matter expert to support the...Hourly payLocal areaRelocation- Associate Process Engineer/Scientist- Drug Product, Biologics Manufacturing Science and Technology Key responsibilities Execution of... ...technology implementation, and/or manufacturing investigations Serve as technical resource to other functions, providing expertise on...
- ...technology services company is looking for an Associate Process Engineer/Scientist specializing in Drug Product within Biologics... ...process changes, troubleshooting manufacturing processes, and serving as a technical resource across teams. Candidates should have a...
$95k - $125k
Overview We have an immediate need for a Structural Engineer (Buildings) to join our fast‑growing team in our Edison, NJ office. This position... ...a larger, more efficient, and cost‑effective team to better serve our clients. Responsibilities Perform structural engineering...Work at officeImmediate start- ...Software Systems Engineer Partner with business clients team, application development, application support, and other IT infrastructure... ...systems do not adversely impact application performance. Serves as an advanced technical expert resolving critical and complex...
- ...diversity to ensure that we are empowered to make sure that “Every day counts” for our employees, the students, and the schools we serve nationwide. ESS is an Equal Opportunity Employer (EOE). Qualified applicants are considered for employment without regard to...
$140k
...immediate need for a Senior Water Resources Engineer for our Edison, NJ office. We are... ...more efficient, and cost-effective team to serve clients. LiRo-Hill is a 1100-person firm... ...and roadway projects Perform watershed modeling, rainfall-runoff modeling, and peak flow...Work at officeImmediate start£54.95k - £64.74k per year
...Pattern Permanent | Full Time | The role As a Civil Operations Engineer at SSEN Transmission, you will ensure the safe and efficient... ...a safe and reliable electricity supply for the communities we serve. We’re upgrading the grid to deliver cleaner, homegrown energy...Permanent employmentFull timeFor contractorsFlexible hours- Interim HealthCare is currently seeking compassionate per diem Registered Nurses (RN's) to serve as substitute nurses in schools. Responsibilities For each day of the assignment you will typically work in the nursing office for the entire school day. Length of assignments...Daily paidInterim roleWork at officeFlexible hours
- ...Development (Central NJ) A well-established and growing civil engineering firm in Central New Jersey is seeking a Lead Civil PE to take charge... ...utility design Manage project teams, schedules, and budgets Serve as primary point of contact for clients, municipalities, and...Local area
$117k - $184.2k
## Continuous Improvement Engineer (Onsite)Applyremote type: Not Applicablelocations: USA -... ....The **Continuous Improvement Engineer** serves as the **MPS Lead - (POCS)** and is an embedded... ....**Leadership Expectations*** Role model for **Safety First, Quality Always**.* Demonstrates...For contractorsLocal areaRelocationVisa sponsorshipFlexible hoursShift work$33 per hour
...preliminary and final subspecialty interpretations and consultative support for hospitals, imaging centers and other medical facilities, and serves as the academic radiology faculty at Rutgers Robert Wood Johnson Medical School. Our physicians interpret over 2.1 million...Temporary workWork at office$36 - $41 per hour
...X-Ray Technologist Reports to : Practice Operations Manager Summary: We are seeking full time X-Ray Technologists to serve a busy Urgent Care center in Warren , NJ. The X-Ray Technologist supports urgent care operations by performing high-quality diagnostic...Hourly payFull timeTemporary workWork at officeWeekend workAfternoon shift$155k - $220k
GFT is seeking an experienced Resident Engineer - Transit and Rail to join our Construction Services team in New Jersey. This role presents... ...projects across New Jersey and the broader tri-state region Serving as the field point of contact for the client and project...Full timeFor contractorsWork at officeFlexible hoursWeekend workAfternoon shift$40 per hour
...preliminary and final subspecialty interpretations and consultative support for hospitals, imaging centers and other medical facilities, and serves as the academic radiology faculty at Rutgers Robert Wood Johnson Medical School. Our physicians interpret over 2.1 million...Full timeWork at officeMonday to Friday1 day per week
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Model Serving Engineer. Be the first to apply!


