Model Serving Engineer
$100k - $150kBright Vision Technologies
Bright Vision Technologies is a forward-thinking software development company dedicated to building innovative solutions that help businesses automate and optimize their operations. We leverage cutting-edge technologies to create scalable, secure, and user-friendly applications. As we continue to grow, we’re looking for a skilled Model Serving Engineer to join our dynamic team and contribute to our mission of transforming business processes through technology. This is a fantastic opportunity to join an established and well-respected organization offering tremendous career growth potential. Job Title: Model Serving Engineer Location: 100% Remote (Continental United States) Position Type: In-house Bright Vision Technologies SOW engagement (no third-party client or vendor) Salary: $100K - $150K Experience: 6+ years Sponsorship: No new H1B sponsorship available. H1B transfers welcomed for qualified candidates. Employment Type: Full-time, direct W2 with Bright Vision Technologies (no C2C, no 1099, no third-party) Engagement: Long-term, multi-year, aligned to the Bright Vision SOW delivery roadmap Compensation: Competitive base salary commensurate with experience, plus benefits. Employment Terms & Visa Policy This is a 100% remote, full-time, direct W2 position with Bright Vision Technologies. This role is part of Bright Vision Technologies’ in-house Statement of Work (SOW) engagement. The client, end customer, and employer for this position is Bright Vision Technologies — there is no third-party client, vendor, or implementation partner involved. We do not engage in C2C, 1099, or third-party arrangements for this role. BUT STRICTLY NO C2C/1099/3RD PARTY COMPANIES. ALL OUR ROLES ARE W2 AND NO 3RD PARTY BROKERING PLEASE. Candidates must be willing to work directly as a full-time W2 employee of Bright Vision Technologies and contribute to our in-house SOW deliverables. No new H1B sponsorship is available for this role. However, candidates who are currently on a valid H1B visa and require a transfer are welcome to apply. We will support H1B transfers for qualified candidates. For every role, a technical coding assessment is mandatory. Please apply only if you are confident in your technical abilities and hands-on experience. Job Summary We are seeking a Model Serving Engineer to design, build, and operate high-performance, highly reliable inference platforms for serving large machine learning models in production. The role focuses on the systems engineering side of AI deployment, including request routing, batching, caching, autoscaling, GPU utilization, and end-to-end observability across diverse model workloads. The ideal candidate brings strong distributed systems and performance engineering expertise, has shipped serving systems at scale, and understands the trade-offs between latency, throughput, cost, and quality in ML serving. Key Responsibilities Design and operate model serving platforms supporting diverse workloads including LLMs, vision models, and recommendation systems. Optimize inference performance using continuous batching, paged attention, speculative decoding, and request multiplexing. Implement multi-tenant routing, rate limiting, and quality-of-service policies across model endpoints. Build autoscaling and capacity management systems that balance latency, throughput, and cost. Tune GPU utilization, memory management, and KV cache strategies for LLM serving workloads. Integrate model serving with API gateways, identity systems, and observability platforms. Implement caching, prompt deduplication, and response reuse strategies where appropriate. Drive end-to-end observability including latency histograms, queue dynamics, GPU utilization, and error tracking. Develop deployment workflows including canary releases, shadow testing, and automated rollback. Operate incident response for high-availability AI services and drive durable reliability improvements. Collaborate with ML and product teams to support new model releases and capability rollouts. Implement security controls including request signing, content filtering, and abuse detection at the serving layer. Document operational procedures, performance characteristics, and tuning guidance for internal teams. Stay current with AI serving research and translate advances into production capabilities. Required Qualifications Bachelor’s or Master’s degree in Computer Science or a related field. Six or more years of experience in distributed systems, infrastructure, or ML platform engineering. Strong proficiency in Python and a systems language such as Go, Rust, or C++. Deep experience operating high-throughput, low-latency services in production. Hands-on experience with LLM or large model inference frameworks such as vLLM or TensorRT-LLM. Strong understanding of GPU architecture, memory hierarchies, and accelerator utilization. Familiarity with Kubernetes, autoscaling, and modern cloud platforms. Experience with observability stacks including metrics, tracing, and structured logging. Solid grounding in performance engineering and capacity planning. Strong communication and incident response skills. Preferred Qualifications Open-source contributions to model serving infrastructure. Experience with multi-region or globally distributed AI serving. Familiarity with model quantization, distillation, and compression techniques. Exposure to FinOps for AI workloads and cost-efficient serving design. Experience supporting external-facing AI APIs at scale. How to Apply Would you like to know more about this opportunity? For immediate consideration, please send your resume to View email address on click.appcast.io Learn more about Bright Vision Technologies at We recognize that our people are our strength, and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. We also make reasonable accommodations for applicants’ and employees’ religious practices and beliefs, as well as mental health or physical disability needs. Bright Vision Technologies is an Equal Opportunity Employer, including Disability/Veterans. Position offered by “No Fee Agency.” Equal Employment Opportunity (EEO) Statement Bright Vision Technologies (BV Teck) is committed to equal employment opportunity (EEO) for all employees and applicants without regard to race, color, religion, sex, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, veteran status, or any other protected status as defined by applicable federal, state, or local laws. This commitment extends to all aspects of employment, including recruitment, hiring, training, compensation, promotion, transfer, leaves of absence, termination, layoffs, and recall. BV Teck expressly prohibits any form of workplace harassment or discrimination. Any improper interference with employees' ability to perform their job duties may result in disciplinary action up to and including termination of employment.
$100k - $150k
Bright Vision Technologies is looking for a Model Serving Engineer to join its team remotely. This position will focus on designing, building, and operating high-performance inference platforms for machine learning models. Responsibilities include optimizing performance...SuggestedRemote job$220k - $320k
ML Model Serving Engineer Want to build the layer that actually makes AI usable in real time? You’ll join a team focused on inference, where performance is the product. This is about delivering low-latency, high-throughput systems across LLMs, speech, and vision models...Suggested3 days per week- Bright Vision Technologies is hiring a Model Serving Engineer to design scalable AI inference platforms. This role demands experience with high-throughput services and large model frameworks, ideal for applicants with strong Python and systems language skills. The position...SuggestedRemote job
- ...infrastructure company developing next-generation multimodal AI models and a proprietary, high-efficiency serving platform. Backed by multi-million-dollar funding and... ...sponsorship from AMD with hands‑on support from AMD engineers the team is scaling rapidly to build the full stack...SuggestedFlexible hours
- Description You will be a Model-Based Systems Engineer (MBSE) supporting the Advanced Development Program (ADP) Line of Business. As the Model Based... ...team members in developing model-based work products and serve as a key resource to other engineering disciplines to help...SuggestedFull timeInterim role
- ...Communications (C3) organization is seeking an experienced Systems Engineer to lead the development of advanced software that drives... ...assist Systems Engineering team members in developing model-based work products and serve as a key resource to other engineering disciplines to...Full timeFlexible hours3 days per week
- Senior External Model Integration Engineer Everforth ECS is seeking a Senior External Model Integration Engineer to work in the National Capital... ...development with a focus on API design, AI/ML model serving or enterprise integration architecture. Demonstrated experience...Contract workLocal area
- ...Technology (Georgia Tech).Founded in 1934 as the Engineering Experiment Station, GTRI has grown to... ...scale, relentlessly committed to serving the public good; breaking new ground in... ...qualified candidates for the position of Model-Based Systems Engineer in Atlanta, Georgia...Contract workFor contractorsWork at officeLocal area
$107.9k - $195.05k
...at Leidos, currently has an opening for a cleared Senior Model-Based Systems Engineer to work in our Arlington, VA office. This is an exciting opportunity... ...About Leidos Leidos is an industry and technology leader serving government and commercial customers with smarter, more...Work at officeLocal areaImmediate start- Under the general supervision of the Senior Model Validation Officer, perform advanced level model validation for the corporation. Focus... ...analysis, but also evaluate other model controls and serve as a resource for the corporation in all model risk management related...Full timePart timeWork experience placementShift workDay shift
$181.1k - $318.4k
...products at Apple. These are multimodal models that power Siri on-device speech features... ...platforms. Our researchers and modeling engineers train models, iterate on data mixtures spanning... ...into a reliable, observable, self-serve system. The work spans python, shell tooling...Relocation$89.01k - $127.16k
Responsibilities Manage the digital model environment and deliverables for an entire project Have an advanced understanding of how all... ...are essential Strong analytical and problem‑solving skills to serve as a first line of technical support Benefits Medical, dental,...Full timeTemporary workWork at office- RadNet, Inc. is seeking a Senior External Model Integration Engineer in the National Capital Region to lead the integration of AI models with WDP... ...Candidates should have a strong background in AI/ML model serving and systems integration, with responsibilities including...
- ...lethal and survivable weapons systems. We serve as our customers’ most trusted agents in... ...research and development program in system engineering/digital engineering. Support the... ...system’s technical baselines, utilizing Model Based Systems Engineering (MBSE)/SysML and...Work at office
- ...Inc. is a Woman-Owned Small Business professional services provider of engineering and technical services to the United States Government. We seek an experienced Model-Based Systems Engineer to serve as part of ACI’s Government Programs Team supporting CPE ISW for IEW&S...Contract work
- Neurologist - Outpatient and/or Inpatient blended model - Davenport, Iowa Join a dynamic, well-established team! We have an exciting neurology... ...of over 5,000 employees. The non-profit regional health system serves a 17-county bi-state region of the Quad Cities (Davenport and...Summer workSeasonal workLocal areaRelocationRelocation package
- ...Job Overview The Senior MBSE Engineer applies Model-Based Systems Engineering methodologies to complex DoD platforms and hardware-centric mission systems composed of sensors, payloads, communications, processing, and supporting software elements. This role supports platform...For contractorsLocal area
- ...To support the development of NAVAIR logistic data and systems, the temporary remote Model Based Systems Engineer (MBSE) will manage MBSE products, conduct architecture analysis, and create technical documentation over a period of approximately 7 months. Key responsibilities...Temporary workRemote work
- ...critical to national defense and multi-domain operations. We are seeking an experienced systems engineer with strong technical depth in military system development and Model-Based Systems Engineering (MBSE) to develop innovative, mission-focused solutions to complex challenges...For contractors
- ...TSC is seeking a Model Based Systems Engineer in King George/Dahlgren, Virginia. This position supports the Systems Engineering & Analysis Division (SEA). Responsibilities Plan, coordinate, and lead working groups in the development, planning, and execution of Model-Based...For contractorsFor subcontractorFlexible hours
$100k - $125k
...motivated professional to deliver lean and effective solutions to our government partners. We have an immediate opening for a Model Based Systems Engineer to join our team. This position will provide an opportunity to develop the innovative technology that supports some of...Full timeImmediate startFlexible hours- ...A leading AI platform company in San Francisco is seeking a Software Engineer focused on machine learning performance. This role involves implementing advanced techniques for ML model inference and debugging performance issues with frameworks like PyTorch and TensorRT...
- ...team of high-performing business professionals and leaders in engineering, R&D, product management and business development areas at our... ...for the following role: The Mechanical Engineer/Analyst Model Development, under the direction the Mechanical Engineering Manager...Permanent employmentWork experience placementImmediate startWork visaFlexible hours
$325k
A leading AI research company in San Francisco seeks an engineer to optimize their powerful AI models for high-volume production environments. The ideal candidate has over 5 years of software engineering experience, strong familiarity with ML architectures, and experience...- ...Inspire health. Serve with compassion. Be the difference. Job Summary Performs high quality EEGs (routine and video) and epilepsy monitoring. Facilitates all aspects of study completion and reporting within the department. Actively participates in outstanding customer...Remote workShift work
- ...Model-Based Systems Engineer (MBSE) Position located at Eglin AFB, FL. The role supports 28 TES Science and Engineering Division and provides advice and assistance in 28 TES programs utilizing modeling and simulation (M&S) approaches while conducting test and evaluation...Contract workTemporary workLocal areaRelocation packageFlexible hours
$75k - $100k
...elements (hardware, software, data, facilities, etc.). Ensures that engineering efforts follow the Systems Engineering process. Work in a... ..., Technical Performance Measures, Technical Reviews, and Modeling & Simulation. Communicates with stakeholders across program offices...For contractorsFor subcontractor$61.9k - $141k
...Model Based Systems Engineer As a systems engineer on our team, you’ll have the opportunity to shape Integrated System Architecture development processes. Your customer will trust you to not only design, develop, and evolve these systems with advanced technology solutions...Full timePart timeLocal area- Apple Inc. is seeking a Senior Machine Learning Engineer in Cupertino, California, to evaluate and refine Apple's AI systems. You will design and develop key infrastructures for model and agent evaluations, contribute to quality improvements, and work closely with product...
- NVIDIA Gruppe is looking for a skilled professional to enhance the performance of large-scale models through advanced optimization techniques in Santa Clara, California. Candidates should have a strong background in DL model training and deployment, ideally with a PhD...
Do you want to receive more vacancies?
Subscribe and receive similar vacancies to Model Serving Engineer. Be the first to apply!



