Sign up to access all features of our service.
  • Job search
  • Favorites
  • Create a CV
    New
  • Salaries
  • Subscriptions

Synthetic Data Engineer (AI Data/Training)

Hyphen Connect Limited

We are seeking a talented and innovative Synthetic Data Engineer. In this role, you will design and implement domain-specific synthetic data generation pipelines, ensuring high-quality data management for training loops. Your expertise will drive the success of data processing and model training within the organization.

Responsibilities:
  • Design domain-specific synthetic data generation (SDG) pipelines via self-instruct and constitutional prompting.
  • Implement automated quality scoring and de-duplication systems.
  • Manage data pipelines that feed directly into SFT and DPO training loops.
Qualifications:
  • Proven experience building large-scale data pipelines (Airflow, Spark, Ray).
  • Deep knowledge of prompt engineering for data generation.
  • Familiarity with dataset distillation and bias mitigation.
Vacancy posted 4 days ago
Similar jobs that could be interesting for youBased on the Synthetic Data Engineer (AI Data/Training) in San Francisco, CA vacancy
  •  ...Synthetic Data Engineer (AI Data/Training) San Francisco Bay Area, USA We are seeking a talented and innovative Synthetic Data Engineer. In this role, you will design and implement domain-specific synthetic data generation pipelines, ensuring high-quality data management... 
    Training

    Hyphen Connect

    San Francisco, CA
    14 days ago
  • $172.5k - $260.1k

     ...Job Category Software Engineering Job Details About...  ...Salesforce is the #1 AI CRM, where humans with...  ...Salesforce. The Enterprise Data & AI Solutions group...  ...specialized in ETL, synthetic data generation, automated...  ..., promotion, benefits, training, assessment of job... 
    Training
    Shift work

    Salesforce

    San Francisco, CA
    3 days ago
  •  ...Data/ETL Engineer (Founding Team) Location: San Francisco Bay Area Type: Full-Time Compensation...  ...Role We're building a multi-tenant, AI-native platform where enterprise data...  ...a connected ontology ready for model training, vector search, and insight-to-action workflows... 
    Training
    Full time

    Fabrion

    San Francisco, CA
    4 days ago
  •  ...Senior Data Engineer Disney Entertainment and ESPN Product & Technology is a global organization...  ...video through the power of data and AI. We design and build innovative...  ...Required Education, Experience/Skills/Training: ~5+ years of data engineering experience... 
    Training

    Disney France

    San Francisco, CA
    4 days ago
  •  ...Data Engineer Location: San Francisco, CA Required Clearance: Secret Salary: Competitive...  ...Data Engineer with a strong focus on AI and machine learning to join our dynamic...  ...quality, clean, and usable data for model training and evaluation. Optimize data storage... 
    Training

    Fullscope

    San Francisco, CA
    4 days ago
  •  ...Description: Role: Data Engineer - Artificial Intelligence & Machine Learning Location Options: Bay Area - CA Responsibilities: - 1. Develop AI/ML Models: •Design, build, and train machine learning models using appropriate algorithms (e.g., supervised... 
    Training

    TEPHRA

    San Francisco, CA
    3 days ago
  •  ...Responsibilities: 1.Design and Build Data Pipelines: •Develop,...  ...preparing datasets for model training and deployment. •...  ..., and best practices in data engineering and big data systems....  ...learning and preparing data for AI/ML model training. -Familiarity... 
    Training

    TEPHRA

    San Francisco, CA
    1 day ago
  • $180k - $220k

     ...insights, and a host of business-critical KPIs. As a Data Engineer in the Data Engineering team, you will own the...  ...related skills, experience, and relevant education or training. We may use artificial intelligence (AI) tools to support parts of the hiring process, such... 
    Training

    Finix

    San Francisco, CA
    4 days ago
  • $138.9k - $186.2k

     ...Senior Data Engineer Disney Entertainment and ESPN Product & Technology is a global organization...  ...video through the power of data and AI. We design and build innovative...  ...Required Education, Experience/Skills/Training: ~5+ years of data engineering experience... 
    Training

    The Walt Disney Studios

    San Francisco, CA
    4 days ago
  •  ...machine learning models Architecting ML training, validation and inference pipelines...  ...approaches to maximizing the potential of data in AI models Defining creative solutions...  ...of study Strong ML research and engineering utilizing established and emerging NLP... 
    Training

    NovumTech Partners

    San Francisco, CA
    4 days ago
  •  ...minerals powering modern energy, AI, and defense technologies. We'...  ...software, automation, and data-driven decision-making. The...  ...'re looking for a Senior Data Engineer to help make it autonomous. We...  ...analysis and machine learning training, validation, and monitoring; own... 
    Training
    Contract work
    Immediate start
    Shift work

    Mariana Minerals

    San Francisco, CA
    3 days ago
  • $99k - $149k

     ...insights about companies and AI-driven personalization to help...  ...responsibility is to integrate data from a variety of sources into...  ...You will provide documentation, training, and consultation for users of...  ...experience in software engineering fundamentals and coding Salary... 
    Training
    Work experience placement
    Local area

    Indeed Inc.

    San Francisco, CA
    13 days ago
  • $160k - $190k

     ...Senior Data Engineer Los Angeles; New York; Remote; San Francisco EDO is the TV outcomes...  ...world-class decision science and vertical AI, EDO equips industry leaders with...  ..., relevant work experience, key skills, training, and business considerations. EDO is... 
    Training
    Full time
    Work experience placement
    Work at office
    Immediate start
    Remote work
    Flexible hours

    EDO

    San Francisco, CA
    4 days ago
  • $120k - $160k

     ...Data Engineer Los Angeles; New York; Remote; San Francisco EDO is the TV outcomes company...  ...-class decision science and vertical AI, EDO equips industry leaders with syndicated...  ..., relevant work experience, key skills, training, and business considerations. EDO is... 
    Training
    Full time
    Work experience placement
    Work at office
    Immediate start
    Remote work
    Flexible hours

    EDO

    San Francisco, CA
    6 days ago
  • $99k - $147k

     ...positive mark on culture. Summary The Data Engineering team is hiring a Data Engineer - Data...  ...data systems powering analytics, ML, and AI applications. You will also grow your expertise...  ...location, market demands, experience, training, and education. The benefits available... 
    Training

    Paramount Unified School District

    San Francisco, CA
    3 days ago
  • $172.5k - $260.1k

     ...Job Category Software Engineering Job Details About Salesforce Salesforce is the #1 AI CRM, where humans with agents...  ...About the Team At Slack, data isn't just infrastructure - it...  ...compensation, promotion, benefits, training, assessment of job performance... 
    Training
    Permanent employment

    Salesforce.Com Inc

    San Francisco, CA
    5 days ago
  • $207k - $238k

     ...Senior Data Engineer At Komodo Health, our mission is to reduce the global burden of disease...  ...Map, analytics products, and downstream AI/ML-enabled use cases. This is a hands-on...  ..., geographic work location, relevant training and certifications, business needs and market... 
    Training
    For contractors
    Work experience placement
    Work at office
    Local area
    Remote work
    Flexible hours

    Komodo Health

    San Francisco, CA
    3 days ago
  • $139.44k - $174.31k

     ...Senior Scientific Data Engineer Berkeley Lab's Joint Genome Institute has an opening for a...  ...capabilities, expert support, and large-scale, AI-ready data resources. As a Department of...  ...'s Degree (or equivalent knowledge/training) in Computer Science or a related field... 
    Training
    Full time
    Work at office
    Remote work
    Relocation package

    Berkely Lab

    San Francisco, CA
    6 days ago
  •  ...Data Science & ML Ops Engineer Location: Bay Area, CA Tax Term (W2, C2C): W2, C2C We are seeking a...  ...Leverage AutoML tools (e.g., Vertex AI AutoML, H2O Driverless AI) for low-code...  ...Kubeflow, or Vertex AI. Automate model training, testing, deployment, and monitoring... 
    Training

    Apolis

    San Francisco, CA
    4 days ago
  • $110k - $145k

     ...Role Is We are seeking a talented Senior Data Engineer to design, build, and maintain our data...  ..., and role-based access management AI & LLM Proficiency: Practical experience...  ...and demonstrated experience, education, training and certifications, and other factors permitted... 
    Training
    H1b
    Work at office
    Local area
    Relocation
    Visa sponsorship
    Flexible hours

    Clearway Energy, Inc.

    San Francisco, CA
    7 days ago
  • $130k - $196.5k

     ...LiveRamp is the data collaboration platform of choice for the world...  ...processing, analytics, and AI/ML workloads. Define and implement...  ...like Grafana. Onboard, train, and mentor vendor teams and...  ...design documents. Champion engineering best practices (code reviews,... 
    Training
    Work from home
    Flexible hours
    Night shift

    LiveRamp

    San Francisco, CA
    2 days ago
  • $170k - $220k

     ...Windfall is seeking a Sr. Data Engineer to join our data team. As a Sr. Data Engineer on our...  ..., experience, and relevant education or training. We also offer a comprehensive benefits...  ...Windfall is a people intelligence and AI company that gives go-to-market teams actionable... 
    Training

    Windfall

    San Francisco, CA
    4 days ago
  • $350k

     ...Machines Lab in San Francisco is seeking a pre-training researcher, responsible for curating and...  ...large-scale datasets that support AI model development. The ideal candidate will...  ...relevant fields. This role blends research and engineering, requiring both theoretical knowledge and... 
    Training

    Thinking Machines Lab

    San Francisco, CA
    4 days ago
  •  ...About the Role We are seeking a Data Infrastructure Engineer to build and operate the infrastructure...  ...research to support perception model training and evaluation workflows, enabling faster...  ...to work on novel sensing, data, and AI systems with real-world deployment paths... 
    Training
    Permanent employment
    Full time

    Matter Intelligence

    San Francisco, CA
    3 days ago
  • $140k - $200k

    Labelbox is the data factory for generative AI, providing the highest quality training data for frontier and task-specific models. Labelbox’s comprehensive platform...  ...at Labelbox! You will use a unique mix of engineering, product and sales to deliver data on high stakes... 
    Training
    Work at office

    I did my part and supported the Regular Toilet

    San Francisco, CA
    5 days ago
  • £75k - £95k per year

    Join to apply for the Data Engineer role at Surecall Tech Join to apply for the Data Engineer role at Surecall Tech Get AI-powered advice on this job and more exclusive features. Direct...  ...growth: You’ll have a budget for tools, training, and conferences Culture-led: Empowered... 
    Training
    Full time
    Freelance
    Remote work
    Flexible hours

    Surecall Tech

    San Francisco, CA
    10 days ago
  • $180k - $250k

     ...founding team, we build multi-agent AI systems that can automate...  ..., operational continuity, and data-driven decision-making. Shape...  ...agent automation, where your data engineering expertise accelerates business...  ...related to model training, model serving, or deployment.... 
    Training
    Full time
    Work at office

    Cerebras

    San Francisco, CA
    5 days ago
  •  ...with groundbreaking vision-based AI, designed for today’s global...  ...don’t believe culture can be engineered - but when it falls into place...  ...Overview We’re looking for a data engineer to help us turn raw driving...  ...point it gets pulled into a training run, and your work will... 
    Training
    Local area
    Flexible hours

    Humble Robotics

    San Francisco, CA
    4 days ago
  •  ...About the Role We’re looking for a Sr. Data Engineer with strong data platform experience to...  ...contribute to the foundation of our emerging AI and ML platform. This role sits at the...  ...source ingestion and preparation to training, tuning, experimentation, productionization... 
    Training

    Octave

    San Francisco, CA
    1 day ago
  • $156k - $195k

    About The Team Data is our fuel at Turo. It is ever‑more abundant...  ...scientists and machine learning engineers, it propels Turo on its...  ...GCP, Azure). Experience with AI tools for code generation (cursor...  ...experience, and relevant education or training. We encourage you to talk with... 
    Training
    Full time
    Work at office
    Local area
    3 days per week

    Turo

    San Francisco, CA
    3 days ago

Do you want to receive more vacancies?

Subscribe and receive similar vacancies to Synthetic Data Engineer (AI Data/Training). Be the first to apply!