C the Signs

AI Data Engineer

Remote Boston, MA
Python Scala Java AWS GCP Azure Spark Apache Spark FHIR HL7 HIPAA Airflow LLM Machine Learning
Description

AI Data Engineer

Location: Boston, Massachusetts, United States, New York, New York, United States, New York, United States, New Jersey, United States, New Hampshire, United States, Rhode Island, United States

Workplace: remote

Description

Position Summary

The Data Engineer will play a crucial role in developing and fine-tuning data specifically for our LLMs and machine learning models. This individual will be responsible for the entire data lifecycle, including gathering, cleaning, structuring, and optimizing large, diverse healthcare datasets. The ideal candidate will have a strong background in data engineering principles, experience with big data technologies, and a keen understanding of the unique challenges and requirements of healthcare data.

You will design, build, and maintain scalable data pipelines that source, preprocess, and deliver high-quality, high-volume datasets to our machine learning engineers. This role requires a deep understanding of data engineering best practices coupled with specific knowledge of the data requirements for LLM training and refinement

Key Responsibilities

  • Collaborate with data scientists and machine learning engineers to understand data requirements for LLM and machine learning model fine-tuning.
  • Design, build, and maintain scalable data pipelines to ingest, process, and store massive and diverse healthcare datasets.
  • Implement robust data validation and monitoring to ensure the integrity, accuracy, and consistency of all training datasets.
  • Implement robust data cleaning, validation, and transformation processes to ensure data quality and integrity.
  • Develop and optimize data structures and schemas for efficient access and utilization by LLMs and machine learning models.
  • Work with the team to identify and acquire new data sources, ensuring compliance with relevant healthcare regulations (e.g., HIPAA).
  • Monitor data pipeline performance, troubleshoot issues, and implement optimizations to improve efficiency and reliability.
  • Document data engineering processes, data models, and data dictionaries.
  • Stay up-to-date with the latest advancements in data engineering, big data technologies, and machine learning.

Requirements

Required

  • Bachelor's degree in Computer Science, Engineering, or a related field.
  • Proven experience as a Data Engineer, with a focus on big data technologies.
  • Strong proficiency in programming languages such as Python, Scala, or Java.
  • Extensive experience with data warehousing, ETL processes, and data modeling.
  • Experience with major cloud providers (e.g., AWS, GCP, Azure) and their data storage and processing services.
  • Hands-on experience with big data frameworks like Apache Spark for distributed processing.
  • Excellent problem-solving skills and the ability to work independently and as part of a team.
  • Strong communication and interpersonal skills.

Preferred

  • Master's degree in a related field.
  • Experience with healthcare data and a good understanding of healthcare data standards (e.g., FHIR, HL7).
  • Familiarity with machine learning concepts and LLM fine-tuning processes.
  • Experience with data orchestration tools (e.g., Apache Airflow).

Work Authorization:

    • Must be a US Citizen, Green Card holder, or currently in the US have valid H1B visa

Benefits

Why Join Us?

Joining C the Signs is not just about building AI; it’s about shaping the future of healthcare. If you are a technical leader with an unshakable belief in the power of AI to save lives and the ability to make it happen at scale, this is your opportunity to create a tangible, global impact.

Benefits:

  • Competitive salary and benefits package.
  • Flexible working arrangements (remote or hybrid options available).
  • The opportunity to work on life-changing AI technology that directly impacts patient outcomes.
  • Join a team that combines cutting-edge innovation with a mission to save lives and improve health equity.
  • Continuous learning opportunities with access to the latest tools and advancements in AI and healthcare.
C the Signs
C the Signs

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say