sec

Data Engineer / Data Scientist, Machine Learning

Athens, Greece
Python Pandas NumPy SQL NoSQL Machine Learning NLP
Description

Data Engineer/Scientist for ML

Location: 24A, Kifissias Avenue,, Athens, Greece

Job Description

Position Summary

We are seeking a specialized Data Engineer or Data Scientist to manage the complete lifecycle of the training data that powers our AI models. This role is pivotal in curating, sanitizing, and structuring high-quality speech and text datasets, serving as the foundation for training state-of-the-art Automatic Speech Recognition (ASR), Text-to-Speech (TTS), and Machine Translation (MT) systems

Role and Responsibilities

Data Pipeline Architecture
Design, build, and maintain robust pipelines for the ingestion, processing, and management of heterogeneous data sources, ensuring efficient flow from raw collection to model-ready inputs.

Unstructured Data Extraction
Extract and process high-fidelity speech data from complex, unstructured sources, including video feeds, multi-channel audio recordings, and raw text archives.

Corpus Curation & Management
Organize, structure, and analyze complex linguistic datasets, including speech-to-text alignments and parallel translation corpora, ensuring metadata accuracy and consistency.

Data Cleaning & Noise Reduction
Implement rigorous quality control protocols to identify and correct errors, remove artifacts, and apply noise reduction techniques to enhance audio clarity.

Dataset Enhancement Strategies
Develop and execute strategies to improve data quantity and diversity, including the application of data augmentation techniques and synthetic data generation.

Cross-Functional Collaboration
Partner closely with Machine Learning Engineers to align data preprocessing workflows and formatting with the specific requirements of various model architectures.

Skills and Qualifications

Programming Proficiency
Advanced proficiency in Python and core data manipulation libraries (e.g., Pandas, NumPy) with the ability to write clean, efficient, and scalable code.

Audio & Data Tooling
Hands-on experience with audio processing and analysis tools (e.g., librosa, torchaudio, Praat) and database management systems (SQL/NoSQL).

ML & NLP Fundamentals
Solid understanding of Machine Learning principles and the specific preprocessing and tokenization requirements for Natural Language Processing (NLP) and speech tasks.

Data Quality Expertise
Proven track record in handling large-scale, messy, or unstructured datasets, with a strong focus on data validation, cleaning, and sanitization techniques.

* Please visit Samsung membership to see Privacy Policy, which defaults according to your location, at: https://account.samsung.com/membership/policy/privacy. You can change Country/Language at the bottom of the page. If you are European Economic Resident, please click here: https://europe-samsung.com/ghrp/PrivacyNoticeforEU.html

sec
sec

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say