It's fun to work in a company where people truly BELIEVE in what they're doing!
We're committed to bringing passion and customer focus to the business.
Corporate Overview
Proofpoint is a leading cybersecurity company protecting organizations’ greatest assets and biggest risks: vulnerabilities in people. With an integrated suite of cloud-based solutions, Proofpoint helps companies around the world stop targeted threats, safeguard their data, and make their users more resilient against cyber attacks. Leading organizations of all sizes, including more than half of the Fortune 1000, rely on Proofpoint for people-centric security and compliance solutions mitigating their most critical risks across email, the cloud, social media, and the web.
We are singularly devoted to helping our customers protect their greatest assets and biggest security risk: their people. That’s why we’re a leader in next-generation cybersecurity.
About the Team
The AI Forge is an internal machine learning group that consults across Proofpoint's entire product portfolio. We are a group of machine learning scientists and software engineers who love keeping up with the latest ML research, fostering a collaborative and creative workplace, and solving challenging problems. Over the past several years, we have developed data-driven product features, leveraging a range of model architectures from tree-based models to state-of-the-art transformers.
We are launching an initiative in Cork, Ireland to work on impactful projects and product applying state of the art AI in support of Proofpoint’s Human Centric Security focus. We are looking for talented and motivated individuals to join this new team.
About the Role
As a Data Pipeline Engineer at Proofpoint, you will develop and maintain large-scale data ingestion, processing, and training pipelines within our Privacy Attested AI Platform. Your work will be critical in preparing large volumes of data for model training and then facilitating the training of those models.
We welcome applications from candidates at all experience levels (junior to senior).
Responsibilities
Design and implement scalable, high-performance data pipelines for ingesting and processing multi-modal cybersecurity data (emails, URLs, forensic logs).
Develop distributed training architectures, ensuring efficient multi-GPU model training across cloud and on-prem environments.
Ensure data integrity through de-duplication, transformation, and validation techniques.
Work within our privacy-compliant data handling environment, and collaborate with the privacy team while building pipelines.
Optimize data pipelines for high-throughput AI model training workflows.
Collaborate with AI infrastucture engineers, Machine Learning Scientists, and cloud infrastructure teams to align data processing with AI objectives.
Work with distributed computing frameworks (Spark, Ray, Dask, etc.) to scale data processing across multiple cloud environments.
Implement monitoring and observability tools for data lineage tracking and pipeline performance.
Qualifications
Strong experience in Python, Go, or Java for data pipeline and distributed computing development.
Hands-on experience with data pipeline frameworks (Apache Kafka, Spark, Flink, Airflow, or similar).
Understanding of distributed computing architectures for AI training (Ray, Kubernetes, PyTorch Distributed, or similar).
Familiarity with cloud-based data storage solutions (AWS S3, GCP BigQuery, Azure Data Lake).
Understanding of data security, access controls, and encryption techniques.
Preferred Qualifications (Senior-Level Candidates):
Expertise in high-scale distributed data systems.
Experience with privacy-preserving techniques such as federated learning, secure enclaves, etc.
Prior experience working on data pipelines and scalable compute architectures for AI model training.
If you like wild growth and working with happy, enthusiastic over-achievers, you'll enjoy your career with us!

0 applies
10 views
Other Jobs from Proofpoint
Senior Commercial Sales Engineer
Sr. Software Engineer
Software Engineer
Software Engineer
Python Backend Team Lead
Similar Jobs
Staff Machine Learning Engineer - User Voice
Senior Machine Learning Engineer - User Voice
Senior Machine Learning Engineer - Specialist Platform and Experience
AI Engineer Intern
AI software engineer intern
AI algorithm engineer intern
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say