Description
- Design, build, and maintain reliable and scalable enterprise-level distributed transactional data processing systems for scaling the existing business and supporting new business initiatives
- Optimize jobs to utilize Kafka, Hadoop, Presto, Spark, and Kubernetes resources in the most efficient way
- Monitor and provide transparency into data quality across systems (accuracy, consistency, completeness, etc)
- Increase accessibility and effectiveness of data (work with analysts, data scientists, and developers to build/deploy tools and datasets that fit their use cases)
- Collaborate within a small team with diverse technology backgrounds
- Provide mentorship and guidance to junior team members
- Ingest, validate and process internal & third party data
- Create, maintain and monitor data flows in Python, Spark, Hive, SQL and Presto for consistency, accuracy and lag time
- Maintain and enhance framework for jobs(primarily aggregate jobs in Spark and Hive)
- Create different consumers for data in Kafka using Spark Streaming for near time aggregation
- Tools evaluation
- Backups/Retention/High Availability/Capacity Planning
- Review/Approval - DDL for database, Hive Framework jobs and Spark Streaming to make sure they meet our standards
- Python - primary repo language
- Airflow/Luigi - for job scheduling
- Docker - Packaged container image with all dependencies
- Graphite - for monitoring data flows
- Hive - SQL data warehouse layer for data in HDFS
- Kafka - distributed commit log storage
- Kubernetes - Distributed cluster resource manager
- Presto/Trino - fast parallel data warehouse and data federation layer
- Spark Streaming - Near time aggregation
- SQL Server - Reliable OLTP RDBMS
- Apache Iceberg
- GCP - BigQuery for performance, Looker for dashboards
Note that in the U.S. we can only hire as full-time employee, not contractor (we can hire contractors in most other countries).
Requirements
- 6+ years of data engineering experience
- Fluency in Python and SQL
- Strong recent Spark experience
- Experience working in on-prem environments
- Hadoop and Hive experience
- Experience in Scala/Java is a plus (Polyglot programmer preferred!)
- Proficiency in Linux
- Strong understanding of RDBMS and query optimization;
- Passion for engineering and computer science around data
- East Coast U.S. hours 9am-6pm EST
- Knowledge and exposure to distributed production systems i.e Hadoop
- Knowledge and exposure to Cloud migration (AWS/GCP/Azure) is a plus
Benefits:
- Comprehensive healthcare with medical, dental, and vision options, and 100%-paid life & disability insurance
- 401(k) Match
- Generous paid vacation and sick time
- Paid parental leave & adoption assistance
- Annual tuition assistance
- Better Yourself Wellness program
- Group volunteer opportunities and fun events
- A referral bonus program -- we love hiring referrals here at PulsePoint

0 applies
25 views
Other Jobs from PulsePoint
(Remote/India) BI Developer/Engineer, Front-End (React & JS)
Sr. QA Automation Engineer (India)
Sr. K8s/Platform Engineer (Remote, India)
Similar Jobs
Senior Data Engineer
Senior Data Engineer
Data Engineer (3-5 Years)
Sr Data Engineer
Senior Solutions Architect - DevOps
Senior Data Engineer (m/f/d)
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say