Data Engineer
Location: Hyderabad, TS, in, Remote
Company Description
Blend360 is a data and AI services company specializing in data engineering, data science, MLOps, and governance to build scalable analytics solutions. It partners with enterprise and Fortune 1000 clients across industries including financial services, healthcare, retail, technology, and hospitality to drive data-driven decision making. Headquartered in Columbia, Maryland, the company is recognized for rapid growth and global delivery of AI solutions through the integration of people, data, and technology.
We are seeking a hands-on Data Engineer with deep expertise in distributed systems, ETL/ELT development, and enterprise-grade database management. The engineer will design, implement, and optimize ingestion, transformation, and storage workflows to support the MMO platform. The role requires technical fluency across big data frameworks (HDFS, Hive, PySpark), orchestration platforms (NiFi), and relational systems (Postgres), combined with strong coding skills in Python and SQL for automation, custom transformations, and operational reliability.
Job Description
We are implementing a Media Mix Optimization (MMO) platform designed to analyze and optimize marketing investments across multiple channels. This initiative requires a robust on-premises data infrastructure to support distributed computing, large-scale data ingestion, and advanced analytics. The Data Engineer will be responsible for building and maintaining resilient pipelines and data systems that feed into MMO models, ensuring data quality, governance, and availability for Data Science and BI teams. The environment integrates HDFS for distributed storage, Apache NiFi for orchestration, Hive and PySpark for distributed processing, and Postgres for structured data management.
This role is central to enabling seamless integration of massive datasets from disparate sources (media, campaign, transaction, customer interaction, etc.), standardizing data, and providing reliable foundations for advanced econometric modeling and insights.
Responsibilities:
Data Pipeline Development & Orchestration
o Design, build, and optimize scalable data pipelines in Apache NiFi to
automate ingestion, cleansing, and enrichment from structured, semi-structured, and unstructured sources.
Ensure pipelines meet low-latency and high-throughput requirements for distributed processing.
Data Storage & Processing
o Architect and manage datasets on HDFS to support high-volume,
fault-tolerant storage.
o Develop distributed processing workflows in PySpark and Hive to
handle large-scale transformations, aggregations, and joins across
petabyte-level datasets.
o Implement partitioning, bucketing, and indexing strategies to
optimize query performance.
Database Engineering & Management
o Maintain and tune Postgres databases for high availability, integrity,
and performance.
o Write advanced SQL queries for ETL, analysis, and integration with
downstream BI/analytics systems.
Collaboration & Integration
o Partner with Data Scientists to deliver clean, reliable datasets for
model training and MMO analysis.
o Work with BI engineers to ensure data pipelines align with reporting
and visualization requirements.
Monitoring & Reliability Engineering
o Implement monitoring, logging, and alerting frameworks to track
data pipeline health.
o Troubleshoot and resolve issues in ingestion, transformations, and
distributed jobs.
Data Governance & Compliance
o Enforce standards for data quality, lineage, and security across
systems.
o Ensure compliance with internal governance and external
regulations.
Documentation & Knowledge Transfer
o Develop and maintain comprehensive technical documentation for
pipelines, data models, and workflows.
o Provide knowledge sharing and onboarding support for cross-
functional teams.
Qualifications
Bachelor’s degree in Computer Science, Information Technology, or related field (Master’s preferred).
Proven experience as a Data Engineer with expertise in HDFS, Apache NiFi, Hive, PySpark, Postgres, Python, and SQL.
Strong background in ETL/ELT design, distributed processing, and relational database management.
Experience with on-premises big data ecosystems supporting distributed computing.
Solid debugging, optimization, and performance tuning skills.
Ability to work in agile environments, collaborating with multi-disciplinary
teams.
Strong communication skills for cross-functional technical discussions.
Preferred Qualifications:
Familiarity with data governance frameworks, lineage tracking, and data cataloging tools.
Knowledge of security standards, encryption, and access control in on- premises environments.
Prior experience with Media Mix Modeling (MMM/MMO) or marketing analytics projects.
Exposure to workflow schedulers (Airflow, Oozie, or similar).
Proficiency in developing automation scripts and frameworks in Python for
CI/CD of data pipelines.
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say
