Hello Visionary!
We empower our people to stay resilient and relevant in a constantly changing world. We’re looking for people who are always searching for creative ways to grow and learn. People who want to make a real impact, now and in the future.
We are looking for a highly skilled and experienced Senior Data Engineer to join our dynamic data engineering team.
The ideal candidate will be responsible for building and maintaining scalable, high-performance data pipelines and cloud infrastructure, with a focus on managing vast amounts of data efficiently in real-time and batch processing environments. The role requires expertise in advanced ETL processes, AWS services such as Glue, Lambda, S3, Redshift, and EMR, and hands-on experience with big data technologies like Apache Spark, Kafka, Kinesis, and Apache Airflow.
You will work closely with data scientists, software engineers, and analysts to ensure that data is accessible, clean, and reliable for business-critical operations and advanced analytics.
Key Responsibilities:
Design & Architect Scalable Data Pipelines: Architect, build, and optimize high-throughput ETL pipelines using AWS Glue, Lambda, and EMR to handle large datasets and complex data workflows. Ensure the pipeline scales efficiently and handles real-time and batch processing.
Cloud Data Infrastructure Management: Implement, monitor, and maintain a cloud-native data infrastructure using AWS services like S3 for data storage, Redshift for data warehousing, and EMR for big data processing. Build robust, cost-effective solutions for storing, processing, and querying large datasets efficiently.
Data Transformation & Processing: Develop highly performant data transformation processes using Apache Spark on EMR for distributed data processing and parallel computation. Write optimized Spark jobs in Python (PySpark) for efficient data transformation.
Real-time Data Streaming Solutions: Design and implement real-time data ingestion and streaming systems using AWS Kinesis or Apache Kafka to handle event-driven architectures, process continuous data streams, and support real-time analytics.
Orchestration & Automation: Use Apache Airflow to schedule and orchestrate complex ETL workflows. Automate data pipeline processes, ensuring reliability, data integrity, and ease of monitoring. Implement self-healing workflows to recover from failures automatically.
Data Warehouse Optimization & Management: Develop and optimize data models, schemas, and queries in Amazon Redshift to ensure low-latency querying and scalable analytics. Apply best practices for data partitioning, indexing, and query optimization to increase performance and minimize costs.
Containerization & Orchestration:
Leverage Docker to containerize data engineering applications for better portability and consistent runtime environments. Use AWS Fargate for running containerized applications in a serverless environment, ensuring easy scaling and reduced operational overhead.
Monitoring & Debugging: Build automated monitoring and alerting systems to proactively detect and troubleshoot pipeline issues, ensuring data quality and operational efficiency. Use tools like CloudWatch, Prometheus, or other logging frameworks to ensure end-to-end visibility of data pipelines.
Collaboration with Cross-functional Teams: Work closely with data scientists, analysts, and application developers to design data models and ensure proper data availability. Collaborate in the development of solutions that meet the business’s data needs, from experimentation to production.
Security & Compliance: Implement data governance policies, security protocols, and compliance measures for handling sensitive data, including encryption, auditing, and IAM role-based access control in AWS.
we are looking for 5+ years of hands-on experience in building, maintaining, and optimizing data pipelines, ideally in a cloud-native environment.
ETL Expertise: Solid understanding of ETL/ELT processes and experience with tools like AWS Glue for building serverless ETL pipelines. Expertise in designing data transformation logic to move and process data efficiently across systems.
AWS Services: Deep experience working with AWS cloud services:
S3: Designing data lakes, ensuring scalability and performance.
AWS Glue: Writing custom jobs for transforming data.
Lambda: Writing event-driven functions to process and transform data on-demand.
Redshift: Optimizing data warehousing operations for efficient query performance.
EMR (Elastic MapReduce): Running distributed processing frameworks like Apache Spark or Hadoop to process large datasets.
Big Data Technologies: Expertise in using Apache Spark for distributed data processing at scale. Experience with real-time data processing using Apache Kafka and AWS Kinesis for building streaming data pipelines.
Data Orchestration: Strong experience with Apache Airflow or similar workflow orchestration tools for scheduling, monitoring, and managing ETL jobs and data workflows.
Programming & Scripting: Proficiency in Python programming language for building custom data pipelines and Spark jobs. Knowledge of standard processes in coding for high performance, maintainability, and reliability.
SQL & Query Optimization: Advanced knowledge of SQL and experience in query optimization, partitioning, and indexing for working with large datasets in Redshift and other data platforms.
CI/CD & DevOps Tools: Experience with version control systems like Git and implementing CI/CD pipelines using tools like Terraform or AWS CloudFormation to automate deployment and infrastructure management.
Preferred Qualifications:
Data Streaming:
Experience in designing and building real-time data streaming solutions using Kafka or Kinesis for real-time analytics and event processing.
Data Governance & Security:
Familiarity with data governance practices, data cataloging, and data lineage tools to ensure the quality and security of data.
Advanced Data Analytics Support:
Knowledge of supporting machine learning pipelines and building data systems that can scale to meet the requirements of AI/ML workloads.
Certifications:
AWS certifications such as AWS Certified Big Data – Specialty or AWS Certified Solutions Architect are highly desirable.
Make your mark in our exciting world at Siemens.
This role, based in Bangalore, is an individual contributor position. You may be required to visit other locations within India and internationally. In return, you'll have the opportunity to work with teams shaping the future.
At Siemens, we are a collection of over 312,000 minds building the future, one day at a time, worldwide. We are dedicated to equality and welcome applications that reflect the diversity of the communities we serve. All employment decisions at Siemens are based on qualifications, merit, and business need.
Bring your curiosity and imagination, and help us shape tomorrow
We’ll support you with:
Hybrid working opportunities.
Diverse and inclusive culture.
Variety of learning & development opportunities.
Attractive compensation package.
Find out more about Siemens careers at: www.siemens.com/careers
Other Jobs from Siemens
Backend developer (Java, SpringBoot)
Azure Devops Engineer
evosoft - c# developer for CompCat
Transportation Project Engineer (f/m/d)
Similar Jobs
Lead Consultant - AI Platform Engineer
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say