Pyspark Data Engineer
Location: Chennai, Tamil Nadu, India
Employment Type: Regular
Responsibilities:
- Develop, test, and deploy high-quality Python code for data migration, data profiling, and data processing.
- Design and implement scalable solutions for working with large and complex datasets, ensuring data integrity and performance.
- Utilize PySpark for distributed data processing and analytics on large-scale data platforms.
- Develop and optimize SQL queries for various database systems, including Oracle, to extract, transform, and load data efficiently.
- Integrate Python applications with JDBC-compliant databases (e.g., Oracle) for seamless data interaction.
- Implement data streaming solutions to process real-time or near real-time data efficiently.
- Perform in-depth data analysis using Python libraries, especially Pandas, to understand data characteristics, identify anomalies, and support profiling efforts.
- Collaborate with data architects, data engineers, and business stakeholders to understand requirements and translate them into technical specifications.
- Contribute to the design and architecture of data solutions, ensuring best practices in data management and engineering.
- Troubleshoot and resolve technical issues related to data pipelines, performance, and data quality.
Qualifications:
- 4-7 years of relevant experience in the Financial Service industry
- Strong Proficiency in Python:
- Excellent command of Python programming, including object-oriented principles, data structures, and algorithms.
- PySpark Experience:
- Demonstrated experience with PySpark for big data processing and analysis.
- Database Expertise:
- Proven experience working with relational databases, specifically Oracle, andconnecting applications using JDBC.
- SQL Mastery:
- Advanced SQL querying skills for complex data extraction, manipulation, andoptimization.
- Big Data Handling:
- Experience in working with and processing large datasets efficiently.
- Data Streaming:
- Familiarity with data streaming concepts and technologies (e.g., Kafka, SparkStreaming) for processing continuous data flows.
- Data Analysis Libraries:
- Proficient in using data analysis libraries such as Pandas for data manipulationand exploration.
- Software Engineering Principles:
- Solid understanding of software engineering best practices,including version control (Git), testing, and code review.
- Problem-Solving:
- Intuitive problem-solver with a self-starter mindset and the ability to work independently and as part of a team.
Education:
- Bachelor’s degree/University degree or equivalent experience
This job description provides a high-level review of the types of work performed. Other job-related duties may be assigned as required.
Preferred Skills & Qualifications (Good to Have):
Experience in developing and maintaining reusable Python packages or libraries for data engineering tasks.
Familiarity with cloud platforms (e.g., AWS, Azure, GCP) and their data services.
Knowledge of data warehousing concepts and ETL/ELT processes.
Experience with CI/CD pipelines for automated deployment.
------------------------------------------------------
Job Family Group:
Technology------------------------------------------------------
Job Family:
Applications Development------------------------------------------------------
Time Type:
Full time------------------------------------------------------
Most Relevant Skills
Please see the requirements listed above.------------------------------------------------------
Other Relevant Skills
For complementary skills, please see above and/or contact the recruiter.------------------------------------------------------
Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.
If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.
View Citi’s EEO Policy Statement and the Know Your Rights poster.
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say
