Citi

Pyspark Data engineer (Scala spark, java spark, python) - C11 - PUNE

Pune, India
Oracle SQL Spark API Hadoop Java Python
Description

Team/Function Overview

Markets Data team is building the next generation Data fabric to solve for Business, Analytics

and growing regulatory needs. Vast amounts of data assets have been accumulated through the

years. Data fabric built on emerging technologies will facilitate the data being inspected,

cleansed, transformed for support decision-making

This job involves being part of a dynamic team for Markets data Risk Reporting on Cancel &

Corrects and Open / Unconfirmed trades and contributing towards software development of

core components using ETL technologies and Cloud database platform. The ideal candidate will

have an eye for building and optimizing data systems and will work closely with our Systems

Architects, Data Scientists, and Analysts to help direct the flow of data within the pipeline and

ensure consistency of data delivery and utilization across multiple projects.

Role / Position Overview

Olympus (Re-platforming of Ocean), the regulatory reporting infrastructure is being re-built

strategically starting with Equities. As part of Olympus build out, the developer will be working

on to rebuild the Markets data Risk Enterprise Program on Cancel & Corrects along with Open

and Unconfirmed trades data.

We need a strong database developer with thorough understanding of advanced Database

concepts to understand the existing application and then migrate the same to Olympus. The

specific skill sets required are exposure to any RDBMS Database, Python, PySpark or JavaSpark.

Experience in any ETL Tools is a good to have skill set.

Key Responsibilities:

The role will include but not be limited to the following:

Design/implement data objects using Data Warehousing methodologies including

Oracle or similar relational database tools, SQL and PL/SQL. Implement DWH solution

using Spark SQL and Python on Big Data.

Identify re-usable database components and develop recommendations for ALPS target

architecture

Align to database programming standards and best practices.

Develop ETL jobs using Talend 8x for processing of data in a Datawarehouse.

Collaborate with project teams to refine functional requirements and translate into

technical architecture/design

Continuously monitor/tune database performance identifying potential

issues/opportunities for improvement and outline recommendations to improve

performance

Work with development teams ensuring adherence to database standards

Oversee change management process for database objects across multiple projects

Accountable for delivery of the database objects through SIT, UAT and Production.

Liaise with clients to determine requirements and interpret into solutions

Mentoring and training of junior team members

Development Value:

Candidate has the opportunity to be a major contributor to the Citi Markets Data Strategy and

contribute towards the goal of increasing revenue using key metrics for decision making. The

candidate will work with bright and innovative individuals both on the business and technology

side and the successful candidate can make a significant difference to the business performance.

Knowledge/Experience:

5+ years of experience within the technology or banking industry

Strong Experience in developing ETL solutions using Pyspark and thorough

understanding of advanced DWH concepts.

Strong hands on experience in developing API modules using Python. Strong experience/advanced knowledge of designing conceptual, logical & physical data

models and generating initial Data Definition Language

Very strong database design/development experience using Oracle 12C/19C

Working experience in Hadoop, Hive Impala

Expert in SQL & PL/SQL modules such as packages, procedures, functions and other

database objects

Expert in Database Performance Tuning

Strong knowledge of DBA skills using Oracle 12C/19C

Experience with Java will be an added advantage.

Expert in Big Data querying tools e.g. Hive and Impala.

Writing Python modules and API related to various Data abstraction layer

Experience in working with any ETL tool like Talend 7x or higher will be an added

advantage

Additional Job Description

Additional Job Description

Team/Function Overview

Markets Data team is building the next generation Data fabric to solve for Business, Analytics

and growing regulatory needs. Vast amounts of data assets have been accumulated through the

years. Data fabric built on emerging technologies will facilitate the data being inspected,

cleansed, transformed for support decision-making

This job involves being part of a dynamic team for Markets data Risk Reporting on Cancel &

Corrects and Open / Unconfirmed trades and contributing towards software development of

core components using ETL technologies and Cloud database platform. The ideal candidate will

have an eye for building and optimizing data systems and will work closely with our Systems

Architects, Data Scientists, and Analysts to help direct the flow of data within the pipeline and

ensure consistency of data delivery and utilization across multiple projects.

Role / Position Overview

Olympus (Re-platforming of Ocean), the regulatory reporting infrastructure is being re-built

strategically starting with Equities. As part of Olympus build out, the developer will be working

on to rebuild the Markets data Risk Enterprise Program on Cancel & Corrects along with Open

and Unconfirmed trades data.

We need a strong database developer with thorough understanding of advanced Database

concepts to understand the existing application and then migrate the same to Olympus. The

specific skill sets required are exposure to any RDBMS Database, Python, PySpark or JavaSpark.

Experience in any ETL Tools is a good to have skill set.

Key Responsibilities:

The role will include but not be limited to the following:

Design/implement data objects using Data Warehousing methodologies including

Oracle or similar relational database tools, SQL and PL/SQL. Implement DWH solution

using Spark SQL and Python on Big Data.

Identify re-usable database components and develop recommendations for ALPS target

architecture

Align to database programming standards and best practices.

Develop ETL jobs using Talend 8x for processing of data in a Datawarehouse.

Collaborate with project teams to refine functional requirements and translate into

technical architecture/design

Continuously monitor/tune database performance identifying potential

issues/opportunities for improvement and outline recommendations to improve

performance

Work with development teams ensuring adherence to database standards

Oversee change management process for database objects across multiple projects

Accountable for delivery of the database objects through SIT, UAT and Production. Liaise with clients to determine requirements and interpret into solutions

Mentoring and training of junior team members

Development Value:

Candidate has the opportunity to be a major contributor to the Citi Markets Data Strategy and

contribute towards the goal of increasing revenue using key metrics for decision making. The

candidate will work with bright and innovative individuals both on the business and technology

side and the successful candidate can make a significant difference to the business performance.

Knowledge/Experience:

8+ years of experience within the technology or banking industry

Strong Experience in developing ETL solutions using Pyspark and thorough

understanding of advanced DWH concepts.

Strong hands on experience in developing API modules using Python.

Strong experience/advanced knowledge of designing conceptual, logical & physical data

models and generating initial Data Definition Language

Very strong database design/development experience using Oracle 12C/19C

Working experience in Hadoop, Hive Impala

Expert in SQL & PL/SQL modules such as packages, procedures, functions and other

database objects

Expert in Database Performance Tuning

Strong knowledge of DBA skills using Oracle 12C/19C

Experience with Java will be an added advantage.

Expert in Big Data querying tools e.g. Hive and Impala.

Writing Python modules and API related to various Data abstraction layer

Experience in working with any ETL tool like Talend 7x or higher will be an added

advantage

------------------------------------------------------

Job Family Group:

Technology

------------------------------------------------------

Job Family:

Applications Development

------------------------------------------------------

Time Type:

Full time

------------------------------------------------------

Citi is an equal opportunity and affirmative action employer.

Qualified applicants will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran.

Citigroup Inc. and its subsidiaries ("Citi”) invite all qualified interested applicants to apply for career opportunities. If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.

View the "EEO is the Law" poster. View the EEO is the Law Supplement.

View the EEO Policy Statement.

View the Pay Transparency Posting

Citi
Citi
Banking Credit Cards Financial Services Wealth Management

0 applies

1 views

Similar Jobs

Lead Software Engineer

Remote Bellevue, WA

Software Engineer

Remote Belfast, Northern Ireland

Ruby Software Engineer

Remote Chicago, IL

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say