FactSet Research Systems is an American financial data and software company, providing wide universe of financial data and services with the help of innovative Financial and Statistical data collection.
“We will no longer need to read documents except for fun”. Today, research analysts in the financial domain, have to read long documents to extract data from documents. This is a long and not fun process. With GenAI, data extraction can be facilitated. Can we build a tool that makes document understanding and data extraction easy?
Assistant, can you extract the value of this “concept” from the document for me? Where did you find such information?
This is today possible. But how can we optimize an AI system to do such a task optimally? How can we make this cheap? How can we guarantee quality? This is the context where we need you.
Project Overview:
The internship project involves assisting a team of AI/ML engineers into building a document intelligence tool. You will be involved into few research topics to prove that the approach can meet performance and cost-efficiency.
The project will combine prompt-engineering, LLM selection, RAG. We want to be able to evaluate at each stage that we are not losing performance and that we are saving costs.
The basis task of the tool is: "I want to extract “this concept” from a document.
The solution involves: retrieving the right chunks from the document; building a dynamic prompt; cost optimization; different research studies to prove hypothesis, etc.
The challenge is: how can we build a solution that can scale? How can we be very competitive cost-wise? How can we guarantee extraction quality?
Document intelligence tool description:
We have built a first version of the document intelligence tool. The next steps are to optimize it, perform different research studies and keep adding functionalities to it.
At the current stage, we will be developing new versions of the tool. Each version will have some research phase. We will want to prove a new version is better than the old version. We need someone to help us with the research and ways to automatically evaluate that the new versions are better than previous ones.
Responsibilities:
Standardize ML/AI datasets:
In order to evaluate an AI system, we need to produce validation datasets
Validation datasets should have standard format
Validation datasets should be stored in a pre-defined location
Standardize IO of datasets
Aggregate evaluation metrics:
Analyze different evaluation metrics for text generation such as “exact match”, “levenstein score”, “BERT score”
Define role of “LLM” as a judge
Handle few experiments to prove different hypothesis:
For instance, prove that using RAG will enhance system performance
Automate non-regression tests
Build a script that will control automatically that system performance did not fall behind given thresholds
Qualifications:
Current student or recent graduate in Computer Science, Information Technology, or a related field.
Proficiency in Python.
Can work with jupyter notebooks
Knowledge on AI/ML
Good problem-solving skills and an eye for detail.
Ability to work collaboratively in a team environment.
What We Offer:
Hands-on experience with innovative GenAI use case.
Mentorship and guidance from experienced developers.
Exposure to real-world projects.
Opportunity to develop a comprehensive understanding of AI projects.
Implication on different AI/ML community events
Why Life is Better as a FactSetter:
FactSet looks to foster a globally inclusive culture. From leadership commitment, to employee led resource groups, FactSet has diversity, equity, and inclusion as a priority. Read more about our priorities here: https://www.factset.com/company/diversity-equity-and-inclusion
FactSet believe giving back to our communities is part of our culture. From volunteer opportunities to working with non-profit partners, you can read more about our commitments here: https://www.factset.com/company/corporate-responsibility
Company profits participation
No or low-cost medical, dental and vision care
Full and free access to LinkedIn Learning catalog
Reimbursement for eligible expenses related to AWS certification, financials certifications (CFA, CIPM, CAIA, FRM)
Employee referral bonuses
Flexible office work / teleworking
And more!
At FactSet, we celebrate diversity of thought, experience, and perspective. We are committed to disrupting bias and a transparent hiring process. All qualified applicants will be considered for employment regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or veteran status. FactSet participates in E-Verify.
Other Jobs from FactSet
Software Engineer - Java Developer
Software Engineer (IS Services)
Observability - Software Engineer
Senior Project Manager
Senior Cybersecurity Engineer - PAM
Similar Jobs
Software Engineer II (.Net)
Software Engineer II
Senior Software Engineer
Senior Software Architect
Senior Software Engineer
Senior Software Engineer - Platform / Server (Bangalore, India)
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 401 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got about 70,000 jobs from 5,000 vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 5,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say