Causaly

Senior Data Engineer

London, UK
SQL Python Pandas Hadoop Docker Machine Learning Spark Elasticsearch Terraform Kubernetes
Search for More Jobs Talk to a recruiter now 💪
This job is closed! Check out or
Description

About us 

Founded in 2018, Causaly accelerates how humans acquire knowledge and develop insights in Biomedicine. Our production-grade generative AI platform for research insights and knowledge automation enables thousands of scientists to discover evidence from millions of academic publications, clinical trials, regulatory documents, patents and other data sources… in minutes. 

We work with some of the world's largest biopharma companies and institutions on use cases spanning Drug Discovery, Safety and Competitive Intelligence. You can read more about how we accelerate knowledge acquisition and improve decision making in our blog posts here: Blog - Causaly 

We are backed by top VCs including ICONIQ, Index Ventures, Pentech and Marathon. 

About the role: 

We are looking for a Senior Data Engineer with experience in data pipelines, backend architectures, ETL, cloud and other related fields. You will join and help to grow our established Data & Semantic Technologies team. This team is responsible for designing & building the highly scalable and flexible data backend that we need at Causaly in order to make our vision become real. You will be working on incremental data pipelines supporting batch as well as targeted updates, grow and maintain massive knowledge graphs and ontologies, feed our constantly growing data warehouse, and so on. You will enable & empower the Applied AI and Application teams, and be responsible for linking their outcomes in order to create true business value.

We are looking for innovative engineers who are capable, talented, engaged and passionate about creating industry-strength architectures and solutions that unleash the value of data. We are a multi-disciplinary team working in a fast-paced and collaborative environment, who value honest opinion and open debate. You have a true problem-solving mind-set with a hands-on attitude, you are keen to design and build innovative solutions that leverage the value of data, you are passionate and creative in your work, you love to share ideas with your team and can pick the right tool for the job? Then you should become part of our journey!

What you can expect to work on:

  • Gather and understand data based on business requirements.
  • Import big data (millions of records) from various formats (e.g. CSV, XML, SQL, JSON) to BigQuery. Process further on BigQuery and combine with external data sources.
  • Implement and maintain highly performant data pipelines with the industry’s best practices and technologies for scalability, fault tolerance and reliability.
  • Build the necessary tools for monitoring, auditing, exporting and gleaning insights from our data pipelines.
  • Work directly with a multitude of technical, product and business stakeholders.
  • Manage and maintain backend data processes related to data delivery, curation and machine learning operations.
  • Help to build a strong data-engineering function, mentor and guide other engineers, shape our technology strategy and innovate on our data backbone.

Minimum Requirements

Successful candidates will have:

  • Master’s degree in Computer Science, Mathematics or a related technical field
  • 5+ years experience in backend data processing and data pipelines
  • Excellent knowledge of Python and related libraries for working with data (e.g. pandas, Airflow)
  • Excellent SQL and database skills
  • Solid understanding of modern software development practices (testing, version control, documentation, version control, etc…)
  • A product and user-centric mindset
  • Excellent problem solving, ownership, organizational skills, high attention to detail and quality

Preferred Qualifications

Any experience of the following will be considered a plus:

  • NoSQL and big data technologies (e.g. Spark, Hadoop)
  • Full-text search databases (e.g., ElasticSearch)
  • Knowledge graphs and graph databases (e.g., Neo4J)
  • MLOps / DataOps in production
  • Terraform, Kubernetes and or/Docker Containers
Causaly
Causaly
Artificial Intelligence (AI) Life Science Pharmaceutical Semantic Search Software

0 applies

36 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 307 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

Cancel anytime / Money-back guarantee

Wall of love from fellow engineers