Ripjar

Data Engineer (Remote)

Remote Cheltenham, England
Python Node.js HDFS HBase Spark Kubernetes MongoDB OpenSearch Apache Airflow Dagster NiFi Git GitHub Actions Jira Confluence Ansible Terraform Argo CD Helm PySpark Linux
Description

Data Engineer

Location: Cheltenham, England, United Kingdom, London, England, United Kingdom, Bristol, England, United Kingdom, Manchester, England, United Kingdom

Department: Engineering

Workplace: remote

Employment Type: full

Description

About Ripjar

Ripjar specialises in the development of software and data products that help governments and organisations combat serious financial crime. Our technology is used to identify criminal activity such as money laundering and terrorist financing, enabling organisations to enforce sanctions at scale to help combat rogue entities and state actors.

Data infuses everything Ripjar does. We work with a wide variety of datasets of all scales, including an ever-growing archive of billions of news articles covering most languages going back over 30 years, sanctions and watchlist data provided by governments, and vast organisation and ownership datasets.

About the Role

We see a Data Engineer as a software engineer who specialises in distributed data systems. You’ll join the Data Engineering team, whose prime responsibility is the development and operation of the Data Collection Hub, a platform that ingests data from many sources, processes/enriches it, and distributes it to multiple downstream systems.

We’re looking for someone with 2+ years of industry experience building and operating production software who enjoys working across data pipelines, distributed systems, and operational reliability.

What you’ll do

  • Engineer distributed ingestion services that reliably pull data from diverse sources, handle messy real-world edge cases, and deliver clean, well-structured outputs to multiple downstream products.
  • Build high-throughput processing components (batch and/or near-real-time) with a focus on performance, scalability, and predictable cost, using strong profiling and measurement practices.
  • Design and evolve data contracts (schemas, validation rules, versioning, backward compatibility) so downstream teams can build with confidence.
  • Own production quality: write maintainable code, strong unit/integration tests, and add the observability you need (metrics/logs/tracing) to diagnose issues quickly.
  • Improve platform reliability by hardening pipelines against partial failures, retries, rate limits, data drift, and infrastructure issues—then codify those learnings into better tooling and guardrails.
  • Contribute to CI/CD and developer experience: faster builds, better test signal, safer releases, and automated operational checks.
  • Participate in design reviews, code reviews, incident retrospectives, and iterative delivery—making pragmatic trade-offs and documenting them clearly.

Technology Stack

  •  Languages: Predominantly Python and Node.js
  • Distributed/data platforms: HDFS, HBase, Spark, plus increasing use of Kubernetes and cloud services
  • Storage/search: MongoDB, OpenSearch
  • Orchestration: Airflow, Dagster, NiFi
  • Tooling: GitHub, GitHub Actions, Rundeck, Jira, Confluence
  • Deployment/config: Ansible (physical), Terraform / Argo CD / Helm (Kubernetes)
  • Development environment: MacBook (typical)

Requirements

Essential:

  • 2+ years building and operating production software systems
  • Fluency in at least one programming language (Python/Node.js a plus)
  • Experience debugging moderately complex systems and improving reliability/performance
  • Strong fundamentals: data structures, testing, version control, Linux basics

Nice to have:

  • Spark/PySpark experience
  • Hadoop ecosystem exposure (HDFS/HBase)
  • Workflow orchestration (Airflow/Dagster/NiFi)
  • Search/indexing (OpenSearch, MongoDB)
  • Kubernetes and infrastructure-as-code
  • Degree in Computer Science or numerical degree

Benefits

  • Competitive salary DOE
  • 25 days annual leave + your birthday off, in addition to bank holidays, rising to 30 days after 5 years of service.
  • Remote working
  • Private Family Healthcare.
  • 35 hour working week.
  • Employee Assistance Programme.
  • Company contributions to your pension.
  • Pension salary sacrifice.
  • Enhanced maternity/paternity pay.
  • The latest tech including a top of the range MacBook Pro.
Ripjar
Ripjar

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say