Senior Data Engineer
Location: Remote (United States)
Department: Engineering
About the Role
As a Data Engineer at Rohirrim, you’ll design, build, and optimize the data pipelines and infrastructure that fuel our AI products. You’ll work closely with our AI/ML teams, product teams, customer success managers,and security/compliance partners to transform complex enterprise datasets into clean, reliable, structured foundations for Rohan deployments — especially in controlled, secure, or GovTech environments.
You’ll help us scale:
- ingestion pipelines
- vector stores
- embedding workflows
- metadata & document-processing frameworks
- Azure-native data services
…in a way that is fast, compliant, and deeply reliable.
What You’ll Do
- Blend capabilities in software engineering, data engineering and devops to build and maintain scalable data ingestion pipelines for structured/unstructured data (documents, PDFs, knowledge bases, enterprise systems, APIs, etc.).
- Develop and operate ETL/ELT workflows that ensure data integrity, security, and lineage.
- Implement and optimize vector database systems and embeddings pipelines supporting RAG and AI features.
- Collaborate with ML engineers to support model training, evaluation, and feature engineering pipelines.
- Architect and manage Azure-based data infrastructure (e.g., Azure Functions, Azure Storage, Azure SQL, Azure Kubernetes Service, Azure OpenAI integrations).
- Build internal tools for metadata extraction, OCR/document parsing, text normalization, and validation.
- Ensure pipelines meet compliance, auditability, and security requirements (SOC2, FedRAMP, etc.).
- Support customer-specific data onboarding workflows for government + enterprise deployments.
- Monitor and improve pipeline performance, reliability, and scalability.
What Makes You a Great Fit
- 10+ years in Data Engineering, Software Engineering, or ML/Data Infrastructure roles.
- Strong experience with Python, SQL, and modern data engineering tools (Airflow, Dagster, dbt, Prefect, etc.).
- Experience building large-scale document extraction ETL pipelines (OCR, PDF parsing, metadata extraction, NLP preprocessing).
- Proficiency with Kubernetes, Docker, and containerized data pipelines deployed on Azure, AWS and/or Google Cloud
- Hands-on experience with relational databases (Postgres, SQL Server, MySQL) and non-relational systems such as Elasticsearch, Redis, and graph databases
- Experience with document-heavy or text-heavy data processing (OCR, parsing, NLP preprocessing).
- Strong data quality, governance, lineage, and validation mindset.
- Excellent communicator who can align with ML, engineering, and product teams.
Bonus Skills
- Experience building or supporting GenAI / LLM / RAG pipelines.
- Experience with Azure OpenAI Service.
- Experience with min.io
- Background with knowledge graphs, semantic search, or indexing at scale.
- Familiarity with CI/CD pipelines in Azure DevOps, GitHub Actions, or similar.
About the Company
Are you passionate about pushing the boundaries of technology in the Gen AI space? Rohirrim is seeking a Senior Data Engineer to mentor engineers, provide technical direction, and drive the development of cutting-edge applications. If you thrive in a fast-paced environment and enjoy leading by example while staying hands-on with coding, we want to hear from you!
Why Join Rohirrim?
At Rohirrim, we're at the forefront of innovation in the Gen AI space. Joining our team means being part of a dynamic environment where your leadership and expertise make a tangible impact on our products and team growth.
There are more than 50,000 engineering jobs:
Subscribe to membership and unlock all jobs
Engineering Jobs
60,000+ jobs from 4,500+ well-funded companies
Updated Daily
New jobs are added every day as companies post them
Refined Search
Use filters like skill, location, etc to narrow results
Become a member
🥳🥳🥳 452 happy customers and counting...
Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.
To try it out
For active job seekers
For those who are passive looking
Cancel anytime
Frequently Asked Questions
- We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
- We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
- We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
- We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
- Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
- Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
- Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅
What Fellow Engineers Say
