TransPerfect

Python Backend Developer

Madrid, Spain
Python OpenCV PyMuPDF python-docx Tesseract PaddleOCR LayoutLM Donut Nougat GPT Claude PyTorch TensorFlow Pandoc OCR AI Machine Learning Deep Learning Natural Language Processing
Description

Python Backend Developer

Location: Spain-Madrid

Time Type: Full time

Job Description

TransPerfect Is More Than Just a Job…
Our greatest asset is our people, and nothing is more important to us than ensuring that everyone knows that. Each of our 100+ offices has its own individual identity, and each also has its own unique rewards.

Job description

About us:

TransPerfect, a recognized leader in translation software with a vibrant start-up spirit, is seeking a creative and passionate Backend Developer to join our innovative Artificial Intelligence (AI) team. As part of this division, you will have the opportunity to shape the future of AI in a global organization. From its beginnings over 10 years ago and the creation of its first machine translation models, the AI team has become a core driver of the company's innovation in machine translation, generative AI, natural language processing, and automation.

We are looking for an experienced backend developer who is excited about pushing the boundaries of technology and making a lasting impact within the AI space. You will be part of a diverse, global team of professionals across the USA, Spain, Portugal and India. If you are passionate about robust and scalable solutions that bring AI to users, this is the role for you.

About the Role:

As a Backend Developer, you will help us solve the "last mile" of document processing: converting complex, unstructured PDFs into perfectly formatted, editable .docx files. The goal is not just to extract text, but to recreate the visual and structural intent of the original document—including nested tables, multi-column layouts, font hierarchies, and styling.

You will lead the research and implementation of our document conversion pipeline. This is a hybrid role requiring you to be both a strategic decision-maker (staying on top of the existing tools) and a hands-on developer (combining engineering and AI skills).

You will be in charge of:

·       Comparative Analysis: Perform a deep-dive evaluation of commercial (ABBYY, Adobe, AWS Textract) vs. open-source/AI-native (Mistral OCR, Docling, Nougat, LlamaParse) solutions.

·       Benchmarking: Establish metrics for "format fidelity" to objectively measure how well a tool recreates headers, footers, tables, and styles.

·       Pipeline Development: Build a Python-based workflow that integrates OCR engines with document generation libraries (like python-docx or Pandoc).

·       AI Implementation: Explore and fine-tune Vision-Language Models (VLMs) or LayoutLM-style architectures to improve structural recognition.

·       Optimization: Solve specific edge cases such as rotated text, low-resolution scans, and complex mathematical notation.

Job requirements

Technical Requirements

·       Python Mastery: Expert-level Python skills with experience in OpenCV, PyMuPDF, and python-docx.

·       OCR/Document AI: Deep familiarity with Tesseract, PaddleOCR, and modern Transformer-based document models (LayoutLMv3, Donut, or Nougat).

·       Format Expertise: A "pixel-perfect" mindset—understanding the nuances of XML-based document formats (OOXML).

·       LLM Integration: Experience using GPT or Claude models for layout correction and semantic cleanup.

·       Architectural Vision: Ability to decide when to use an "off-the-shelf" API versus when to build a custom PyTorch/TensorFlow pipeline.



Nice to Have

·       Experience with Pandoc AST (Abstract Syntax Trees) for format conversion.

·       Background in DTP, Typography or Graphic Design.

·       Contributions to open-source OCR or PDF manipulation projects.

TransPerfect
TransPerfect

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say