Tiger Analytics

Generative AI Data Engineer

United States
Python SQL PySpark Snowflake Neo4j Apache Airflow AWS GCP Linux GitHub VS Code DevOps Hadoop Spark Streamlit API Machine Learning Generative AI Elasticsearch Terraform CloudFormation Docker Kubernetes Jenkins GitHub Actions RAG LLM
Description

Gen AI Data Engineer

Location: United States

Department: MLE

Workplace: remote

Employment Type: full

Description

Tiger Analytics is looking for experienced Machine Learning Engineers with Gen AI experience to join our fast-growing advanced analytics consulting firm. Our employees bring deep expertise in Machine Learning, Data Science, and AI. We are the trusted analytics partner for multiple Fortune 500 companies, enabling them to generate business value from data. Our business value and leadership has been recognized by various market research firms, including Forrester and Gartner.

We are looking for top-notch talent as we continue to build the best global analytics consulting team in the world. You will be responsible for:

Technical Skills Required:

Programming Languages: Proficiency in Python, SQL, and PySpark.

Data Warehousing: Experience with Snowflake, NOSQL and Neo4j.

Data Pipelines: Proficiency with Apache Airflow.

Cloud Platforms: Familiarity with AWS (S3, RDS, Lambda, AWS batch, SageMaker processing Job, CloudFormation, etc.) or GCP (Vertex AI RAG, Data pipeline, Bigquery, GKE)

Operating Systems: Experience with Linux.

Batch/Realtime Pipelines: Experience in building and deploying various pipelines.

Version Control: Experience with GitHub.

Development Tools: Proficiency with VS Code.

Engineering Practices: Skills in testing, deployment automation, DevOps/SysOps.

Communication: Strong presentation and communication skills.

Collaboration: Experience working with onshore/offshore teams.

Requirements

Desired Skills:

·        Big Data Technologies: Experience with Hadoop and Spark.

Data Visualization: Proficiency with Streamlit and dashboards.

·        APIs: Experience in building and maintaining internal APIs.

·        Machine Learning: Basic understanding of ML concepts.

·        Generative AI: Familiarity with generative AI tools and techniques.

Additional Expertise:

·        Knowledge Graphs: Experience with creation and retrieval.

·        Vector Databases: Proficiency in managing vector databases.

·        Data Persistence: Ability to develop and maintain multiple forms of data persistence and retrieval methods (RDMBS, Vector Databases, buckets, graph databases, knowledge graphs, etc.).

·        Cloud Technologies: Experience with AWS, especially SageMaker, Lambda, OpenSearch.

·        Automation Tools: Experience with Airflow DAGs, AutoSys, and CronJobs.

·        Unstructured Data Management: Experience in managing data in unstructured forms (audio, video, image, text, etc.).

·        CI/CD: Expertise in continuous integration and deployment using Jenkins and GitHub Actions.

·        Infrastructure as Code: Advanced skills in Terraform and CloudFormation.

·        Containerization: Knowledge of Docker and Kubernetes.

·        Monitoring and Optimization: Proven ability to monitor system performance, reliability, and security, and optimize them as needed.

·        Security Best Practices: In-depth understanding of security best practices in cloud environments.

·        Scalability: Experience in designing and managing scalable infrastructure.

·        Disaster Recovery: Knowledge of disaster recovery and business continuity planning.

·        Problem-Solving: Excellent analytical and problem-solving abilities.

·        Adaptability: Ability to stay up-to-date with the latest industry trends and adapt to new technologies and methodologies.

·        Team Collaboration: Proven ability to work well in a team environment and contribute to a positive, collaborative culture.

GenAI Engineer Specific Skills:

·        Industry Experience: 8+ years of experience in data engineering, platform engineering, or related fields, with deep expertise in designing and building distributed data systems and large-scale data warehouses.

·        Data Platforms: Proven track record of architecting data platforms capable of processing petabytes of data and supporting real-time and batch ingestion processes.

·        Data Pipelines: Strong experience in building robust data pipelines for document ingestion, indexing, and retrieval to support scalable RAG solutions. Proficiency in information retrieval systems and vector search technologies (e.g., FAISS, Pinecone, Elasticsearch, Milvus).

·        Graph Algorithms: Experience with graphs/graph algorithms, LLMs, optimization algorithms, relational databases, and diverse data formats.

·        Data Infrastructure: Proficient in infrastructure and architecture for optimal extraction, transformation, and loading of data from various data sources.

·        Data Curation: Hands-on experience in curating and collecting data from a variety of traditional and non-traditional sources.

·        Ontologies: Experience in building ontologies in the knowledge retrieval space, schema-level constructs (including higher-level classes, punning, property inheritance), and Open Cypher.

·        Integration: Experience in integrating external databases, APIs, and knowledge graphs into RAG systems to improve contextualization and response generation.

·        Experimentation: Conduct experiments to evaluate the effectiveness of RAG workflows, analyze results, and iterate to achieve optimal performance.

Benefits

This position offers an excellent opportunity for significant career development in a fast-growing and challenging entrepreneurial environment with a high degree of individual responsibility.

Tiger Analytics
Tiger Analytics

0 applies

0 views

There are more than 50,000 engineering jobs:

Subscribe to membership and unlock all jobs

Engineering Jobs

60,000+ jobs from 4,500+ well-funded companies

Updated Daily

New jobs are added every day as companies post them

Refined Search

Use filters like skill, location, etc to narrow results

Become a member

🥳🥳🥳 452 happy customers and counting...

Overall, over 80% of customers chose to renew their subscriptions after the initial sign-up.

To try it out

For active job seekers

For those who are passive looking

Cancel anytime

Frequently Asked Questions

  • We prioritize job seekers as our customers, unlike bigger job sites, by charging a small fee to provide them with curated access to the best companies and up-to-date jobs. This focus allows us to deliver a more personalized and effective job search experience.
  • We've got over 200,000 jobs from 15,000+ vetted companies. No fake or sleazy jobs here!
  • We aggregate jobs from 15,000+ companies' career pages, so you can be sure that you're getting the most up-to-date and relevant jobs.
  • We're the only job board *for* software engineers, *by* software engineers… in case you needed a reminder! We add thousands of new jobs daily and offer powerful search filters just for you. 🛠️
  • Every single hour! We add 2,000-3,000 new jobs daily, so you'll always have fresh opportunities. 🚀
  • Typically, job searches take 3-6 months. EchoJobs helps you spend more time applying and less time hunting. 🎯
  • Check daily! We're always updating with new jobs. Set up job alerts for even quicker access. 📅

What Fellow Engineers Say