Job VC
Senior AI Data Engineer (RAG / Retrieval / Python)
Technologies
Description
We are hiring a
Senior AI Data Engineer
for a Vancouver-based client building
an AI/data platform with heavy external data ingestion, non-standard data engineering, and retrieval-driven workflows.
This role is best suited for someone who has already worked on production AI/LLM-related data systems, especially where data ingestion, parsing, indexing, retrieval, and backend services come together.
This is not a classic BI / reporting / data warehouse role.
This is not a pure backend API role.
We need someone strong at the intersection of data engineering, AI retrieval / RAG workflows, Python backend services, cloud infrastructure, and messy real-world data pipelines.
What You’ll Do:
Build and maintain data ingestion pipelines for structured and unstructured external data.
Design and support retrieval pipelines for AI/LLM workflows.
Develop Python services and APIs for data processing and retrieval, primarily with FastAPI.
Work with vector-based retrieval, metadata enrichment, chunking, indexing, and synchronization.
Support data flows across Postgres, object storage, vector search, and related stores.
Improve reliability, observability, performance, and maintainability of the existing platform.
Collaborate with software engineers and AI-focused teammates to stabilize and evolve the system.
Contribute to technical design decisions in a fast-changing startup environment.
Requirements:
5+ years of commercial software/data engineering experience.
Strong commercial experience with Python.
Hands-on commercial experience building data pipelines / ingestion workflows.
Hands-on commercial experience with AI/LLM-related retrieval systems, such as RAG pipelines, vecto search / embedding-based retrieval, or document ingestion / parsing / chunking / indexing workflows.
Experience building or maintaining FastAPI or similar Python backend services.
Experience with AWS data / cloud infrastructure.
Experience with unstructured or semi-structured data.
Strong SQL and practical data modeling skills.
Ability to work independently in ambiguous product environments.
Strong written and spoken English — all technical documentation and client reviews are in English.
Strongly preferred
Production experience with one or more vector databases / vector search technologies, such as Pinecone, pgvector, Qdrant, Weaviate, OpenSearch / Elasticsearch vector search, or FAISS.
Experience with graph databases or connected-data modeling, such as Neo4j or Amazon Neptune.
Experience with scraping-heavy or connector-heavy ingestion systems.
Experience with LangChain, LangGraph, Haystack, LlamaIndex, or similar orchestration frameworks.
Experience with Terraform.
Experience supporting retrieval quality, latency, and production reliability.
Nice-to-Have:
Experience with reranking, hybrid retrieval, or evaluation of retrieval quality.
Experience with AI agent workflows or tool-calling systems.
Experience with data governance, permissions, or enterprise knowledge access.
Experience in startup or product companies where engineers own end-to-end outcomes.
Client and Domain:
Client:
a software company
Country:
Canada
Domain:
AI/Data platform
Apply for a job
Write to us in email to
[email protected]
, in telegram
@insoftex_company
, or via the form below.
Senior AI Data Engineer
for a Vancouver-based client building
an AI/data platform with heavy external data ingestion, non-standard data engineering, and retrieval-driven workflows.
This role is best suited for someone who has already worked on production AI/LLM-related data systems, especially where data ingestion, parsing, indexing, retrieval, and backend services come together.
This is not a classic BI / reporting / data warehouse role.
This is not a pure backend API role.
We need someone strong at the intersection of data engineering, AI retrieval / RAG workflows, Python backend services, cloud infrastructure, and messy real-world data pipelines.
What You’ll Do:
Build and maintain data ingestion pipelines for structured and unstructured external data.
Design and support retrieval pipelines for AI/LLM workflows.
Develop Python services and APIs for data processing and retrieval, primarily with FastAPI.
Work with vector-based retrieval, metadata enrichment, chunking, indexing, and synchronization.
Support data flows across Postgres, object storage, vector search, and related stores.
Improve reliability, observability, performance, and maintainability of the existing platform.
Collaborate with software engineers and AI-focused teammates to stabilize and evolve the system.
Contribute to technical design decisions in a fast-changing startup environment.
Requirements:
5+ years of commercial software/data engineering experience.
Strong commercial experience with Python.
Hands-on commercial experience building data pipelines / ingestion workflows.
Hands-on commercial experience with AI/LLM-related retrieval systems, such as RAG pipelines, vecto search / embedding-based retrieval, or document ingestion / parsing / chunking / indexing workflows.
Experience building or maintaining FastAPI or similar Python backend services.
Experience with AWS data / cloud infrastructure.
Experience with unstructured or semi-structured data.
Strong SQL and practical data modeling skills.
Ability to work independently in ambiguous product environments.
Strong written and spoken English — all technical documentation and client reviews are in English.
Strongly preferred
Production experience with one or more vector databases / vector search technologies, such as Pinecone, pgvector, Qdrant, Weaviate, OpenSearch / Elasticsearch vector search, or FAISS.
Experience with graph databases or connected-data modeling, such as Neo4j or Amazon Neptune.
Experience with scraping-heavy or connector-heavy ingestion systems.
Experience with LangChain, LangGraph, Haystack, LlamaIndex, or similar orchestration frameworks.
Experience with Terraform.
Experience supporting retrieval quality, latency, and production reliability.
Nice-to-Have:
Experience with reranking, hybrid retrieval, or evaluation of retrieval quality.
Experience with AI agent workflows or tool-calling systems.
Experience with data governance, permissions, or enterprise knowledge access.
Experience in startup or product companies where engineers own end-to-end outcomes.
Client and Domain:
Client:
a software company
Country:
Canada
Domain:
AI/Data platform
Apply for a job
Write to us in email to
[email protected]
, in telegram
@insoftex_company
, or via the form below.