Job VC
Staff AI Engineer, Conversation Intelligence Systems
Technologies
Description
Who we are
Xenoss is an AI engineering and integration services company, helping medium to large enterprises run AI transformation end-to-end, from situation analysis and goals framing to data discovery and preparation, pipeline building, model development, retraining pipeline design, solution deployment, and support.
We build a broad spectrum of AI solutions such as user behaviour prediction, content generation, NLP, audience segmentation, pathfinding solutions, AI assistants, edge computer vision, fraud detection, and others.
We work with prominent companies such as Microsoft, Toshiba, AstraZeneca, Activision Blizzard, Verve Group, Voodoo Games, and Telefonica, among others.
We’re included in the top 100 software companies on the Inc. 5000 list.
What is the project
We’re hiring a Staff Applied AI Engineer to lead a long-term enterprise AI initiative for a world-leading banking holding.
The project focuses on building conversational intelligence and predictive decision systems on top of large-scale customer interaction data. The goal is to transform raw customer conversations and related enterprise data into structured business signals, real-time guidance, and outcome prediction capabilities.
The work sits at the intersection of conversational AI, enterprise data, applied machine learning, real-time decision support, and regulated financial services.
The first phase will validate the core conversation intelligence foundation. The broader roadmap includes post-call event sequence mining, real-time in-call guidance, golden dataset creation, model evaluation frameworks, and prediction layers that combine conversation signals with CRM, customer profile, transaction, product usage, and outcome data.
You will help define how these systems are designed, evaluated, and scaled.
What will you do
You’ll lead the applied AI direction across the full lifecycle, from data and taxonomy design to model evaluation, system architecture, and production readiness.
Core work includes:
Designing AI approaches for conversation event sequence mining
Turning unstructured transcripts into structured events, intents, customer reactions, and opportunity signals
Defining golden dataset strategy, annotation schemas, and SME validation workflows
Evaluating
LLM-based
extraction, classification, RAG, fine-tuning, and hybrid approaches
Designing evaluation frameworks, quality benchmarks, and acceptance criteria
Leading error analysis and model improvement cycles
Shaping real-time guidance capabilities for live customer conversations
Designing prediction approaches that combine conversation signals with structured customer and outcome data
Partnering with data engineering, MLOps, client SMEs, and delivery leadership
Making trade-offs between model quality, latency, cost, explainability, and governance
You’re expected to be deeply hands-on in AI system design, evaluation, and experimentation.
Technology landscape
You’ll operate across the modern applied AI and ML ecosystem, including, but not limited to:
LLM-based
extraction and classification
Open-weight and proprietary LLM evaluation
RAG and knowledge-grounded response generation
Fine-tuning and domain adaptation
LoRA / QLoRA and PEFT methods
PyTorch and Hugging Face ecosystem
Classical ML and sequence modelling for outcome prediction
Probability calibration and model evaluation
Dataset annotation workflows and golden dataset design
MLOps, monitoring, and model governance concepts
We optimise for production viability, measurable business value, and enterprise constraints.
Scope of ownership and delivery context
At Staff level, you’ll own the applied AI architecture and evaluation strategy for a complex enterprise AI program.
Core ownership
Define the AI approach for the conversation intelligence PoC
Establish the event / intent / insight taxonomy
Define the golden dataset strategy and annotation workflow
Establish evaluation frameworks and acceptance criteria
Drive trade-offs between accuracy, explainability, latency, cost, and governance
Decide which modeling approaches are appropriate for each use case
Act as an escalation point for AI architecture, evaluation, and data strategy decisions
Team and delivery context
Work within a cross-functional team spanning AI engineering, data engineering, MLOps, solution architecture, and client stakeholders
Partner with domain SMEs on taxonomy, labeling, and validation
Mentor engineers working on extraction, evaluation, and data pipelines
Translate ambiguous business use cases into testable AI hypotheses and validation plans
What should you bring
Must have
Strong hands-on experience with applied AI / ML systems in production-oriented environments
Experience with NLP, conversational AI, or transcript-based intelligence systems
Ability to design evaluation frameworks, not just run experiments
Experience building or validating structured datasets from unstructured text
Strong understanding of
LLM-based
extraction, classification, RAG, and fine-tuning trade-offs
Practical knowledge of classical ML or predictive modeling
Understanding of probability-based prediction, calibration, and outcome evaluation
Comfort working with messy enterprise data and incomplete labels
Ability to communicate with both technical teams and business stakeholders
Strong ownership of ambiguity, scope control, and PoC validation strategy
Nice to have
Financial services domain exposure
Experience with sales, call center, or customer conversation analytics
Speech / ASR pipeline familiarity
Model governance and auditability experience
Experience with real-time AI systems or low-latency inference
Experience combining unstructured conversation signals with structured CRM, transaction, or customer profile data
Experience designing golden datasets and SME review workflows
Operating model
Engagement structure: FTE-equivalent via long-term B2B contract
Work location: On-site or closely aligned with the client team in New York
Infrastructure: Client environment only, no external training or data processing environments
Data residency: All work executed within the client perimeter
Delivery mode: PoC-first, with a path toward production-grade conversation intelligence and prediction systems
Xenoss is an AI engineering and integration services company, helping medium to large enterprises run AI transformation end-to-end, from situation analysis and goals framing to data discovery and preparation, pipeline building, model development, retraining pipeline design, solution deployment, and support.
We build a broad spectrum of AI solutions such as user behaviour prediction, content generation, NLP, audience segmentation, pathfinding solutions, AI assistants, edge computer vision, fraud detection, and others.
We work with prominent companies such as Microsoft, Toshiba, AstraZeneca, Activision Blizzard, Verve Group, Voodoo Games, and Telefonica, among others.
We’re included in the top 100 software companies on the Inc. 5000 list.
What is the project
We’re hiring a Staff Applied AI Engineer to lead a long-term enterprise AI initiative for a world-leading banking holding.
The project focuses on building conversational intelligence and predictive decision systems on top of large-scale customer interaction data. The goal is to transform raw customer conversations and related enterprise data into structured business signals, real-time guidance, and outcome prediction capabilities.
The work sits at the intersection of conversational AI, enterprise data, applied machine learning, real-time decision support, and regulated financial services.
The first phase will validate the core conversation intelligence foundation. The broader roadmap includes post-call event sequence mining, real-time in-call guidance, golden dataset creation, model evaluation frameworks, and prediction layers that combine conversation signals with CRM, customer profile, transaction, product usage, and outcome data.
You will help define how these systems are designed, evaluated, and scaled.
What will you do
You’ll lead the applied AI direction across the full lifecycle, from data and taxonomy design to model evaluation, system architecture, and production readiness.
Core work includes:
Designing AI approaches for conversation event sequence mining
Turning unstructured transcripts into structured events, intents, customer reactions, and opportunity signals
Defining golden dataset strategy, annotation schemas, and SME validation workflows
Evaluating
LLM-based
extraction, classification, RAG, fine-tuning, and hybrid approaches
Designing evaluation frameworks, quality benchmarks, and acceptance criteria
Leading error analysis and model improvement cycles
Shaping real-time guidance capabilities for live customer conversations
Designing prediction approaches that combine conversation signals with structured customer and outcome data
Partnering with data engineering, MLOps, client SMEs, and delivery leadership
Making trade-offs between model quality, latency, cost, explainability, and governance
You’re expected to be deeply hands-on in AI system design, evaluation, and experimentation.
Technology landscape
You’ll operate across the modern applied AI and ML ecosystem, including, but not limited to:
LLM-based
extraction and classification
Open-weight and proprietary LLM evaluation
RAG and knowledge-grounded response generation
Fine-tuning and domain adaptation
LoRA / QLoRA and PEFT methods
PyTorch and Hugging Face ecosystem
Classical ML and sequence modelling for outcome prediction
Probability calibration and model evaluation
Dataset annotation workflows and golden dataset design
MLOps, monitoring, and model governance concepts
We optimise for production viability, measurable business value, and enterprise constraints.
Scope of ownership and delivery context
At Staff level, you’ll own the applied AI architecture and evaluation strategy for a complex enterprise AI program.
Core ownership
Define the AI approach for the conversation intelligence PoC
Establish the event / intent / insight taxonomy
Define the golden dataset strategy and annotation workflow
Establish evaluation frameworks and acceptance criteria
Drive trade-offs between accuracy, explainability, latency, cost, and governance
Decide which modeling approaches are appropriate for each use case
Act as an escalation point for AI architecture, evaluation, and data strategy decisions
Team and delivery context
Work within a cross-functional team spanning AI engineering, data engineering, MLOps, solution architecture, and client stakeholders
Partner with domain SMEs on taxonomy, labeling, and validation
Mentor engineers working on extraction, evaluation, and data pipelines
Translate ambiguous business use cases into testable AI hypotheses and validation plans
What should you bring
Must have
Strong hands-on experience with applied AI / ML systems in production-oriented environments
Experience with NLP, conversational AI, or transcript-based intelligence systems
Ability to design evaluation frameworks, not just run experiments
Experience building or validating structured datasets from unstructured text
Strong understanding of
LLM-based
extraction, classification, RAG, and fine-tuning trade-offs
Practical knowledge of classical ML or predictive modeling
Understanding of probability-based prediction, calibration, and outcome evaluation
Comfort working with messy enterprise data and incomplete labels
Ability to communicate with both technical teams and business stakeholders
Strong ownership of ambiguity, scope control, and PoC validation strategy
Nice to have
Financial services domain exposure
Experience with sales, call center, or customer conversation analytics
Speech / ASR pipeline familiarity
Model governance and auditability experience
Experience with real-time AI systems or low-latency inference
Experience combining unstructured conversation signals with structured CRM, transaction, or customer profile data
Experience designing golden datasets and SME review workflows
Operating model
Engagement structure: FTE-equivalent via long-term B2B contract
Work location: On-site or closely aligned with the client team in New York
Infrastructure: Client environment only, no external training or data processing environments
Data residency: All work executed within the client perimeter
Delivery mode: PoC-first, with a path toward production-grade conversation intelligence and prediction systems