tharensol
AI Customer Support System
All Projects
GenAI2024completed

AI Customer Support System

End-to-end AI support platform that handles 80% of tier-1 tickets autonomously, dramatically cutting response times and support costs.

80%
Tickets Auto-Resolved
4 min
First Response Time
3.2 → 4.7
CSAT Improvement

Project Overview

A B2B SaaS company with 120,000 active users was drowning in tier-1 support tickets — password resets, billing queries, onboarding confusion. Response times averaged 18 hours. Customer satisfaction was falling.

They needed an intelligent triage and resolution engine that could handle the majority of requests without human intervention, while routing genuinely complex issues to senior agents with full context.

The Challenge

The core engineering challenge was threefold:

  • Intent accuracy at scale — A generic LLM would hallucinate or give incorrect account-specific answers. We needed tight retrieval-augmented generation over private documentation, knowledge bases, and live account data.
  • Zero trust on PII — Customer data could not leave the client's VPC. All model calls had to be proxied through an on-prem inference layer.
  • Graceful escalation — The system had to know precisely when it couldn't answer, and hand off to a human agent with a complete, structured context packet.

Architecture

We designed a three-layer resolution pipeline:

Layer 1 — Intent Classification: Fine-tuned a distilled BERT model on 50k labeled historical tickets to classify intent into 38 categories with 96% accuracy.

Layer 2 — RAG Resolution Engine: Built a vector retrieval system over the client's documentation (Confluence, Zendesk macros, API docs) using pgvector and a GPT-4 API proxy. Responses were grounded strictly in retrieved context with citation links.

Layer 3 — Escalation Router: A confidence-threshold + rule-based system that detects when the AI is uncertain and constructs a structured handoff packet (user history, ticket category, retrieved context, AI's attempted answer) for human agents.

Implementation

  • Inference: Azure OpenAI on a private endpoint, proxied through a FastAPI gateway
  • Vector Store: PostgreSQL + pgvector, refreshed nightly from Confluence and Zendesk
  • Ticket Ingestion: Webhooks from Zendesk, normalised into a canonical schema
  • Agent Interface: Custom React dashboard showing AI decisions, confidence scores, and one-click approvals
  • Monitoring: Prometheus + Grafana for latency, accuracy, and escalation-rate dashboards

Results

Within 90 days of deployment:

  • 80% of tier-1 tickets resolved autonomously, with no human touchpoint
  • Average first-response time dropped from 18 hours to 4 minutes
  • Support team headcount held flat despite 40% user growth
  • Customer CSAT improved from 3.2 to 4.7 / 5

Technologies Used

PythonFastAPIGPT-4pgvectorReactAzure