Enterprise Data Platform

Ingest. Enrich.
Embed. Search.

A pure ingestion and enrichment platform for any data source. Documents, APIs, databases, structured and unstructured — Vector normalises, enriches, embeds, and indexes it all.

Feed in PDFs, DOCX, HTML, Markdown, CSV, JSON, API responses, database exports, or raw text. Vector's pipeline chunks, enriches with metadata, generates GPU-accelerated embeddings, and serves sub-50ms semantic search with intelligent reranking.

Request a Demo → View API

<50ms

p95 Search Latency

Any

Data Source

1M+

Vectors / Node

100%

On-Premise

From Raw Data to Intelligence

A fully automated ingestion-to-query pipeline that handles any data source

📥

Ingest

Documents, APIs, databases, CSV, JSON, HTML — any structured or unstructured source

✨

Enrich

Metadata extraction, entity tagging, classification, and format normalisation

⚡

Embed

GPU-accelerated dense vector embeddings via transformer models

🗃

Index

HNSW vector index in Qdrant for fast approximate nearest-neighbour search

🔍

Search + Rerank

Semantic retrieval with cross-encoder reranking for precision

Built for Production

Everything you need to ingest, enrich, and search data at enterprise scale

📥

Multi-Format Ingestion

Ingest from any source — PDFs, DOCX, HTML, CSV, JSON, API responses, database exports, plain text, and more. One pipeline handles structured and unstructured data alike.

✨

Data Enrichment Pipeline

Automatic metadata extraction, entity recognition, classification tagging, and format normalisation. Raw data goes in, enriched, searchable knowledge comes out.

⚡

GPU-Accelerated Embeddings

CUDA-powered transformer models generate dense vector embeddings at throughput rates that leave CPU-only solutions behind. Toggle GPU on or off from the admin dashboard.

🎯

Cross-Encoder Reranking

Two-stage retrieval: fast vector recall followed by neural cross-encoder reranking. Precision jumps 15-30% over embedding-only search, with selectable reranker models.

📦

Workpack System

Drop-in ZIP bundles that pre-load domain knowledge, prompt templates, and advisory workflows. Install via drag-and-drop — no code changes required.

🔐

Multi-Tenant Isolation

Full tenant isolation with per-tenant API keys, user management, and scoped data access. Superadmin console for fleet oversight across all tenants.

📈

Real-Time Dashboard

Live metrics with 30-minute time-series charts. Track ingestion rate, search volume, p95 latencies, vector counts, and per-API-key usage analytics.

🔑

Scoped API Keys

Issue API keys with granular scopes — ingest, search, admin, or any combination. Per-key usage stats, rotation support, and instant revocation from the UI.

📊

Prometheus Metrics

Native /metrics endpoint exposing request counts, latency histograms, vector store size, and GPU utilisation. Plug straight into Grafana or your existing stack.

📑

Smart Chunking

Format-aware splitting with configurable overlap, metadata injection, and structure preservation. Handles tables, headers, nested JSON, and multi-level hierarchies.

🛡

PII-Safe Logging

Toggle PII-safe mode to strip sensitive data from all logs and audit trails. Built for regulated industries — healthcare, finance, legal — without sacrificing observability.

🔄

Source Connectors

Plug into REST APIs, webhooks, file shares, S3 buckets, and database connections. Schedule recurring ingestion or trigger on-demand — Vector pulls and processes automatically.

Workpacks

Pre-built knowledge bundles that turn Vector into a domain expert instantly

Advisory

IT Director Workpack

Strategic IT advisory covering digital transformation, cloud migration, cybersecurity governance, vendor selection, and technology roadmaps.

📄 50+ documents 🎯 Tuned prompts

Advisory

IT Manager Workpack

Operational IT guidance for infrastructure management, service desk optimisation, patch management, and team workflows.

📄 40+ documents 🎯 Tuned prompts

Advisory

Front Desk Workpack

Customer-facing knowledge base for reception, visitor management, booking systems, and front-of-house procedures.

📄 30+ documents 🎯 Tuned prompts

Compliance

GDPR Compliance Pack

Data protection policies, breach response procedures, DPIA templates, and regulatory guidance for GDPR compliance workflows.

📄 60+ documents ✅ Audit-ready

Industry

Healthcare Knowledge Pack

Clinical protocols, patient pathways, NHS Digital standards, and healthcare information governance frameworks.

📄 80+ documents 🛡 PII-safe

Industry

Legal Advisory Pack

Contract analysis templates, case law summaries, regulatory filing guides, and legal research workflows for in-house counsel.

📄 70+ documents 🔑 Scoped access

RESTful API.
Zero Friction.

Every capability in Candengo Vector is exposed through a clean, versioned REST API. Ingest documents, run semantic searches, and generate answers — all from a single endpoint. Authenticate with scoped API keys and get structured JSON responses.

POST /v1/ingest Upload & embed documents
POST /v1/search Semantic vector search
POST /v1/query RAG-powered Q&A
GET /v1/documents List ingested documents
DEL /v1/documents/{id} Remove a document
GET /health Health & readiness check

                                
                                
                                
                                curl
                            

# Ingest a document
curl -X POST https://your-instance/v1/ingest \
  -H "Authorization: Bearer cv_key_..." \
  -F "file=@report.pdf" \
  -F "metadata={\"dept\":\"legal\"}"

# Semantic search
curl -X POST https://your-instance/v1/search \
  -H "Authorization: Bearer cv_key_..." \
  -H "Content-Type: application/json" \
  -d '{
    "q": "data retention policy for EU clients",
    "top_k": 5,
    "rerank": true
  }'

# RAG query with citations
curl -X POST https://your-instance/v1/query \
  -H "Authorization: Bearer cv_key_..." \
  -H "Content-Type: application/json" \
  -d '{
    "q": "What is our GDPR breach notification process?",
    "cite_sources": true
  }'
                            

Architecture

Production-ready stack, deployable in minutes

💫

FastAPI

Async Python backend with automatic OpenAPI schema generation and Pydantic validation

🖸

Qdrant

High-performance vector database with HNSW indexing, filtering, and payload storage

🌀

Transformer Models

Sentence-transformers for embeddings with optional CUDA acceleration on NVIDIA GPUs

💫

Docker Ready

Single container deployment with docker-compose. Mount a volume and go — zero configuration

Ready to Transform Your Knowledge Workflows?

Candengo Vector is enterprise software, licensed and deployed to your infrastructure. Self-hosted, air-gapped capable, with dedicated onboarding and support. Contact our sales team for pricing and a personalised demo.

Contact Sales →

sales@candengo.com

Ingest. Enrich. Embed. Search.

Your AI Remembers.

From Raw Data to Intelligence

Built for Production

Multi-Format Ingestion

Data Enrichment Pipeline

GPU-Accelerated Embeddings

Cross-Encoder Reranking

Workpack System

Multi-Tenant Isolation

Real-Time Dashboard

Scoped API Keys

Prometheus Metrics

Smart Chunking

PII-Safe Logging

Source Connectors

Workpacks

IT Director Workpack

IT Manager Workpack

Front Desk Workpack

GDPR Compliance Pack

Healthcare Knowledge Pack

Legal Advisory Pack

RESTful API.Zero Friction.

Architecture

FastAPI

Qdrant

Transformer Models

Docker Ready

Ready to Transform Your Knowledge Workflows?

Ingest. Enrich.
Embed. Search.

RESTful API.
Zero Friction.