Conrad van Coller

Solution Architect · Forward Deployed Engineer

As a Solution Architect at Kaha Management, I lead enterprise CLM/AML technical implementations and integrations. I co-founded the Reporting Accelerator product, a real-time data synchronisation platform built from concept to multi-client production.

View CV Download LinkedIn

8 yrs engineering
3 yrs solution architecture
Ericeira, Portugal

01 / Selected work

Selected Work

Selected projects worth describing in more depth than the CV allows.

KahaPilot: Private documentation layer powering AI assistants and coding agents.

Designed & Built · 2025 – Present

A documentation-grounded AI coding assistant for Fenergo (Fen-X) integration work, built end-to-end as an internal tool for Kaha Management. LLMs hallucinate on specialised vendor APIs because their training data rarely covers them. KahaPilot fixes this by hosting a private semantic search index of developer documentation, curated internal stackoverflow knowledge, integration patterns & templates, and more, locally on each developer's machine, and exposes it to Claude Code and Claude Desktop over the Model Context Protocol.

This system is a component in functional knowledge uses as well as agentic coding systems.

KahaPilot high-level design diagram showing Claude clients connecting to an MCP wrapper, the grounded-docs vector index, Ollama embeddings, and Phoenix OSS telemetry running off the hot path. — High-level design. Claude clients call the MCP wrapper, which fronts a local SQLite vector index of Fenergo documentation. Phoenix OSS sits off the hot path for telemetry.

What it does

When a developer asks Claude to write Fenergo code, Claude searches the local index first and grounds every response in authoritative documentation. The corpus covers ~2,900 Developer Hub pages, hundreds of Swagger / OpenAPI specifications, and ~5,000 quality-curated answers from the internal Stack Overflow for Teams instance, the most valuable institutional knowledge a new joiner could get. Retrieval runs entirely on the developer's laptop, with no external API calls and no data leaving the organisation.

Functional consultants use it the same way through Claude Desktop. When they need a specific piece of information (internal company knowledge or a Fen-X documentation lookup), Claude answers from the grounded corpus and surfaces direct links to the relevant sources alongside the response, for verification and further reading.

Architecture

A small Node.js MCP wrapper fronts grounded-docs, an extended build of the open-source arabold/docs-mcp-server. The wrapper re-exposes its tools with Fenergo-scoped descriptions, enriches results with canonical source URLs, and emits OpenInference TOOL spans asynchronously so telemetry never blocks retrieval. The whole stack ships as a single Docker Compose project that a developer brings up with one command. Claude Code Hooks are also used to improve MCP tool calls in CC uses.

Indexing & distribution

Documentation is chunked along Markdown structure (headings, code fences, table boundaries) so each chunk holds one self-contained idea: a single endpoint, a concept, a Q&A pair. Chunks are embedded locally with snowflake-arctic-embed2, a model that performs strongly on technical and code retrieval, and persisted into a SQLite vector store. The pre-indexed store ships as a compressed archive in Git LFS, and a scheduled job rebuilds it weekly from the latest Fenergo documentation, Stack Overflow exports, and Swagger specs. Developer machines simply git pull to refresh their embeddings, with first grounded query under five minutes from a clean clone, and no local re-indexing required.

Telemetry & observability

A companion Phoenix OSS deployment captures every MCP tool call, giving engineering leads visibility into which docs are retrieved, retrieval latency, token consumption, and per-user activity. Phoenix is hosted by a small Aspire project in C# and deployed to Azure Container Apps for centralised, team-wide telemetry. Phoenix is also surfaced back into Claude Desktop as an MCP server, so leads can query traces in natural language.

2,900+ Developer Hub pages indexed

5,000+ Curated Stack Overflow answers

200+ Swagger / OpenAPI specs

Reporting Accelerator: Real-time data synchronisation and reporting for CLM/AML platforms.

Co-Founder & Technical Lead · Nov 2024 – Present

Real-time data ingestion, synchronisation, and reporting platform built for Fen-X (Fenergo's enterprise CLM/AML offering). I identified a structural reporting gap in the Fenergo ecosystem across multiple clients and drove the initiative from concept to multi-client production alongside full Solution Architect responsibilities. The product is now live across three financial institutions and maintained by a four-engineer team I lead.

What it does

Fen-X holds client lifecycle and compliance data in a NoSQL backend that is not designed for high-volume analytical queries. The accelerator captures every domain event the platform emits (Entity, Workflow, Product, etc changes), enriches each event against the source APIs, and lands a structured, queryable data mirror in PostgreSQL. Financial institutions can then run their own reporting and analytics on top of a real-time, audited mirror without putting load on the platform's transactional path. Often using SnowFlake's PostgreSQL connector to import the data to in-house data lakes.

Both webhook (push) and polling (pull) ingestion are supported, so the system works against tenants that do not expose real-time event streams. Historical backfills and per-tenant reseeds are first-class operations rather than afterthoughts.

Architecture

Seven specialised .NET 8 microservices orchestrated with Aspire and deployed to Azure Container Apps. The pipeline is event-driven end to end: an Event Listener fronts the source platform's webhook and polling integrations, publishes onto RabbitMQ with dead-letter queues and idempotent retry handling, and an Event Handler consumes, enriches, and persists each message. Separate services own seeding & migration (Data Seeder API), workflow-driven ingestion (Flows API), native report syncing (Advanced Reporting Handler), and scheduled maintenance (Background Services).

The data store follows a Medallion architecture in Azure PostgreSQL Flexible Server. Raw event tables retain the original JSON payload and a complete audit trail, structured tables hold the relational projection, and materialised views provide query-optimised reporting surfaces that BI tools (Power BI, Tableau, custom SQL/API consumers) connect to directly. No proprietary semantic layer, no vendor lock-in on the read side.

Ingestion & enrichment

Webhook events are accepted, validated, and immediately published to queues so the source platform's delivery is acknowledged within milliseconds. The handler consumes asynchronously, enriches events through each data domain's Query API, and writes a normalised row alongside the original payload. Polling is symmetrical: the listener tracks watermarks per event type and replays gaps without producing duplicates downstream. The Flows API gives the source platform's workflow engine a direct ingress path at specific points in a given workflow, used when highly granular data is not a requirement.

Reporting layer

The Advanced Reporting Handler mirrors the source platform's native reporting engine: it syncs the catalogue of report queries, triggers scheduled executions, fetches outputs, and lands them in dedicated tables alongside the event-derived data. Background workers refresh materialised views on configurable cadences and manage retention. The result is a single PostgreSQL schema that combines event-sourced history, current-state projections, and pre-aggregated report outputs, all queryable with standard SQL.

Operations & deployment

Each tenant is a self-contained deployment. Clients fork the repository, populate tenant configuration, and run a fully automated GitHub Actions pipeline that provisions Azure infrastructure, builds and publishes container images, and brings the stack up in roughly fifteen minutes. Health endpoints, dependency probes, and Application Insights telemetry ship by default. The same artefacts run on-premises for clients that cannot deploy to Azure.

260 K+ Legal entity AML datasets

3 Financial Institution Clients in production

4 Developers led

More projects coming soon.

02 / Contact

Contact

I'm always interested to hear about challenging technical problems and opportunities to work with great teams.

Email vancollerconrad@gmail.com LinkedIn linkedin.com/in/conrad-van-coller