Architecting layman.vuishere.com: Building a Context-Aware AI Research Agent

Mar 2026

<aside> 🤖

Note: This article was written by an LLM deeply analyzing git logs and implementation diaries, working strictly under the technical direction and editing of Vu Nguyen.

</aside>

For the past 12 hours, I've been heads-down building layman.vuishere.com, an AI-powered technical research and synthesis platform designed to distill complex engineering domains into digestible, actionable insights. Whether you are a Staff Engineer evaluating SOTA models or a C-Suite executive trying to understand what replacing a cloud service means for your bottom line, the tool dynamically adjusts to your technical altitude.

In this post, I want to walk through the system architecture, how we went from a blank canvas to an orchestrated multi-agent system, the brutal technical hurdles we overcame along the way, and what’s next for the project. Grab a coffee—this is going to be a deep dive.

The "Why" and the Core Vision

The AI space is noisy. When you ask a generic LLM to give you trade-offs between Claude 4.5 Opus and GPT-5.2 for a specific workload, you get a wall of text. It lacks context about who you are, it lacks factual grounding from live searches, and it certainly doesn't plot the trade-offs on a Pareto frontier.

I wanted to build a "technical co-pilot" that does exactly that:

Understands your technical profile (Hobbyist vs. Enterprise vs. Researcher).
Expands your simple query into deep technical subdomains.
Spawns autonomous agents to search, scrape, and build a local knowledgebase.
Synthesizes visually (Timelines, Recharts Pareto charts, Stack Rankings).
Exports a boardroom-ready presentation.

Phase 1 & 2: The Foundation and Dynamic Profiling

To build a robust system, I knew I needed a solid boundary between my stateful backend and my presentation layer. I went with FastAPI on the backend and Next.js (App Router) on the frontend. I stuck with standard Tailwind CSS and Shadcn UI components for a premium, dense-data aesthetic (glassmorphism overlays and fluid Framer Motion animations).

The first significant architectural decision was dynamic profiling. Generating good answers requires good priors. I built a flow that generates a sequence of user-specific onboarding questions to pin down their persona. To keep the database light and the context window rich, I opted to persist these profiles using Markdown with YAML Frontmatter. The metadata (e.g., user_type: enterprise) handles our application logic routing, while the nuanced text body gets seamlessly fed into system prompts as context. This small abstraction reduced a lot of complex relational database modeling down to a clean, versionable file system approach.

High-Level Architecture

graph TD
    User([User]) -->|Onboarding & Queries| UI[Next.js App Router]
    UI -->|API Requests| API[FastAPI Backend]

    subgraph Data Persistence
        ProfileDB[(Markdown Profiles)]
        FTS[(SQLite FTS5 Knowledgebase)]
        Cache[(Diskcache)]
    end

    API -->|Reads/Writes| ProfileDB
    API -->|Spawns| Planner[Planner Agent]

    Planner -->|Dispatches| Worker[Worker Agents]

    Worker -->|Searches & Scrapes| External[External Web / Reddit]
    Worker -->|Stores Findings| FTS
    Worker -->|Checks| Cache

    API -.->|SSE Live Logs| UI

    API -->|Aggregates| Synth[Synthesizer Engine]
    Synth -->|Reads| FTS
    Synth -->|Generates JSON| UI

The "Why" and the Core Vision

Phase 1 & 2: The Foundation and Dynamic Profiling

High-Level Architecture

Phase 3 & 4: Domain Parsing and The Orchestrator "Brain"