Vu Nguyen
May 2025
Try the bot here.
What’s Agak-Agak?
It’s a Singlish phrase meaning to estimate, doing guesswork. Showing a resume is like this to me, just estimating and doing guesswork. So I decided to make it a bit more fun and interactive with this bot and name it “guesswork” essentially.
How is the bot built?
Figure: Generalized architecture of Agak-Agak (source)

Core Components & Workflow
1. Document Processing and Chunking
- My resume and supplementary context documents are split into smaller text chunks using a smart splitter that respects natural language boundaries (paragraphs, sentences).
- Each chunk overlaps with the previous one to preserve context across chunk boundaries, improving semantic coherence.
- Embedding and Vector Storage
- Each chunk is transformed into a dense vector embedding using a SentenceTransformer model (
all-MiniLM-L6-v2) to capture its semantic meaning.
- These embeddings are stored in ChromaDB, an efficient in-memory vector database optimized for similarity search.
- Query Processing and Retrieval
- When a user submits a question, the system embeds the query into the same vector space and performs a fast approximate nearest neighbor search in ChromaDB to retrieve the top candidate chunks.
- To enhance precision, the top 2× the requested results are then reranked using a cross-encoder model (
ms-marco-MiniLM-L-6-v2) which jointly considers the query and each candidate chunk to produce a more accurate relevance score.
- Answer Generation with LLM