Skip to main content
Fotoke notion

From scattered onboarding emails to a single chatbot

March 15th, 2024
GEN AI

When new colleagues join Arinti, onboarding information used to be scattered across emails and documents. With Large Language Models now available, we built a chatbot that answers employee questions instantly — powered by our own Notion knowledge base.

How it works — three steps

First, Document Ingestion: we convert all Notion onboarding content into numerical vectors using OpenAI embedding models. The content is split into chunks using LangChain's markdown splitter, then stored in a vector database. Second, Query: the user's question is converted into a vector using the same embedding model, then matched against the vector database via similarity search. The most relevant content chunks are passed along with the question to OpenAI GPT 3.5 Turbo, which formulates an answer based on a structured prompt. Third, Memory: the chatbot tracks conversation history, combining previous messages with new questions into standalone queries.

breaking it down

Automation and deployment

Azure Functions automatically fetch new Notion content on a daily basis, keeping the chatbot's knowledge up to date. The front-end is built with Streamlit and embedded directly into our Notion workspace — so employees access it without switching tools. The same architecture works for any knowledge base: FAQ pages, project documentation, or policy documents. As long as the content is stored somewhere, it can serve as the foundation for a chatbot.

Fotoke notion
Interested in building something similar?

LET'S TALK