Hello company! Today we are going to talk about a little open source tool which will save you a ton of time when it comes to chat with your documents.
His nickname? Kotaemon. No, this is not a new Pokémon, but rather your future best friend when it comes to playing RAG (Retrieval-Augmented Generation for short).
So what is RAG? Imagine a virtual assistant capable of digging through a mountain of documents to get you the information you need, all while chatting with you as if nothing had happened through a very clean interface. Ideal for chat with your documentswhether you are a simple user who wants to ask questions about his files, or a developer who wants to get his hands dirty to create his own RAG pipeline.
Let's start at the beginning: installation. Nothing could be simpler, just run this command in your terminal:
docker run -e GRADIO_SERVER_NAME=0.0.0.0 -e GRADIO_SERVER_PORT=7860 -p 7860:7860 -it --rm taprosoft/kotaemon:v1.0
And presto, go to http://localhost:7860/
to access the web interface. The default account is admin/admin, but you can create other users directly from the interface.
Now, let's talk a little about awesome features :
1. Multi-user : Kotaemon supports multiple user login. Practical for working in a team or sharing your favorite document collections with your colleagues.
2. Various LLM models : Whether you are an OpenAI, Azure team, or prefer open source models like Llama, Kotaemon adapts to your desires. It even supports local models via Ollama or llama-cpp-python.
3. Pipeline RAG hybrid : Kotaemon uses a mix of full-text and vector search to find the most relevant information in your documents.
4. Support multi-modal : Texts, images, tables… Kotaemon handles it all like a boss. It's perfect for your reports full of incomprehensible graphs.
5. Advanced Quotes : No more answers coming out of nowhere. Kotaemon tells you exactly where his information comes from, with a little highlighter in the original document. Useful for checking that your assistant isn't talking bullshit to you.
6. Complex reasoning : For tricky questions that require combining several pieces of information, Kotaemon can break the problem down into sub-questions. Thanks Sherlock!
7. Interface configurable : You can tweak a lot of parameters directly from the interface, without having to dive into the code. In short, ideal for those allergic to the terminal.
8. Extensibility : For devs who like to tinker, Kotaemon is based on Gradio. This means that you can add your own interface elements or customize the processing pipeline as you see fit.
Now, if you really want to push the envelope, here are some tips to get the most out of Kotaemon:
1. Optimize your documents : The more well structured your docs are, the more effective Kotaemon will be. Remember to use clear headings, bulleted lists, and format your tables neatly.
2. Play with the settings : Don't hesitate to tinker with the recovery and generation settings. Sometimes a small adjustment can make a big difference in the quality of responses.
3. Combine models : Try different combinations of embedding models and LLMs to find the perfect duo for your needs.
4. Use agents : For complex tasks, agents like ReAct or ReWOO can really make a difference.
5. Customize the prompts : Default prompts are fine, but by tailoring them to your specific domain you can get even more relevant responses.
And for devs who would like to push it even further, know that you can easily add your own reasoning or indexing pipelines. The project even provides an example GraphRAG pipeline to give you ideas.
Well, I can already hear you: “But Korben, isn’t it a bit of overkill to just ask my docs questions?” Well, imagine not! Imagine: you are working on a huge project, with hundreds of pages of specs, reports, and various notes. Instead of spending hours sifting through everything to find specific information, you throw a question to Kotaemon and boom, you have your answer in a few seconds, with the exact sources. That, my friends, is called gaining productivity!
And the best part is that it's open source.
Go take a look at the project's GitHub repo and start playing around with it.