Knowledge base advice
# šŸ¤help
s
Hi all, Wondering if anyone has advice on how to structure their document knowledge bases? I'm finding that my chatbot cannot answer questions where there is a slight variation in the wording uses. It's very frustrating. How do I build the best possible KB that can also answer questions when they are asked slightly differently to normal
g
@salmon-chef-40227 I'm not an expert at this but I think there are many videos on YouTube explaining how to make a great knowledge base
f
Compress information through summarization or key point extraction.
Remove irrelevant text/documents tailored to your specific task. Reformat indexed data to match end users expected formats.
10 Ways to Improve the Performance of Retrieval Augmented Generation 1. Clean your data. RAG connects the capabilities of an LLM to your data. If your data is confusing, in substance or layout, then your system will suffer. If you’re using data with conflicting or redundant information, your retrieval will struggle to find the right context. And when it does, the generation step performed by the LLM may be suboptimal. Say you’re building a chatbot for your startup’s help docs and you find it is not working well. The first thing you should take a look at is the data you are feeding into the system. Are topics broken out logically? Are topics covered in one place or many separate places? If you, as a human, can’t easily tell which document you would need to look at to answer common queries, your retrieval system won’t be able to either. This process can be as simple as manually combining documents on the same topic, but you can take it further. One of the more creative approaches I’ve seen is to use the LLM to create summaries of all the documents provided as context. The retrieval step can then first run a search over these summaries, and dive into the details only when necessary. Some framework even have this as a built in abstraction.
There are a lot of great articles about this subject. There are also YouTube videoes about this as @glamorous-guitar-39983 said.
s
Thank you both. I've been searching YouTube for quite a long time but I haven't been able to find a single video that specifically demonstrates how to stucture the word doc. knowledge base. For example, should I include the question, and then the answer? Should I include 10 variations of the question and then the answer? Should I only include the answers themselves? I can't find anywhere that clearly states the best practices here. Any advice on this specifically? For example, my chatbot accurately answers "Do you have a gym?" and it answers it by also saying where it is. However if I ask it "Where is the gym?" it just does not answer the question. I've been searching the internet for a long time and I just can't seem to figure out why. Any ideas?
f
I would start out by just testing different ideas. See what works best for you
And I think just watching videos about RAG, and learning how it works is enough.
s
Yep been watching a lot of videos on RAG, thanks for what you sent through. So by testing different ideas, does this mean there are not best practice ways to structure the knowledge base?
f
There is no 1 solution that is best for all cases. It depends on how your data is structured. I would recommend creating a FAQ with all of the questions it can't answer
g
If your knowledge base is a mess you can also ask chatgpt to organize it
13 Views