https://discord.gg/botpress logo
#🤝help
Knowledge base structure
# 🤝help
c

cool-gold-3015

07/22/2023, 2:30 PM
When botpress reads from a pdf or word document in the knowledge base, does it only read text, and ignore font sizes, paragraph spaces, bold/underlining, things like that, because knowing this would assisst me with buiding the knowledge base, thank you
d

delightful-wall-23933

07/22/2023, 3:43 PM
Good Question! i would like to know this too
c

cool-gold-3015

07/22/2023, 5:51 PM
@early-train-33247 can you help me on this one please
e

early-train-33247

07/22/2023, 5:53 PM
I don't think formatting matters, just the proximity of information.
Let's ask @acceptable-kangaroo-64719, he might know better
c

cool-gold-3015

07/22/2023, 6:27 PM
Do you know specifically how the bot actually fishes out information? based on the question of course
e

early-train-33247

07/22/2023, 7:12 PM
Unfortunately not, let's please wait for Gordy to respond!
c

cool-gold-3015

07/22/2023, 7:43 PM
alright thank you!
@acceptable-kangaroo-64719
would really like to know this one
a

acceptable-kangaroo-64719

07/24/2023, 10:21 AM
Formatting matters a little bit. Things that help (for documents) are: * Ensuring your PDF is text-based and not an image of text * Uses consistent fonts and text color * Does not rely heavily on images or LaTeX formatting For the actual information, Guilhermy is right that proximity is the only thing that matters. I've seen successful knowledge bases that use any of these formats: Labeled Question-Answer Question: Where do pandas live? Answer: They originate from the mountains of China near Chengdu Unlabeled Question-Answer What color are pandas" They are black and white List of facts Pandas eat bamboo
c

cool-gold-3015

07/24/2023, 10:27 AM
Great, thanks for helping
2 Views