Website is not getting indexed correctly
# 🤝help
p
Hey, I am working on a Chatbot and I wanted to add a website to the KB, but somehow it is not taking the content from the website itself but from some popups you can reach from the header. After investigating the website a bit more, I found out that some genius put everything, including the header into the body. But now I am asking myself: it is possible only to index the relevant content on the page? Like selecting a single div on the page to index or something similar? Thanks 🙂
b
currently you can't get that granular on the indexing
an alternative workaround would be to just save those pages to a document, like a pdf or plain text, and uploading that
since indexing a site only takes a snapshot of your website at a moment in time, this should result in the same KB
p
also thought about this solution, kind of unsatisfying but as you mention, results are the same.. Thanks 🙂
b
it's also worth noting that if we're having trouble indexing your content so will search engines
so if you run a business, it's probably in your best interest to fix this... 😛
p
it's not my business, I am building for a client😂 but yeah, probably I should mention this...
b
hahaha
yeah gotcha
p
@bumpy-butcher-41910 could I contact you via dm? I have a quick question about this and don't want to share the clients website here..
b
I can't offer support via DMs, sorry!
p
okay... can you explain why botpress is only indexing a part of the website? I found out that it is not even looking at the other parts of the site but only at a specific subpart, even if the link shows every part of the website
Only looking at this part which is somewhere inside the body tag
b
do you know if the part of the website not being indexed by botpress is being indexed by a search engine, for example?
p
Yes it is
b
hmm
I'm not sure I have a reliable answer to your question, then
p
okay...
I will open a new post, maybe someone knows fix
b
yup! you might find it helpful to include some screenshots of what's happening when you try to index the site too
p
@bumpy-butcher-41910 you know which crawler Botpress is using? What I found is that the website I tried to index has a robots.txt that looks like this: https://www.contiss.de/robots.txt Is it possible that this website is blocking the Botpress crawler? And only indexing the Popup, because the popup is under a special url: https://www.contiss.de/#preisrechner
145 Views