website
# 🤝help
s
Knowledge Base and linking to a very large news source. Can someone please explain what my Knowledge Base indexes when I link to a news site which has 44,000 stories. It seems to have indexed some of the 2023 news stories (69 vectors). Is it indexing just the headlines or clicking through to the entire story? Any guidance appreciated and if it is worth linking to these large news sources. Should the KB be used for more targeted and precise information (reports etc) rather than broad news websites? Website KB source: https://www.healthclubmanagement.co.uk/health-club-management-news

https://cdn.discordapp.com/attachments/1140955966259597372/1140955966658064465/Screenshot_2023-08-15_at_11.24.49.png

a
Hey @some-greece-38740, happy to shed some light for you: * Unless you're using the sitemap crawl, only that specific page will be crawled and vectorized * If you want to search all 44,000 stories in your KB, consider using the web search source instead. It will use a scoped search engine search fro Google or Bing to search for the article instead of ingesting the whole site into your knowledge base * The best KB source depends on your data. Big websites like this are great for web search because search engines have already indexed them. Custom information like the bot's name are better-suited to a plain text source.
s
Thanks @acceptable-kangaroo-64719 this is really helpful and gives me a better sense for how to optmise the KB.
@gordy Can I ask you why website sub pages are not a valid web search? I thought a more specific page would be more effective than the home page as you are directing it to the most relevant part of the website.

https://cdn.discordapp.com/attachments/1140955966259597372/1141022847171579935/Screenshot_2023-08-15_at_15.52.24.png

a
yes, because search engines can only be scoped to a domain, not a page. You can use the web page source if you want a specific page.
3 Views