Bot is sudenly using way more tokens
# 🤝help
m
My bot sudenly started using way more tokens, (like 4 times more). I feel like it started when i checked the gpt-3.5 option in the KB agent configuration. I even have a backup of my bot (in the form of an other bot) which i ask the exact same question then activated gpt-3.5 and it started to cost 4 times more. The problem is you can't uncheck the config.
c
Hey there!
Could you please send a screenshot of what you've toggled on?
m
Hi, I'll send it ASAP but it's in the knowledge agent configuration window, I had 3 options (gpt-3.5, hybrid, gpt-4) nothing was checked at first and the token usage was still normal, once checked it quadrupled. And I can't uncheck it (I even tried to restore default knowledge agent settings) I also have a screenshot of the logs of the same question asked right before and right after the change, where you can see the increase in token used.
b
there's no option for "nothing to be checked" - by default your bots will be set to hybrid
either way, your bot will only ever use 3.5 or 4 for your AI actions
it would be helpful to know what you define as "normal" vs. "quadrupled", and if you can provide any information from the logs about which requests are using more tokens than expected
m
The same question with the same answer used ~4k when nothing was checked, then 16k+ once gpt-3.5 was checked. I'll send you a screenshot asap
b
it's not just the question that's relevant - part of what includes tokens is the summary of the conversation, i.e. if you ask the same question at the end of a very long conversation it will use more tokens, because your bot uses the rest of the conversation as context to provide a better answer
m
on an old bot, you won't get any option checked, the hybrid option is indeed checked on newly created bot but not on bots created before this feature.
the same small question on this old bot cost 3k7 token
on my main bot
12k5 tokens
only kb agent enabled
no history of conversation
here is an example of the price before
same question after :
i can't get back to previous token usage
b
hm, interesting, thanks for this detailed information
let me look into this
but just to confirm - there isn't really a bot setting that will use "less" tokens than what you've shown here - what I mean is, even though an old bot would've had that option unchecked, they're still using 3.5-turbo behind the scenes
so something else is causing this increased token usage
m
yep, i just wanted to give you the scenario/timeline, it's probably something else but i can't figure out.
b
yeah this information is really helpful!
m
Is the number/size of my kbs impacting the number of token used ? to reproduce on my old bot, i only imported 1 text file compared to my main bot. i don't think so because i had normal usage token last week and begining of the afternoon.
and i'm only at 2.8% kb usage on the main bot
b
that was going to be my next question - we do also pass the contents of a KB through a query to make sure the information is retrieved properly
however it sounds like your kb is small
m
i'm going to export my bot, reimport it on an other bot and remove all kbs except the one i tried on the new bot
when doing that i have the same token usage as before
i don't understand, because i had normal token usage with the same number of kbs
b
the same # of kbs or the same size?
m
same size
b
hmm
m
i don't know what to think 😄 because if the token usage increase that much on juste 2.8% KB usage and cost between 1 to 2 cents for a 5 words question and 5 words answer (with only KB agent enabled), i'm not sure how much will i pay on 50% usage for example 😮 on the other hand i'm quite sceptical about the problem coming from this since it's quite sudden.
hey, looking at the logs it seems like even a simple question such as asking the weight of a product (answered in only one KB) is using a lot of KBs.
Hello @bumpy-butcher-41910, I just saw these 2 posts that seems close to my problem, both this week-end : https://discord.com/channels/1108396290624213082/1203735997616226404 https://discord.com/channels/1108396290624213082/1203388139435986974 do you have any more informations on that mater ?
g
+
b
hi all, this was a visual bug in the dashboard, but as you can see your per-bot usage is accurate we should have fixed the bug that was causing the usage to be incorrectly displayed in the aggregate dashboard
m
hi, the total usage indeed decreased on the dashboard but the token usage is still problematic when i query KBs, it looks like it's including random kb content when creating the context before sending it to the llm
like here for exemple, i ask if it's possible to pay in instalments, the answer is in One Kb as a website page, but it includes random stuff from kb talking about products
here in english. The kb content on the screen translate to (deepl): Depending on your goals, you can add it to your cream every day, or once or twice a week. Do not use the booster alone. For best results, focus on the areas most affected, and reapply if necessary. Why destock this Cellublue product? We're giving you the chance to get Cellublue products at an incredible price. The packaging is from an old collection, which is why we can no longer sell them, but we're taking advantage of this clearance offer to make sure you don't waste this stock of products. So go for it! with a 0.72 score, i don't know why this is used to answer my question about paying in instalments this have literally nothing to do with it.
Hi @bumpy-butcher-41910 can you enlighten me on that ?
Hey, can i have an answer on my last example ?
b
can you clarify how the answer provided in this screenshot isn't satisfactory?
m
i'm not saying it's not satisfactory, i'm saying it's using and sending chunks of text completely unrelated to my question to the llm, i've put an example in my last screenshot
the answer to my question is literally in one single place but it uses chunks of information about how to use a product for example.
b
it's not completely unrelated - what I suspect is happening here is that the concept of installments is semantically close to the concept of regularly applying something
which is how LLMs search through a vector database to find information
m
would you say it deserve a 0.72 score ?
b
that's not really for me to decide
m
i know ofc
b
but to clarify - this is working as expected because these kinds of queries will work with a semantic search on your KB and then cross-reference all of those chunks with what it thinks the answer to the question should be
m
CRYO GELS selling tips: Home Cryotherapy Experience: With the activating mist, enjoy the refreshing, slimming benefits of cryotherapy right in your own home. Targeting Stubborn Areas: With 1800 rotations per minute, the device works deep down to reduce stubborn fat, whether on thighs, buttocks, stomach or arms. Sculpting & Firming: Combined with the activating mist, the device stimulates the skin, making it more toned and sculpted. Refreshing sensation: The combined use of the activating mist and the device provides a pleasant sensation of freshness, stimulating fat elimination. Nature's advanced technology: Although equipped with cutting-edge technology, this device harmonizes perfectly with natural solutions such as the activating mist.
an other chunk included for the same question
yes i get that
is there a way ton set a minimum score or someting ?
because i see that the chunk used are ordered by score descending and the first one are indeed super relevant
but the majority after that really isn't
b
yeah there isn't currently but this would prevent those less-relevant chunks to be sent to the LLM
m
that would be a great option
b
however I think the assumption here is that if you send less chunks the request will use fewer tokens - in theory this is true, but we'd have to determine if that would translate into a meaningful reduction in token usage
and it's worth noting here that the majority of token usage on LLM operations depends on the complexity of the operation itself and not necessarily the size of the database
but I agree it might be useful to set a confidence interval for KB queries
equally - it's also worth mentioning here that AI Tasks support temperature intervals, which is adjacent to this
m
can you clarify your last sentence ?
i know what temperature is but how would that help me ?
i mean the workflow
b
it wouldn't for this specific task, but I thought you mind find it useful!
m
oh okay
should i open a feature-request ?
b
please do!
and include all the context you've added here, that's really helpful
m
thanks a lot
Hello @bumpy-butcher-41910 just so you know, i just loggend back to my bot, and noticed the cost of the exact same questions/answers with the same token usage as before are back to what it was (divided by 2) do you know if something happened ? thanks
(might not be recent, i did not try my bot since 1 month
c
Hey @many-evening-95878 we did adjust our pricing according to OpenAI's new pricing.
m
Oh okay thanks
4 Views