Cost not proportional to token use in some cases?
# 🤝help
w
I’m currently testing a bot that incorporates Knowledge Base. Aside from the single KB query, the bot has only one AI task (GPT-3.5) which typically uses fewer than 1000 tokens per run. On runs where an answer is found in the first KB source (which is a web source), token consumption is around ten thousand tokens, with a cost of about half a cent (0.005) However, in some cases, including when no answer is found in the KB, token use is roughly double at about twenty thousand, but cost is increased by a factor of ten or more, typically around ten cents (0.10) I don’t see anything explicit in the logs or docs that would seem to explain this phenomenon. I’d like to understand what is causing cost to be out of proportion with token use. All guidance much appreciated!
b
in the event debugger, you're able to see exactly which actions consume how many tokens
as well as their associated cost
ideally this should break it down
if it doesn't, can you provide some examples of these logs where the cost is increased disproportionately to token usage?
w
@bumpy-butcher-41910 thanks for the reply! So in the example of the last case where ~20k tokens are reported in the log but the cost of $0.10 appears disproportionate, in the Event Debugger pane, the only entry that shows token count is the AI task, which shows
Input Tokens: 820
and the KB query does not show token count. -- in the logs, the knowledgeAgent action has two entries
Generating answer based on a 41 results
and
no helpful answer generated by KB
, but does not register how many tokens were used in its operation. The only other entry for the run that I see which gives a token count is the the final entry,
Billed tokens: 20,806 | Total tokens: 20,806 | Cost: $0.1054 | Cache savings: 0%
b
gotcha!
and what part of these numbers is disproportionate to what you were expecting?
w
comparing that previous example with another run where the KB did find an answer and roughly half the previous example's token count was used (total 10,872), but the cost was roughly one-eighteenth of the previous example at $0.0055 -- here's the log summary of that run:
Billed Tokens: 10,872 | Total Tokens: 10,872 | Cost: $0.0055 | Cache savings: 0%
. --- so comparing the two cases, it would appear that the previous run in which the KB did not find an answer, was about nine times more expensive per token used, than this run -- n.b. I've tried many runs of each type, and the results are always roughly what I've described here, so the phenomenon isn't a fluke
b
the only way that I can explain this behaviour is that it looks like one is using GPT-4 and one is using GPT-3.5
without having done the test myself I can't confirm this
because input and output tokens are also billed at different rates, that will also cause some variation in cost
but not at the scale you've described here
b
same thing while an error, total cost much higher
w
ah! I have "hybrid" selected as the "Model Strategy" in the Knowledge Agent config, do you think that might be something to do with it? Like it's switching to "Best" when an answer not found, or something along those lines?
b
ah there you go
that's exactly how the hybrid selection works
w
aha! thanks for confirming that -- and the cost difference sounds in the ballpark to you, i.e. 4 is 9+ times as costly per token than 3.5?
b
you don't need to take my word for it:
you're charged at cost, we don't mark up the prices you see here
4-turbo is roughly 20x more expensive than 3.5 turbo
w
yep, that looks like it'd do it all right, hahaha! Thanks a bunch for helping me work through that -- I'm gonna try changing the config to "Fastest / 3.5" and seeing how the results look
b
\o/
w
Closing the loop on this: Changing the KnowledgeAgent setting / config to "Fastest / 3.5" did indeed make the cost-per-token consistent across various queries, as expected. It also improved cache savings on repeat similar queries. (There was also an expected, notable, performance degradation when switching to 3.5-only). Thanks again for the help!
b
thanks for following up! glad to hear it 🙂
4 Views