I'm using OpenAI's API with GPT-4-turbo-preview model in Botpress for an embedded chatbot. I'm running into an issue with latency, where the bot is taking much longer to respond than expected. I read on OpenAI's API documentation that lowering the max_tokens parameter could help improve response times. However, I'm not a programmer and can't find where to adjust this in Botpress's UI or settings. Could anyone guide me on how to configure max_tokens for OpenAI's API calls within Botpress?
Here's the OpenAI documentation that discusses the issue: https://platform.openai.com/docs/guides/production-best-practices/improving-latencies
Thank you @fresh-fireman-491 -- I'll try it out! Would you say that setting it to 500 has helped?
f
fresh-fireman-491
02/14/2024, 9:07 PM
I just used 500 because the response from OpenAI would be cut off when the URL to the image was too large when not using max_tokens.
500 has worked really well for me, but I used it a bit earlier and the response time was long but I think this is because OpenAI had some issues earlier that might still have an effect on the response time. Is the slow response time a problem you have had for a while or just today?
m
mammoth-solstice-82392
02/14/2024, 9:10 PM
Slow response has been an ongoing issue. My first version of the bot was using GPT Builder (their GPT store tool) and it worked lightening fast. I created an embeddable version of the same bot using Chatbase, and it had a number of limitations including slow response. So I switched to working directly with the API and using Botpress for embedding, and overall it's worked well except for the slowness.