Adjusting max_tokens to improve GPT chatbot latenc...
# 🤝help
m
I'm using OpenAI's API with GPT-4-turbo-preview model in Botpress for an embedded chatbot. I'm running into an issue with latency, where the bot is taking much longer to respond than expected. I read on OpenAI's API documentation that lowering the max_tokens parameter could help improve response times. However, I'm not a programmer and can't find where to adjust this in Botpress's UI or settings. Could anyone guide me on how to configure max_tokens for OpenAI's API calls within Botpress? Here's the OpenAI documentation that discusses the issue: https://platform.openai.com/docs/guides/production-best-practices/improving-latencies
f
Hey there @mammoth-solstice-82392 Take a look at this template https://botpress.com/templates/openai-api-template under Vision where I have added the max tokens. I believe I have set it to 500.
m
Thank you @fresh-fireman-491 -- I'll try it out! Would you say that setting it to 500 has helped?
f
I just used 500 because the response from OpenAI would be cut off when the URL to the image was too large when not using max_tokens. 500 has worked really well for me, but I used it a bit earlier and the response time was long but I think this is because OpenAI had some issues earlier that might still have an effect on the response time. Is the slow response time a problem you have had for a while or just today?
m
Slow response has been an ongoing issue. My first version of the bot was using GPT Builder (their GPT store tool) and it worked lightening fast. I created an embeddable version of the same bot using Chatbase, and it had a number of limitations including slow response. So I switched to working directly with the API and using Botpress for embedding, and overall it's worked well except for the slowness.
f
Ah okay I see. Let me know how it works out
2 Views