Live chatbot conversation suddenly ends. no errors...
# 🤝help
c
Not sure, what's going on here. I can see on the conversations tab that the user entered her email address at this time: 15:38:08. but I can't see this email and time on the logs. So the customer entered her email and then nothing else happened. Just suddenly stopped. Not the first incident as well. This is on a webchat. But the same thing has also happened on our slack integrations. Need some help please. Our client is not very happy. And now we need to check every single conversation (a lot of them as well over the weekend) to check if this has happened to others. reported this problem: report_01J0K1DAR5DAP2XRP94TXDP7BX https://cdn.discordapp.com/attachments/1252204483954479136/1252204484738809946/1.jpg?ex=66715dcd&is=66700c4d&hm=b392fc1c14c0b78a73df9378bf46af38c40de0e0f3150f955735250af6f085d6& https://cdn.discordapp.com/attachments/1252204483954479136/1252204485049061386/Untitled-1.jpg?ex=66715dce&is=66700c4e&hm=1464ef5672e158f7201b535360f485182e4b00ad6ef8e1ea475c3c57c52d2797&
found another one. this one is preceded by an error : error saving state session for conversation. but not sure if relevant. the conversation seem to have continued after the error and then suddenly stopped again. On the conversations tab, you can see the user selected from a choice at 16:14:58.952Z. But again, this does not exist on the logs. so the conversation ended abruptly again and didn't manage to follow the flow to record the transcript, add the user to the crm or notify us. basically missing an important message from the user. https://cdn.discordapp.com/attachments/1252204483954479136/1252217570983936080/6.jpg?ex=667169fe&is=6670187e&hm=a4d2b7f2517e9fbfaf20b572d8ead84f99c800f605a36fad4a1dde984d00c659& https://cdn.discordapp.com/attachments/1252204483954479136/1252217571261026335/4.jpg?ex=667169fe&is=6670187e&hm=a2ab294b06ce744eb8732494a994397c24fb809a7aab04e628d40e097d317d70& https://cdn.discordapp.com/attachments/1252204483954479136/1252217571495772264/5.jpg?ex=667169fe&is=6670187e&hm=0ad28bac862442c82f34f4832abba47c85b34182aec1cf3d88a054d07be17053&
b
Hi Jojo, will you get any error if you give it another try?
l
Hi @big-market-58184 , I'm Jojo's partner. I have just used a transcript from one of the failed conversations to recreate a chat and it failed again albeit even earlier this time. The event log shows another internal server error.
I followed up this test with another using my own personal details & it worked fine. Quite bizarre.
b
using your own email 🤔 ?
l
Yes
The failed test I did didn't even get as far as email. I entered the users name & then it froze
We found another one that came in between 2 & 3 o'clock this afternoon (Madrid time). It seems to just happen randomly. We can't find any rhyme or reason why it keeps happening. None of the failures have anything in common. Sometimes it goes early in the flow when the user is being asked for their name, sometimes at the end of the chat when things are wrapping up. If it were happening every time at the same point we might be able to work it out ourselves, but when it's an internal server error like these are we're pretty lost. Thankfully we're able to go into the logs and pull out the user details for the client but it's starting to take up much of our day firefighting this issue.
@big-market-58184 it looked like you were typing a response yesterday but it never appeared. Is there anything we can do to try fix this ourselves? I think the answer is probably no, in which case what can we do here? Looking at the event log I can see the server errors happen often, even from before we noticed our bots were freezing mid conversation. Only now the errors are losing our client potential customers & we are spending a lot of time daily trawling the convo logs looking for a way to fix this. Any ideas?
Another chat blipped out mid-way through. The user answered a question & the bot never spoke again.
@big-market-58184 @bumpy-butcher-41910 is there anyone out there who can help? Our clients live bots are malfunctioning for days and it's costing them & us money. This issue was reported via the BP Studio (report_01J0K1DAR5DAP2XRP94TXDP7BX) on Monday but nobody is getting back to us. I totally understand you get inundated with people asking for help in here, but this issue does not appear to be something we can rectify ourselves. It's being caused by an internal server error, as per the screen grabs above. I wouldn't reach out for help if we could deal with it ourselves!
b
hey there - thanks for flagging this. I took a look but, like you, I can't identify any reason this might be happening, and this isn't an issue we've seen with other users
can you try exporting the bot to another workspace or simply a fresh bot file and re-deploying it?
l
Hi @bumpy-butcher-41910 thanks for the reply. I'm trying to export the bot to my own companies workspace now. What do you mean "simply a fresh bot file and re-deploying it"? Like start over from scratch?
b
instead of a new workspace, just importing it into a brand new bot in the same workspace
l
@bumpy-butcher-41910 I've tried to import the bot half a dozen times now into another workspace & it keeps throwing an error & freezing. Upon page refresh the flows do appear, but given that it causes an error I don't have much confidence that everything is imported correctly.
Ah ok, I'll try that now & see if it is any better.
The import error message disappears so quickly that it's difficult to get a screen grab of it
Ok, so it didn't have an import error on the new instance on our clients workspace. I'll do some basic setup & just keep testing it until it breaks.
b
interesting. let me know~
s
We've been having the same issues and we import fresh versions of the bot every day or two. report_01J1B42DZNGVTNKTGYS31KDFY0
l
@bumpy-butcher-41910 & @big-market-58184 , ok I finally managed to get around to setting up an alternate version of our bot. On the first attempt of testing it out the internal server error happened (report_01J1SPWDZ64RDRY3TMVDETV4HW), again. It's not good to hear that someone else is having the same issue (see prev comment), though it does suggest it's not an issue specific to our account. Here's some background info on our situation so that someone can hopefully make sense of what is going wrong: ***Our bot was functioning perfectly for months, we had moved onto building a Slack Bot for our client (although this newer bot is actually suffering from the same issue now also...). We know it was working perfectly because we check through the conversation logs every day & cross check them against any leads that come in. We then needed to update the bot with a small prompt update on an AI task. Upon publishing this update we were quickly made aware of the bot being broken by reports from our clients website visitors (embarrassing 🤦🏼). We checked it and learned that all of the Generative Text cards were broken & needed to be fixed. It took us a few days to find & fix them all. This is when we began noticing the internal server errors. During our daily conversation checks we saw that chats were cutting off mid-way through. We cross checked these 'blips' against the server errors & sure enough they all matched up. The errors don't happen at the same point during the chat. Sometimes they happen early, sometimes towards the end. There seem to be no rhyme or reason to it. On your advice we set up the bot on a fresh template in the BP studio - but as prev mentioned the server error happened on the first attempt.* We did wonder if this was caused by Webchat, but the same happens in Slack so it must be caused by BP.
@fresh-fireman-491 I know you don't work for BotPress, but you do help a lot of people on here & are pretty epic - have you heard many other builders complaining about these internal server errors? Having had a look through the help section I've seen a few mentions of it.
f
Thanks for the fast response @bumpy-butcher-41910! I'll post here since it seems to be the same problem Ryan is facing. It happened twice today. This is what happened in the user events, and nothing on the logs for a while after the internal server error. https://cdn.discordapp.com/attachments/1252204483954479136/1257704618096463945/image.png?ex=66856033&is=66840eb3&hm=836a2540e47ff8f932476c9a96b967b4a59a8d216d7c69376f1de3fcc6050648& https://cdn.discordapp.com/attachments/1252204483954479136/1257704618327015475/image.png?ex=66856034&is=66840eb4&hm=8e837365e1ef533c8c9f694dea295810d26183e26e57235334ae88a98382b255&
b
thanks for sharing gang 🫡
r
Hey all – Here's what we know so far about this error. The error is caused by an execution timeout. Basically, each message has to be processed within 60s by the bot, otherwise an error is thrown. (we will improve the error message so it says that). We are still investigating what can be the cause for these timeouts. We think that in the vast majority of the intermittent failures, the cause is OpenAI being slow or down, but we're looking into other possible causes. We're also working on having backup LLMs for when OpenAI is down or slow, which should ship within the new few weeks. If you could answer these questions (and anyone else having these errors) that would help us troubleshoot further: - Are you making API calls to external systems that may be slow? - Are you chaining multiple AI Tasks back-to-back in the same workflow, such that executing the whole chain may take more than 60 seconds? - Are you storing large payloads in the variables? - Do you have loops or manual promises (like using setTimeout) inside your workflows that can run for a while?
b
@fast-printer-67716 @limited-orange-3544 ^ 🫡
f
Thanks for the quick response you guys! I'm using an external API to send/save the information the user tells me, but the problem never happened during the API call. It only happened when I'm saving some variables (asking them to the user). We're not using AI directly, the user should only answer/choose some options from a dropdown button, there are some exceptions where they type some info like their name and I save it. No large payloads, the biggest ones are usually strings less than 30 char. No loops, only the regular bot timeout that I changed to 15 min.
r
@fast-printer-67716 thanks for all the answers, that really helps! would you mind creating a problem report, this will give us access to a copy of your bot for further troubleshooting? @square-energy-41150
s
# Sending a Problem Report Here's how to send the Botpress Team a problem report: 1. In the Botpress Cloud Studio, locate the "Help" icon in the bottom left corner of the screen. 2. Click "Report a problem" and follow the steps on the screen. 3. Copy the Report ID to your clipboard and provide it to a member of the Botpress Team on Discord.
message has been deleted
f
Surely, thanks for the help. Here is the report ID: report_01J1W4XZ01AMREV4BE0953V4GY
l
Thanks @rich-battery-69172 . My wife @calm-cricket-17313 is going through your questions now & composing some answers.
c
Hi @rich-battery-69172. We indeed have many back to back chains of AI Tasks in the same workflow. Many of the errors occur during those AI tasks but then that could also be just because we have many of them because they also occur during a simple yes or no question which does not involve AI tasks. I've been looking at the logs and events and conversations just now and I found this conversation that actually had a 500 bad gateway error on the logs. In this one, the user just needed to select yes or no. He selected no and nothing else happened. That message from the user didn't even reflect on the logs. It showed as failed on the events with a 500 gateway error on the logs too. Although most of the time, there's no error on the logs at all. just nothing happens. https://cdn.discordapp.com/attachments/1252204483954479136/1258061983308320778/badgateway-error.jpg?ex=6686ad06&is=66855b86&hm=92b4b5e248c69f33f50636182a0312b463e35b58e1e8cd0b7c5105619d3db68f& https://cdn.discordapp.com/attachments/1252204483954479136/1258061983639539822/conversation-stopped.jpg?ex=6686ad06&is=66855b86&hm=ed8bf134a1c269fd66ed4a53551dec7eb8a2df2620ef15749df983b595e94367&
Also, we have the whole conversation history in one variable but there have also been times when the error happens and the conversation stops right at the beginning when there's not much stored in it yet. On one of our bots that has this issue, we only use the zapier integration. The longest setTimeout we have is only 5 seconds.
l
@rich-battery-69172 @bumpy-butcher-41910 @big-market-58184 It's been a week since I last checked in here. We're still getting the internal server errors causing the conversations to abruptly end. Have we any response to @calm-cricket-17313 's comments above?
We thought things had improved, when in-fact our client had just turned off his Google Ads so the bot was receiving less attention.
Now they're back on and I'm having to manually check every convo to make sure leads don't fall through the cracks.
f
Same here. However, if the user send another message after the error, the conversation resumes as if nothing happened. I instructed them to do so for now
l
We get that ocassionally, but more often the chat freezes and never recovers. Our bot is public facing so we can't really tell people to just keep typing messages if it freezes. If we did it'd look quite unprofessional!
f
I get it, my boss wasn't too happy with my "solution" either, but he accepted it for now. My bot records occurrences/failures in the public lighting service in 15 small towns in rural São Paulo. So we thought that it would be better for them to keep trying instead of just getting no answers.
c
We have more work to do now on this bot as well. makes me wonder if it's worth developing it not knowing the status of this issue. such a shame because everything was going quite well until we published it again for a very small update. then all the generative texts broke and then this internal server error. Has to be one of their updates that broke it.
l
@bumpy-butcher-41910 would our client be able to get this resolved if he were on the Team plan instead of PAYG? I see you get Live Chat as part of the package.
The Team plan is way overkill for his current needs - but he's AI crazy & has plans in the pipelines for many future bots.
b
we're looking into this issue but - as you can imagine - there isn't an easy diagnosis + fix (if there was, it would have been resolved by now) - and since this only affects a small percentage of conversations it's difficult to keep track of
to that end, there's not much that our Live Chat support team can do on this front, beyond provide more information to our current diagnostic efforts
howdy howdy @limited-orange-3544 @calm-cricket-17313 @fast-printer-67716 we just pushed a hotfix to the bots associated with the report IDs listed here that should resolve this issue - can you monitor these bots for the next 24 hours and let me know how things get on? I'll check in again tomorrow at 4pm EST to see 🙂
f
Sure thing! Thanks!
l
Thanks @bumpy-butcher-41910 . We monitor our bots constantly anyway. I'll have to ask my client to switch his ads back on so that we can give it a thorough testing. We'll let you kow what happens.
b
great - of course, the fix will remain in place even after 4pm EST tomorrow. we applied a hotfix to a small selection of accounts and want to determine whether to apply that hotfix elsewhere
l
Hi @bumpy-butcher-41910. 51 mins ago we had another server error that caused a users conversation to abruptly end. Do we need to re-deploy/publish our bot to see the benefit of the hotfix you rolled out? If not, then it would appear the fix hasn't worked for us 😣
f
Things are going smoothly over here, had just 9 users since the hotfix. None got the problem until now
r
This helps Ryan, we saw that you seem to be facing a different issue than @fast-printer-67716 & @calm-cricket-17313 In your case the problem is the state is too big to be persisted to our State API. There's a lack or clear error which we will fix to make this problem easier to troubleshoot in the future. It's unclear looking at your bot where and why this happens, we're going to add more detailed logs to help find the root cause. Will keep you posted on the updates here.
b
progress is progress is progress, thanks for the updates everyone!
l
@calm-cricket-17313 is my wife - our issue is the same issue. To be fair this is the only error we've had today. Jojo had to roll back an update earlier & I wonder if the last screen grabs I posted were related to that. We'll continue to minitor & stay hopeful your hotfix worked as it has for junkrs
b
happy monday bot builders - do we have any updates for me? 😎 🫡
l
Hi @bumpy-butcher-41910 Happy Monday indeed! Our client hasn't yet switched on his Ads so the bot hasn't had a proper test. Of the few users who found the website/bot organically today one of them had 2 internal server errors in the space of 3 minutes - however the chat wasn't interrupted & the goal of the interaction was met (they were sent an invoice). So it's looking positive. Once we have more data to look at we'll report back here. Thanks for checking in. Much appreciated.
@rich-battery-69172 @bumpy-butcher-41910 - So far it looks as though the hotfix may have been a success! Our client still hasn't turned on his ads so we've not managed to have a really good test yet. Since it does seem to have worked though, so I wondered if you could please apply the same hotfix to our other bot - report_01J2XP22WNVV0PXA0TJF84Q6YG. Do you think it will help? This other bot is installed into Slack & for use internally by our clients staff & one of them has encountered an internal server error which has left them stuck in a loop which they can't escape. We built in an escape function so a user can type 'cancel' & be returned to the start of the flow, but it's not working. We've had him try to type a number of different prompts to try get the bot to snap out of it but nothing is working. We had them log out of Slack & back in again, force restart the app, do a hard refresh, clear his cache. As I was writing this ~30 mins had passed since the issue was reported to us & the bot finally responded - perhaps due to a timeout? The user is able to use the bot again thankfully. It would be great if this didn't happen again. Do you think your hotfix will help in this situation?
r
@limited-orange-3544 we can apply the hotfix too, but just curious when the issue was first experienced by your client on slack?
l
Today is the first time this specific issue happened, but there have been internal server errors over the past few weeks that have caused the bot to freeze (similar to our website bot) - although these errors have been escapable. I was intent on asking you to apply the hotfix to the Slack bot eventually, once we'd managed to test the website one sufficiently. This error was much worse than those previously which prompted me to ask you to apply it sooner.
f
Sorry for the late response, I've been busy with some issues. I have not seen the bot freezing since them, things seem to be working fine now, I'll let you know if anything happens on this matter. Thanks for the help!
l
@rich-battery-69172 @bumpy-butcher-41910 sorry guys, now my client has switched back on his ads and the bot is taking more enquiries it's freezing still. It did appear that the issue was gone, but now it appears not. We've done a little work on the bot, unsure if that's managed to break your hotfix? Here's a new report ID - report_01J32P94E7PY298PZME6KBV82A We also have another issue. We've started to see duplicate responses from the bot sometimes too. An issue we saw on another post (https://discord.com/channels/1108396290624213082/1263401378231881759) Having looked through the Help topics this morning I can see there are many more users experiencing similar issues!
c
@rich-battery-69172 @bumpy-butcher-41910 Just another thing I want to add that I noticed. Yesterday, I exported one of the bots that I was using as a stage and imported it in our live bot (where you applied the hotfix). After I published the live bot, the errors came in one after another. so I immediately rolled it back to the old version (I exported the live bot before importing the stage so I just had to import the previous version back). I made some changes in it and published it again. Since publish, the errors still happen and the conversations still suddenly stop but not as crazy as after I imported and published that stage version from another bot (where the hotfix was not applied).
f
Good morning @bumpy-butcher-41910! My bot froze 2 times this weekend, even after the user tried to message it again there was no response. In events all there was is "internal server error" again. Should I create a problem report again?
r
@fast-printer-67716 please
f
Here it is: report_01J3DF007RDCP26BGT25MTW8FV Thanks!
r
@limited-orange-3544 @fast-printer-67716 thanks, the team is checking both your issues. we have a good idea of what the source might be
c
@rich-battery-69172 @bumpy-butcher-41910 unfortunately we have another of the same incident. much less frequent but happened again. made another report report_01J3MNXH93FDDD4CDW0KT6F70C..
Events are still showing status failed and internal server error. On the logs, it has a corresponding huge error that says Error saving state session for conversation - State: { "history": ... Some events with status failed and internal server error don't seem to cause the conversation to fail. but this one resulted in the same issue as have been reported previously where the bot stops responding altogether causing a very bad user experience. Also, this is a slack integration.
the bot has also been republished a few times after the fix that was done recently
r
@calm-cricket-17313 thanks for providing this. if you look at the end of the error in the logs, can you screenshot this too? thanks
f
@bumpy-butcher-41910 @rich-battery-69172 We are experiencing the same issues—flow errors and conversation state problems—with several of our bots. I will create a report for a few of them, but please let me know if you need any additional information. I reviewed the entire thread, and the reasons appear to be quite similar. Additionally, is there a limit to the size of a conversation state?
report_01J3QJC5WXDGT89JD6V16259SE
r
Thanks all for your patience and reporting on this. We're rolling out a few fixes this week that should address those.
@here to just follow up on this, we rolled out some fixes – please let us know if you ever face this issue again. Appreciate all your patience while we investigated and fixed this
f
Thanks for addressing the issue guys
l
Thank you Sly & your team. Your efforts are much appreciated. We'll let you know if we have a relapse!
Sorry @rich-battery-69172 , after a few days of no issues about half an hour ago we had another ISE that stopped a users chat from continuing. There was another earlier in the week but we figured maybe it was due to the bot not being published yet after you applied your patch. The bot has been published several times now this week so can no longer be because of that. Error in the events tab but nothing in the logs. I was unable to generate a report ID. https://cdn.discordapp.com/attachments/1252204483954479136/1271456193822331014/Screenshot_2024-08-09_at_15.05.59.png?ex=66b7675b&is=66b615db&hm=d43cb1b51cf4fe0ede3d7f7327b2bce86046a068b6bc3c95fe37c802c1e40b4a& https://cdn.discordapp.com/attachments/1252204483954479136/1271456194153811988/Screenshot_2024-08-09_at_15.03.57.png?ex=66b7675b&is=66b615db&hm=f92d47e82abd0787b1600533f9c51c1ce88911b3d5d9d39c46ee433f0dfec65c&
r
Thanks Ryan, will have the team look into it
@limited-orange-3544 I got some news! so it seems like the original issue is gone, however you indeed still experienced a conversation jamming. I don't want to share private information about your bot/workspace here, but if you go to the "Issues" page, you should see the occurrences of the issue. In that specific case, it seems like the call of the AI Task (yes/no question) using GPT 3.5 Turbo took more than 109 seconds to respond, which caused the message processing to be abandoned. There are various mid-term features that we're going to bring to prevent those AI-provider latency issues from affecting customer bots, but most importantly: - Automatic model fallback when a model is either down or has degraded performances - Long-lived workflows: workflows that can run for an indefinite period of time without impacting the conversation experience Stay tuned for those updates coming in the next few months. In the meantime, I would recommend you install the OpenAI integration and change your AI Tasks to use GPT 4o-mini. This model is the latest, fastest and cheapest and should have less downtime than GPT 3.5 Turbo
n
I am having an issue with integrating Slack. I guess it is related tot his bug.
244 Views