My problem:
I’m trying to add remote ota updates that I can trigger with mqtt so that I can update my controller’s firmware from anywhere. I added an mqtt topic for updates where it extracts a url from the mqtt message and calls a function that handles the update. The function is pretty standard for ota updates on arduino (I’ve done this before on other projects with no issue), but for some reason it keeps freezing after a while (the time or bytes written when it freezes do not seem consistent).
What I’ve tried:
I double checked that the url I’m using is working, and it is, I can download the firmware from it on my computer.
I tried making it so that the loop doesn’t run after I’ve started an ota update (it just returns immediately) so that any processes there don’t interrupt the update, and that hasn’t solved it either. (Before doing this, I got a crash every once in a while that seemed to do with memory allocation and UDP).
I switched to chunked reading of the stream instead of reading it all at once (and tested out different buffer sizes), and that hasn’t helped.
I made it print the http connection status, the wifi connection status, and the free heap after each chunked read, and they all seem stable (the lowest I’ve seen the free heap get is about 70000, which I imagine should be fine).
No matter what I do, the update just freezes and never finishes. There are no crashes or error messages or anything, the controller just seems to completely stop and I can not for the life of me figure out why. If anybody has any insight into why this might be happening and/or any solutions, I’d really appreciate any help I can get, thanks.
I would try triggering the update function via a different path as a test.
Can you create a preset that trips the Fn?
You could set that up as a Time-Macro.
Essentially pull the MQTT piece out of the process (for now).
If you can get another path to perform the auto update, you may be able to get MQTT to eventually trigger the alternate path.
The approach you took is wrong or at least inappropriate IMO.
WLED already has /update HTTP endpoint where you can start the update using curl or similar.
If you want WLED to “request” an update (as a response to API call, i.e. JSON) you may want to look into ESP8266httpUpdate class (or ESP32 variant).
You may best pair this with a usermod implementation.
I’m quite confident that using mqtt is not the issue here since that’s just to pass the url and I haven’t had any issues with that, and I can’t get rid of mqtt altogether since that’s my main way of communicating with the controller. The issue seems to be with the update itself, maybe some background process that’s interrupting it or something like that. Thanks for the suggestion though!
Thanks for the reply! I know about the /update endpoint, but /update can only be called when connected to the controller’s access point or when connected to the same network as the controller is, right? I need to be able to update from anywhere in the world. The issue seems to be with it trying to download the update over http from a public url (which I need it to do), something in the connection messes up and it just freezes.
Hi, sorry for the delay getting back to you on this, life has been very busy lately. I finally made the draft PR, let me know if you’re able to see it.
It sounds like you’re encountering a frustrating issue with your OTA update process over MQTT. Here are several potential avenues to explore that might help you troubleshoot and resolve the freezing problem:
1. Check Network Stability
Wi-Fi Stability: Ensure that the Wi-Fi connection is stable and has low latency during the OTA process. You might want to log connection quality metrics during the update.
Signal Strength: If possible, test the update in an environment with strong Wi-Fi to rule out connectivity issues.
2. Memory Management
Heap Fragmentation: Although you have a decent amount of free heap, fragmentation might still be an issue. Consider running a memory compaction routine or using a simpler data structure to manage OTA updates.
Reduce Memory Usage: Check other parts of your code for memory leaks or excessive memory usage that might be consuming heap space during the update.
3. Error Handling
Timeouts: Implement timeout mechanisms during the OTA update process. If the download takes too long, you could reset the ESP32 or report an error through MQTT.
Verbose Logging: Add more logging around the update process to capture any anomalies before the freeze occurs. This can help identify any specific operation that leads to the problem.
4. Chunk Size and Reading Method
Buffer Sizes: While you mentioned trying different buffer sizes, experiment with both smaller and larger chunk sizes. Sometimes, optimal sizes can vary based on network conditions.
Stream Reading: Consider using WiFiClient::read() in a loop until the end of the stream instead of reading a fixed number of bytes. This can help manage variable response sizes more gracefully.
5. OTA Process Isolation
Task Management: If you’re using FreeRTOS, consider running the OTA process in its own task to isolate it from other operations. This can help prevent other tasks from interfering.
Disable Interrupts: If applicable, disable interrupts during the OTA update to avoid race conditions or unexpected behavior.
6. Use Standard Libraries
ESP32 OTA Library: Ensure you’re using the ESP32’s built-in OTA libraries for the update process, as they are optimized for handling OTA updates efficiently.
Update via HTTP: Double-check that your code is correctly set up to handle OTA via HTTP and not inadvertently using other protocols or libraries that could introduce complexity.
7. Test with Different Firmware
Simplify Firmware: As a diagnostic step, try using a minimal firmware that only handles the OTA process. This can help determine if the problem is in your existing code or with the OTA mechanism itself.
8. Network Load
Reduce Load: If your device is performing other network-intensive tasks simultaneously, try reducing the load to see if that helps with the OTA update. https://www.oemstron.com/
By systematically checking these areas, you should be able to pinpoint the cause of the freezing and implement a more reliable OTA update process. If you still encounter issues, sharing specific code snippets or logs may help further diagnose the problem.