Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overload handling: low heap and high number of open connections #1116

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

amkrk
Copy link

@amkrk amkrk commented Jan 30, 2022

Stability improvements to the server, allowing to survive momentary overload e.g. when serving more the 4 files on handling a few clients.

Overload control thresholds:
Level1: (max connections exceeded): send 429 with header "Retry-After" set to 1 second
Level2: (low free Heep): send pure 500 (no extra info added into reponse header)
Level3: (very low free help or if can not create AsyncWebServerRequest object - request too big): immediate drop underlying TCP connection

It's been tested on ESP8266 (built on Arduino Windows). All above strategies looks to work well. Client (web browser) re-tries http request after getting 429 or 500. This allows to serve html projects with more than ~4 (html/CSS/js/...) files being requested simultaneously on ESP8266 (10 files in case of my html project). Closing connection causes a web browser to finish loading the webpage without waiting forever for a request which has been shed due to the overload.

Overall: an application started to work stable now :) No spurious resets even after 1 day and while working with 8 browsers connected and running an application (web API calls every 1 sec from each browser in parallel with loading webpage's body from other clients).

ESP32: MAX_NUM_OF_HTTP_REQUESTS - this may likely be set higher (more RAM just for the server), but I was not testing this on ESP32.


Next steps (likely to be done in a separate commit): check against heap defragmentation (test allocation and free of one a few kB block) before processing http request. Applicaion controlled reset if the test allocation fails. Calling user's callback function before the reset, allowing to send e.g. good bye / I'm rebooting msg through a serial line into coo-working MCU.

@amkrk amkrk changed the title Overload handling: low heap and number of open connecions Overload handling: low heap and high number of open connecions Jan 30, 2022
@amkrk amkrk changed the title Overload handling: low heap and high number of open connecions Overload handling: low heap and high number of open connections Jan 30, 2022
@zekageri
Copy link

I would love to see this implemented. :(

Copy link

@dralois dralois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, definitely would be a useful feature.

@amkrk amkrk mentioned this pull request Jun 14, 2022
@amkrk
Copy link
Author

amkrk commented Jun 14, 2022

"This pull request can be automatically merged by project collaborators
Only those with write access to this repository can merge pull requests."

I'm not in the group, so those who are: go ahead with the merge. I'm not sure about regression testing accross several platforms, i.e. how it works in the project ...

@blazoncek
Copy link

I am voting +1 for this PR to be merged.

When a data to be sent needs to be pulled from external MCU, a pointer into related AsyncWebServerRequest can be sent ESP->MCU, and then ESP<-MCU (response with the data).

When the data arrives at ESP user application code running on it can cast a pointer into AsyncWebServerRequest and use it for building AsyncWebServerResponse.

But the TCP connection and associated AsyncWebServerRequest can be released in a meantime (before ESP will get the data data from the MCU)...

In order to secure the case the pointer needs to be validated before casting it to AsyncWebServerRequest.

Application code shall then maintain a list of active AsyncWebServerRequest for this.

The change to ESPAsyncWebServer helps to do it. When the connection is released it passes the pointer into user's callback as a parameter, so user can remove the invalid pointe from the list.

Following can be example user's application code which makes it done:

---------------------------
std::deque<AsyncWebServerRequest*> activeHttpRequests;

bool isRequestValid(AsyncWebServerRequest *request)
{
    for(int i=0; i<activeHttpRequests.size(); i++)
  {
    if (activeHttpRequests[i]==request) return true;
  }
  return false;
}

void httpClientRegister(AsyncWebServerRequest *request)
{
  activeHttpRequests.push_back(request);
}

void httpClientDisconnectionHandler(AsyncWebServerRequest *request) // <- the callback needs 'this' (AsyncWebServerRequest*) as a parameter, change the change to ESPAsyncWebServer code
{
    std::deque<AsyncWebServerRequest*>::iterator it;

    it = activeHttpRequests.begin();
    while (it != activeHttpRequests.end()) {
        if (*it == request) break;
        it++;
    }

    activeHttpRequests.erase(it);
}

void ApiOnBody(AsyncWebServerRequest *request, uint8_t *data, size_t len, size_t index, size_t total) //this is called when POST request comes to ESP over WiFi
{
...
  httpClientRegister(request);
  request->onDisconnect(httpClientDisconnectionHandler);
...
}

void sendApiResponseToHttpClient(const char* data) // the is called when ESP recives data from MCU
{
  int32_t dataSeparatorOffset;
  AsyncWebServerRequest *request;

...
get 'request' ptr value from msg incomming from MCU
...

  if(!isRequestValid(request)) return; // too late, disconnected already

  AsyncWebServerResponse *response = request->beginResponse(200, "text/xml", data);
  if(NULL!=response) request->send(response);
}
@GeorgeFlorian
Copy link

@amkrk would this require changes to an implementation ?
Does it change how we write the server code ? Or is it doing it all under the hood ?

I will test these changes on an ESP32 because I have a medium sized project that's being served by this library and it sometimes hangs.

@MetriciRO
Copy link

MetriciRO commented Sep 14, 2023

@amkrk
These changes make my project not load all its css and js files:
image

The error is a HTTP 429 Too Many Requests.

It also slows down the interface quite a lot.

server.serveStatic("/settings", SPIFFS, "/settings.html");
server.serveStatic("/dashboard", SPIFFS, "/index.html");
server.serveStatic("/user", SPIFFS, "/user.html");
server.serveStatic("/", SPIFFS, "/").setCacheControl("max-age=600");

@zekageri
Copy link

I just wanted to try. Iam sad to hear that

@amkrk
Copy link
Author

amkrk commented Sep 14, 2023

I'm sorry to hear that. I guess (as stated in a description) the ovld controll feature needs a calibration on esp32.
I've just setup those limits for ESP8266 (very small on RAM, etc.), which may triger the functionality prematurely on ESP32.

In particular:
#define MAX_NUM_OF_HTTP_REQUESTS 3

is used by the code:

if(svr->numOfRequests > MAX_NUM_OF_HTTP_REQUESTS){
  AsyncWebServerResponse *response = r->beginResponse(429);

Dear @GeorgeFlorian & @MetriciRO , would it be possible for you to experimentally determine proper number for ESP32 ?
I do not have the chip, so can not do it ... Then you would need to #ifdef the number and set it for each of these platforms separatelly in src/WebServer.cpp .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants