Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESP_ERR_INVALID_RESPONSE after few days of work (IDFGH-13723) #74

Open
3 tasks done
Silvesterrr opened this issue Sep 17, 2024 · 3 comments
Open
3 tasks done

ESP_ERR_INVALID_RESPONSE after few days of work (IDFGH-13723) #74

Silvesterrr opened this issue Sep 17, 2024 · 3 comments

Comments

@Silvesterrr
Copy link

Checklist

  • Checked the issue tracker for similar issues to ensure this is not a duplicate.
  • Provided a clear description of your suggestion.
  • Included any relevant context or examples.

Issue or Suggestion Description

Hello,
I switched to esp-modbus 1.0.15 from 1.0.7. Now after few days of work I get error as below:

E (872259047) MB_PORT_COMMON: 872259042655:Frame send error = 5
E (872259047) MB_CONTROLLER_MASTER: mbc_master_send_request(97): Master send request failure error=(0x108) (ESP_ERR_INVALID_RESPONSE).

The device reads from devices in repeat every 600-1000ms. After error happens it is show while trying to read any slave.

Note that after reboot everything works normally.
Even if ESP_ERR_INVALID_RESPONSE occures shoud'nt it get back up?

And what means error = 5 which is MB_EIO.
Is ESP_ERR_INVALID_RESPONSE the result of

MB_EIO or the other way around?

Maybe I should impement vMBMasterRxFlush() as discussed in other issue?

Im using modbus rtu master with modbus tcp slave.

here is my sdkconfig:
sdkconfig.txt

@github-actions github-actions bot changed the title ESP_ERR_INVALID_RESPONSE after few days of work ESP_ERR_INVALID_RESPONSE after few days of work (IDFGH-13723) Sep 17, 2024
@alisitsyn
Copy link
Collaborator

alisitsyn commented Sep 17, 2024

Hello @ Silvesterrr,

I need more information to identify the reason for the issue. Could you store and then send to me the bigger portion of log with the debug severity set in the kconfig menu? This should include the last logging messgaes when the error occur. I need to check how the master and slave are used in your device application.

(ESP_ERR_INVALID_RESPONSE) - means that the master RTU sent the request to the slave and got incorrect response from slave or response was fragmented because the slave responds to previous transaction when the new one is in progress. This may be due to increased slave response time which longer than the time between transactions and incorrect value of slave response time option in master = 500 ms. The log can clarify this.

@Silvesterrr
Copy link
Author

Silvesterrr commented Sep 18, 2024

Sure,
I was waiting for problem to occur. It worked for ~23h and failed again.
Here is a portion of log with debug log verbosity:
centrala_2_18_09_2024.txt
The device worked perfectly fine until line: 6185 where it received some sort of data and failed?
Generally slaves should not send any data that is that length. So that is weird.

from that moment the device can't send any more frames.
I verified it with external rs485 dongle and I can confirm the device is not sending any more frames after that point.

Just to clarify behavior. In the rs485 network there are other 8 devices. My device tries to read ids 1-16 in loop. It reads 1-8 and fails to read 9-16. That is expected behavior of my code.

@alisitsyn
Copy link
Collaborator

alisitsyn commented Sep 19, 2024

The device worked perfectly fine until line: 6185 where it received some sort of data and failed?
Generally slaves should not send any data that is that length. So that is weird.

What I can see from the log the slave response time = 500ms in your project.
timestamp: 81748191 - 81749091:
On some stage the slaves stop to respond properly. The master sends the request but slaves respond only right after expiration of slave respond time. The master tries to send the request but can not do this because the slaves send delayed respond.
On line 6185 the situation is very similar but you get bunch of data after timeout that were not expected and they arrive right after start of new master transaction otherwise the UART buffer would be cleared. It looks like that all your slaves are become active and try to respond the same time or other master on the RS485 is active. I can just guess what happened on the bus to cause collisions and receiver gets 66 bytes right after start of transaction request. So, I suppose something happens with your segment and data bus and this may caused the issue. This needs further inspection of the code. Could you check the v2 implementation of RTU with the same conditions?

Please also try to disable the CONFIG_FMB_TIMER_PORT_ENABLED=n kconfig value, this can help. I need to reproduce this issue to confirm and fix. I will do this once have time for it. Please let me know the results of check as per above notes.
Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants