Usage usermod #4342

netmindz · 2024-12-03T18:49:03Z

Track anonymised usage data for WLED.

The aim is to be able to answer the following questions

How many devices are running WLED
Which ESP chips are in use
How many LEDs are people using per-controller
What version of WLED are users running
How stable is that version (track the uptime to spot trends in lower uptime caused by higher rate of crashes)

This will need to be an opt-in feature, easier to do at part of the onboarding for fresh installs than for upgrades, so we will need to put lots of messages out asking for users to enable so we can better support them and prioritise feature development.

The backend server is open-source so that we can provide complete transparency as to what data we capture and how we use it https://github.com/netmindz/WLED_usage

netmindz · 2024-12-11T02:33:45Z

Note to self - we should also send flash size and perhaps partition info given the current challenges regarding bin size, especially with V4 and also the issues with users installing WLED via Tasmota or non-standard partitioning

willmmiles · 2024-12-11T20:34:54Z

Note to self - we should also send flash size and perhaps partition info given the current challenges regarding bin size, especially with V4 and also the issues with users installing WLED via Tasmota or non-standard partitioning

Following on with this: for telemetry builds, we might want to think about explicit crash reporting. I've got some code for ESP8266 to write crash dumps to the flash, where it could be uploaded or posted later; and I'm given to believe that ESP32s will automatically do so if there's a crash dump partition left for them.

willmmiles · 2024-12-11T21:10:37Z

From a technical design review standpoint:

+1 for the hash over the MAC address as device ID. Unique but not reversible.

I'm not thrilled about a statically allocated packet 100s of bytes in length -- that's a lot of RAM waste on some of our more constrained platforms (8266, S2); especially since the contents themselves are already statically available elsewhere. I'd prefer to do packet assembly on the stack when needed rather than permanently waste RAM.

I also think it would be better to track posting success, rather than periodically spray packets on to the internet -- I think we should aim to minimize the static runtime cost. I'd be inclined to suggest using a TCP connection instead of UDP; construct and send only once, out of the connected() callback. The whole transaction can be rigged up as a self-destructing callback chain using AsyncTCP. If it goes, it goes; if it doesn't, it doesn't; but after that there's no ongoing cost beyond a used-up pointer and an empty loop() callback. (For a few extra bytes of code memory, it could even be made a proper HTTP POST, so we could use off-the-shelf CRUD db code; our client code need not care about the reply.)

On the service side: where does the database live? Who has the keys (to the data, to the service management), and how are they passed along to the team/made available publicly? Do we need to think about DOS attacks or flood prevention? Particularly with a periodic send approach, scalability quickly becomes a concern - do we need to think about handling 100k live devices? (Should we be so lucky?)

netmindz · 2024-12-12T10:55:37Z

From a technical design review standpoint:

+1 for the hash over the MAC address as device ID. Unique but not reversible.

Thank you

I'm not thrilled about a statically allocated packet 100s of bytes in length

That is just my inexperience with C, it sounds like an easy thing to fix

I also think it would be better to track posting success, rather than periodically spray packets on to the internet -- I think we should aim to minimize the static runtime cost. I'd be inclined to suggest using a TCP connection instead of UDP; construct and send only once, out of the connected() callback.

In order to be able to see build stability, we do need more than a one-time only call as we need the uptime, the exact frequency TBD

On the service side: where does the database live? Who has the keys (to the data, to the service management), and how are they passed along to the team/made available publicly?

From a cost perspective, the easiest is to point this at a VM on my own dedicated server that the team is all given access to. Happy to discuss other possible options

Do we need to think about DOS attacks or flood prevention? Particularly with a periodic send approach, scalability quickly becomes a concern - do we need to think about handling 100k live devices? (Should we be so lucky?)

Even at 100k devices sending one message an hour, say, that is still very little bandwidth, and we can play around with different storage models for the data. We can lean heavily into the fact that we can accept failure. If we miss an update from a device — so what, we don't care. Nobody is going to care if a specific update gets lost. There is no expectation that this will give us 100% visibility. This is another good reason to be using UDP not TCP, we avoid all the extra overhead of needing to establish a connection, threading issues relating to handling those connections, nio etc; We just see a stream of packets

willmmiles · 2024-12-12T16:22:50Z

I also think it would be better to track posting success, rather than periodically spray packets on to the internet -- I think we should aim to minimize the static runtime cost. I'd be inclined to suggest using a TCP connection instead of UDP; construct and send only once, out of the connected() callback.
In order to be able to see build stability, we do need more than a one-time only call as we need the uptime, the exact frequency TBD

Do we care about uptime in general, or uptime of crashes? If we only care about crashes, then we only need to report once at boot with the uptime from before the last crash. We can store and read back the uptime from before the crash locally on the device in memory that is only cleared on power-cycling. ("RTC memory" is one such space, though I found on ESP8266 you could use pretty much any statically allocated variable if you ask the linker nicely to leave it alone; haven't tried ESP32 yet.)

All the other concerns are contingent on single update vs continuous update implementation.

From a cost perspective, the easiest is to point this at a VM on my own dedicated server that the team is all given access to. Happy to discuss other possible options

No objections to the physical arrangement, though IANAL and I can't speak for any potential legal ramifications. Mostly I wanted to pin down how the team is given access. Who has the authority to add another developer to the access list? How is that to be managed? (Ask politely on discord is a reasonable answer, but I do think it should be documented somewhere.)

netmindz added 5 commits December 2, 2024 18:16

Add UsageUsermod

abebb54

Further work on UsageUsermod

1b4b56b

Usage ESP8266 fix

467f2e4

Usage - send data

cc6de12

Usage - add deviceId

11a010f

netmindz requested a review from softhack007 December 3, 2024 18:49

netmindz requested review from DedeHai, willmmiles and Aircoookie December 11, 2024 02:23

DedeHai marked this pull request as ready for review December 11, 2024 09:46

netmindz marked this pull request as draft December 12, 2024 10:25

netmindz changed the base branch from 0_15 to main December 16, 2024 13:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Usage usermod #4342

Usage usermod #4342

netmindz commented Dec 3, 2024

netmindz commented Dec 11, 2024

willmmiles commented Dec 11, 2024

willmmiles commented Dec 11, 2024

netmindz commented Dec 12, 2024

willmmiles commented Dec 12, 2024 •

edited

Loading

Usage usermod #4342

Are you sure you want to change the base?

Usage usermod #4342

Conversation

netmindz commented Dec 3, 2024

netmindz commented Dec 11, 2024

willmmiles commented Dec 11, 2024

willmmiles commented Dec 11, 2024

netmindz commented Dec 12, 2024

willmmiles commented Dec 12, 2024 • edited Loading

willmmiles commented Dec 12, 2024 •

edited

Loading