Replace PRNG with hardware RNG #4225

DedeHai · 2024-10-25T06:20:42Z

Both ESP8266 and ESP32 have a hardware random register. This update makes use of that. It is slightly faster than the fastled variants but mostly it is truly random, even when the timing limitations stated in the datasheet are disregarded. Also saves a bit on code size.

Replaced all random8() and random16() calls with new hw_random() versions
Not replaced in FX where PRNG is required

I started this for the particle system, where good randomness is required and made a more elaborate version with this PR.

As a test I filled an array of 128 with random numbers from both fastled random16() and hw_random() and found that the hardware variant gives a perfectly linear distribution while fastled PRNG does not: generate lots of numbers, sort them ascending in excel, plot and add a trend line to compare, see if the trendline goes (approximately) through (0,0). hw_random() results does, fastled random does not.

Please Review the changes to make sure I did not replace where not adequate. Also: if there is a better place than fcn_declare.h to add the #include and #define I can move it.

Both ESP8266 and ESP32 have a hardware random register. This update makes use of that. It is slightly faster than the fastled variants but mostly it is truly random, even when the timing limitations stated in the datasheet are disregarded. Also saves a bit on code size. - Replaced all random8() and random16() calls with new hw_random() versions - Not replaced in FX where PRNG is required

softhack007 · 2024-10-25T06:33:23Z

Hi @DedeHai,

I have two points we maybe need to discuss:

A) on esp32, is there a benefit compared to esp_random() ?

B) there was a design decision (time before you joined) to use fastled random functions in code related to effects. The reason behind was that some users wanted to have effects in sync on different devices (aka 'super-sync') which means all effects have the same timebase (strip.now) , plus they use a deterministic random function with the same seed. I see the benefit of a good random distribution , though. The previous decision for effects to use fastled random is not forever so let's discuss.

Besides much better random distribution, did you measure if performance of effects improves?

DedeHai · 2024-10-25T06:43:36Z

thanks for the feedback

esp_random() uses a delay loop to ensure entropy, that is what I first used and found it is not necessary for WLED but probably required for encryption purposes. also that function is not available on ESP8266.
I did read that PR discussion, these changes would not affect that if I am not mistaken: the random seed of the PRNG at bootup makes all calls to random8() and random16() different on different devices. The FX that require sync I left untouched (the ones that call get seed and set seed) but please correct me if I am mistaken.
I did not test performance and I would not expect it to be any different: in my test, calling both random16() and hw_random16() gave a maybe 5% speed boost (a few microseconds difference at 512 calls IIRC) so not noticeable if you do not call it thousands of times. But replacing all calls with the hw version makes code smaller by 300 bytes and gives true randomness.

as mentioned, I made this primarily for the particle system, where I found that random16() is just no random enough, it biases some of the effects (waterfall for example) and I have to use random() which is slower than this version. Still need to port this over and see if it makes any noticeable difference.

blazoncek

Changing hundreds of lines, where preprocessor can do it for you, is prone to errors.

wled00/fcn_declare.h

softhack007 · 2024-10-25T10:40:15Z

the fastled PRNG is just a few lines of code. could put that in util.cpp as prandom16() or something similar and then explicitly distinct between PRNG and RNG. I can easily do that if someone can point me towards the ideas behind super-sync.

@DedeHai the topic was actually split over different "threads" of discussion. I think these are most important parts

Seed FastLED's RNG #3552
- --> Introduced the idea to always use strip.now (instead of millis()) and replaced all remaining random() calls with fastled random functions
- This means AC WLED only implemented a "lightweight" version of super sync. If I remember correctly, the main objective was to have segments in sync as much as we can, plus allow effect to do "almost the same thing" when running on different devices that are "synced"
The full "super sync" idea originated in the MM fork. I have to say this activity was never 100% competed - we mainly tried to sync single effects to understand what's needed to make them "super sync".
- Replace usage of millis() in effects with strip.now - sync effects across wled instances MoonModules/WLED#72

Ignore requirement to use preprocessor.

DedeHai · 2024-10-25T16:30:50Z

So the supersync idea using strip.now is still at a concept stage, is that correct?
IMHO the idea is flawed. Sync would need to be frame based, not time based between different controllers. My reasoning is this: the random16() functions are called by frame, if one controller runs at 60FPS and another doing more segments for example runs at 50FPS, they will not be in sync. Or what is the idea?
To me, that is a seperate topic and does not really have anything to do with this PR. The changes to FX are easy to revert back to a PRNG, the changes I made should not affect the current state of things.

blazoncek · 2024-10-25T19:20:46Z

My reasoning is this: the random16() functions are called by frame, if one controller runs at 60FPS and another doing more segments for example runs at 50FPS, they will not be in sync.

Thank you for putting it this way. I could not figure out what I disliked about "super-sync" but it looks this is one of the things that bothered me.

DedeHai · 2024-12-20T06:49:46Z

where do we stand on this PR? The way I see it the issues raised about random vs. pseudo random and supersync are something that needs to be addressed seperately and for the time being this PR improves true randomness of the current implementation.

willmmiles · 2024-12-20T14:24:38Z

I'm happy to accept a faster random implementation. I think "Super-sync" might be better implemented by the "copy FX output" PR - it could be used to target network strip types.

softhack007 · 2024-12-20T21:35:46Z

where do we stand on this PR? The way I see it the issues raised about random vs. pseudo random and supersync are something that needs to be addressed seperately

Hi @DedeHai, I think the PR is ok. To come back to the "supersync" discussion, there are several possible use cases

same board, several segments (with different sizes) should show the same effect "in sync"
--> could not be done with the "copy segment" effect, however will not work as-is today anyway, because we would need to make sure that all effects start with the same random seed (effect already use a common "clock" from strip.now).
--> Possible, but not fully implemented yet.
as a variant of 1), but with several outputs or segments having the same dimensions. To keep outputs in sync, the best way is to mirror busses using "custom bus start index".
the "super-sync cluster" - running the same effect on several boards, each board controlling a small patch of the fixture. For example 9 esp32, each running a section of "octopus" on its own 24x24, giving you a 72x72 octopus effect.
--> this was the original usecase that we explored in the MoonModules fork. This use case requires synchronization of local clocks, framerates, and random seeds.
--> However this was at a time where a local 72x72 effect was so slow that it would crawl along at a handfull of FPS.
--> In fact I think with the speedups we have already achieved, and (maybe) using the new async udp output driver from @troyhacks, this use case can be implemented better with one esp32-S3 running the complete effect at 128x128 (30fps), and then pushing pixels to 9 sub-devices via DDP or art-net.

My summary is that only case #1 (and maybe #2) could still be relevant to users. Case 1 could still be implemented, if we implement a "switch" in hw_random() that falls back to fastled pseudo-random when the user explicitly wants this. Plus a bit of extra code to always set the same random seed before calling each effect function.

So yes, let's proceed to merge.

DedeHai · 2024-12-21T10:34:25Z

I thought about this and came up with a possible solution:

add an option to "enable segment sync"
add functions (or maybe a class) to generate random numbers:
- if segment sync is disabled, all effects use HW RNG
- if sync is enabled it uses a PRNG - the prng has a current seed and a "frame seed", in service() the PRNG is reset to frame seed before calling each FX, ensuring they use the same random numbers. before calling the first FX, it advances the "frame seed"
PRNG will be a bit slower, on ESP32 the hardware CRC unit can be used to "shuffle" random numbers for better randomness (I used this on a STM32 once and it gives good results) -> will be a bit slower but should still be pretty fast, CRC takes 16 clock cycles for a 32bit number. ESP8266 does not have a CRC unit though so if sync between controller is implemented, sync between ESP32 and ESP8266 would not be possible
doing it this way may even allow particle FX to be in sync, given that the segment size is identical
I am not sure which FX also require the segment to be the same size for this to work

edit:
gave this some more thought and the proposed solution could be done by having two functions to get a 32bit random number: one returns the hardware register, one returns a xorshift random number which should have very good randomness at the cost of a few extra instructions compared to fastled random.
using a function pointer and a struct or class allows to choose one or the other (globally) with a "set_rng_mode()" function that assigns the correct function pointer.
all non-FX random numbers can still use the hw_randomx() functions from this PR that always use the hardware RNG, FX use rng_random(). the disadvantage is that using a function pointer, inlining is of course no longer possible, resulting in function call overhead, nothing to be too worried about though.

Questions:

if a "per frame" PRNG as proposed is used would that solve the sync for all or at least most FX?
this approach would NOT allow to sync segments in groups, for example have 4 segments with the same FX, sync 2 and 2 -> the copy segment FX could be used in such a case
any other requirements?

I could add this PRNG approach easily if requirements are clear, so additional comments and inputs are welcome.

edit2: for the math enthusiasts: original paper about xorshift PRNG https://www.jstatsoft.org/article/download/v008i14/916

willmmiles · 2024-12-21T15:19:30Z

same board, several segments (with different sizes) should show the same effect "in sync"
--> could not be done with the "copy segment" effect, however will not work as-is today anyway, because we would need to make sure that all effects start with the same random seed (effect already use a common "clock" from strip.now).
--> Possible, but not fully implemented yet.

I'm curious - is there an example of an FX that uses randomness but will be "in sync" with different segment sizes? Even if using the same PRNG/seed for each frame, I'd expect that most FX will render distinctly differently if the segment sizes aren't identical, eg. particles will be placed at different locations because they modulo over the segment size, or the like.

I definitely agree that "super-sync" cases 2 and 3 are best solved with mapping (either bus tools, copy FX, DDP, or the like).

Re PRNG: My $0.02 is that -- if there is a use case for #1, above -- then we should probably always use the PRNG for FX (seeded by the hw RNG or sync'd value) rather than make dynamic function calls. Keep it simple.

softhack007 · 2024-12-21T16:48:44Z

is there an example of an FX that uses randomness but will be "in sync" with different segment sizes

Yes there are some - like multi comet (random "launch" of comets), or drip (random "drop-off"), or fireworks starburst (random "bursts"). Plus some trivial ones that are not random, but strictly time/frame dependant - like twoDots, Akemi, DNA, GEQ, ....

willmmiles · 2024-12-21T18:01:36Z

is there an example of an FX that uses randomness but will be "in sync" with different segment sizes

Yes there are some - like multi comet (random "launch" of comets), or drip (random "drop-off"), or fireworks starburst (random "bursts"). Plus some trivial ones that are not random, but strictly time/frame dependant - like twoDots, Akemi, DNA, GEQ, ....

All of those have size-dependent RNG usage:

Multi comet uses the strip length to decide when to launch - different lengths will diverge quickly.
Drip will diverge if the vertical size differs (drops will hit bottom and reset at different times). For the same vertical dimension, partial "copy fx" would be equivalent
Fireworks starburst count and locations depend on segment length, so launch times will diverge

Synchronized rendering (ie. keeping frame counts in sync) is another ball of wax entirely. We'd either have to decouple the model calculation from the frame output timing (ie. separate physics from rendering - would change a lot of current FX); arrange bidirectional communication so no segment or node moves on to the next frame until everyone's up to date; or loop in the FX function until it's "caught up" if we notice we've fallen behind. All of which are doable but nontrivial. :(

DedeHai requested a review from blazoncek October 25, 2024 06:20

softhack007 requested review from netmindz, willmmiles, softhack007 and Aircoookie October 25, 2024 06:40

softhack007 added enhancement optimization re-working an existing feature to be faster, or use less memory labels Oct 25, 2024

blazoncek previously requested changes Oct 25, 2024

View reviewed changes

wled00/fcn_declare.h Show resolved Hide resolved

softhack007 added this to the 0.15.1 candidate milestone Nov 2, 2024

softhack007 added the discussion label Nov 2, 2024

DedeHai mentioned this pull request Dec 6, 2024

replaced repeated progress() calculation calls with a variable #4256

Merged

netmindz changed the base branch from 0_15 to main December 16, 2024 13:30

Merge remote-tracking branch 'upstream/main' into 0_15_hardware_random

6f82aa0

DedeHai added 2 commits December 20, 2024 18:36

Merge remote-tracking branch 'upstream/main' into 0_15_hardware_random

3e280ac

replaced missed random() with hw_random()

6bd8187

DedeHai removed the discussion label Dec 20, 2024

DedeHai merged commit 5f77478 into Aircoookie:main Dec 20, 2024
19 checks passed

DedeHai added the discussion label Dec 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace PRNG with hardware RNG #4225

Replace PRNG with hardware RNG #4225

DedeHai commented Oct 25, 2024

softhack007 commented Oct 25, 2024 •

edited

Loading

DedeHai commented Oct 25, 2024 •

edited

Loading

blazoncek left a comment

softhack007 commented Oct 25, 2024 •

edited

Loading

DedeHai commented Oct 25, 2024

blazoncek commented Oct 25, 2024

DedeHai commented Dec 20, 2024

willmmiles commented Dec 20, 2024

softhack007 commented Dec 20, 2024 •

edited

Loading

DedeHai commented Dec 21, 2024 •

edited

Loading

willmmiles commented Dec 21, 2024

softhack007 commented Dec 21, 2024 •

edited

Loading

willmmiles commented Dec 21, 2024

Replace PRNG with hardware RNG #4225

Replace PRNG with hardware RNG #4225

Conversation

DedeHai commented Oct 25, 2024

softhack007 commented Oct 25, 2024 • edited Loading

DedeHai commented Oct 25, 2024 • edited Loading

blazoncek left a comment

Choose a reason for hiding this comment

softhack007 commented Oct 25, 2024 • edited Loading

DedeHai commented Oct 25, 2024

blazoncek commented Oct 25, 2024

DedeHai commented Dec 20, 2024

willmmiles commented Dec 20, 2024

softhack007 commented Dec 20, 2024 • edited Loading

DedeHai commented Dec 21, 2024 • edited Loading

willmmiles commented Dec 21, 2024

softhack007 commented Dec 21, 2024 • edited Loading

willmmiles commented Dec 21, 2024

softhack007 commented Oct 25, 2024 •

edited

Loading

DedeHai commented Oct 25, 2024 •

edited

Loading

softhack007 commented Oct 25, 2024 •

edited

Loading

softhack007 commented Dec 20, 2024 •

edited

Loading

DedeHai commented Dec 21, 2024 •

edited

Loading

softhack007 commented Dec 21, 2024 •

edited

Loading