Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore smaller float code with uint32_t #221

Open
charlesnicholson opened this issue Jul 29, 2022 · 22 comments
Open

Explore smaller float code with uint32_t #221

charlesnicholson opened this issue Jul 29, 2022 · 22 comments

Comments

@charlesnicholson
Copy link
Owner

Right now the code that splits floats up into [integer portion | number of leading 0's after decimal | decimal portion] stores the decomposed integer + decimal portions as uint64_t. This is very very expensive (in time and space) on 8-bit platforms like the eZ80.

Nanoprintf captures full floating-point precision, but truncates instead of rounds as a performance optimization.

Explore a compile-time flag that intentionally sacrifices precision for size/speed by storing the integer + decimal portions as uint32_t's instead.

cc @mateoconlechuga

@manuprendlair
Copy link

Hello,

I would to use nanoprintf on c compiler without 64bits integer. (long long is restrict to 32 bits).
I do the job for all variables but I dont know how to transform the float code to 32 bits as @charlesnicholson suggest.
I try some hack but there are somethink not understanding for me..
@mateoconlechuga could you explain the way to ?

Thanks

@charlesnicholson
Copy link
Owner Author

I'll take a shot at it sometime in the next few days; the tricky part is that breaking the float up into 3x integers (integer portion, fractional portion as an integer, and number of leading zeroes after the decimal point) currently requires 64 bits of working space.

Simply changing all of the uint64_t to uint32_t is the first step, also changing 64 to a 32 for "max bits we can handle for a fractional part", but then some of the tests fail for fractional numbers that have no mantissa.

@manuprendlair
Copy link

Yes I did all of this and had fail also.

@charlesnicholson
Copy link
Owner Author

One nice thing about the current approach is that it doesn't actually do any floating point operations. Unfortunately, the 64-bit variables are very hard to get rid of.

I did sketch out a quick-and-dirty replacement here https://github.com/charlesnicholson/nanoprintf/pull/243/files that uses floating point operations but only uint32_t. It's not quite as precise, but it did pass all but 4 of the conformance tests so it's almost certainly "good enough".

Could I ask anyone interested to try the PR branch and give feedback? If it's satisfactory, I wouldn't mind adding another compile-time configuration macro to pick the version; it's a reasonable tradeoff.

@manuprendlair
Copy link

That's great!
I will test it today or tomorow.
Thank Charles!

@charlesnicholson
Copy link
Owner Author

It's a garbage implementation, but if it works "well enough" i think it's straightforward to add rounding, and can use the powers-of-ten table trick to get rid of loops.

I'm not excited by how much larger it is than the u64 version, but if it allows the project to support more architectures, that's probably best for the users :)

@manuprendlair
Copy link

Hello,

I test the 32 bits version, I need to change in body variable declarations because my compiler is pure C (on an old HCS12 codewarrior), and fix some compiler unknows type (like uintptr), and it work as expected.
Sorry I didnt test all printf specifier, just work with float, width and precision aspect.

I could upload my pure C "fix" version if need

Regards

@charlesnicholson
Copy link
Owner Author

I'd love to see it, if you want to open it up as a PR. (Unsure w/o looking at it first which specific branch would land into main, but if you're opening a PR anyway please add yourself to the contributors file!)

@Okarss
Copy link
Contributor

Okarss commented Jan 22, 2024

Hello (again), guys!

After doing the #255 , I started looking at this issue. After some thinking and tinkering with a calculator, there came an idea of an algorithm, which holds a 28-bit value, but moves left/right through the bits/digits, by doing multiplication/division iteratively and dynamically. For now I implemented some experimental code just for the fractional part to see what it's practically capable of. Here are some output examples of the standard printf() (first) compared to the experimental code (second).

0.0625
0.0625

0.0039062504656612873077392578125
0.0039062504656612873077392578125

0.00195312523283064365386962890625
0.001953125230967998504638671875

0.66666666666666662965923251249478198587894439697265625
0.666666664183139801025390625

0.0000000000000000000006666666666666666363714841738585811436529307...
0.000000000000000000000666666634380817413330078125

Pros:

  • The algorithm uses only 32-bit integer arithmetic and provides 28-bit (8+ decimal digits) precision.
  • The algorithm has no range limitation.
  • The algorithm doesn't calculate the digits beyond the requested precision.
  • The code doesn't do the double=>float conversion and extracts the value directly from the double type.
  • The integer part can be implemented in a similar manner to get rid of 64-bit arithmetic completely.
  • The version with 64-bit arithmetic can be added back in the same "moving window" algorithm as a compile time option and can provide 60-bit (18+ decimal digits) precision with all of the previously mentioned benefits.

Cons:

  • The more it has to move to the right of the decimal separator, the more of an error it accumulates.
  • The error is data dependent.

The dynamic nature of the error can be seen in the last two examples, which prints a value of 2/3 and the same value divided by 10^21. The first one prints 8 pieces of '6', while the second prints just 7 pieces. But in my opinion the precision is pretty good anyway. So, are we willing to take such a sacrifice?

@charlesnicholson
Copy link
Owner Author

Hello again! It's great to hear from you; this sounds very interesting and promising! It aligns very much with the float philosophy I've had for nanoprintf, which is roughly "code size and compatibility are more important than perfect round-trip float/double support", especially if there's a compile-time flag that gives users the option to trade speed/space for more precision. To use nanoprintf in the first place users clearly need to know what they're getting into, so I don't mind burdening them with cognitive overhead like this; it's easy to experiment and see what works best for a given codebase.

Do you have any initial estimates about the impact (positive or negative) to the binary size? Now that Actions automatically run on every PR, it's hopefully easier to do experiments (check out the "Size Report" job at the very bottom of the action run). As long as it's in the same ballpark, I'd be happy to incorporate this, and please update the readme.md to credit yourself with the details!

@Okarss
Copy link
Contributor

Okarss commented Jan 22, 2024

Then it seems that I have captured the philosophy as it was intended. :) At this point the code is just a very crude experiment in Visual Studio completely separate from the nanoprintf, so I cannot give exact numbers, but I can give an estimate based on the amount and logic of the code. The 64-bit version should be about the same both size and speed wise. The 32-bit version should be both smaller and faster by a significant amount. For the latter the only 64-bit operations left will be a few bitwise operations when extracting the bits from the original double value. But that code is similar to what the current double=>float conversion does anyway.

By the way, I did a funny experiment for the integer part by moving the decimal separator to the left and then printing the whole number as a fraction with a single loop. It is probably the smallest implementation possible, but, because the precision has an impact on the right side, the result is this:

32.0
31.99999995529651641845703125

32.0625
32.0624999701976776123046875

Not only it "damages" the integer part, but also the nicely printable fractional part. It will do this when the integer part is larger than '9'. In many cases the rounding can come in and save the day, but I guess this is not what people will accept and I don't like it either.

As un-intuitive as it seems, for the "moving window" approach the integer part is more tricky than the fractional part. So, now the plan is to implement a symmetric algorithm, which prints nicely the integer part up to about 2^32 and then starts stuffing zeros to the right for a larger values. Then I have to integrate rounding, process special values and then there is a question about the limits. As the algorithm itself has no range limits and even the 32-bit version can provide 32+28 (integer+fraction) bit precision for a numbers around the decimal separator, probably we should reconsider the 32 character limit. Maybe it should also be made as a compile time parameter?

@charlesnicholson
Copy link
Owner Author

Hello! Sorry for the delay; I'm traveling for work.

This all looks very interesting, and I'm not sure increasing the stack usage matters that much for this- if you have an FPU, you can probably afford another 29 bytes of stack usage :)

I'm eager to learn more!

@Okarss
Copy link
Contributor

Okarss commented Feb 4, 2024

Here are some news on the progress.

Implemented the integer part, rounding and special values as planned. By the way, the current nanoprintf implementation doesn't handle subnormal numbers correctly. It turns out that the C standard doesn't define the sizes of float and double explicitly and, for example, on AVR-GCC the size of double can be 32 or 64 bits, and it is set to 32 bits by default - dealt with that by using definitions from <float.h>. Improved the internal processing of the dynamic scaling (a more appropriate name for the "moving window" algorithm), which slightly improved the precision. Refactored the code to add 64-bit processing option and then realized that now it works with unsigned integers of any size... So, here are some output examples of the standard printf() (first) compared to the dynamic scaling with 64, 32, 16, 8 bit processing respectively:

0.66666666666666662965923251249478198587894439697265625
0.66666666666666662965923251249478198587894439697265625
0.66666666604578495025634765625
0.6666259765625
0.65625

0.00000000000000000000066666666666666663637148417385858...
0.00000000000000000000066666666666666663486340294042520...
0.0000000000000000000006666666567325592041015625
0.000000000000000000000666064453125
0.00000000000000000000054375

9999999999.0
9999999999.0
9999999980.0
9997000000.0
8800000000.0

9999999999999999583119736832.0
9999999999999999568000000000.0
9999999840000000000000000000.0
9991000000000000000000000000.0
8000000000000000000000000000.0

For now, the numbers, which are not represented exactly, are always lower than the real values. That is because the dynamic scaling internally preserves at least 29.678 bits (for 32-bit version), but truncates values instead of rounding while processing. I'm looking into whether this can also be improved with a small amount of code.

Also tried compiling it for Cortex-M7 with GCC and Os optimization and the size of the new npf_ftoa_rev() function for 32-bit version is 472 B. The 64-bit version, of course, pulls in additional code and increases the total size by 856 B. I guess the sizes are promising, especially for the 32-bit version. And integration into nanoprintf is not far away... :)

@charlesnicholson
Copy link
Owner Author

Whoa, this is very cool! I'm eager to see it!

@Okarss
Copy link
Contributor

Okarss commented Feb 7, 2024

Did the first integration into nanoprintf. I'm still working on it, but the published code is pretty complete and decent, so everybody feel free to try it:
https://github.com/Okarss/nanoprintf/tree/float
The rework introduces two optional configuration parameters, which can be defined like this:

#define NANOPRINTF_CONVERSION_BUFFER_SIZE    512
#define NANOPRINTF_CONVERSION_FLOAT_TYPE    uint64_t

Compared to the current v0.4.0 release, the total FLASH size of my test build on Cortex-M7 with GCC and Os optimization for 64-bit version is reduced by just 56 B, but for 32-bit version by 948 B. Also, compared to 32-bit version, the 16-bit and 8-bit versions increase the size by 48 B and 40 B respectively, which makes sense for a 32-bit CPU.

Take a note that the size of the npf_vpprintf() function is roughly the same and the big reduction of the total size for 32-bit version comes from getting rid of the additional compiler linked 64-bit arithmetic code.

@charlesnicholson
Copy link
Owner Author

Getting rid of the built-in 64-bit stuff is huge! I'll play with it when I'm not working; thanks again for figuring this out :)

@Okarss
Copy link
Contributor

Okarss commented Mar 17, 2024

Added an internal rounding for the dynamic scaling process, which improves precision, especially for a very large/small values. Mathematically the code is now complete. There is just one thing I don't like - handling of the fractional part, when the integer part exceeds the integer range. Here is an example (actual value : printed value) with 32-bit integers:

4294967295.5 : 4294967295.5
4294967296.0 : 4294967300.0
4294967296.5 : 4294967300.5
4294967297.0 : 4294967300.0

The fact, that the 4294967297.0 gets a printed as a smaller number that the 4294967296.5 seems unacceptable to me. This is a consequence of the fact that currently I'm printing the fractional part first. Initially such a decision was driven by the fact that it saves the rounding (of the printed characters) code of going over the integer part. Now it seems that it could be more efficient to print the integer part first and readjust the rounding code so that it can go over all digits, if necessary. And, in case the integer part exceeds the integer range, ignore the fractional part completely and just print zeros instead. At least it will print 4294967300.0 in all of the lines in the example, except for the first one, of course. Does it sound like the right approach?

@charlesnicholson
Copy link
Owner Author

Heya @Okarss - sorry I let this languish :( I think extrema errata like this is totally fine, as long as it's documented. Nanoprintf pretty clearly is comfortable trading float precision for implementation size efficiency, so I think clearly explaining the limitations is plenty sufficient. I agree with what you're saying about printing a number smaller than the input, this close to the positive limit, and I think what you've done makes fine sense. :)

@charlesnicholson
Copy link
Owner Author

Also, I'm excited to see how it does on the CI jobs! Feel free to open a PR whenever you'd like- when you do, could I ask you to rewrite the float portion of the docs as well, give yourself front-and-center credit for it, and explain (or link to) your algorithm please?

@Okarss
Copy link
Contributor

Okarss commented Mar 31, 2024

Swapped the order of processing the integer and the fraction parts. In the previous example, if the integer part exceeds the integer range, now it will print 4294967300.0 and the fraction part will not be processed at all. Though, if the actual value is 4294967295.96875, now it will still print 4294967296.0 because of the rounding code.

Another interesting thing is the default integer type/size for processing. Currently it is set to unsigned int, which typically is 32-bit for 32/64-bit CPUs and 16-bit for 8/16-bit CPUs. This saves the space for the smaller CPUs, while values up to [65535].[9998779296875] can still be printed perfectly. Another sane option would be to default to uint_fast32_t, which prints values up to [4294967295].[99999999813735485076904296875] perfectly, but it increases the code size:

AVR    ( 8-bit): +106 B
MSP430 (16-bit):  +92 B
RL78   (16-bit): +242 B

As the library is named "nano..." and the size is the highest priority, I'm inclined to leave the default to the unsigned int, which is already in the current code. Any comments?

Overall it seems that now the code is ready and I'm nearing a PR. Then I will also update the documentation and add a commit for that. By the way, the algorithm is basically the same Wojciech Muła's idea, but evolved and extended further. It solves the range problem, by implementing scaling, which keeps just the most significant bits of the mantissa in the process. The idea just came to me naturally. Now I actually searched online for this approach and I cannot find anything. Unbelievable...

@charlesnicholson
Copy link
Owner Author

That is interesting. I agree with your suggestion about unsigned int, but I wonder if it's the kind of thing that might (eventually) make sense to parameterize? It's hard to discuss without seeing the code; I do worry about nanoprintf having a configuration explosion...

I'm really excited to see it! Novel work is very exciting in general, and the floating point stuff has been the weakest part of npf, I think. :)

@Okarss
Copy link
Contributor

Okarss commented Apr 6, 2024

The code is in the same GitHub branch, to which I gave a link in an earlier post - there are new commits. The integer type is also the same I showed in that post, now I'm just reevaluating the choice of the default definition for it. So in total there are still the same two new optional parameters. One can also see the new npf_ftoa_rev() function there. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants