Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

invalid byte sequence in UTF-8 #49

Open
anarcat opened this issue Nov 24, 2020 · 8 comments
Open

invalid byte sequence in UTF-8 #49

anarcat opened this issue Nov 24, 2020 · 8 comments

Comments

@anarcat
Copy link

anarcat commented Nov 24, 2020

When switching to this module from my dumb sysctl one, I get this error when running my manifest:

Error: Failed to apply catalog: invalid byte sequence in UTF-8

To reproduce:

Add this to a reproducer.pp file:

  sysctl::value { 'kernel.unprivileged_userns_clone':
    value  => '1',
    target => '/etc/sysctl.d/userns.conf',
  }

Download the 0.0.12 module to test-modules/sysctl, and run this:

puppet apply --modulepath="$PWD/test-modules" repro.pp 

I get this error:

$ puppet apply --modulepath="$PWD/test-modules" repro.pp 
Notice: Compiled catalog for angela.anarc.at in environment production in 0.04 seconds
Error: Failed to apply catalog: invalid byte sequence in UTF-8

This can also be reproduced in the git head. I can't find any corrupt UTF-8 output in the module's source code, so that can't be it.

My guess is that the output of sysctl -a makes Puppet unhappy: it might trying to decode it as unicode at some point and failing. And indeed, here, the sysctl -a output isn't UTF-8 clean:

$ sudo sysctl -a | iconv -f utf8 -t latin1 > /dev/null
iconv: illegal input sequence at position 66297

It specifically stumbles upon sunrpc.transports, which less shows as:

sunrpc.transports = <C0><E3><A3><C1>m^?

And indeed, this is happy:

anarcat@angela:puppet(master)$ sudo sysctl -a | grep -v sunrpc.transports | iconv -f utf8 -t latin1 > /dev/null
anarcat@angela:puppet(master)$ 

I would recommend not making any assertions on the encoding of values in the kernel's sysctl output.

@anarcat
Copy link
Author

anarcat commented Nov 24, 2020

I have naively tried this patch, but still get the error:

--- /var/lib/puppet/lib/puppet/provider/sysctl_runtime/sysctl_runtime.rb	2020-11-24 16:07:38.852719880 -0500
+++ /tmp/puppet-file20201124-5268-94we8q	2020-11-24 16:47:58.740918485 -0500
@@ -17,6 +17,7 @@
     output = executor.execute("#{Puppet::Util.which('sysctl')} -a", {
       :failonfail         => true,
       :combine            => false,
+      :override_locale    => false,
       :custom_environment => {}
     })
     output.split("\n").collect do |line|

I also tried to add LANG or LC_ALL=C.UTF-8 to the custom_environment there, but that doesn't help, apparently.

@duritong
Copy link
Owner

These characters do not really make sense for that option.

Locally (Fedora) + on a debian/buster64 Vagrantbox I get:

sysctl -a | grep sunrpc.transports
sunrpc.transports = tcp 1048576
sunrpc.transports = udp 32768
sunrpc.transports = tcp-bc 1048576

It would be important to get that reproducible to get ruby parsing things correctly.

@duritong
Copy link
Owner

This seems to be caused by a kernel bug in certain Debian versions: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=974193

@anarcat
Copy link
Author

anarcat commented Nov 25, 2020

but why is ruby trying to decode that stuff in the first place? seems like this code would be more solid if it wasn't trying to make any guesses about the encoding of sysctl...

that said, i see the bug in an earlier version of linux (5.8) so I've fixed the affected versions in the debian BTS. thanks for that! :)

@duritong
Copy link
Owner

Ruby reads Strings and by default it assumes the content is UTF-8 and while trying to parse it it fails: https://tenderlovemaking.com/2020/01/13/guide-to-string-encoding-in-ruby.html

How else should it know how to represent it internally, so that you can actually read from it?

As this article shows it highly depends on the context. And I would say it is fairly safe to say that systctl -a is expected to return proper UTF-8 and it should fail if it is unable to parse the output or on any other issue, which is what it is configured to do.

Now the error is quite generic and you probably don't get immediately a clue why things failed. So we could wrap it, make it more nicer to have more context where it failed. On the other side, running puppet agent -t --trace should probably also give you where it fails.

@anarcat
Copy link
Author

anarcat commented Nov 26, 2020

latin1, as an encoding, has the surprising property of never failing to decode, for example, even if it might mean garbage in some cases. i'm not sure the kernel actually says what the encoding of sysctl is anywhere, it would be interesting to figure that out.

that said, maybe for the purpose of this module it would be overkill to make changes to this... i do agree an improved error message would have made my life much easier: i spent a few hours tracing that and it was fairly frustrating. :)

@duritong
Copy link
Owner

Did you use puppet agent -t --trace which should show you the stacktrace of the exception, thus it would have pointed to the sysctl module and parsing its output. Or did it not show that?

@anarcat
Copy link
Author

anarcat commented Nov 28, 2020

Did you use puppet agent -t --trace which should show you the stacktrace of the exception, thus it would have pointed to the sysctl module and parsing its output. Or did it not show that?

I have not. I wasn't familiar with that option. It does show it:

$ puppet apply --modulepath="$PWD/test-modules" repro.pp --trace
Notice: Compiled catalog for angela.anarc.at in environment production in 0.05 seconds
Error: Failed to apply catalog: invalid byte sequence in UTF-8
/home/anarcat/src/puppet/test-modules/sysctl/lib/puppet/provider/sysctl_runtime/sysctl_runtime.rb:19:in `split'
/home/anarcat/src/puppet/test-modules/sysctl/lib/puppet/provider/sysctl_runtime/sysctl_runtime.rb:19:in `instances'
/home/anarcat/src/puppet/test-modules/sysctl/lib/puppet/provider/sysctl_runtime/sysctl_runtime.rb:33:in `prefetch'
/usr/lib/ruby/vendor_ruby/puppet/transaction.rb:360:in `prefetch'
/usr/lib/ruby/vendor_ruby/puppet/transaction.rb:252:in `prefetch_if_necessary'
/usr/lib/ruby/vendor_ruby/puppet/transaction.rb:111:in `block in evaluate'
/usr/lib/ruby/vendor_ruby/puppet/graph/relationship_graph.rb:119:in `traverse'
/usr/lib/ruby/vendor_ruby/puppet/transaction.rb:173:in `evaluate'
/usr/lib/ruby/vendor_ruby/puppet/resource/catalog.rb:239:in `block (2 levels) in apply'
/usr/lib/ruby/vendor_ruby/puppet/util.rb:519:in `block in thinmark'
/usr/lib/ruby/2.5.0/benchmark.rb:308:in `realtime'
/usr/lib/ruby/vendor_ruby/puppet/util.rb:518:in `thinmark'
/usr/lib/ruby/vendor_ruby/puppet/resource/catalog.rb:238:in `block in apply'
/usr/lib/ruby/vendor_ruby/puppet/util/log.rb:156:in `with_destination'
/usr/lib/ruby/vendor_ruby/puppet/transaction/report.rb:146:in `as_logging_destination'
/usr/lib/ruby/vendor_ruby/puppet/resource/catalog.rb:237:in `apply'
/usr/lib/ruby/vendor_ruby/puppet/configurer.rb:186:in `block (2 levels) in apply_catalog'
/usr/lib/ruby/vendor_ruby/puppet/util.rb:519:in `block in thinmark'
/usr/lib/ruby/2.5.0/benchmark.rb:308:in `realtime'
/usr/lib/ruby/vendor_ruby/puppet/util.rb:518:in `thinmark'
/usr/lib/ruby/vendor_ruby/puppet/configurer.rb:185:in `block in apply_catalog'
/usr/lib/ruby/vendor_ruby/puppet/util.rb:232:in `block in benchmark'
/usr/lib/ruby/2.5.0/benchmark.rb:308:in `realtime'
/usr/lib/ruby/vendor_ruby/puppet/util.rb:231:in `benchmark'
/usr/lib/ruby/vendor_ruby/puppet/configurer.rb:184:in `apply_catalog'
/usr/lib/ruby/vendor_ruby/puppet/configurer.rb:369:in `run_internal'
/usr/lib/ruby/vendor_ruby/puppet/configurer.rb:237:in `block in run'
/usr/lib/ruby/vendor_ruby/puppet/context.rb:65:in `override'
/usr/lib/ruby/vendor_ruby/puppet.rb:260:in `override'
/usr/lib/ruby/vendor_ruby/puppet/configurer.rb:211:in `run'
/usr/lib/ruby/vendor_ruby/puppet/application/apply.rb:355:in `apply_catalog'
/usr/lib/ruby/vendor_ruby/puppet/application/apply.rb:280:in `block (2 levels) in main'
/usr/lib/ruby/vendor_ruby/puppet/context.rb:65:in `override'
/usr/lib/ruby/vendor_ruby/puppet.rb:260:in `override'
/usr/lib/ruby/vendor_ruby/puppet/application/apply.rb:280:in `block in main'
/usr/lib/ruby/vendor_ruby/puppet/context.rb:65:in `override'
/usr/lib/ruby/vendor_ruby/puppet.rb:260:in `override'
/usr/lib/ruby/vendor_ruby/puppet/application/apply.rb:233:in `main'
/usr/lib/ruby/vendor_ruby/puppet/application/apply.rb:174:in `run_command'
/usr/lib/ruby/vendor_ruby/puppet/application.rb:375:in `block in run'
/usr/lib/ruby/vendor_ruby/puppet/util.rb:667:in `exit_on_fail'
/usr/lib/ruby/vendor_ruby/puppet/application.rb:375:in `run'
/usr/lib/ruby/vendor_ruby/puppet/util/command_line.rb:135:in `run'
/usr/lib/ruby/vendor_ruby/puppet/util/command_line.rb:73:in `execute'
/usr/bin/puppet:5:in `<main>'

I (rather naively i guess?) used --debug instead:

Info: Applying configuration version '1606603815'
Debug: /Stage[main]/Main/Sysctl::Value[kernel.unprivileged_userns_clone]/require: require to Class[Sysctl::Base]
Debug: /Stage[main]/Main/Sysctl::Value[kernel.unprivileged_userns_clone]/Sysctl[kernel.unprivileged_userns_clone]/before: before to Sysctl_runtime[kernel.unprivileged_userns_clone]
Debug: Prefetching parsed resources for sysctl
Debug: Prefetching sysctl_runtime resources for sysctl_runtime
Debug: Executing: '/sbin/sysctl -a'
Debug: Storing state
Debug: Pruned old state cache entries in 0.00 seconds
Debug: Stored state in 0.01 seconds
Error: Failed to apply catalog: invalid byte sequence in UTF-8
Debug: Applying settings catalog for sections reporting, metrics
Debug: Finishing transaction 47423916401140
Debug: Received report to process from angela.anarc.at
Debug: Evicting cache entry for environment 'production'
Debug: Deleted text domain :production: true
Debug: Caching environment 'production' (ttl = 0 sec)
Debug: Processing report from angela.anarc.at with processor Puppet::Reports::Store

which kind of does if you squint a little. It's how I decided to file the bug here and look at the sysctl output...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants