Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Philip/random clean up fixes #1184

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

pamolloy
Copy link

@pamolloy pamolloy commented Jul 1, 2024

PR Description

A few random fixes related to porting libiio to a MCU.

PR Type

  • Bug fix (a change that fixes an issue)
  • New feature (a change that adds new functionality)
  • Breaking change (a change that affects other repos or cause CIs to fail)

PR Checklist

  • I have conducted a self-review of my own code changes
  • I have commented new code, particularly complex or unclear areas
  • I have checked that I did not introduce new warnings or errors (CI output)
  • I have checked that components that use libiio did not break
  • I have updated the documentation accordingly (GitHub Pages, READMEs, etc)

@pamolloy
Copy link
Author

pamolloy commented Jul 1, 2024

I'm debugging one more issue, which I'd like to resolve before removing the "Draft" label

Copy link
Author

@pamolloy pamolloy Jul 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @nunojsa, would you mind sanity checking this commit? There is a fair amount of generic code here and although I can test my MCU setup I am not totally confident how this will impact other use cases.

Copy link
Author

@pamolloy pamolloy Jul 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also thinking of adding a --debug that will set the log level in params

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @nunojsa, would you mind sanity checking this commit? There is a fair amount of generic code here and although I can test my MCU setup I am not totally confident how this will impact other use cases.

I'm not super super familiar with libiio internals but looking at the code the change seems pretty harmless. The commit message is also very good in explaining the change (and it does make sense to me).

I was also thinking of adding a --debug that will set the log level in params

If the other option is to recompile libiio, then yes... that seems like an useful change. Maybe a follow up PR...

@pamolloy pamolloy force-pushed the philip/random-clean-up-fixes branch from 5a8480f to d70e3e0 Compare July 3, 2024 12:09
@pamolloy pamolloy marked this pull request as ready for review July 3, 2024 12:10
Copy link
Contributor

@nunojsa nunojsa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

iiod/responder.c Show resolved Hide resolved
@@ -359,7 +359,7 @@ struct iio_context * handle_common_opts(char * name, int argc,
} else if (!arg && backend != IIO_LOCAL)
fprintf(stderr, "argument parsing error\n");
else if (backend == IIO_URI)
ctx = iio_create_context(NULL, arg);
ctx = iio_create_context(&params, arg);
else
ctx = iio_create_context(NULL, NULL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likely not needed if there's no real need for a re-spin but adding the params in here should not arm (I guess this is local context and timeout is unlikely - if possible at all)...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that passing NULL is functionally the same in Libiio as passing a zero-initialized params.

@pamolloy
Copy link
Author

pamolloy commented Jul 9, 2024

Obviously if you want to keep testing CentOS 7 and whether those functions should be dropped is beyond my pay grade. I'm more than happy to drop them from the series, but figured I'd offer up the changes. 😄

Copy link
Contributor

@rgetz rgetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@rgetz rgetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still support centos 7, since some downstream users (MathWorks) still support Centos 7 - https://www.mathworks.com/support/requirements/matlab-linux.html

Copy link
Contributor

@rgetz rgetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the functionality for local - in that it ignores the timeout given on the command line, which will be a problem for slow devices.

Copy link
Contributor

@rgetz rgetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the free man page:

If ptr is NULL, no operation is performed.
meaning it is perfectly safe and valid.

Rearranging things to avoid this is unnecessary.

@pamolloy
Copy link
Author

Hi @rgetz, thanks for taking a look! Unfortunately, it looks like the comments were made on the PR main thread rather than on a particular commit or diff, which can make it a bit harder to understand what you are referring to and I can't reply directly to the comments in a thread or mark them as resolved. If you have any follow-up review comments could you please comment on the code directly where possible?

this commit should not be necessary...

https://github.com/analogdevicesinc/libiio/blob/main/CMakeLists.txt#L195
https://github.com/analogdevicesinc/libiio/blob/main/CMakeLists.txt#L214
https://github.com/analogdevicesinc/libiio/blob/main/CMakeLists.txt#L220

Is this referring to adding -Wall to CI/azure/ci-ubuntu.sh and CI/build_win.ps1 or adding it to azure-pipelines.yml?

I'm not sure why the flags you link to do not get applied. This is already a bit of a rabbit hole since enabling warnings has created numerous warnings across platforms and architectures. Any help debugging it would be appreciated. Otherwise I'll try to look into it when I have time.

We still support centos 7, since some downstream users (MathWorks) still support Centos 7 - https://www.mathworks.com/support/requirements/matlab-linux.html

👍 Will drop it the next time I push

This changes the functionality for local - in that it ignores the timeout given on the command line, which will be a problem for slow devices.

This is also a bit hard to understand out of context of the code. The change I made is intended to do the opposite. The command-line timeout isn't currently applied until after the context is created, which created a problem for my setup with a MCU communicating at 9600 Baud.

From the free man page:

If ptr is NULL, no operation is performed.
meaning it is perfectly safe and valid.

Rearranging things to avoid this is unnecessary.

Yup, it is definitely valid C, but given that the bug was related to a bug calling free() incorrectly I wanted to make sure the logic was clear. Also, everything had to get rearranged anyway to fix the bug, this was a trivial change in comparison. For example, I believe the former is more clear than the later:

if (description) {
    if (ctx->description) {
        free(ctx->description);
    } else {
        ...
    }
}

As opposed to this alternative based on how it is currently implemented:

if (description) {
    if (ctx->description) {
        ...
    } else {
        ...
    }
    free(ctx->description);
}

Of course it also saves a function call, although that's clearly not significant here.

@rgetz
Copy link
Contributor

rgetz commented Jul 16, 2024

Is this referring to adding -Wall to CI/azure/ci-ubuntu.sh and CI/build_win.ps1 or adding it to azure-pipelines.yml?

both - We (Paul and I) decided we only apply these things to the library - not the examples or the utils - which are managed in their own cmake files.

@pamolloy
Copy link
Author

Is this referring to adding -Wall to CI/azure/ci-ubuntu.sh and CI/build_win.ps1 or adding it to azure-pipelines.yml?

both - We (Paul and I) decided we only apply these things to the library - not the examples or the utils - which are managed in their own cmake files.

Ok, but just so it is clear, doing so has meant that warnings have not been enabled in your builds for a significant amount of code. It is much harder to guarantee that -Wall is added to every target in the appropriate CMakeLists.txt than passing it explicitly at the command-line. Also those CI scripts already include -Werror just as a CMake option (i.e. -DCOMPILE_WARNING_AS_ERROR=ON), so adding -Wall would be consistent with that.

Reading through the top level CMakeLists.txt, -Wall is only applied to the iio library target. That does not include examples/, iiod/, tinyiiod/ and I believe also utils/. I'm not enough of a CMake expert to fix this CMake so I'll just drop my CI change, but keep all the fixes I found by enabling warnings.

@rgetz
Copy link
Contributor

rgetz commented Jul 16, 2024

Ok, but just so it is clear, doing so has meant that warnings have not been enabled in your builds for a significant amount of code.

If you mean extra warnings - sure. But only the code that doesn't really matter... if you want to fix everything, and then pass down the CFLAGS to the appropriate target_compile - and then fix everything - that's up to you - there are lots of failing things when you do that that the CI found.

As mentioned in the CMAKE - you can't turn on some flags on some architectures while Cmake is testing for specific functionality or capabilities. - it will cause false failures, and screw up the builds. see : https://github.com/analogdevicesinc/libiio/blob/main/CMakeLists.txt#L300

@pamolloy pamolloy force-pushed the philip/random-clean-up-fixes branch 2 times, most recently from cb9ab56 to aaa67e0 Compare July 17, 2024 15:46
Add check when attempting to concatenate description strings.

When description was NULL and ctx->description was a valid string then
ctx->description was freed and replaced by NULL. This caused iio_info to
not print the backend description string.

Additionally, do not free(NULL) when ctx->description is NULL.

Fixes analogdevicesinc#1121

Signed-off-by: Philip Molloy <[email protected]>
When a timeout is specified as a command-line argument then it is only
applied after the final context is created. If the creation of the
context requires transfering data then a default timeout is used. That
default timeout might be too short for slow transfers, preventing the
context from being created.

For example, a serial backend without any devices needs to send a
context XML string of roughly 1170 bytes, which when communicating at
9600 baud will take more than the default timeout of 1000 ms.

Signed-off-by: Philip Molloy <[email protected]>
Signed-off-by: Philip Molloy <[email protected]>
If iio_context_get_device() does not return a iio_device pointer then
the uninitialized ret variable was passed as a code to
iiod_io_send_response_code().

Signed-off-by: Philip Molloy <[email protected]>
- Remove unused variables. A warning was only generated in the macOS and
  ARM Linux builds and not in x86 Linux.
- The uninitialized variable should never be read since the example will
  exit() if an error occurs. A warning was only generated with the macOS
  builds, not the x86 or ARM Linux builds.

Signed-off-by: Philip Molloy <[email protected]>
This was only caught in the macOS and ARM builds and not in Linux

Signed-off-by: Philip Molloy <[email protected]>
- Remove unused variables
- The uninitialized variable should never be read since the example will
  exit() if an error occurs. A warning was only generated with the macOS
  builds, not the x86 or ARM Linux builds.

Signed-off-by: Philip Molloy <[email protected]>
@pamolloy pamolloy force-pushed the philip/random-clean-up-fixes branch from aaa67e0 to 14ef974 Compare July 18, 2024 10:18
@pamolloy
Copy link
Author

I dropped the CentOS7 change and adding -Wall to the CI scripts. I was able to verify that there were no warnings with -Wall enabled for x86 Linux, ARM Linux and macOS before I did.

I assume the current build errors are related to #1183.

@rgetz
Copy link
Contributor

rgetz commented Jul 18, 2024

and adding -Wall to the CI scripts.

I think this was removed from your patch set - but I would not do it in the CI scripts - do it in the CMAKE - so that all developers run with the same settings, and people don't notice problems only when submitting patches, they find it on their desktop first...

but this was also the reason that we didn't add warn as error outside the library - if a userspace util warns on a new compiler for some reason - you shouldn't stop people from using the utils.

-Robin

Copy link
Contributor

@pcercuei pcercuei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit doesn't have to be so complicated... even though the commit message helps to understand what it does.

The change could just be:

diff --git a/context.c b/context.c
index a202e80e..a836f5fd 100644
--- a/context.c
+++ b/context.c
@@ -641,8 +641,11 @@ iio_create_context_from_xml(const struct iio_context_params *params,
                }
        }
 
-       free(ctx->description);
-       ctx->description = new_description;
+       if (new_description) {
+               free(ctx->description);
+               ctx->description = new_description;
+       }
+
        ctx->params = *params;
 
        return ctx;

Note that free(NULL) is perfectly valid C so there's no reason to work around it.

Copy link
Contributor

@pcercuei pcercuei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from the two things I noted above, the rest of the changes look good! Thanks for the PR!

@@ -257,12 +257,12 @@ struct iio_context * handle_common_opts(char * name, int argc,
int *err_code)
{
struct iio_context *ctx = NULL;
struct iio_context_params params = { .timeout_ms = -1 };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 would mean infinite timeout (IIRC), keep it zero-initialized and Libiio will set it to the default timeout that corresponds to the backend used.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, -1 was used previously to prevent iio_context_set_timeout() from being called if a timeout wasn't specified at the command-line. Thanks for catching that copy/paste mistake!

@pamolloy
Copy link
Author

This commit doesn't have to be so complicated... even though the commit message helps to understand what it does.

The change could just be:

diff --git a/context.c b/context.c
index a202e80e..a836f5fd 100644
--- a/context.c
+++ b/context.c
@@ -641,8 +641,11 @@ iio_create_context_from_xml(const struct iio_context_params *params,
                }
        }
 
-       free(ctx->description);
-       ctx->description = new_description;
+       if (new_description) {
+               free(ctx->description);
+               ctx->description = new_description;
+       }
+
        ctx->params = *params;
 
        return ctx;

Note that free(NULL) is perfectly valid C so there's no reason to work around it.

Right, I replied to a similar point in this comment. I'm proposing:

if (description) {
    if (ctx->description) {
        new_description = malloc();
        iio_snprintf(new_description, ...);
        free(ctx->description);
    } else {
        ...
    }
}

As I understand it you're suggesting the following?

if (description) {
    if (ctx->description) {
        new_description = malloc();
        iio_snprintf(new_description, ...);
    } else {
        ...
    }
}

if (new_description) {
    free(ctx->description);
    ctx->description = new_description;
}

@pcercuei
Copy link
Contributor

Well, your comment did say "the bug was related to a bug calling free() incorrectly", but that's not the case, is it?

From what I can see the bug is that the code unconditionally replaces ctx->description with new_description even when the new pointer is NULL.

@pamolloy
Copy link
Author

pamolloy commented Jul 23, 2024

Incorrectly in the sense of the logic of the program not the semantics of C.

Maybe it would help if I rephrased Additionally, do not free(NULL) when ctx->description is NULL in my commit message to make it clear that that one line change is intended to make the code more concise and readable?

Or I could split up the commit somehow to make the diff easier to read?

@pcercuei
Copy link
Contributor

As I understand it you're suggesting the following?

No, I'm suggesting that you don't need anything else than the diff I pasted above to make it work.

To be clear, what bothers me with your commit isn't the change in itself (although avoiding the free(NULL) sounds like a bogus argument to me), more like the amount of lines updated for something that can be done in a much simpler manner.

Sure, the if (description && ctx->description) { ... } else if (description) { ... } looks a bit weird, but I'll always favor tiny commits over code cleanups. As it is right now, the actual fix is drowned into lines of unrelated changes, and it makes it harder to review as the change is not obvious, and the whole thing trashes the file's git history.

@pamolloy
Copy link
Author

For me the actual code is always more important than what the diff looks like. Indenting code isn't exactly a complicated maneuver, but it does create a larger diff. And in my opinion the original bug was the result of the code being oddly complicated in the first place. So making code more complicated by moving statements into additional blocks because of ugly diffs is inviting the same kind of logical errors that this bug fix is trying to fix.

I'm not trying to avoid free(NULL) in my one line change, but make the code readable. As I said above the following doesn't make sense to me:

if (description) {
    if (ctx->description) {
        ...
    } else {
        ...
    }
    free(ctx->description);
}

Why not move the free(ctx->description) into the conditional that checks that ctx->description is not NULL?

Anyway, I'm not sure how to get around such a fundamental disagreement. 😆 I'd be happy to break out the oneline change of free(ctx->description) into a new commit with a commit message that clarifies it is for readability.

@nunojsa
Copy link
Contributor

nunojsa commented Aug 7, 2024

@pamolloy I lost a bit track of this but I guess you could still make @pcercuei happy and do your cleanup...

Sure, the if (description && ctx->description) { ... } else if (description) { ... } looks a bit weird, but I'll always favor tiny commits over code cleanups. As it is right now, the actual fix is drowned into lines of unrelated changes, and it makes it harder to review as the change is not obvious, and the whole thing trashes the file's git history.

What I read of this is, make the fix as small as possible (diff speaking) so it's easy to understand what the fix is about. Then, there's nothing stopping you from adding a new patch only for code cleanup and improving readability. If the extra patch really does that, I don't see why Paul would disagree with it.

In the end, what Paul is trying to say (I think) is that git history matters, a lot. TBH, I completely agree with this... You can still refactor the code as you want but in an incremental way. And when fixes are involved, it's even more important. For instance think about a fix that you want to backport. It's way easier to backport a tiny little patch with oneline diff than a bigger patch where it will be way difficult to understand what the fix is about. And when backporting, you just care about the fix and not code refactoring :)

@pamolloy
Copy link
Author

pamolloy commented Aug 7, 2024

What I read of this is, make the fix as small as possible (diff speaking) so it's easy to understand what the fix is about. Then, there's nothing stopping you from adding a new patch only for code cleanup and improving readability. If the extra patch really does that, I don't see why Paul would disagree with it.

This seems totally reasonable to me!

@pamolloy
Copy link
Author

Was just watching So You Think You Know Git - FOSDEM 2024 and learned about --word-diff. For example, the ugly default diff on Github for context: Handle if description is NULL and ctx->description exists that is the result of changing the indention is much easier to understand with --word-diff.

Only part of the output of git show:
2024-10-26-174836_791x625_scrot

The entire output of git show --word-diff:
2024-10-26-174850_770x756_scrot

@pamolloy
Copy link
Author

And I just found out it is also possible to render something similar to --word-diff with Github by adding ?w=1 (e.g. 76f6cbd?w=1).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants