Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reworked paths logic, added tests, and fixed a CSV bug. #146

Merged
merged 2 commits into from
Sep 4, 2018

Conversation

data-pup
Copy link
Member

@data-pup data-pup commented Aug 28, 2018

This commit does a few different things, and largely finishes up the work started in #88, and #140.

First, analyze/analyses/paths.rs has been divided into separate files for the paths function itself, the emitting logic, and for the struct that represents an entry in the Paths struct. Further, the logic for traversing the ir::Items tree that was previously mixed into the emitting logic has been divided out into a separate function, create_entry.

To be sure that this is all working correctly, I added some various tests as well, especially focusing on the max_paths option. The expected output files were generated using the current master branch, to be sure that everything was behaving the same way before and after this change.

While doing this, I did notice an issue with the emit_csv function, being that the meta root is printed in the path field of a CSV record. Additionally, some items only display themselves in that field. In each of these cases, the field will instead be empty. Given that the meta root (iirc) is an internal abstraction, I think it is best left out of this command's output, especially if it is not shown in other formats.

Currently, summaries are not shown, but the code to do so is included in comments under a 'FIXME' note. Once this lands, following up by enabling these sections and updating the test cases should be very straightforward.

Minor sidenote: The output generated by emit_csv would probably be most helpful if we made a change so that duplicate rows are not displayed. For now, I've considered that out of scope for this PR, and can open up a separate issue for that.

@data-pup data-pup force-pushed the paths-refactoring-pt-ii-pr branch 2 times, most recently from 8f975ca to 6800a2d Compare August 28, 2018 21:50
@data-pup data-pup requested a review from fitzgen August 28, 2018 21:50
struct Paths {
items: Vec<ir::Id>,
opts: opt::Paths,
entries: Vec<PathsEntry>,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, I need to clean up the previous items member here, I'll take care of that.

@data-pup
Copy link
Member Author

data-pup commented Aug 28, 2018

As mentioned in the comment above, I have some light cleanup work to do on this. This should be ready for review very shortly :)

Update: This is ready for review now 😸

@data-pup data-pup changed the title Reworked paths logic, added tests, and fixed a CSV bug. [WIP] Reworked paths logic, added tests, and fixed a CSV bug. Aug 28, 2018
@data-pup data-pup changed the title [WIP] Reworked paths logic, added tests, and fixed a CSV bug. Reworked paths logic, added tests, and fixed a CSV bug. Aug 28, 2018

/// This helper function is used to collect ir::Id values for the `items` member
/// of the `Paths` object, based on the given options.
pub fn get_items(items: &ir::Items, opts: &opt::Paths) -> Result<Vec<ir::Id>, traits::Error> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This function should be renamed get_starting_positions, and have its documentation comment updated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to be imported anywhere, and can therefore be not-pub.

impl PathsEntry {
pub fn _count(&self) -> u32 {
1 + self.children.iter().map(|c| c._count()).sum::<u32>()
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is included for the sake of the summary logic that can be enabled once this lands :)

elem[0],59,0.13043862752033958,
std::panicking::LOCAL_STDERR::__getit::h7827294b3348067a,16,0.035373187124159884,func[37] -> std::panicking::LOCAL_STDERR::__getit::h7827294b3348067a
func[37],1,0.0022108241952599928,elem[0] -> func[37]
elem[0],59,0.13043862752033958,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This expectation file is a good example of duplicate rows in the CSV output. We can clean that up separately :)

@fitzgen
Copy link
Member

fitzgen commented Aug 29, 2018

Given that the meta root (iirc) is an internal abstraction, I think it is best left out of this command's output, especially if it is not shown in other formats.

Yep.

@fitzgen
Copy link
Member

fitzgen commented Aug 29, 2018

Minor sidenote: The output generated by emit_csv would probably be most helpful if we made a change so that duplicate rows are not displayed. For now, I've considered that out of scope for this PR, and can open up a separate issue for that.

SGTM

@fitzgen
Copy link
Member

fitzgen commented Aug 29, 2018

Update: This is ready for review now 😸

Isn't it totally unfair that there isn't a set of emojis of dog faces with different expressions?!?

@data-pup
Copy link
Member Author

Isn't it totally unfair that there isn't a set of emojis of dog faces with different expressions?!?

You have no idea how often I think about this

Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r=me with nitpicks below addressed :)

Thanks @data-pup!


/// This helper function is used to collect ir::Id values for the `items` member
/// of the `Paths` object, based on the given options.
pub fn get_items(items: &ir::Items, opts: &opt::Paths) -> Result<Vec<ir::Id>, traits::Error> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to be imported anywhere, and can therefore be not-pub.


/// Process a given path entry, and return an iterator of table rows,
/// representing its related call paths, according to the given options.
fn process_entry<'a>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For fns that aren't closures, I think the nesting-as-private-scope isn't carrying as much water as it does for the closures.

The fns will be not-pub anyways, so no worry about abstraction leakage.

But having them nested does make it harder to read the fn that they are nested within.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me! Especially after #105, I agree with you that the nested functions can be adjusted. I'll patch that now.

Box::new(row_iter.chain(children_iter))
} else if depth == opts.max_depth() {
Box::new(row_iter)
// FIXUP: Create a summary row, and chain it to the row iterator.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, it is easier to read and review PRs if these are left completely out instead of commented out. In general, my opinion is to either remove completely (and leave it for a follow up), or leave it in and actually implement+test it; no middle-of-the-road, halfway solutions that add baggage if we don't circle back and actually finish the implementation.

Fine for this PR though, just talking about the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure! Pardon, I had felt like it might make some of my rationale easier to follow, but I can leave these out in the future. Thanks for letting me know 💟

use std::iter;

use twiggy_ir::Items;
use twiggy_opt::Paths;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because Items and Paths are the only things used within these helper modules, and because they are compact, I elected to import these directly rather than using the twiggy_ir as ir pattern that is used elsewhere. That pattern helps with readability across a larger file that might use multiple items from another module, but here I think it's fine to be explicit about what we're using :)

pub size: Option<u32>,
pub size_percent: Option<f64>,
pub name: String,
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe pub(super) is what I should be using here. If that looks wrong though, lmk!

@@ -128,6 +130,8 @@ fn create_entry(
.collect()
};

// Temporarily add the current item to the set of discovered nodes, and
// create an entry for each child. Collect these into a `children` vector.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some comments like this in the create_entry function, just to clarify what is happening here. Because this is where all the recursion/traversal is happening, figured it was worth highlighting with a comment. :)

use twiggy_traits as traits;

use super::paths_entry::PathsEntry;
use super::Paths;
use analyses::paths::Paths;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decided to be explicit about the path for this, so that it looks similar to the use statements within the helper module below. Totally subjective, happy to change this back if the super:: syntax is preferable.

@data-pup
Copy link
Member Author

data-pup commented Aug 30, 2018

@fitzgen, I made some edits to fix the notes in your review. Most of the changes were straightforward, but the point about nested functions introduced some changes that I'd want to run by you before merging.

The gist is, I think it would be nice to keep all of the helper functions within the same file, rather than breaking them out into even more submodules. Conversely, it would be nice if the paths_emit file was not a impl traits::Emit for .. block followed by a host of various helper functions, given that there are currently 3 formats to consider.

To fix this, I wrapped the helper functions for the emit_text, emit_json, and emit_csv methods in individual mod { } blocks. Each exposes a process_entry function, and a helper struct if required.

I think this helps keep the emit_* methods readable, and clearly delineates each format's helper functions. I figured this is probably worth getting consensus on first however 😸

@data-pup
Copy link
Member Author

Regarding the CI failure, I think this is something upstream? I saw someone mention in the #rust-wasm IRC channel that they were seeing the same error occur while running wasm-pack, so we might want to restart the build once that is fixed :)

@fitzgen fitzgen merged commit a88495e into rustwasm:master Sep 4, 2018
@data-pup data-pup deleted the paths-refactoring-pt-ii-pr branch February 28, 2019 19:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants