Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/image benchmarking #880

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

qh681248
Copy link
Contributor

PR Type

  • Feature

Description

This PR extends the original benchmark framework and adds a visual benchmark. The benchmarking process follows these steps:

  1. Load an input image and downsample it using a specified factor.
  2. Convert to grayscale and extract non-zero pixel locations and values.
  3. Generate coresets with varying algorithms.
  4. Plot the original image alongside coresets generated by each algorithm.
  5. Save the resulting plot as an output file.

How Has This Been Tested?

Existing tests pass as expected.

New tests introduced with this change verify that...

Does this PR introduce a breaking change?

Checklist before requesting a review

  • I have made sure that my PR is not a duplicate.
  • My code follows the style guidelines of this project.
  • I have ensured my code is easy to understand, including docstrings and comments where necessary.
  • I have performed a self-review of my code.
  • I have made corresponding changes to the documentation.
  • My changes generate no new warnings.
  • New and existing unit tests pass locally with my changes.
  • Any dependent changes have been merged and published in downstream modules.
  • I have updated CHANGELOG.md, if appropriate.

Copy link
Contributor

Performance review

Commit 22efbbd - Merge 89244b6 into 5039cff

Statistically significant changes

  • basic_rpc:
    • OLD: compilation 0.7774 units ± 0.08315 units; execution 0.1256 units ± 0.000803 units
    • NEW: compilation 0.8478 units ± 0.08107 units; execution 0.1472 units ± 0.009568 units
    • Significant increase in execution time (17.23%, p=5.124e-05)

Normalisation values for new data:
Compilation: 1 unit = 406.66 ms
Execution: 1 unit = 888.55 ms

Copy link
Contributor

Performance review

Commit 5302c89 - Merge 540fb7e into 87dd4a7

No significant changes to performance.

@bk958178 bk958178 self-requested a review December 3, 2024 15:54
Copy link
Contributor

@bk958178 bk958178 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @qh681248. Some minor changes requested

@@ -556,7 +556,7 @@ contextmanager-decorators=contextlib.contextmanager
# system, and so shouldn't trigger E1101 when accessed. Python regular
# expressions are accepted.
generated-members=

cv2.*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you tell me why this has been added? Are we accessing non-existent members of cv2?


The benchmarking process follows these steps:
1. Load an input image and downsample it using a specified factor.
2. Convert to grayscale and extract non-zero pixel locations and values.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is meant by 'non-zero pixel locations and values' ? Does this mean pixels that have value above zero? i.e., are not completely black?

The benchmarking process follows these steps:
1. Load an input image and downsample it using a specified factor.
2. Convert to grayscale and extract non-zero pixel locations and values.
3. Generate coresets with varying algorithms.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change 'varying' to 'various' or 'different'

Benchmark performance of different coreset algorithms on pixel data from an image.

The benchmarking process follows these steps:
1. Load an input image and downsample it using a specified factor.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you briefly describe why we downsample the image?

)

# Set up the original data object and coreset parameters
data = Data(pre_coreset_data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to jax this? data = Data(jnp.asarray(pre_coreset_data))


# Set up the original data object and coreset parameters
data = Data(pre_coreset_data)
coreset_size = 8_000 // (downsampling_factor**2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this the coreset size? Can you add detail in the docstring above to explain


# Initialize each coreset solver
key = random.PRNGKey(0)
solvers = initialise_solvers(Data(jnp.array(pre_coreset_data)), key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we doing Data(jnp.array(pre_coreset_data)) again here? Surely this can just be:

solvers = initialise_solvers(data, key).

NB: I'm not sure which is correct for data - either Data(jnp.array(pre_coreset_data)) or Data(jnp.asarray(pre_coreset_data))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants