Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use async for filesystem hits #246

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

RealOrangeOne
Copy link
Contributor

Except Directory, as there's no async-compatible glob implementation

Description

Use tokio::fs when interacting with the filesystem.

Motivation and Context

This frees the event loop up for doing other things when waiting for the filesystem.

How Has This Been Tested?

The compiler doesn't complain, and very little else has changed.

I'd try the unit tests, but those fail on master, let alone this branch.

Changelog Entry

Types of Changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (no code change)
  • Refactor (refactoring production code)
  • Other

Checklist:

  • My code follows the code style of this project.
  • I have updated the documentation accordingly.
  • I have formatted the code with rustfmt.
  • I checked the lints with clippy.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

Except `Directory`, as there's no async-compatible `glob` implementation
Copy link
Owner

@orhun orhun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thanks for the PR. I'm curious if there are any visible performance improvements after this. Did you run any benchmarks by any chance?

src/file.rs Show resolved Hide resolved
src/paste.rs Show resolved Hide resolved
src/server.rs Outdated Show resolved Hide resolved
src/util.rs Show resolved Hide resolved
@RealOrangeOne
Copy link
Contributor Author

I've not done any benchmarks from this, but I'd expect most of the benefit to come in throughput as opposed to raw latency. Adding more yields adds more overhead to a given request, but allows the server to handle other requests at the same time. I doubt rustypaste is used in any particularly high-throughput deployments, but can't hurt to be that little more efficient. It also means more concurrent requests can be handled by fewer workers.

@orhun
Copy link
Owner

orhun commented Feb 25, 2024

Sounds good to me. Clippy is failing, can you check?

I think this PR will be ready to merge when the CI is green!

src/paste.rs Show resolved Hide resolved
src/paste.rs Show resolved Hide resolved
@tessus
Copy link
Collaborator

tessus commented Feb 28, 2024

I have noticed that the tests fail - in-code and fixtures.

@tessus
Copy link
Collaborator

tessus commented Mar 5, 2024

what's up with this PR? The tests fail and it should be merged so that I can work on the fix for files with the same name.

Can we get this moving again? First the tests have to pass.

This required a little refactor, but the code is now much cleaner, and correctly handles updating the config
@RealOrangeOne
Copy link
Contributor Author

Even the master test suite fails locally for me, so I don't trust my local environment (or there's something else weird going on). I think I found the root cause of the main outstanding failures, which I've just fixed. Hopefully, the tests now pass properly

@tessus
Copy link
Collaborator

tessus commented Mar 6, 2024

Even the master test suite fails locally for me,

You are correct. I've opened an issue #253 to track this issue. However, there are only 2 tests failing in git master. But I recall there were quite a lot failing here.

@RealOrangeOne
Copy link
Contributor Author

There are only 3 failing here, all looking like they may be related. I'll rebase and see if that resolves anything.

@tessus
Copy link
Collaborator

tessus commented Mar 6, 2024

There is still one test case failing. Not sure why that is when it's ok in git master: https://github.com/orhun/rustypaste/actions/runs/8167143469/job/22327067458#step:5:517

P.S.: I can try running your branch locally and check out the tests. But I am way past my bed time. I'll have to do that tomorrow.

@RealOrangeOne
Copy link
Contributor Author

Possibly a race in the IO, or something else behaving weirdly. I'll try and look this evening.

@tessus
Copy link
Collaborator

tessus commented Mar 6, 2024

I ran the tests in your branch several times locally and there was never an error. All tests passed. @orhun any chance you could try the tests in this branch locally?

I also looked at the test and it is a bit confusing. It converts the body of the response (which is the link that is returned from rustypaste) into bytes. Why the conversion? Anyway, the same link is supposed to be returned for a file that has the same contents.
I ran a rustypaste test instance (master branch) and uploaded the same file twice (with duplicate_files = false) and I got 2 different links returned. That should not have happened. But when I run the tests all is fine. How is this possible? This makes no sense.

Maybe orhun has a better crasp on this, since he wrote the code. I'm currently majorly confused.

@RealOrangeOne
Copy link
Contributor Author

I can't reproduce the issues with cargo test (running that case specifically), but cargo tarpaulin does show an issue. I wonder if it's something to do with how tarpaulin runs, or something around concurrency. I'm not convinced it's related to this PR.

@tessus
Copy link
Collaborator

tessus commented Mar 6, 2024

I looked at the code and I am even more confused now than I was before.

It seems that duplicate_files only works iff you do not send an expiry date and there is no default_expiry set in the config. I am not sure whether this was intended or maybe I am msreading the code.
It also still doesn't explain why the tests fail on gh, but not locally.

@orhun
Copy link
Owner

orhun commented Mar 6, 2024

For me all the tests pass locally. I'm not sure what's wrong with the CI :/

I also looked at the test and it is a bit confusing. It converts the body of the response (which is the link that is returned from rustypaste) into bytes. Why the conversion?

I'm guessing I had hard time converting body to a string (due to Pin and stuff) so gave up and just used bytes. But yeah, we should probably not do that.

It seems that duplicate_files only works if you do not send an expiry date and there is no default_expiry set in the config.

I think this was intended - since the other file has an expiry date it is "not the same" as the other file. It should be nice to document this behavior though.

@tessus
Copy link
Collaborator

tessus commented Mar 6, 2024

I think this was intended - since the other file has an expiry date it is "not the same" as the other file. It should be nice to document this behavior though.

I don't really understand how to actually create a file that does not expire. Is it only possible, if no default is set? what about sending expiry as -1 or 0?
Maybe something to add at one point? I never tried to create a file w/o expiration. Maybe it already works that way.

Even though the expiry date is not the same, the content of the file is the same. This is what tripped me up in my understanding.

But yeah, we should probably not do that.

other tests use string comparison, so it should be an easy fix. Still, it probably won't change the fact that it weirdly fails on gh.

@tessus
Copy link
Collaborator

tessus commented Mar 6, 2024

other tests use string comparison, so it should be an easy fix. Still, it probably won't change the fact that it weirdly fails on gh.

ok, I see what the problem was and still is. I guess there's no way around it then. Sometimes I think that rust makes certain things overly complicated and not very intuitive.

@tessus
Copy link
Collaborator

tessus commented Mar 6, 2024

It almost seems that on gh either one of the following is ignored:

  • config.paste.duplicate_files = Some(false);
  • that the default for default_expiry is not set

P.S.: The fixture also passes, so this is really weird.

@tessus
Copy link
Collaborator

tessus commented Mar 8, 2024

I ran a few tests here #259 and it seems that tarpaulin has a side effect and makes that particular test fail.

I've opened an issue with tarpaulin: xd009642/tarpaulin#1488

@RealOrangeOne
Copy link
Contributor Author

So how can we progress this PR? It seems silly to hold it off because the coverage tool has a bug. Should we remove the coverage checker, merge this in, then create a separate PR to add it back?

@tessus
Copy link
Collaborator

tessus commented Mar 15, 2024

@RealOrangeOne can you merge the changes in master back into this PR, change the one call to use .await, so that we can run the tests.
Or you can give me push access to your branch and I do it.

P.S.: I think your force-push created a conflict.

@tessus
Copy link
Collaborator

tessus commented Mar 15, 2024

The tests in paste.rs are missing the .await for the store_file call. What is strange though is that this was not caught by the lints.

P.S.: Why on earth would cargo clippy --tests --verbose -- -D warnings not catch this? How could the tests have been even compiled?

@RealOrangeOne
Copy link
Contributor Author

Which ones? I've looked and all of them seem to call .await

@tessus
Copy link
Collaborator

tessus commented Mar 15, 2024

Wait, I might have checked the wrong branch. Give me a sec. One test fails though.

@tessus
Copy link
Collaborator

tessus commented Mar 15, 2024

ok, so I ran your branch locally and the tests pass. But on the ci the following happens:

thread 'paste::tests::test_paste_data' panicked at src/paste.rs:346:9:

@tessus
Copy link
Collaborator

tessus commented Mar 15, 2024

I will run a few tests later this evening.

@tessus
Copy link
Collaborator

tessus commented Mar 19, 2024

I ran a bunch of different modified tests and I am at a loss. I am stuck. No idea what is going on.

@tessus
Copy link
Collaborator

tessus commented Mar 21, 2024

@orhun have you been able to figure out what the issue could be? it seems this is blocking a new release for a while now and the other PRs haven't been merged yet either.

@orhun
Copy link
Owner

orhun commented Mar 21, 2024

I managed to reproduce the test failure locally!

Just use non-async calls in tests.

Details
diff --git a/src/paste.rs b/src/paste.rs
index 7ae5bb9..4a088a0 100644
--- a/src/paste.rs
+++ b/src/paste.rs
@@ -287,6 +287,7 @@ mod tests {
     use awc::ClientBuilder;
     use byte_unit::Byte;
     use std::env;
+    use std::fs;
     use std::str::FromStr;
     use std::time::Duration;
 
@@ -307,14 +308,14 @@ mod tests {
             type_: PasteType::File,
         };
         let file_name = paste.store_file("test.txt", None, None, &config).await?;
-        assert_eq!("ABC", fs::read_to_string(&file_name).await?);
+        assert_eq!("ABC", fs::read_to_string(&file_name)?);
         assert_eq!(
             Some("txt"),
             PathBuf::from(&file_name)
                 .extension()
                 .and_then(|v| v.to_str())
         );
-        fs::remove_file(file_name).await?;
+        fs::remove_file(file_name)?;
 
         Ok(())
     }
@@ -335,10 +336,10 @@ mod tests {
             type_: PasteType::File,
         };
         let file_name = paste.store_file("foo.tar.gz", None, None, &config).await?;
-        assert_eq!("tessus", fs::read_to_string(&file_name).await?);
+        assert_eq!("tessus", fs::read_to_string(&file_name)?);
         assert!(file_name.ends_with(".tar.gz"));
         assert!(file_name.starts_with("foo."));
-        fs::remove_file(file_name).await?;
+        fs::remove_file(file_name)?;
 
         config.paste.random_url = Some(RandomURLConfig {
             length: Some(4),
@@ -351,10 +352,10 @@ mod tests {
             type_: PasteType::File,
         };
         let file_name = paste.store_file(".foo.tar.gz", None, None, &config).await?;
-        assert_eq!("tessus", fs::read_to_string(&file_name).await?);
+        assert_eq!("tessus", fs::read_to_string(&file_name)?);
         assert!(file_name.ends_with(".tar.gz"));
         assert!(file_name.starts_with(".foo."));
-        fs::remove_file(file_name).await?;
+        fs::remove_file(file_name)?;
 
         config.paste.random_url = Some(RandomURLConfig {
             length: Some(4),
@@ -367,9 +368,9 @@ mod tests {
             type_: PasteType::File,
         };
         let file_name = paste.store_file("foo.tar.gz", None, None, &config).await?;
-        assert_eq!("tessus", fs::read_to_string(&file_name).await?);
+        assert_eq!("tessus", fs::read_to_string(&file_name)?);
         assert!(file_name.ends_with(".tar.gz"));
-        fs::remove_file(file_name).await?;
+        fs::remove_file(file_name)?;
 
         Ok(())
     }
@@ -386,9 +387,9 @@ mod tests {
             type_: PasteType::File,
         };
         let file_name = paste.store_file(".foo", None, None, &config).await?;
-        assert_eq!("xyz", fs::read_to_string(&file_name).await?);
+        assert_eq!("xyz", fs::read_to_string(&file_name)?);
         assert_eq!(".foo.txt", file_name);
-        fs::remove_file(file_name).await?;
+        fs::remove_file(file_name)?;
 
         config.paste.default_extension = String::from("bin");
         config.paste.random_url = Some(RandomURLConfig {
@@ -401,14 +402,14 @@ mod tests {
             type_: PasteType::File,
         };
         let file_name = paste.store_file("random", None, None, &config).await?;
-        assert_eq!("xyz", fs::read_to_string(&file_name).await?);
+        assert_eq!("xyz", fs::read_to_string(&file_name)?);
         assert_eq!(
             Some("bin"),
             PathBuf::from(&file_name)
                 .extension()
                 .and_then(|v| v.to_str())
         );
-        fs::remove_file(file_name).await?;
+        fs::remove_file(file_name)?;
 
         Ok(())
     }
@@ -436,9 +437,9 @@ mod tests {
                 &config,
             )
             .await?;
-        assert_eq!("tessus", fs::read_to_string(&file_name).await?);
+        assert_eq!("tessus", fs::read_to_string(&file_name)?);
         assert_eq!("fn_from_header.txt", file_name);
-        fs::remove_file(file_name).await?;
+        fs::remove_file(file_name)?;
 
         config.paste.random_url = Some(RandomURLConfig {
             length: Some(4),
@@ -458,9 +459,9 @@ mod tests {
                 &config,
             )
             .await?;
-        assert_eq!("tessus", fs::read_to_string(&file_name).await?);
+        assert_eq!("tessus", fs::read_to_string(&file_name)?);
         assert_eq!("fn_from_header", file_name);
-        fs::remove_file(file_name).await?;
+        fs::remove_file(file_name)?;
 
         Ok(())
     }
@@ -476,8 +477,7 @@ mod tests {
             PasteType::Oneshot
                 .get_path(&config.server.upload_path)
                 .expect("Bad upload path"),
-        )
-        .await?;
+        )?;
 
         let paste = Paste {
             data: vec![116, 101, 115, 116],
@@ -491,8 +491,8 @@ mod tests {
             .get_path(&config.server.upload_path)
             .expect("Bad upload path")
             .join(format!("{file_name}.{expiry_date}"));
-        assert_eq!("test", fs::read_to_string(&file_path).await?);
-        fs::remove_file(file_path).await?;
+        assert_eq!("test", fs::read_to_string(&file_path)?);
+        fs::remove_file(file_path)?;
 
         Ok(())
     }
@@ -511,8 +511,7 @@ mod tests {
             PasteType::Url
                 .get_path(&config.server.upload_path)
                 .expect("Bad upload path"),
-        )
-        .await?;
+        )?;
 
         let url = String::from("https://orhun.dev/");
         let paste = Paste {
@@ -524,8 +523,8 @@ mod tests {
             .get_path(&config.server.upload_path)
             .expect("Bad upload path")
             .join(&file_name);
-        assert_eq!(url, fs::read_to_string(&file_path).await?);
-        fs::remove_file(file_path).await?;
+        assert_eq!(url, fs::read_to_string(&file_path)?);
+        fs::remove_file(file_path)?;
 
         let url = String::from("testurl.com");
         let paste = Paste {
@@ -548,8 +547,7 @@ mod tests {
             PasteType::Url
                 .get_path(&config.server.upload_path)
                 .expect("Bad upload path"),
-        )
-        .await?;
+        )?;
 
         let url = String::from("https://upload.wikimedia.org/wikipedia/en/a/a9/Example.jpg");
         let mut paste = Paste {
@@ -570,15 +568,14 @@ mod tests {
             "70ff72a2f7651b5fae3aa9834e03d2a2233c52036610562f7fa04e089e8198ed",
             util::sha256_digest(&*paste.data)?
         );
-        fs::remove_file(file_path).await?;
+        fs::remove_file(file_path)?;
 
         for paste_type in &[PasteType::Url, PasteType::Oneshot] {
             fs::remove_dir(
                 paste_type
                     .get_path(&config.server.upload_path)
                     .expect("Bad upload path"),
-            )
-            .await?;
+            )?;
         }
 
         Ok(())

Results in a bunch of failures (same as CI) for me. I think it is a spurious async issue.

it seems this is blocking a new release for a while now and the other PRs haven't been merged yet either.

Do you mean the dependabot PRs? Is there any other PR planned for the release on your side?

@RealOrangeOne
Copy link
Contributor Author

I think it is a spurious async issue.

I suspect you're right. CI runs tests in 1 thread, but it's possible it still runs multiple cases in the same event loop?

The only way I can think for trying to work through these is to have each test case run in its own temporary directory instead, which would prevent them from interracting - which I suspect is what's happening here, but obviously can't prove it.

What should the next step be? It'd be great to get this in!

@orhun
Copy link
Owner

orhun commented Mar 21, 2024

I think we can try what you have suggested and try to have a more isolated test environment. I found this which is pretty interesting. Might not be needed though.

@tessus thoughts?

@tessus
Copy link
Collaborator

tessus commented Mar 21, 2024

Do you mean the dependabot PRs? Is there any other PR planned for the release on your side?

Yep, I meant those. And nope, nothing planned from my side. I also noticed that the CI is throwing warnings, but this has nothing to do with a release.

I found this which is pretty interesting. Might not be needed though.

This is actually pretty awesome. This also means, we could run the tests with multiple threads then, correct? Isolating tests is always a good idea. IMO this should be the default for tests anyway with a keyword to allow tests to not be isolated iff needed.
Not sure why that wasn't implemented in Rust core.

@tessus thoughts?

The only issue I have with it is that I don't understand why the tests fail. In #263 I've tried nextest but also changing the test, which turned out to result in another problem (stalling/timeout, which makes no sense either).
Maybe I am missing something, because I am not a Rust expert, but from what I can ascertain from a logical point of view is that they shouldn't fail.

@orhun
Copy link
Owner

orhun commented Mar 22, 2024

Yeah, I still think that making some things async broke some internal concurrency guarantees and now the tests are failing spuriously because of that. We can proceed with making things isolated and see if that works.

This prevents tests from conflicting with eachother if they don't clean up after themselves correctly.
@RealOrangeOne
Copy link
Contributor Author

sealed_test doesn't seem to mix well with actix_rt::test. I've implemented it manually, and the tests seem to pass more reliably locally. It's still not perfect, and still requires --test-threads 1 for some reason, but I think digging into that further is probably out of scope for this PR.

@orhun
Copy link
Owner

orhun commented Mar 23, 2024

Yeah, as long as the CI passes I think we are set. It is still failing for some freaking reason though 💀

@tessus
Copy link
Collaborator

tessus commented Mar 24, 2024

So what is the plan here? The issue is if we merge this all future tests will fail.

It's possible that there are some issues with test, actix, and async. I have no idea. But something is off. These tests should not fail. Maybe there's a bug in actix or cargo test. Either way, can we not hold off the release any longer?

auto-merge can be setup, the PRs merged (except this one), and a new release be cut.
Unfortunately I can't do any of this myself.

@orhun
Copy link
Owner

orhun commented Mar 24, 2024

Either way, can we not hold off the release any longer?
auto-merge can be setup, the PRs merged (except this one), and a new release be cut.

Yeah, I think I'm going to move forward with the release this week, without this PR :/

@orhun
Copy link
Owner

orhun commented Mar 24, 2024

Someone suggested assay for isolated testing FWIW

@tessus
Copy link
Collaborator

tessus commented Mar 25, 2024

This project seems dead. I understand that if there are no dependencies, or a project does not have anything to add because it is feature complete, or there are no bugs, it is possible that there are no commits. But a PR has been ready to be merged for 1.5 years and the maintainer just ignored it after the OP addressed all comments.

This is not something you want to use. Well, at least I wouldn't.

@tessus tessus mentioned this pull request Apr 3, 2024
@tessus
Copy link
Collaborator

tessus commented Apr 3, 2024

I rebased to master in this test: #272 and it seems that the tests pass now in CI. They also pass locally. Maybe @RealOrangeOne you can rebase this PR as well.

@tessus
Copy link
Collaborator

tessus commented Apr 3, 2024

@orhun something is seriously wrong here. the PR now has the exact same code as in #272

@tessus
Copy link
Collaborator

tessus commented Apr 4, 2024

@RealOrangeOne I think we have to try something else.

Can you create a new PR with the current changes?

  1. save the changed files
  2. create a new branch off master
  3. move the saved files back

Not using any git commands for step 1 and 3 is on purpose.

@orhun
Copy link
Owner

orhun commented Apr 4, 2024

Maybe there is a cache issue? We can try clearing the cache first.

@orhun
Copy link
Owner

orhun commented Apr 4, 2024

Ok cleared the cache, but that didn't really do anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants