Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement parameterize as multi-shot coroutine #34

Open
BenWoodworth opened this issue Sep 1, 2024 · 6 comments · May be fixed by #37
Open

Implement parameterize as multi-shot coroutine #34

BenWoodworth opened this issue Sep 1, 2024 · 6 comments · May be fixed by #37
Labels
enhancement New feature or request

Comments

@BenWoodworth
Copy link
Owner

This is not a planned feature, at least not yet, but this issue tracks how parameterize could work if it were implemented as a suspend block that resumes from each parameter instead of re-running the whole block

It could be implemented as a suspend point when declaring a parameter:

suspend operator fun Parameter.provideDelegate()

A parameter will be constructed only once (instead of every iteration). After that, there will be a suspend point when provideDelegate is called on it, and having the listed/computed arguments locked in. From there, parameterize's suspend block will resume from there once for each of the arguments as a multi-shot coroutine.

Benefits

  • performance: no need to re-run start of block
    • changes how code before a parameter behaves, since now only
  • performance: no need to validate that the block was deterministic
  • no need to throw ParameterizeContinue up the call stack for internal control flow
  • no requirement for the block to be deterministic
  • parameterOf is no longer a performance concern
    • lazy parameterize is no longer needed

Problems/Blockers

  • Marking provideDelegate as suspend is not supported as of Kotlin 2.0
    • Though that was a restriction with the K1 compiler architecture <----------------------------- LINK
    • And has been acknowledged as possible to support in K2 <-------------------------------------- LINK
  • Use inside other suspend blocks may (or may not) be an issue
    • E.g. parameterizing Kotest suites with
      context("group") {
          parameterize {
              // ...
              test("test") { // suspends with `context`'s coroutine scope, through parameterize's
              }
          }
      }
      

Implementation

Multi-shot coroutines should be possible as long as a coroutine continuation can be cloned, that way it can be resumed more than once.

For each platform:

  • On jvm, Roman Elizarov has a Gist of how this can be done here
  • For js, creating a shallow copy is also a common enough concept
  • And native should be similarly straightforward
@BenWoodworth BenWoodworth added the enhancement New feature or request label Sep 1, 2024
@BenWoodworth
Copy link
Owner Author

There may be some fundamental issues with this as well, since it'd be breaking assumptions that Kotlin makes about code.

For example:

suspend fun multiResume() {}

suspend fun x() {
    val a: Int

    multiResume()

    a = 4
    a = 5 // ERROR: 'val' cannot be reassigned
}

Here Kotlin is fine with the first assignment since it assumes that the continuation at multiResume() will be resumed at most once. But, of course, with a multi-shot coroutine like what this issue's proposing, that's not the case.

Testing on JVM I'm finding that the code does run without any issue, replacing the value of 4 with 5. There are some subtle inconsistencies that can arise though:

fun test() {
    val a: Int

    a = 4
    val printCaptured = { println(a) }

    @Suppress("VAL_REASSIGNMENT")
    a = 5

    printCaptured()
    println(a)
}

In this example, a is captured in the lambda, and since it's a val and won't change, the constant value 4 is captured, instead of a reference that will change to 5 when a is updated. So in this code, "4 5" is printed. But, if a is changed to a var, a reference to a's value is captured, and the lambda will always print the actual current value of a, meaning that var a results in "5 5" being printed.

While this isn't necessarily a problem, and might even be desirable, it might hint towards a bigger issue when abusing coroutines like this.

@kyay10 kyay10 mentioned this issue Oct 1, 2024
@kyay10
Copy link

kyay10 commented Oct 1, 2024

I got nerd-sniped by this, and have now opened #37
I'll write a neater explanation for my ideas soon, but for now:

I have a project in the works that implements multishot coroutines in Kotlin.

What Kotlin provides is usually termed "single-shot delimited continuations" in academia. It's a well-known academia result that multishot delimited continuations can do "monadic reflection", which is a fancy term for being able to write monadic code in direct style (which is way better than e.g. for-comprehensions or do-notation).
Thinking about this in terms of monads will be unnecessarily confusing though, so the better way to think about it (and the way that works really nicely with coroutines) is the idea of effects.
Basically, your parameterize DSL can be modelled as a direct-style (i.e no nesting) effect that looks like this:

interface ParameterizeScope {
  suspend fun <A> declareParameter(sequence: Sequence<A>): A
}

No need for provideDelegate or anything (and thus we don't need to wait for language support for suspend delegated properties)
Now, why did I mention these theoretical CS concepts? Well, it's because it has a very easy answer for how to handle state.
This is usually termed "Delimited Dynamic Binding".
Taking your example and changing it a bit:

interface MultiResumeable {
  suspend fun multiResume(): Unit
}
suspend fun <R> dryRun(block: suspend MultiResumable.() -> R): R {
  // some magic here that dry-runs `block` first, then runs it again and returns that final result
}
suspend fun test(): Int {
    var a: Int = 0

    multiResume()

    return ++a
}

dryRun { test() } // should be 1

Obviously, if a was a global variable, we should expect that the function returns 2 instead.
We want to think of a as stored on the stack frame of the test function. When the multiResume function calls us 2 times, we want the value of a each time to be 0. Our dryRun function is clearly a "delimiter" for our block. So, the trick is we want a to be stored on the stack frames so that the correct value for a (0) is what we use each time.
Turns out, coroutines work like that by default, and so this code as given will just work.
However, issues turn up when one uses a capturing closure:

suspend fun test(): Int {
    var a: Int = 0
    val printer: () -> Unit = { println(a) }
    multiResume()

    return ++a
}

dryRun { test() } // returns 2 :O

This reveals the implementation detail that a now becomes heap-allocated.
Notice though that the type of printer doesn't mention suspend at all, so, as-is, it can't access our metaphorical stack
There's two ways around this:
A simpler one (that I haven't yet implemented) would be to store and restore the value of a into its IntRef upon resumption. This has the issue of obviously not being able to run in parallel because 2 different coroutines are accessing conflicting state.

The other way is to somehow still store a on the stack, but allow a way to access it from any function a calls:

// This is almost like an initial stack-frame
private data class TestData(var a: Int = 0)
suspend fun test(): Int = newReader(TestData(), TestData::copy) { // this: Reader<TestData>
    // val printer: () -> Unit = { println(a) } //Errors
    val printer: suspend () -> Unit = { println(ask().a) } 
    multiResume()
    return ++(ask().a)
}

dryRun { test() } // returns 1

now every access to a is through this ask function on Reader, which goes up the stack, finds the stack frame that has that apt Reader mark, and gets the TestData object from it.
Usage within other suspend functions should be easily possible. However, "crossing" a boundary like in this example:

context("group") {
    parameterize {
        // ...
        test("test") { // suspends with `context`'s coroutine scope, through parameterize's
            val x by parameter(...)
        }
    }
}

This can cause a bunch of unintended consequences depending how test is implemented. For instance, my current implementation can't handle crossing a launch or async boundary yet, and it's hard to know how that should behave (there's less academic info about this). Another thing is that how test is implemented might not be compatible with multishot behaviour.
The example doesn't really make much sense though since one would just include the parameterize call inside the test instead. I'm working on a compiler plugin that would offer errors in situations like these by annotating suspend scopes with a @Multishot annotation.

@kyay10 kyay10 linked a pull request Oct 1, 2024 that will close this issue
@CLOVIS-AI
Copy link

The intended usage is:

context("group") {
    parameterize {
        val x by parameter(...)
    
        // ...
        test("test") { // suspends with `context`'s coroutine scope, through parameterize's
            
        }
        
        val y by parameter(...)
        
        test("test 2 $y") { … }
    }
}

As long as this is supported, I think it isn't a problem if declaring parameters within non-EXACTLY_ONCE lambdas, such as a nested test() declaration, is forbidden. Indeed, I would prefer if it was forbidden with a nice error message. I can't figure out what it would do.

@kyay10
Copy link

kyay10 commented Oct 1, 2024

Yeah that intended usage is absolutely fine! It's not even about EXACTLY_ONCE-ness. The potential issue is if the code uses mutable state in a presumed-linear way. A simple manifestation of this issue is in for-loops, where their use of a mutable iterator causes unexpected issues (this is the same state problem described above). Another issue is that currently I haven't supported crossing a launch boundary (so if test runs its block in a launch, and the block tries to use parameter, it results in a runtime error because we can't "see-through" the launch).
A final issue is that test could theoretically take the lambda and store it somewhere else, and hence end up calling it not within the parameterize block.

Using parameter within test could have somewhat reasonable behaviour, but it would be akin to simply using the parameters outside the test anyway.
Forbidding this in plain Kotlin is likely difficult though. @RestrictsSuspension is too limiting for this. The new compiler frontend has support for Checkers though, and so it'd be somewhat easy to write a plugin that prohibits such usages. It amounts to annotating multishot-capable suspend functions, and then imposing code-coloring restrictions akin to the ones that the compiler imposes on normal suspend functions (or Composable functions).

@BenWoodworth
Copy link
Owner Author

Sorry for the late reply! I've been pretty busy and haven't had time to appreciate this yet.

I wanted to mention quick that I just opened an issue for allowing provideDelegate to suspend, that way there's some attention on the idea:
https://youtrack.jetbrains.com/issue/KT-72468/Support-suspend-operator-provideDelegate

@kyay10
Copy link

kyay10 commented Oct 30, 2024

Take all the time you need! Feel free to ask any questions or anything like that.
FWIW, I'm pretty sure your current implementation is basically a thermometer continuation but specialized to non-determinism. These "thermometer continuations" are a simulation of normal delimited continuations, and hence delimited continuations, if already available in a language, are way, way simpler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants