Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP - name-based extractors for unapply to drop allocation of Some #47

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

lbialy
Copy link

@lbialy lbialy commented Mar 10, 2019

Initial attempt at implementation of #46 - I have checked and this seems to work as it should, but sadly usage of another value class to introduce allocation-less name-based extractor introduces an usage limitation for macro annotations - it's impossible to declare a value class inside of another class and sadly, test suite classes are one of potential use cases like that. Frankly, I have no idea how and if this can be circumvented but maybe some other soul can find a way to solve this riddle.

Tests fail, obviously, as macro annotations now won't compile with unapply = true when declared inside of a class.

@lbialy
Copy link
Author

lbialy commented Apr 25, 2019

Solution using a shared instance of name-based extractor fails for 2.10 only. Will look into this.

q"""def unapply[..$tparamsNoVar](x: ${clsDef.name}[..$tparamNames]): Some[${valDef.tpt}] =
Some(x.asInstanceOf[${valDef.tpt}])"""
q"""def unapply[..$tparamsNoVar](x: ${clsDef.name}[..$tparamNames]): io.estatico.newtype.NewType.NewTypeUnapplyOps[${valDef.tpt}] =
new io.estatico.newtype.NewType.NewTypeUnapplyOps[${valDef.tpt}](x.coerce[${valDef.tpt}])"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably avoid the use of the .coerce extension method as it's unreliable from this context (requires the user to have the Coercible ops imported). Instead, should be able to use asInstanceOf safely.

class NewTypeUnapplyOps[A](val a: A) extends AnyVal {
def isEmpty: Boolean = false
def get: A = a
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably be final class

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also wonder if you could get away with using val get in the constructor, e.g.

final class NewTypeUnapplyOps[A](val get: A) extends AnyVal {
  def isEmpty: Boolean = false
}

@@ -1,5 +1,6 @@
package io.estatico.newtype.macros

import io.estatico.newtype.Coercible
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this import is unnecessary?

@carymrobbins
Copy link
Member

First, thanks for the PR and I appreciate your patience waiting on me to finally get around to commenting on it! 😃

While I like the spirit of this change, I wonder if the Some ever really gets constructed here after the JVM is warmed up. I played around with this idea a while back (see #18 (comment)).

Here are the benchmarks via sbt clean 'jmh:run TestBenchmark' -

Benchmark                     Mode  Cnt          Score         Error  Unit
testCaseClass                thrpt  200  402211203.871 ±  861246.654  ops/s
testManualSimpleUnapply      thrpt  200  393148121.470 ± 4173139.812  ops/s
testManualUnapplyValueClass  thrpt  200  384478663.022 ±  870967.771  ops/s
testNewTypeSimpleUnapply     thrpt  200  110200664.917 ±  283007.861  ops/s

the testManual benchmarks are for handwritten newtypes, whereas the testNewType one is the macro generated one which contains ClassTag instances.

To expand on this, testManualSimpleUnapply uses the Some extractor whereas testManualUnapplyValueClass uses a value class (the same strategy as this PR) -

oleg-py/newtype-unapply@master...carymrobbins:unapply-class

object Manual {
  type Type <: Base with Tag
  trait Tag
  type Base <: Any
  def apply(x: String): Manual = x.asInstanceOf[Manual]
  def unapply(x: Manual): Some[String] = Some(x.asInstanceOf[String])
}

object ManualUnapplyValueClass {
  type Type <: Base with Tag
  trait Tag
  type Base <: Any
  def apply(x: String): ManualUnapplyValueClass = x.asInstanceOf[ManualUnapplyValueClass]
  def unapply(x: ManualUnapplyValueClass): Unapply = new Unapply(x.asInstanceOf[String])
  final class Unapply(val get: String) extends AnyVal {
    def isEmpty: Boolean = false
  }
}

Ideally, these handwritten versions should perform better than any generic solution (unless I've horribly messed something up) and their performance is very close (with the Some version actually being slightly faster, but that could just be due to solar flares).

If you do indeed want to pursue this avenue (although I know it's been a couple months now since this PR's inception), I'd recommend forking my benchmark and seeing how well it does. But if the performance of using Some is essentially identical to using AnyVal (which is probably mostly due to some JIT magic) it might make sense to leave it as Some. Definitely open to ideas here though.

@lbialy
Copy link
Author

lbialy commented Jun 5, 2019

To be perfectly honest I wouldn't expect allocation-less unapply to be significantly faster than Some-based version. One reason I can think of off the top of my head is that TLAB allocations are very fast, so any benchmark that would measure real impact of this would have to actually measure the impact of garbage generation pressure on the collector. With semi-concurrent and fully concurrent GCs available on the JVM this is a hard task. One way to measure this would probably be to use serial collector and prolonged benchmark which should show both the cost of allocation and elongated GC pause. There's another optimization that is available to some extent with C2 (at least that's what I believe, I'm not sure) which is escape analysis and greatly improved with GraalVM. If I understand this correctly JVM can choose to allocate Some in stack frame and then deallocate everything by just dropping that frame.

Now the question that you might be thinking of is: why would it be beneficial to have this in scala-newtype if JVM is capable of such feats? My answer is that if this project strives to provide sensible, zero-cost abstractions it should do so consistently and if it's possible to avoid an allocation using a clever compiler trick it would be the right thing to do.

I'm also a tad bit pedantic and like working on such intricacies 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants