filereaper 126 days ago [-]
The following bytecodes have been introduced: vaload vastore vbox vdefault vload vreturn vstore vunbox vwithnewfield

These will have direct impact on the JVM. Is there a test version of the JVM with these bytecodes already implemented? Not sure if Java follows similar rules as the IETF, show us working code with the specifications.

jimktrains2 126 days ago [-]
If they're doing breaking changes to the VM, I wonder if we can get rid of type erasure. Generics seem really nice until you actually use them to do anything mildly complicated, then type erasure rears it's ugly head and you now have a type parameter and a Thing.class normal parameter and reflection.
alkonaut 126 days ago [-]
I only thought the erasure was bad because of how Java lacked value types (so my list of T will use heap boxes when T is int).

I haven't given it a ton of thought but I thought that apart from reflection it would actually be nicer if the runtime erased types. I.e. at runtime a type should just be either a stack/copy/move type or a reference type. So erase all reference type to objects.

It feels like it's the job of the compiler to handle these. Obviously at runtime you can get some extra safety guarantees from reified types, but at the cost of languages on the runtime being more difficult to adapt to other paradigms etc.

I imagine a runtime that doesn't have strong opinions about types to be easier to make type level language changes for (i.e. the language(s) can evolve without the need for runtime changes), and it should also be easier to support entirely different type systems.

tigershark 126 days ago [-]
Erased generics are awful compared to proper generics like C#. If you ever used both of them it should be clear as the sun. In C# you don't have to pass around the type in a method parameter and use reflection. I’m missing how you can claim that the languages on the runtime with proper generics are more difficult to adapt to other paradigms given the extremely fast evolution of C# and F# compared to Java that for example received closures something like 10 years after C#.
mike_hearn 126 days ago [-]
That's a relatively minor syntax issue, fixable with something like Kotlin.

All runtimes need to erase types at some level. Otherwise ArrayList<Foo> and ArrayList<Bar> would end up compiling identical versions if both Foo and Bar are reference-only types, which just wastes memory. At some level the compiler and runtime need to merge duplications - in C++ that feature is called COMDAT folding, or used to be.

.NET has had serious problems with code duplication in the past. Here's an excellent blog post by a Microsoft engineer on it:

http://joeduffyblog.com/2011/10/23/on-generics-and-some-of-t...

jdmichal 126 days ago [-]
Your comment is a bit misleading, because .NET has always only instantiated one version of a generic for all reference types. Even the article you posted backs this up, though I'm sure I've read the same on MSDN.

"... instantiations over reference types are shared among all reference type instantiations for that generic type/method, whereas instantiations over value types get their own full copy of the code."

Just about the only thing it could do better is to reuse the same instantiation for all value types of the same size.

JamesNK 126 days ago [-]
.NET always preserves generic type information for reflection but there is code reuse when a generic type is used multiple times with different reference types, e.g. there is little overhead having List<object> and List<string>

Duplication exists with generics and value types, e.g. List<long> and List<DateTime> are entirely separate code. It's just a thing to keep in mind when mixing them with value types.

keth 126 days ago [-]
Don't Haskell and Scala also erase the types? If I remember correctly Martin Odersky even said erasure is better for some things in one of his videos/keynotes (?), but I'm not sure if I remember that correctly.
tel 126 days ago [-]
Haskell totally erased types and has a mechanism to recover that information as values. Both are way safer and easier to use than Java's versions.
pkolaczk 125 days ago [-]
Scala follows a similar route. It erases types, but lets you reify on demand in form of TypeTags.
tel 124 days ago [-]
Except not:

    x match {
      case _: List[Int] => 1
      case _: List[Char] => 2
      case _: String => 3
    }
This used java's semi-erased class tags and doesn't report a TypeTag constraint in its type. It will also return 1 when x is of type List[Char] since partial erasure means that the first two patterns end up being identical. The compiler will warn you about this situation, but generally it shows up all the time for various reasons. Super bad news.

That all said, the TypeTag system could be very nice someday. Especially if asInstanceOf were dropped eventually or, better, relied upon TypeTag.

draven 126 days ago [-]
Scala does. It's annoying in pattern matching clauses.
virtualwhys 125 days ago [-]
and yet it prevents you from shooting yourself in the foot at run time (if you have `-Xfatal-warnings` enabled), so the compiler has your back.

When Dotty and Java 10 land many annoyances will go away thankfully.

PaulHoule 125 days ago [-]
Java maintained compatibility between non-generic containers and generics.

Look at C# on the other hand and you see they had to add entirely new types which means that the .NET framework has an ugly split between APIs based on the old container classes and those based on the new container classes. That fits the trend that C# is a better language than Java, but Java has better class libraries to work with.

paavohtl 125 days ago [-]
But this split happened in 2005. The ecosystem fully migrated to generic containers almost instantly, and you won't find the old containers being used anywhere.

They took a risk, and it paid off. In my opinion, C# is both a better language and has better class libraries.

ygra 125 days ago [-]
The be fair, C# was a lot less entrenched at that time than Java was, even though I think Java could have made the same step at the time. But the Java maintainers probably had good reasons for that decision as well.

As for better class libraries, I found the .NET BCL to be excellently designed and thought through. It's also very consistent throughout. Now, parts of the FCL, like System.Windows.Forms are another matter ...

There's still some warts, such as there not being an ISet<T> before .NET 4, but Java's standard library has its share of quirks and historical weirdnesses as well. And as libraries age there are always old ways of doing things you can never really remove, and newer ways that are better. None of the two is as bad as C++, but depending on what you do you can stumble around in a swamp of old APIs for a while before finding what you're actually supposed to use.

alkonaut 126 days ago [-]
> In C# you don't have to pass around the type in a method parameter and use reflection.

I know - but I'm not familiar with the situation where I'd have to to that in java. Do you mean e.g.

    Foo(object x)
    {
      if (X is List<int>)
         ...
      else 
         ...
    }
bskap 125 days ago [-]
The canonical example in the standard library would be this one from the entire collections framework:

    public Object[] toArray()
    public <T> T[] toArray(T[])
You cannot turn an ArrayList<T> into a T[], for example, without passing a T[] into the function so that you can grab the type from the passed parameter at runtime. C#, which retains the generic information at runtime doesn't need this so you can just do

    public T[] toArray()
The other place I've seen this come up is with Exceptions- you cannot catch a generic Exception. It's pretty annoying now that Java 8 has streams because you cannot have a checked exception abort out of processing a Stream unless you either 1) catch Exception (rather than the specific subclass) and deal with everything or 2) catch the checked exception in the lambda, wrap it in a runtime exception, rethrow it, and then catch the wrapped exception.
Sharlin 126 days ago [-]
Basically every time you need to instantiate a type passed as a type parameter. You either need to pass the appropriate class literal or a factory function. And if you need ro create an instance of a generic type parameterized by a type parameter (eg. Collection<T>) you need a factory or else resort to raw types and/or ugly casting.
alkonaut 126 days ago [-]
Do you mean things like this (C#)?

    void AddAnItem<T>(List<T> aList) where T : new()
    {
       aList.Add(new T());
    }
Or for instantiating generic types:

    public TColl CreateACollectionAndAddAnItem<TColl, T>(T item) 
       where TColl : ICollection<T>, new()       
    {
       TColl aList = new TColl(); 
       aList.Add(item);
       return aList;
    }

    // Usage 
    List<string> myList = CreateACollectionAndAddAnItem<List<string>, string>("hello");
jdmichal 126 days ago [-]
Both those examples are exactly the type of thing you cannot do in Java because of type erasure. For instance, your first example would need to be:

    <T> static void addItem(List<T> list, Class<T> type) {
        T item = type.newInstance();
        list.add(item);
    }
alkonaut 125 days ago [-]
Can't that be solved by passing these types around under the hood, but allowing the language to sugar them out when they are known? I thought this kind of half-erasure was how some languages already operated on runtimes that erase generics.
jdmichal 125 days ago [-]
That's basically what Java does. Java class definitions contain the full, reified type information, but when the runtime loads the class, those types are lost. Only the compiler uses them.

There are some ways to introspect generic types in Java, but you need a concrete binding. For instance, if a method returns List<Integer>, you can in fact see that it returns List<Integer> and not just List. Method.getGenericReturnType() would return a ParameterizedType with a List raw type and Integer type arguments. But that requirement for it to be a concrete binding means it's not really helpful from the context of writing a generic class or method in the first place.

Using the above example, I'm not sure how you could desugar that without having the type known to the runtime. The generic method is going to be the same code no matter the type provided, but the type provided is necessary in order to know the type to construct. So either the runtime must provide the type, or it must be given as a parameter.

Additionally, glossing over the issue like that creates a huge trade-off. Now programmers must build a mental model of when the compiler can and can't do the type binding. The difficulty of building such a mental model accurately is one of the central complaints against Rust's borrow and lifetime checker(s).

kuschku 126 days ago [-]
You don’t have to in Kotlin either, and that’s on the JVM.
sixbrx 126 days ago [-]
IIRC Kotlin can only do it on inline functions on type Params explicitly marked reified
shellac 126 days ago [-]
> I only thought the erasure was bad because of how Java lacked value types (so my list of T will use heap boxes when T is int).

That's covered by a sibling valhalla feature called 'generic specialisation'. See http://openjdk.java.net/jeps/218.

alkonaut 126 days ago [-]
Yeah that looks like an excellent step (and one that should have been taken around the same time as .NET 2 was released). Getting better performance from a MyIntList than from an ArrayList<int> is such a horrible design smell.
126 days ago [-]
nradov 126 days ago [-]
Those are not breaking changes. Existing byte code will continue to work.

My understanding of generic type erasure is that it is more of a language issue than a VM issue. Nothing in the VM prevents reified generics, but when generics were introduced in Java 5 they decided to use type erasure in order to avoid having to produce a whole new collections standard library or break compatibility with existing code.

jimktrains2 126 days ago [-]
I meant breaking in that the Java 10 compiler would be targeting a newer version of the JVM and could dispense with having to be backwards compatible, which was one of the reasons I remember reading for generics.

They could have simply left the old library, deprecated it, and added in their genericed one as a a new package. It had to be modified anyway.

evincarofautumn 126 days ago [-]
Right, it breaks forward compatibility: bytecode for a newer VM won’t work on an older VM. IMO backward compatibility is much more important for a programming language, though.
jdmichal 126 days ago [-]
Or they could have done erasure in the runtime when they detected non-generic usage of a generic class. Instead of forcing it on everyone.
moomin 126 days ago [-]
Yup, and Microsoft did exactly that, meaning C# has been able to express List<int> for ten years now.
valarauca1 126 days ago [-]
Nah this is a VM issue.

Type assertions have no ability to recurse. RN you assert both classes are a UTF8 value in the constant pool. Or really 1 class is a UTF8 value, which the compiler know the first one is.

But with Generics, you may not know what it holds at runtime so this doesn't work. You'd have to constantly rebuild the constant pool (Slow) each time a new _kind_ of generic is used.

So either a new type+byte code would have to be introduced, or the type assertion instruction would have to become recursive. The later is a breaking change.

Also interfaces break this b/c those are class specific. So a Lorg/my/company/myClass$5; and Lorg/your/company/yourClass$0 might both implement the same interface. But that interface information isn't in the collection class, but those to class's class files. So really it just needs new byte code.

dtech 126 days ago [-]
If you look at the .NET (IL) implementation of generics [1] it relies quite heavily on the VM/Bytecode. Implementing a new collections library (like C#/.NET did) is probably not the cost, afaik the Java team choose type erased generics to prevent breaking binary compatibility between Java 4 and 5 at the VM/bytecode level.

[1]: https://stackoverflow.com/a/5342424/572635

jdmichal 126 days ago [-]
Java could have done the same by implementing erasure as a VM-level construct that's only invoked when a generic class is used in a non-generic manner. That is, using List is equivalent to using List<Object>, which I think even retains semantic equivalence. Bonus points for gating the feature behind pre-5 class versions.
126 days ago [-]
PDoyle 126 days ago [-]
These aren't breaking changes. This isn't even the first time they've added bytecodes. (Java 7 added invokedynamic.)
bitmapbrother 126 days ago [-]
Isn't type erasure one of the reasons there are so many dynamic languages on the JVM?

Also, why do you consider the introduction of these new bytecodes as breaking changes for the JVM? Backward compatibility has always been one of the highest priorities for the Java architects.

paulddraper 126 days ago [-]
> Isn't type erasure one of the reasons there are so many dynamic languages on the JVM?

No and yes.

No, it doesn't really matter for dynamically typed languages (just use Object for generic types).

Yes, it matters for statically typed languages. One of the factors for Scala .NET abondonment was the difficulty in creating a interoperable advanced type system in a fully reified environment.

IMO, full type erasure (like C/C++) is the right way to go. Java erasure is weird only because it didn't do it completely, not because they did it at all.

Matthias247 126 days ago [-]
What do you mean with "full type erasure (like C/C++)"? Afaik if you have RTTI enabled you have full type information available at runtime, even for templates - which is the thing that people usually want when talking about reified generics. Is it about C++ without RTTI? Or about the fact that vector<int> is compiled completely separate from a vector<string> (monomorphization)? I think the latter one is an orthogonal issue to whether type information is available at runtime.
paulddraper 125 days ago [-]
You're right, since C++98, it's has RTTI. Subpar example on my part :/
josefx 125 days ago [-]
> No, it doesn't really matter for dynamically typed languages (just use Object for generic types).

At which point any call to the underlying environment fails with a type error. Sorry your List<Object> is not and will never be a List<int>, trying to pass it as one wont do. Either you provide a way to construct a generic type with the right type parameters in the dynamic language or you accept that you can only inter-op with a limited subset of the underlying environment.

Of course that also gets ugly. You now have to deal with the fact that you not only have a List but several incompatible List<T> floating around, adding the contents of two lists suddenly includes the question what sort of list to return, List<Int>,List<Double>,List<Number> or an error? In a type erased context the answer is always the same, you return a List<Object>.

bad_user 126 days ago [-]
> No, it doesn't really matter for dynamically typed languages (just use Object for generic types).

This line is disproven by all the dynamic languages that happened and failed on the CLR.

kelnos 126 days ago [-]
Did they fail because they weren't possible to build on top of the CLR, or because people didn't want to use them?
jimktrains2 126 days ago [-]
I meant breaking in that the Java 10 compiler would be targeting a newer version of the JVM and could dispense with having to be backwards compatible, which was one of the reasons I remember reading for generics.
kelnos 126 days ago [-]
Where are you reading that? The Java N compiler always targets version N of the JVM (well, by default, anyway), and you can't run bytecode on older versions of the JVM.

I would sincerely doubt they're considering breaking back-compat in Java 10 (that is, making it so older bytecode targeting something <10 cannot run on 10+).

125 days ago [-]
126 days ago [-]
devdoomari 126 days ago [-]
scala handles the generic-edge cases using lower-bounds and upper-bound types - and scala runs on JVM. But there could be any other edge cases that cannot be solved easily thru scala's lower-/upper-bound types... (don't know enough)
bad_user 126 days ago [-]
It's in fact not type erasure that "rears its ugly head", but actually that Java's type system is not expressive enough, which is why developers end up relying on reflective capabilities.

There's something seriously wrong with the type system when you want "instanceOf T" checks or "new T".

barrkel 126 days ago [-]
There's nothing wrong with doing 'new T' in a generic method or class body. It requires a constraint on the type variable is all: that some constructor with a specific parameter list is available.
kelnos 126 days ago [-]
If you don't have/want "new T", how do you ever create an instance of anything?
chii 126 days ago [-]
The in this context is a variable, not a concrete type.
barrkel 126 days ago [-]
Did you mean: "The T in this context..."?

Instantiating a generic type is like a function call at compile time, where the type arguments are substituted for type parameters in the body of the generic type. It's a variable at compile time, but it's a concrete type at runtime for any version of the code that can execute.

_old_dude_ 126 days ago [-]
It has the same kind of rules, i believe, here is the code of the VM that works/may work with value types

http://hg.openjdk.java.net/valhalla/valhalla/hotspot/

karianna 126 days ago [-]
Hi all - interested in early access builds and / or helping out? Head to OpenJDK.java.net and join the Valhalla project and the Adoption Group.

Disclaimer - I help run the adoption group and maintain the Valhalla wiki.

nullnilvoid 126 days ago [-]
This title is hugely misleading. The value types specification is for JVM, the platform, not Java 10. You can compile a large number of source languages to run on JVM, such as Java, Scala, Closure, Groovy, Kotlin, Javascript etc. just to name a few.
desdiv 126 days ago [-]
>Version 54.1 class files are intended to be supported only by implementations of Java SE 10

Java SE 10, the platform, will support value types. One can interpret the ambiguous term "Java 10" as either "Java 10 the language" or "Java 10 the platform". Applying the principle of charity[0] will yield the right interpretation in this case.

[0] https://en.wikipedia.org/wiki/Principle_of_charity

WatchDog 126 days ago [-]
Is it really misleading though?

Does anyone with the slightest familiarity with JVM languages not understand that each major Java language version is also associated with a major JVM version?

aardvark179 126 days ago [-]
The important thing here is that this minimal value types specification is not aimed at making Java language level changes, or many of the Java runtime changes that would be expected of other VM features such as invokeDynamic that were not exposed in the Java language. Those still require a lot of work round the Java type system and the standard library.

What this does provide is a mechanism by which the use of value types can be experimented with by library and language authors. I believe these value type features are intended to be optional and are not guaranteed to remain unchanged at future JVM releases.

nullnilvoid 126 days ago [-]
Given that we are not even close to the official release of Java 9 and whether the proposal to JVM specification will be accepted and implemented for which version, it is largely inaccurate to title the article "Java 10 - Specification for Value Types" for a JVM change proposal. Besides, interchangeably referring to JVM and Java reflects the ignorance about the whole ecosystem of JVM languages.
notamy 126 days ago [-]
Java 9 is scheduled for release in September[1]. How is that "not even close"?

[1] http://openjdk.java.net/projects/jdk9/

WatchDog 126 days ago [-]
Well the specification explicitly talks about version 54 class files, which it then goes on to state "Version 54.1 class files are intended to be supported only by implementations of Java SE 10.". It's not going to land with Java SE 9.
nullnilvoid 126 days ago [-]
With all the prolonged delays of features in Java, whether it is Java 10, 11, or 12 is still in question. As an instance, Jigsaw was proposed to be a driving feature of Java 7, and then delayed to Java 8 and even Java 9. And recently it was voted "No" by IBM, Red Hat etc.
WatchDog 126 days ago [-]
Sure, features slip, but value types are coming in java 10 until they aren't.
_old_dude_ 126 days ago [-]
As far as i've understood, i'm a lurker on the valhalla mailing list, the current plan is to release a MVP containing only the implementation of value types in the VM at the same time as the 9.1 release so by the end of Q1 2018.
xxs 126 days ago [-]
How misleading? Pretty much everything available in Java is available directly in the bytecode. Java compiles to bytecode, any JVM based language can use utilize the same bytecode(s).

There is even more: invokeDynamic, Java (the language) doesn't use it in its compiled bytecode and the JVM instruction is targeted at JVM based languages. (InvokeDynamic is still available from Java via java.lang.invoke)

jnordwick 126 days ago [-]
Java emits invokeDynamic instructions as part of lambda execution.
_old_dude_ 126 days ago [-]
technically, the lambda creation, not it's execution. It's also used for string concatenation in Java 9.

while invokedynamic was introduced to enhance the support of dynamic languages on the JVM, invokedynamic is now used as a kind of macro instruction to avoid to add new instructions in the bytecode set when the instruction can be decomposed in a set of already existing instructions.

jnordwick 125 days ago [-]
In Java 9 string concatenation it isn't for macro use but to allow the run time to make the final determination as to how it is going to actually do the joining since there are now multiple strategies available to process the strings and possibly user overrideable behavior to incorporate.
126 days ago [-]
moomin 126 days ago [-]
In its own way this is a sad day. Yes, Java's getting better, but it marks another nail in the coffin for the dream of a smart compiler. In this case, escape analysis never lived up to its promise.
PDoyle 126 days ago [-]
I'm not sure what benefit you're picturing from escape analysis, but EA is largely pointless because GCs have gotten so good that stack allocation of objects is really no better than heap allocation plus collection of short-lived objects.

I worked for years on escape analysis in IBM's Java JIT compiler. We struggled to find any actual programs that showed any benefit at all. The real benefits of escape analysis were second-order effects like eliminating monitor operations on non-escaping objects, or breaking apart objects and putting their fields into registers (especially autoboxed primitives). The actual stack allocation wasn't really any faster than heap allocation, and a GC operation in the nursery doesn't even look at dead objects.

EA is basically a microbenchmark-killer. For real software, it's not often worth the trouble.

moomin 125 days ago [-]
I think we're agreeing. The party line for a long time was that Java didn't have value types because they thought EA would render it unnecessary. But as you say, it never really approached being able to replace the benefits of true value types.
PDoyle 125 days ago [-]
Ah, got it. Yes, I think that claim about EA would be considered naive in retrospect.
lolive 126 days ago [-]
Could you elaborate on what a smart compiler is?
mike_hearn 126 days ago [-]
A http://wiki.c2.com/?SufficientlySmartCompiler in this case would be a compiler capable of automatically converting classes into value types such that the developer did not have to think about this detail of memory management themselves.

I felt the same slight sadness when I saw the complexity of the planning involved in Java value types, and the many-year path to get there. Intuitively, a sufficiently smart compiler should have been able to take care of this, and in some cases it does do. So it's worth reflecting on why adding new opcodes and such is necessary.

HotSpot and some other JVMs can do an optimisation called "scalar replacement", which converts objects into collections of local variables which are then subject to further optimisation. So for example a Triple<A, B, C> type class could be converted into three variables, then the optimiser notices that the second element of the triple was never actually used anywhere, and eliminates it entirely.

Scalar replacement is one of several optimisations that relies on the output of an escape analysis. Escape analysis is a better known term so people often use the name of the analysis to mean the name of the optimisations it unlocks, although that's not quite precise. Sometimes people talk about stack allocation, but that isn't quite right either. Only objects that don't "escape" [the thread] can be scalar replaced.

There are several reasons this isn't enough and why Java needs explicit support for value types.

Firstly, the JVM implements two different EA algorithms. The production algorithm that is used with the out of the box JIT compiler (C2) is somewhat limited. It can identify an object as escaping just because it might escape in some situations, even if it often doesn't. There is a better algorithm called "partial escape analysis" implemented in the experimental Graal JIT compiler, but Graal isn't used by default. In Java 8 it requires a special VM download. It'll be usable in Java 9 via command line switches. PEA can unlock optimisation of objects along specific code paths within a method even if in others it would escape, because it enables the un-optimisation of escaped objects back onto the heap.

Graal unfortunately can't be just dropped in. For one, you don't just replace the entire JIT compiler for a production grade mission critical VM like HotSpot. It may take years to shake out all the odd edge case bugs revealed by usage on the full range of software. For another, Graal is itself written in Java and thus suffers warmup periods where it compiles itself. The ahead-of-time compilation work done in Java 9 is - I believe - partly intended to pave a path to production for Graal.

Implementing PEA in C2 is theoretically possible, but I get the sense that there isn't much major work being done on C2 at the moment beyond some neat auto-vectorisation work being contributed by other parties. All the effort is going into Graal. I've heard that this is partly because it's a very complex bit of C++ and they're afraid of destabilising the compiler if they make major changes to it.

Unfortunately even with PEA, that still isn't enough to replace value types.

(P)EA can only work on code that is in-memory when the compilers optimisation passes are operating, and moreover, only in-memory in the form of compiler intermediate representation. In a per-method compilation environment like most Java JITCs that means it can only work if the compiler inlines enough code into the method it's compiling. Graal inlines more aggressively than C2 does but it still has the fundamental limitation that it can't do inter-procedural escape analysis.

It can be asked, why not? What's so hard about inter-procedural EA?

That's a really good question that I wish John Rose, Brian Goetz and the others had written a solid full design doc or presentation on somewhere. They've said it would be incredibly painful, and obviously view the Valhalla (also incredibly painful) path as superior, so it must be hard. In the absence of such a talk I'll try and summarise what I learned by reading and watching what's out there.

Firstly, auto-valueization - which is what we're talking about here - would necessitate much, much larger changes to the JVM. Methods and classes are the atoms of a JVM and changing anything about their definition has a ripple effect over millions of lines of C++. HotSpot isn't just any C++ though - it's one of the most massively complex pieces of systems software out there, with big chunks written in assembly (or C++ that generates assembly). The Java architects have sometimes made references to the huge cost of changing the VM. They clearly perceive changing the Java language as expensive too, but compared to the cost of changing HotSpot, it can still be cheaper to complicate the language. Frankly it sounds like Java is groaning under the weight of HotSpot - it's fantastically stable and highly optimised, but that came at large complexity and testing costs that have to be considered for every feature. Part of the justification for doing Graal is that the JITC is the biggest part of the JVM and by rewriting it in Java, they win back some agility.

As an example of why this gets hard fast, consider that the compiled form of a "valueized" version of a parameter is different to the reference form. Same for return value (a value can be packed into multiple registers). So you either have to pick a form and then box/unbox if the compiled version doesn't 'fit' into the call site, or compile multiple forms and then keep track of them inside the VM to ensure they're linked correctly. And a class that embeds an array may wish to be specialised to a value-type array, but then you need to detect that and auto-specialise, detecting any codepaths that might assume reference semantics like nullability or synchronisation, and then you have to be able to undo all that work if you load a new class that violates that constraint.

Secondly, detecting if something can be converted into a value type (probably) requires a whole program analysis. Java is very badly suited to this kind of analysis because it's designed on the assumption of very dynamic, very lazy loading of code. Not just because of applets and other things that download code on the fly, but it's quite common in Java-land for libraries to write and load bytecode at runtime too. Also Java users and developers expect to be able to download some random JAR and run it, or a random pile of JARs and run it, with essentially no startup delay. HotSpot can run hello world in 80 milliseconds on my laptop and the very fast edit-run cycle is one of the things that makes Java development productive relative to C++.

All that said, applets are less important than they once were, and Java 9 does introduce a "jlink" tool that does some kinds of link-time optimisations:

https://docs.oracle.com/javase/9/tools/jlink.htm#JSWOR-GUID-...

https://gist.github.com/mikehearn/930e3001e76415ae1e2d85dcdf...

PDoyle 125 days ago [-]
IBM's Java jit compiler already has this "partial escape analysis". EA basically still never made much difference in real apps. It also fundamentally can't solve the problem you want it to solve, because what happens if your value objects do escape? I really think the compiler is the wrong tool for the job here.

Java has two fundamentally different kinds of types: objects and primitives. Value types are user-defined primitives. Seems pretty straightforward, from a language design perspective. I'm not sure why we're trying so hard to avoid this.

naasking 125 days ago [-]
> It also fundamentally can't solve the problem you want it to solve, because what happens if your value objects do escape? I really think the compiler is the wrong tool for the job here.

You'd need to generalize stacks to full-fledged region analysis to see any real benefits, but only MLKit has taken it this far IIRC.

PDoyle 125 days ago [-]
Ok, so you analyze "regions". Then what? Modify the data structures to use some kind of efficient headerless representation? Then stop the world and rewrite the whole heap if a new class gets loaded that violates an assumption? Or if existing code tries to use reflection to look at these things?

Nothing's impossible I guess, but this kind of global datastructure reshaping is definitely far beyond the state of the art in Java JIT compilers.

naasking 125 days ago [-]
Combining Region Inference and Garbage Collection, http://www.elsman.com/pdf/pldi2002.pdf
PDoyle 125 days ago [-]
Cool, thanks for the link.

My mistake--I thought "regions" meant some sort of interprocedural "code regions".

I suspect the state of the art in GC has moved in the last 15 years since this paper was written, and they might have trouble beating modern techniques, but that's speculation.

naasking 125 days ago [-]
It's hard to beat regions in time overhead, since every operation is constant time; space overhead is where they suffer, and that's where GC enters the picture.

Escape analysis is a degenerate region analysis, and many of the cases where it fails because it's not general enough would be handled quite well by regions.

moomin 126 days ago [-]
Can I just say that this was a superb answer and much more detailed than anything I was planning to write myself.
frogboglog 126 days ago [-]
That's a long post and I admit I didn't read all of it. But, is anything you've said there a consideration for most "business" software? I can understand on a cache-hitting trading app perhaps.
moomin 125 days ago [-]
It's important if you care about GC pauses (which you will eventually). Take a look at this report from the .NET based stack overflow guys: http://blog.marcgravell.com/2011/10/assault-by-gc.html?m=1

Note that they solved their problems using exactly the features being added in Java 9.

brianwawok 126 days ago [-]
Why use java if not for the performance? If you dont care about speed I would never touch it.
chii 126 days ago [-]
a smart compiler is one where it can bridge the shortcomings or incompetence of the programmer such that you can be very much more stupid and still output decent code.
louthy 126 days ago [-]
Which is what all compilers do (by optimizing and rewriting syntax trees to be more efficient)
netheril96 126 days ago [-]
What is the use of value types if it cannot be the argument of generics?
knappa 126 days ago [-]
That's another proposed change in Java 10, both part of Project Valhalla:

http://www.jesperdj.com/2015/10/12/project-valhalla-generic-...

valarauca1 126 days ago [-]
bluejekyll 126 days ago [-]
Tuples! Finally.

> Finalization. Finalization makes no sense for values and should be disallowed (though values may hold reference-typed components that are themselves finalizable.)

This is a little disappointing. I was hoping that the lifetime of essentially stack based variables could be tracked, and finalization could be bound to the stack. That would be really powerful; though mind bending after beating so many Devs to not rely on finalization.

Imagine wrapping a Closable object in a Closing tuple for instance, and that is guaranteed to run when the tuple goes out of scope.

clhodapp 126 days ago [-]
At first reading, that sounds nigh-impossible to get right in a world where values on the stack can be closed over.
dkersten 126 days ago [-]
Closing over value types could capture by value and make a copy, like you would typically do in C++ when capturing values in lambdas.

Of course, since value types are immutable, that would be a bit memory wasteful I guess.

barrkel 126 days ago [-]
The copy would be living on a heap object only reachable via the interface reference the lambda implements. Deterministic finalization of that copy would be a Hard Problem.
clhodapp 126 days ago [-]
Yep. But how would you correctly deal with finalizers in the face of those copies? See my comment a bit deeper in the thread.
dkersten 126 days ago [-]
Yeah, you're right.

I guess it doesn't typically make sense to capture a RAII object and have the copy outlive the original, but it also may not make sense to delay the lifecycle of the original since it breaks the expectation of "i put this on the stack, so expect it to finalize when it goes out of scope". Probably best not to mix the two. That is, a stack-allocated RAII object whose finalizer is called when it goes out of scope, which references some data that can be closed over, with a lifetime separate from the RAII object.

At the very least, you would call finalizers on both (the stack one when it goes out of scope and the copy when it gets garbage collected) and then its up to the programmer to make sure that shared resources are finalized correctly (eg by keeping a shared reference count...). (I'm thinking a use case where the object represents some external resource that should be freed only once)

Sounds pretty error prone though.

I suppose that Java's solution is KISS: simply don't support finalizers on value objects.

haimez 126 days ago [-]
And voila, rust.
_old_dude_ 126 days ago [-]
The Value Types proposed in TFA are immutable so no need to have a borrow checker for them :)
clhodapp 126 days ago [-]
You wouldn't need such a sophisticated tracker as Rust has because the implementation here rules out mutable borrows but you'd still need something that looked like ownership tracking or something that looked like garbage collection to correctly implement finalizers on value classes in the face of closure, right? When you close over a stack-allocated immutable value, the low-level effect is that you make a copy of it onto the heap, I'd think. However, at higher-level, you create another reference whose lifespan staves off finalization, no?

Edit: I'll add that it's possible that the proper implementation for even a bog-standard, assign-to-a-variable copying would need to work this way (make another finalization-delaying handle) but I think that's less-certainly true.

clhodapp 126 days ago [-]
Exactly! Now let's just amend the value types proposal to include ownership tracking! Should be simple, right?
pseudoramble 126 days ago [-]
This was handy to glance through. Thanks!

I did notice one point that struck me as odd (under "Details, details"):

> Can a value class implement interfaces? Yes. Boxing may occur when we treat a value as an interface instance.

I don't normally think of values in other languages as having methods directly or an interface like this. I wonder what the primary reasons for this would be. Backwards compatibility might be one reason.

taspeotis 126 days ago [-]
C# permits it but it's not completely intuitive [1]. As to why, it has some benefits, like letting all types (value or otherwise) implement IComparable, IEquatable, IFormattable etc.

[1] https://blogs.msdn.microsoft.com/abhinaba/2005/10/05/c-struc...

runeks 125 days ago [-]
> I don't normally think of values in other languages as having methods directly or an interface like this.

If I understand you correctly, Haskell has this, and it's basically all you have to implement generic interfaces in Haskell.

A simple example is the Bounded class, whose interface exposes two functions: minBound and maxBound. So, for example, all integers (int8, uint8, int16, uint16, [...], uint64) have upper and lower bounds, and can implement this interface, like so (Word8 is uint8):

    instance Bounded Word8 where
       minBound = 0
       maxBound = 255
This forces all instances of a class to be a type that contains all information needed to implement the interface (as opposed to some construct that needs to accumulate state at runtime in order for it to work properly), which means the compiler can check almost everything at compile-time.

This is where monads come in, which is simply a value that describes a computation, e.g. fetching a value of a specific type from a web server at a given URL. So, in Haskell, all that exists are values, and they're also used to represent an operation which, at runtime, performs a certain action.

So, for example, the following function:

    concatStr :: String -> String -> String
is a pure function which only modifies it's two arguments (String and String) and produces a String, whereas

    printReadStdIn :: String -> IO String
    printReadStdIn str = do
       putStrLn str
       getLine
is a function which, when given a String, will -- at runtime -- print out a String and read a String from standard input, and this String will be available inside the IO monad (which can implement interfaces, because it's just a value).
pjmlp 126 days ago [-]
A few examples are D, C#, VB, F#, Eiffel, Delphi, C++, Rust.
pseudoramble 125 days ago [-]
Thanks for the examples. Seems clear that I don't fully understand it then, but it's good to know that it's common so I can look around some more.
mike_hearn 126 days ago [-]
Project Valhalla has been running for years. The first few years were spent almost entirely on experimenting with reified generics in Java. They produced a full version of the collections and streams libraries where you can write ArrayList<int> and it gets specialised on the fly, a la C++.
Cieplak 126 days ago [-]
C++ has had these things for years.

Seriously though, the JVM is killing it. Can't wait until I can compile my .jar to verilog and run it on my FPGA, or send the verilog to TSMC and get some ASICs printed out.

repsilat 126 days ago [-]
> C++ has had these things for years.

C++ is built on these things. The "value type" thing informs everything about the language. I love it, and it's my go-to language when something just has to be fast, but it comes along with some hefty baggage. Java gets a lot of mileage out of its reference-first philosophy that C++ would be wise to try out -- the "everything must be possible with no overhead" philosophy of the language bleeds into the programming community in unhealthy ways, and the amount of language machinery to make the value stuff work "as well as possible" (https://stackoverflow.com/questions/3601602/what-are-rvalues...) is absurd.

> Seriously though, the JVM is killing it

As for the JVM and Java, this seems like a guilty admission that the CLR and C# ate their lunch a decade ago (perhaps not wrt performance, but for language features....) I've never worked in the Microsoft stack, but the stories have been the same since the dawn of the age -- "C# is Java with real generics, value types and lambda functions." First they laughed...

bitmapbrother 126 days ago [-]
>As for the JVM and Java, this seems like a guilty admission that the CLR and C# ate their lunch a decade ago

I disagree. The JVM continues to eat the lunch of the CLR in performance, developer mindshare, pervasiveness and ecosystem. And with each new Java version those language feature differences continue to decline. And if you do need that syntactic sugar then there's always Kotlin.

Matthias247 126 days ago [-]
As great as structs in C# for performance reasons are: They also provide several new ways to shoot yourself in the foot if you don't understand their behavior to 100%.

One example that I run many years ago was having a struct Foo { int x; } which was nested inside a List<Foo> myList. Trying to mutate one list element with syntax like myList[0].x = 27 silently failed, because the indexer of the list returns a copy of Foo and you mutate that inside of writing it back. Nowadays its a compiler error, but back then it wasn't. The workaround is something like "Foo f = myList[0]; f.x = 27; myList[0] = f;". Which is quite weird if you are coming from C++. From a C# point of view it's actually obvious, since you can't pass around references and pointers to structs inside of other things - and if you also add this functionality, you get lots of the complexity from C++ back (references to structs, functions/properties that return references, questions around reference lifetimes, etc.). Without having also reference semantics for structs, structs also can't achieve the kind of performance which "value types" in C++ have, since they sometimes need to be copied fully.

Another nice gotcha is the surprising behavior of readonly structs, which e.g. is documented here: https://bytes.com/topic/c-sharp/answers/261922-readonly-beha...

For those reasons I think it's quite fair that Java and other languages did not immediately jump onto the value types train. I think introducing them carefully, and trading off the performance gains against the amount of new complexity in the language is fine.

pjmlp 126 days ago [-]
When Java was introduced we already had plenty of GC enabled languages with value types.

Eiffel, Oberon, Modula-2+, Modula-3, VB, are just a few examples.

Hence why I always saw as lost opportunity not to provide them in first place.

rsj_hn 125 days ago [-]
> As for the JVM and Java, this seems like a guilty admission that the CLR and C# ate their lunch a decade ago

The reason why this never happened was that Java was good enough. A quick look at TIOBE shows C# stuck at around 3% while java continues to reign supreme.

I do think C# is a better language in many ways, but not, it seems, in any ways that really matter for purposes of language adoption. Which is a shame, since I enjoy writing code in C# more than in java, but I don't have much opportunity to do it, and I've never been in a situation in which C# was the language and it wasn't an MS-only solution.

Cieplak 125 days ago [-]
I was hoping the Seriously though conveyed the sarcasm of the previous sentence, given that Java is written in C++.

I agree with you that C# is a much nicer programming language than Java. The HotSpot JVM, otoh, is a masterpiece of engineering in terms of its JIT and GC optimizations.

repsilat 125 days ago [-]
> I was hoping the Seriously though

Ah, my fault. You expected more than the audience deserved.

_old_dude_ 126 days ago [-]
Value types doesn't seem to be like struct in C++ or C#.

They are immutable so you have no way to know if they are stack allocated, split over several registers or on heap.

This is important because it means that JITs will not have to be changed, they will not have to manage stack aliases like in C#/C++.

eeperson 126 days ago [-]
Cieplak 126 days ago [-]
Chisel is very sweet, definitely the tool I would use to implement the JVM instruction set in silicon.
pedrow 125 days ago [-]
Does anyone how this would work if there were two value types with the same members/definition but from different libraries - would they be interoperable? For example, a mathematical function library and a chart plotting library might both define complex numbers. It would be good if you didn't have to write (trivial) conversion functions every time you wanted to use them together, like you do now. (I think this is called 'structural typing'?)
kodablah 125 days ago [-]
It appears not and that's a good thing here. Java is based on explicit typing and I do not expect things to work w/ implicit conversions between value types. Granted, that doesn't mean that a JVM implementation may not have a higher performing way to perform a cast between two of the same struct. There is of course boxing/unboxing that comes w/ converting between value and ref classes.
saagarjha 126 days ago [-]
Looks cool, but I didn't see Java 10 anywhere. What's the timeframe for this? Is this truly a part of Java 10, considering that Java 9 is still in development?
sverhagen 126 days ago [-]
The new release date is September 21: https://jaxenter.com/java-9-schedule-change-sept-134484.html

It's been moved a lot and AGAIN, but I believe "the target is starting to move slower" :)

WatchDog 126 days ago [-]
The specification refers to version 54.1 class files, version 54 corresponds to Java 10, `.1` being the inital minor version.
geodel 126 days ago [-]
I'd think Java 10 will be around mid 2020 or later.
pulse7 126 days ago [-]
Why not just "value types" and "value class types" instead of long "direct value class types"?
Retra 126 days ago [-]
Because references are values.
cosmosgenius 126 days ago [-]
Is there a TL:DR version of this?
zyx321 126 days ago [-]
Java 10 may support defining your own primitives.

e.g. You could define a tuple-like type that will get passed and stored by value. Now you can make a billion member Array of 'em and not incur the speed and memory overhead you'd get from using Objects that need to be tracked, dereferenced, garbage-collected, etc...

_old_dude_ 126 days ago [-]
It's a patch to Java Virtual Machine spec in order to introduce value types, think like an immutable struct with methods attached.

I suppose that at some point in the future, the Java language spec will be updated to let us play with that.

exabrial 126 days ago [-]
<3 Java and Java EE! Glad to see the innovation wheels turning