vonnik 341 days ago [-]
Gluon is an attempt by Microsoft and Amazon to regain some influence in AI tools. Keras looked like it was going to become the standard high-level API, but now Theano is dead, and CNTK and MxNet are controlled by Google's rivals, and they're ganging up against Google's tools. Francois Chollet is committed to keeping Keras neutral, but he's still a Google engineer, and that probably makes Microsoft and Amazon nervous. This is the equivalent of Microsoft creating C# in response to Java. The company that controls the API, has enormous influence on the ecosystem built atop the API, just like Google has had with Android, or MSFT with Windows. MSFT and AMZN are carving out their own user base, or trying to, at the price of fragmenting the Python community.
Narew 341 days ago [-]
Unlike Keras and Tensorflow, Gluon is Define-by-run Deeplearning framework like Pytorch, Chainer. Network definition/debugging/flexibility are really better with dynamic network (define-by-run). That's why Facebook seem to use Pytorch for research and caffe2 for deployment. Gluon/Mxnet can do both define-by-run with Gluon API and "standard" define-and-run with it's Module API.
agibsonccc 341 days ago [-]
I think you're both right here. Competition will force the other to innovate. I don't think end users lose by there being multiple interfaces even if fragmentation is ultimately what happens here.

Standard formats and interop will help fix that.

deanCommie 341 days ago [-]
I think you had a justifiable devil's advocate until your claim about "Fragmenting"

What exactly is so bad about competition?

blueyes 341 days ago [-]
What is so bad about standardization?
mulmen 341 days ago [-]
congerous 341 days ago [-]
Data scientists arguably have too much choice. 10 data scientists will have 50 different tools, can't share work or build on another's experiments or even remember what the result of an experiment were. those are some of the reasons why most data science projects fail. that and integrations. standardization has real benefits.
mulmen 341 days ago [-]
Of course standardization has benefits but how do you choose? Standardization only works if choice is eliminated so choice is a barrier to achieving standardization.
agibsonccc 341 days ago [-]
It often just comes down to project requirements. Eg, what kind of model is required? How hard would it be to build with tool x?

For example, a big reason why a lot of computer vision research was built (and sorta still is because of momentum) on caffe was pre existing model zoos.

A big reason why people choose TF (despite lacking dynamic graphs) is just because of existing community.

Requirements for both papers as well as industry will continue to evolve. Each framework will have their own trade offs.

agibsonccc 341 days ago [-]
There's tradeoffs to choice. In the case of another commenter "too much choice" means a ton of churn and a lot of friction when it comes to building models.

I think there's always a trade off of innovation vs stability that people should be thinking about here.

Granted, things like the model formats should help long term, but for now we're going to be dealing with a ton of churn on APIs.

I'm sure another thing like dynamic graphs will come along and we'll need to update the apis.

I suspect keras will respond to this at some point by adding primitives for eager mode and the like.

I know both data scientists who need more advanced models and others who prefer the keras api just building off the shelf models.

341 days ago [-]
staticelf 341 days ago [-]
Can someone please shed light to why so many ML tools and frameworks are being implemented in Python? What makes Python so special for doing ML?

Personally, I would love for MS to release or support a .NET based ML toolkit. There is open source stuff like http://accord-framework.net but I would assume that it isn't as big nor complete as a framework being supported by a major corporation.

saurik 341 days ago [-]

> Python's Buffer Protocol: The #1 Reason Python Is The Fastest Growing Programming Language Today

> The buffer protocol was (and still is) an extremely low-level API for direct manipulation of memory buffers by other libraries. These are buffers created and used by the interpreter to store certain types of data (initially, primarily "array-like" structures where the type and size of data was known ahead of time) in contiguous memory.

> The primary motivation for providing such an API is to eliminate the need to copy data when only reading, clarify ownership semantics of the buffer, and to store the data in contiguous memory (even in the case of multi-dimensional data structures), where read access is extremely fast. Those "other libraries" that would make use of the API would almost certainly be written in C and highly performance sensitive. The new protocol meant that if I create a NumPy array of ints, other libraries can directly access the underlying memory buffer rather than requiring indirection or, worse, copying of that data before it can be used.

(The italic emphasis was copied from the original article.)

stingraycharles 341 days ago [-]
That doesn’t sound like a very satisfying reason to me. Isn’t a ByteArrayBuffer in Java pretty much the same (its underlying implementation is a char[] which can be used directly from C)?

Is there perhaps another factor, such as an existing ecosystem or that it’s widely used in the academic field?

agibsonccc 341 days ago [-]
If you use direct allocation yes. Java has great facilities for direct memory management. (Look at the number of libraries like netty and co that exist). The problem with java is more the friction at getting at some of the lower level apis.

Python just has momentum and a fairly easy to use FFI.

pathseeker 341 days ago [-]
>Is there perhaps another factor, such as an existing ecosystem or that it’s widely used in the academic field?

Yes, it's widely used in science in general. Don't underestimate the learning curves of other languages when your audience is scientists and mathematicians. Python is incredibly easy to use, even when using numpy and other scientific tools.

genericpseudo 341 days ago [-]
Python has been the language of choice for many physicists (when not doing Fortran) since the early 2000s.
aabajian 341 days ago [-]
As a long-time Java developer, Python was a beauty. It brings back the joy of programming and makes data manipulation a breeze. C# is better than Java, but it's still not as elegant/simple/clean as Python for data science.
kuschku 341 days ago [-]
Python is only fun for tiny projects. Once you reach 120k LOC in a project, refactoring in Python is an insanity even with PyCharm, and debugging becomes impossible, too.

Have you tried Kotlin?

chirau 341 days ago [-]
Data Engineer here... How do you get to 120K LOC without splitting up your infrastructure? If anything, it is poor design on your part. Python is beautiful. I've used it at 3 different companies now, 2 of which i encouraged them to try it out and they have nothing but love for it.
kuschku 341 days ago [-]
Even if you split your infrastructure, have you ever tried refactoring larger projects, with many contributors, while ensuring API contracts are kept?

Without a strict and static type system it becomes quite problematic to ensure new code keeps the API contract, unless you have unit codes for every possible value.

A good type system accelerates your coding speed, compared to writing equivalent unit tests, and it improves your quality, compared to no testing.

tanilama 341 days ago [-]
> A good type system accelerates your coding speed, compared to writing equivalent unit tests, and it improves your quality, compared to no testing.

Coding speed is least of my concern for a ML project, to be honest. And unit tests aren't useful either, since ML by large is not deterministic. A lot u said is true for web application, but didn't really apply for a ML project

nielsbot 341 days ago [-]
“accelerates coding speed”

this is debatable

kuschku 341 days ago [-]
Well, the comparison was to writing unit tests that provide the same safety as an equivalent type system.

And compared to that, the type system is certainly faster.

lstmemery 341 days ago [-]
Numpy enforces type consistency within arrays. Type errors are still possible but generally rarer and are noticed sooner than base Python.
nielsbot 340 days ago [-]
valid point
HelloNurse 341 days ago [-]
Once you reach 120k LOC in a machine learning project, you should have split it up into disparate projects (input adapters, interactive applications, transforming results...) many orders of magnitude ago, even in verbose and refactoring-friendly languages like Java.
real-hacker 330 days ago [-]
Most deep learning programs are in the range of 100s lines of code, even for quite complicated models.
bertomartin 341 days ago [-]
Can you be specific on what makes this a headache? Your "refactoring" tells me that the code probably wasn't well structured in the first place and if so this would make refactoring difficult for any language, particularly dynamically typed ones.
grtrans 341 days ago [-]
> Your "refactoring" tells me that the code probably wasn't well structured in the first place

Well considering this is a realistic scenario for fallible humans, it’s still decent advice to keep your exploratory projects in python small to avoid ridiculous tech debt. It’s not quite as bad as with ruby, but it’s close.

kuschku 341 days ago [-]
I've got exactly that experience.

Many languages make it problematic to keep code actually bug-free and maintainable, and Python and especially Ruby are problematic for that, while Java and Kotlin, but even C++ (with a strict style guide) are a lot nicer to work with at scale.

If you want to keep consistent APIs between modules, strict types and checked exceptions are very helpful, while with python one typo can lead to accesses being lost — which is why so many use slots nowadays, and TypedPython, and annotations. But if I do that, I might as well use Java or Kotlin, and get a better IDE.

Compared to unit tests, strict and static types are faster, compared to no testing, static types are safer.

bertomartin 341 days ago [-]
I probably could have put that better, so here goes: shitty code can be a challenge to refactor regardless of language
tanilama 341 days ago [-]
Python in data/ml rarely goes into that scale. It is used for Training. Several thousands line per project at most, it is tractable and I don't think machine learning models really can be refactored or debugged like a web application.
timdorr 341 days ago [-]
Python has always had a good selection of scientific/mathematical libraries (numpy, scipy).

In addition, notebook apps like Jupyter fit well with the experimental nature of scientific code. I have a colleague who was attempting to do some stuff in Ruby (to fit with our application stack) who would leave IRB sessions open for weeks at a time. He's recently switched to Zepplin for notebook stuff, and it has been a huge productivity boost for him.

pm90 341 days ago [-]
Mostly because Python is a great "glue" language. It isn't performant enough to implement the actual low-level computation, but is better at running other applications, getting data from them, feeding them into other applications (a.k.a pipelines).
staticelf 341 days ago [-]
Sure but if I have invested a lot in MS technologies I am hesitant to learn and implement things in a completely new language if I can find something that is more fitting to the stack I already use.
HelloNurse 341 days ago [-]
What relevant stack of Microsoft technology are you hoping to leverage for high performance numerical computation? Surely nothing involving .NET. Python for Windows and Python extensions can be compiled with Visual Studio; I don't see other ways to be more Microsoft-friendly.
nl 341 days ago [-]
Sure but if I have invested a lot in MS technologies I am hesitant to learn and implement things in a completely new language if I can find something that is more fitting to the stack I already use.

The rest of the ML world is in that exact situation, but on Python. They aren't going to throw away their familiar tools unless everyone else does too.

losteric 341 days ago [-]
If you already have a good stack that you're very effective, the switching cost may not be worth it.

However, if you're more algorithmically focused, python is a great DSL

hackinthebochs 341 days ago [-]
Python is easy to pick up (coming from someone who loves the MS stack). Don't let it being a new language deter you.
grtrans 341 days ago [-]
It’s not clear eg what use you would get from sharing a language between your model training and request processing.
aswanson 341 days ago [-]
Momentum, community uptake. Python with numpy, matplotlib had the framework in place as a free alternative to the exhorbitantly priced Matlab. Community ran with it.
zitterbewegung 341 days ago [-]
Python had a scientific community long before even this whole new fad of Data Science. Also, financial institutions helped make software such as Pandas. The last thing is that that the syntax of the language is really friendly and easy to use with batteries included. Others have mentioned its a great glue language since it was originally designed as a systems programming language and a bunch of *nix distributions use it for that purpose.
alexcnwy 341 days ago [-]
Most DL researchers are more into math/stats/theory than programming and Python is faaar easier to pick up and grok than java/.net/etc.
othersideofcoin 341 days ago [-]
It's a common denominator, few downsides, speed handled in lower level code/libs. Really good "get shit done" language, best GSD lang I've used. Scientists + programmers, data engineers and PhDs, all are cool w/ the syntax. It's open source, has a shitton of supporting libs.

Outside of speed, I've read very few valid criticisms. What other languages are cross-platform, great lib support, delegate easily to lower level libs for perf, are there?

(FWIW, any MS based language is probably excluded from consideration depending on its cross platform ability. Many data people -- like me! -- won't use a MS based OS)

hacking_again 341 days ago [-]
Python has its share of cruft and idiosyncrasies. I find some of the syntax irritating, e.g. boolean logic, argument handling, hidden / private / magic symbols, and those pesky half-open intervals that routinely lead to off-by-one errors.
make3 341 days ago [-]
According to the nips blog I think, 98% of deep learning papers are published with python source, and 1% R 1% other stuff.
justnikos 340 days ago [-]
CNTK has C# bindings (for training) since last month (and for evaluation since forever). Also more stuff will be coming into the core C# language.
oh-kumudo 341 days ago [-]
Data/ML people love python, they build a great ecosystem around it for 20 years. So yes, Python is pretty special.
romanovcode 341 days ago [-]
Same here, kind of defeats the purpose of many libraries in my opinion if all of them are released for Python and look exactly the same. Think: front-end frameworks in javascript.
indescions_2017 341 days ago [-]
AWS / MSR response to dynamic graph computation paradigms such as Chainer, Tensorflow Fold. There is certainly a distribution advantage to having this be the default backend engine as so many store their data on S3, Redhsift.

The DyNet paper is still the best source for background on the relative advantages to using "Define-By-Run" networks:

DyNet: The Dynamic Neural Network Toolkit


Now I just need to get to where scaling to 1000 GPUs is a problem I actually have ;)

binarymax 341 days ago [-]
I feel like DL/ML frameworks are getting about as frequent as the JavaScript framework craze of the past decade. I started out on Torch and recently I've been using Keras, and this looks quite similar to the latter. Hard to see why one would switch to this if they are already comfortable in another.
wonder_bread 341 days ago [-]
Is this as much of a collusion attempt against TensorFlow as it seems? Or has this been a long time coming?
logicchains 341 days ago [-]
>Is this as much of a collusion attempt against TensorFlow as it seems?

What do you mean by collusion here? Seems more like an attempt at competition than collusion.

ninju 341 days ago [-]
I guess the collusion comes from the cooperation of Amazon and Microsoft on this effort
wonder_bread 341 days ago [-]
Do you think they would still be doing this if TF hadn't become the near de-facto ML software? Seems like more of a response to its rise than an actual research project
Artemis2 341 days ago [-]
Curious to see what HN thinks of this. Are Amazon and Microsoft going against Google and Tensorflow? Will we see Gluon Processing Units on AWS and Azure in the near future?
mza 341 days ago [-]
We love TensorFlow (and have a ton of developers using it on AWS).

Just like databases we’ll support a wide range of engines on AWS; some of our own like Gluon, along side others from the community like PyTorch and TensorFlow. They’re all first class citizens.

We even fund separable (competing!) teams internally to focus on making sure AWS is the best place to run each of these popular engines.

gfredtech 341 days ago [-]
Tensorflow was the default Keras backend on AWS, but it got replaced by MxNet
krallistic 341 days ago [-]
Note that they used a pretty old+forked version of keras for the mxnet backend
eanzenberg 341 days ago [-]
Sorry that's super confusing, since currently Tensorflow is 1 of 2 possible backends to Keras. Is this the same Keras?
genericpseudo 341 days ago [-]
One of four (MXNet and CNTK alongside TF and Theano), and the Amazon deep learning API forked Keras to default to MXNet support before it was really ready - which irked the Keras authors quite a bit.
stablemap 341 days ago [-]
It was marked as a dupe, but someone posted this earlier and I like that it shows a little code:


eanzenberg 341 days ago [-]
This is interesting, but because of the growth of the number of ML frameworks and languages, when new ones pop up it would be great for them to release methods to transfer existing models to their language. I would love some extra compatibility with AWS for deploying deep learning models in prod but since I already have existing models running in production, it's a hard sale for me to re-train and re-implement from scratch existing work.
Narew 341 days ago [-]
Recently there is more and more initiative to have standard format in deeplearning environnement. dlpack for tensor format (https://github.com/dmlc/dlpack) onnx for saved NN (https://github.com/onnx/onnx) and tvm for execution (http://tvmlang.org/2017/10/06/nnvm-compiler-announcement.htm...)
pjmlp 341 days ago [-]
Couldn't they have chosen a better name?


xelxebar 341 days ago [-]
Not to mention that "gluon" is the name of a particle (strong force carrier). There seems to be a trend recently to ride science hype by choosing a sciency-sounding name. :/
floopidydoopidy 341 days ago [-]
That's unfortunate. Sometimes I wonder if Microsoft does stuff like this on purpose. A simple google search would should who is using what name.
moduspwnens14 341 days ago [-]
A little odd to see AWS and Microsoft partnering together on something like this, but it's good to see regardless.
amrrs 341 days ago [-]
Recently MSFT partnered with FB releasing a deep learning framework Open Neural Network Exchange, and now Microsoft with AWS. Seems Microsoft has learnt that Google is far ahead in this game (just like they lost Internet game to Google) so let's make something happen that Google might not become another unbeatable winner in AI.
nozzlegear 341 days ago [-]
There seems to be a little bit of partnership between Amazon and Microsoft. For example, they recently announced that users will soon be able to use Alexa from Cortana, and Cortana from Alexa. I'm curious to see if this trend continues.
origami777 341 days ago [-]
AWS is probably the biggest windows customer by now. Probably a lot of ideas passed between the two.
chatmasta 341 days ago [-]
The most important executives at both companies live and work around Seattle, so maybe that has something to do with it. There are some nice golf courses in Bellevue!
intern4tional 341 days ago [-]
This is not the case. Probably more relevant is that there are dedicated AI / Cloud Computing meetups and groups in the Bellevue and Seattle area that employees of all the companies attend. This means that the people that do this research and build these products frequently chat with each other.

I know my counterparts at AWS as a result and as we are friends, I push for collaboration whenever opportunities arise. At least on the MS side of the house these sorts of outreach and collaborative projects are a ground up push.

jeffbarr 341 days ago [-]
As far as I know, none of my colleagues can golf. But I hear that we do have some nice courses.
bbgm 341 days ago [-]
That did make me chuckle.
forgotAgain 341 days ago [-]
Partnerships between tech companies is not that unusual. Now if it happens to amount to anything significant that would be odd.
mhh__ 341 days ago [-]
I can't help but feel that "Gluon" is only going to conflict with Gluons (The Bosonic particle)
Analog24 341 days ago [-]
In the same way Caffe conflicts with your local coffee shop. This is far from the first naming conflict of the internet age.
mikebenfield 341 days ago [-]
There's also the Gluon language [1] and, as someone below mentioned, this [2] enterprise mobile thing.

[1]: https://github.com/gluon-lang/gluon

[2]: http://gluonhq.com/

c0brac0bra 341 days ago [-]
Page is 404ing now?
dstaheli 341 days ago [-]
machineman44 341 days ago [-]
Why AWS and Microsoft? Why not Amazon and Microsoft?
0xbear 341 days ago [-]
The goal is to erode TF user base. Too little, too late: PyTorch is already very effectively eroding the TF user base.