NHacker Next
  • new
  • past
  • show
  • ask
  • show
  • jobs
  • submit
Vega-Lite: A Grammar of Interactive Graphics (vega.github.io)
domoritz 1266 days ago [-]
Hi Hacker News! One of the Vega-Lite authors here. I'm excited to see you all here checking out our declarative visualization system.

If you want to learn more about the academic origins, check out the paper: https://idl.cs.washington.edu/papers/vega-lite. Vega-Lite is also available as the default plotting library in JupyterLab.

If you like, also check out Altair (https://github.com/altair-viz/altair), a Python API for generating Vega-Lite and Vega-Lite-API (https://observablehq.com/@uwdata/introduction-to-vega-lite), a JavaScript API to generate Vega-Lite.

pbowyer 1266 days ago [-]
I'd like to second the thanks; Vega-Lite is awesome.

I've two pieces of feedback.

The first is I started by using Vega-Lite-API thinking it would be easier to understand. As I tried to do what I needed I found most examples in the manual are JSON only, and the JS API docs weren't helpful.

It took a while to work out how to make a log-scale axis, or how to customise the tooltip content (I could see in the JSON how to do it easily) and I gave up trying to work out how to provide a custom colour scale via JS - or to change to another pre-defined one.

The second is around toggling on/off the data in the chart. I had ~20 series on one scatter plot, and being able to highlight other members of a series, or turn series on/off would make it easier to explore the data.

I found a JSON example of doing this; but not one with the JS-API. It was also more complicated than I hoped, wanting to use Vega-Lite as a better Excel to explore this data, and not to write my own chart-app if that makes sense (different purpose).

stevesycombacct 1265 days ago [-]
As an Altair fan, I'm interested in your plans for future versions, if you're free to discuss those.

While composing this question I was about to ask if you support VS Code, but I see[0] that you do now.

I was also about to ask if you support disabling the 5,000-row limit, but I see[1] that you do that now, too.

So, clearly, you're doing great things with Altair.

[0]https://altair-viz.github.io/user_guide/display_frontends.ht...

[1]https://altair-viz.github.io/user_guide/faq.html

pletnes 1266 days ago [-]
I’d just like to say thanks for working on vega-lite. I haven’t used it much, but it will change the world to a slightly better place.

Dear sw devs who are doing visualization: please do have a look at vega-lite, even if you can’t use it for $PROJECT for $REASON, you can learn so much about visualization by reading the docs and playing with the online editor. The separation of concerns was an eye opener for me.

1266 days ago [-]
wodenokoto 1265 days ago [-]
> Vega-Lite is also available as the default plotting library in JupyterLab.

What does that mean? I thought plotting was done by what-ever program script was running on the kernel.

domoritz 1265 days ago [-]
JupyterLab outputs have a mime type. Depending on the mimetype, a different renderer is used. For example, text is shown as plain text and an image is binary with the image type (e.g. png). There is also a mimetype for Vega and Vega-Lite JSON.

What this means for you as a user is that you don't need to install the renderer as it comes with the JL frontend.

Vega-Lite is client side so it's not running in the kernel.

acomjean 1266 days ago [-]
We use it for our web based visualizations or genetic data. We love it’s ability to save these a svg/png out of the box.

Vega lite has a lot of sane defaults which is great but sometimes it takes a little bit to get what you want. The examples are good but sometime simple. The docs are comprehensive though.

The examples:

https://vega.github.io/vega-lite/examples/

I’ve used a bunch (D3,C3, highcharts,amcharts) but this seems to be the best combination of easy and powerful

The tutorials are a good start: https://vega.github.io/vega-lite/tutorials/getting_started.h...

justtemporary 1266 days ago [-]
I have just completed a number of vegalite graphics for a University assignment focused on creating editorial data visualisations (due in a few days! https://crcorbett.github.io/FIT3179/)

As someone quite fresh to data analysis, I found Vega-Lite fun to work with. I have some experience with ‘Grammar of Graphics’ approaches, having taken a few classes focussed on R and Tidyverse.

The docs are comprehensive, but I found the example charts often too basic for the ideas I was trying to implement. This might be more reflective of my greenness in the area than any deficiency in the library.

I understand it’s relatively new to the scene so it’s not expected to have full functionality just yet. This assignment made me appreciate how simple ggplot() is, in particular when it comes to faceting graphics.

matyunya 1266 days ago [-]
There's also Voyager which generates Vega-Lite spec prediction based on provided data:

https://github.com/vega/voyager

I made a small clone just to understand the inner workings of it: https://ellx.io/matyunya/simple-voyager-clone/index.md

Here's my very basic attempt at ggplot-like API on top of Vega-Lite: https://ellx.io/matyunya/plot.

There's also Vega-Lite API for producing spec JSON with more typical chaining calls: https://vega.github.io/vega-lite-api/

domoritz 1266 days ago [-]
Nice. Feel free to send us a pull request to add your Voyager clone to https://vega.github.io/vega-lite/ecosystem.html.
matyunya 1266 days ago [-]
I'd be honored!
mdifrgechd 1266 days ago [-]
I personally don't like packages that combine aggregation or other data processing with display. There is clearly a community that likes this paradigm (plotly is similar) but I favor doing the analysis and getting the view I want as a data structure, and then plotting as is.

I would love to hear the advantages of combining aggregation and plotting the way Vega does.

domoritz 1266 days ago [-]
Vega-Lite author here. The reason why we integrated the two is that we wanted to provide support for interactions with a fully reactive runtime system (https://idl.cs.washington.edu/papers/reactive-vega-architect...). It's difficult to do that while being agnostic to different implementations of these transformations.
breck 1266 days ago [-]
I think the Vega team would agree with you since it seems to me like they are doing exactly that with Arquero (https://github.com/uwdata/arquero).

From my experience it's the right architecture.

You cannot have a great data visualization library without a great data transformation library, but the data transformation functions should be at lower level; not provided by the visualization layer.

kanitw 1266 days ago [-]
Vega author here. We actually disagree with the last part.

For visual analysis tools (not just UI charting library), having the ability to quickly summarize data is very useful for analyses (esp exploratory ones).

We are not alone with this design choice. ggplot2 in R and GUIs like Tableau also includes data aggregation as first class citizen in their tools.

breck 1266 days ago [-]
I think I misread the OP that they were talking about separating the implementation code at the vis and data layer (rereading I think they were talking about the user's experience). I disagree and think for a user it's great to have your visualizations and transformations work seamlessly.

My memory of Vega was that it mixed data transform code with vis code. So I didn't have like a dplyr + ggplot2 combo. I thought design wise that was a mistake, because nailing the transforms from an edge case and performance perspective is hard and doing it 2x doesn't make sense to me. So decoupling the vis code from the data engine I think is better.

But I took a fresh look at the Vega repo and it is indeed nicely decoupled, and the transforms look usable standalone. So maybe it sort of already has the Dplyr+GGPlot2 style decoupled architecture that I thought Arquero would bring.

I had thought of Vega as "a monolith for datavis", but now looks like there's lots of smaller usable packages in there.

jules 1266 days ago [-]
I've only used ggplot2 and found it fantastic compared to standard plotting tools. What has been improved about the design of these type of visualization tools since then?
jryb 1266 days ago [-]
I would actually turn this question around: why would you want to implement your own aggregation functionality?

It's super convenient to be able to just make a histogram. And I'm sure my hexbin aggregation function would have errors in it the first time I wrote it.

But these are all opt-in - if I need to make a histogram of 10 trillion datapoints and performance is critical, sure, I'll do the aggregation myself and just call the barplot function instead of the histogram function. What did bundling a histogram function take away?

tibbetts 1266 days ago [-]
Generally being able to interactively adjust the granularity of the data, drill down, and also to interactively filter the data and see it properly re-aggregated.
klysm 1266 days ago [-]
I think the problem is aggregations frequently become part of the display of the data, especially if they are even a little bit interactive.
viraptor 1266 days ago [-]
That's 99% of my use cases. If I'm plotting data, I'm doing exploration and want to change aggregation / display quickly.
nicolaskruchten 1259 days ago [-]
I agree that some libraries do this but in general Plotly does not: we mostly visualize the data given, and lots of users wish we would onboard more transformation/aggregation/processing :)
TeMPOraL 1266 days ago [-]
I like Vega-Lite and I used it in various small projects over the past few years. It's really easy to work with.

That said, I hit performance issues very quickly, when trying to do interactive data visualization - and by interactive, I mean changing the data, not chart styling. That may be because every time data changes, I have to repackage it into Vega-Lite JSON description and rerender the chart. I wonder if there's a better way of doing it? A partial update? I couldn't find anything in the docs last time I tried.

Related: what would be the best alternative in JS, short of writing my own d3/canvas code, for cases where I have a structurally fixed (but possibly complex) chart, but I need to hit it with 100k or 1M data points and have it redraw under 100ms?

domoritz 1265 days ago [-]
Vega-Lite is built on Vega, which is fully reactive and can do partial updates. In order to use it, you need to update the data via the Vega view api. Check out https://vega.github.io/vega/docs/api/view/#view_data.

One other piece of advice I have it to reduce the number of marks you need to draw with aggregation or sampling. I used this approach in https://github.com/uwdata/falcon to visualize billions of points and interact with them in real time.

TeMPOraL 1265 days ago [-]
Thank you! And thanks for linking to Falcon, I'll check it out. The demos are similar to the thing I was trying to achieve, and show the performance I dreamed of.
bloaf 1266 days ago [-]
See also: Altair (Python library for generating Vega-Lite)

https://github.com/altair-viz/altair

Previous discussion: https://news.ycombinator.com/item?id=23411859

newswasboring 1266 days ago [-]
There is also the julia implementation [1]

[1] https://github.com/queryverse/VegaLite.jl

YeGoblynQueenne 1266 days ago [-]
I like the idea of the whole plot being described as json. I was looking for a way to automatically generate plots from a project written in Prolog (https://github.com/stassa/louise). Until now I was composing some R plotting scripts with Prolog, which can get a bit clunky. Swi-Prolog has a solid library to convert from Prolog terms to json so it should be straightforward to translate between program output in Prolog and Vega-Lite json. I will definitely give this a try.

Just a question to @domoritz - I noticed that in this example plot, that shows relative numbers of different types of animal produced in the US and UK rendered as emoji:

https://vega.github.io/vega-lite/examples/isotype_bar_chart_...

- the data is encoded as hard-coded values that tell the engine where to place each icon of a sheep, pig or cow. Isn't it possible to derive these positions from numerical data, automatically?

For instance, instead of enumerating each instance of a "pig":

    "values": [
      {"country": "Great Britain", "animal": "pigs"},
      {"country": "Great Britain", "animal": "pigs"},
     ... etc. 
Shouldn't it be possible to input the actual number of pigs and have the engine put it into bins as necessary?
nbevans 1266 days ago [-]
We just finished integrating Vega-Lite into our Xamarin mobile app - with support for UWP, Android and iOS. It's great.

The charts are self-contained in a blob of JSON, so you can ship them around easily.

It supports both Canvas and SVG rendering, which is great if you need to export to a .png since Canvas is better suited to this. SVG is good for crystal clear vector graphics though. So we use SVG mode on the mobile app, and Canvas mode on the back-end.

We tried numerous other JS charting libraries (c3, chartist, chart.js, frappe, and a couple others I forget) but none of these were as professionally and comprehensively put together as Vega-Lite in our experience.

I am starting to view Vega-Lite as "the SQLite" of the charting world - I hope this view holds for the long-term.

kumarvvr 1266 days ago [-]
Looks like this library generates static graphs.

Is there any specific reason to use this, as opposed to say, Apache Echarts (https://echarts.apache.org/en/index.html) ?

jamessb 1266 days ago [-]
> Looks like this library generates static graphs.

It can create interactive charts [0].

> Is there any specific reason to use this, as opposed to say, Apache Echarts

I've never used it, so my initial impressions may be mistaken, but ECharts looks much less declarative.

It's interesting to compare the specifications for a bubble chart in both systems [1, 2].

The ECharts example [1] first specifies a chart-type ("scatter"), which seems to be hard-coded to use the first two elements of each entry in the data array as the x and y positions. This is then customised by writing JavaScript functions to set the symbol size and color.

In contrast, the Vega-Lite example [2] defines the chart fully declaratively - you first set the mark type to circle, then specify the data encoding which defines how each variable maps to each attribute. This mapping is properly declarative - it isn't just a manually-defined function.

If you had a multidimensional dataset and wanted to change which variables you want to plot, it looks like you'd need to reshape the data array if you were using ECharts, whereas you could just change the "field" attributes in the encoding part of a Vega-Lite specification. This makes Vega-Lite more convenient for exploratory data analysis.

The way Vega-Lite represents these encodings is convenient - a recently created library by Krist Wongsuphasawat tries to expose a similar interface to other visualisation components [3].

[0]: https://vega.github.io/vega-lite/examples/#interactive

[1]: https://echarts.apache.org/examples/en/editor.html?c=bubble-...

[2]: https://vega.github.io/vega-lite/examples/circle_natural_dis...

[3]: https://encodable.vercel.app/

kumarvvr 1265 days ago [-]
Thank you for the detailed answer.
dmichulke 1266 days ago [-]
It's declarative, so you can just send all the details via an API.

And yes, one could work around something non-declarative with enough effort.

kumarvvr 1266 days ago [-]
I didn't get you. What do you mean "all the details" via an API? Do you mean the final output can be sent over an API?
dmichulke 1266 days ago [-]
Yes, have a JSON dictionary interpreted by vega-lite and be done.
kumarvvr 1266 days ago [-]
Even Apache Echarts works in the same way. The whole graph is declared in a JSON object.
IshKebab 1266 days ago [-]
I like Vega, but I hate that it is not very efficient. It uses array-of-structs which is not efficient in JavaScript. For example to create a heat map you have an object for every single point. Also there does not seem to be a way to do proper heat maps at all (i.e. an interpolated image), only a grid of squares.

Finally I'm not really sure of the point of being declarative if it only supports JavaScript anyway. Maybe they plan a WASM implementation?

simongray 1266 days ago [-]
> Finally I'm not really sure of the point of being declarative if it only supports JavaScript anyway. Maybe they plan a WASM implementation?

Vega schemas are JSON and JSON is quite portable, despite its JS origins.

Vega has actually become fairly popular in the Clojure ecosystem (see e.g. https://github.com/metasoarous/oz) due to how well the data-orientation fits the Clojure philosophy. I also think Altair for Python is quite popular (https://github.com/altair-viz/altair) and that is a Vega-lite library.

geokon 1266 days ago [-]
that still needs to call JS though... It's not like they rewrote the renderer in Clojure

A simpler and easy to grok alternative with minimal dependencies is thing-geom

https://github.com/thi-ng/geom/blob/master/geom-viz/src/core...

You just declare what you want drawn and you get back an svg tree which you can either modify further or transform to xml to get an actual .svg format string

It's very unopinionated and can be easily inserted in both Clojure and Clojurescript applications. You can render with the browser, webview, batik, svgsalamander or I even quickly wrote a converter to Javafx bc I have the svg tree I can directly traverse

simongray 1266 days ago [-]
> that still needs to call JS though... It's not like they rewrote the renderer in Clojure

But why exactly is that an issue...? Some people also think the JVM is icky so they won't touch Clojure. I don't really care much myself what underlying technology is used, I just like to get things done.

> A simpler and easy to grok alternative with minimal dependencies is thing-geom

Thanks for reminding me of thi.ng. I am both in awe at how prolific he is, while at the same time frustrated with the non-standard org-mode-driven development style.

> You just declare what you want drawn and you get back an svg tree which you can either modify further or transform to xml to get an actual .svg format string

Probably worth mentioning that Vega can also give you an SVG.

> I even quickly wrote a converter to Javafx bc I have the svg tree I can directly traverse

Can you link your code? I would like to explore this some more.

IshKebab 1265 days ago [-]
> But why exactly is that an issue...?

It's not an issue as such, it's just that you are doing extra work to avoid Javascript by having a fully declarative system (there's no Javascript in the Vega specs - it even has a custom language for expressions).

That's great if you intend to rewrite the renderer in Clojure. But if you don't do that - if you just call into Javascript anyway then what's the point?

It's very future-looking, but feels quite YAGNI at the moment.

simongray 1265 days ago [-]
What's the point? The point is being able to use it from Clojure or whatever language you fancy. I'm not gonna switch to JS for something like this.
IshKebab 1265 days ago [-]
But the Clojure Vega implementation still uses Javascript!
simongray 1263 days ago [-]
And every language is equivalent to machine code which is represented as electricity in a circuit. Who cares what's further down the layers? It's about what abstractions your work with, not what libraries your code calls into.
harperlee 1266 days ago [-]
Apparently org-mode no longer drives it, from what I read in the link above!

    > Originally developed in a Literate Programming style using Emacs & Org-mode, it has recently (May 1) decided to revert to a traditional Clojure project setup to encourage more contributions from other interested parties. The original ORG source files are kept for reference in the ./org/ directory until further notice.
geokon 1265 days ago [-]
Well the core "issue" is they Vega ,as far as I understand it, isn't a format in the same sense as SVG is

So you're creating a hard dependency on having to embed a JS run time. There is no other rendering backend. If you have your problem space all speced out then maybe that's alright, but that generally leaves me a bit uncomfortable. The dependency graph is huge vs geom.

With geom I started my project with using batik, then moved to svgslamander when I needed to draw updates a bit faster and then when I had a bit more time to write my own renderer I changed to cljfx/javafx.

If I'm not happy with the vega renderer then I'm kinda stuck - while geom is all a digestible size

The "renderer" as such is here

https://github.com/geokon-gh/corascope/blob/master/src/coras...

I basically massage geom's svg hiccup into cljfx maps. The formats are almost one to one

Arkdy 1265 days ago [-]
If you think that the JVM is icky, GraalVM may satisfy some of your concerns.

Just getting access to Clojure on Windows through a single graal executable saved me some heartache this week, and I can't wait to start packaging apps with it.

data_ders 1266 days ago [-]
Huge shout out to the team on this. Altair is far and away the best viz tool in Python, and all my nitpicks have been addressed in the last 6 months.
billfruit 1265 days ago [-]
Can it do 3d plots and views?
domoritz 1265 days ago [-]
Not yet, although sand dance is built on top of Vega and implements a custom renderer. https://sanddance.js.org
julius_set 1266 days ago [-]
Why is it so hard to create a simple data table with 2 columns in Vega? Literally 0 examples in your documentation for this.
simongray 1266 days ago [-]
Why would you use something like Vega for that?
sterlinm 1262 days ago [-]
The main use case where I've wanted something like that is that you want to pair an interactive chart with a data table. For example you create a crossfilter and want the table to list the observations that pass the filter.

I actually contributed an example to the Altair documentation that links a table to a scatter plot.

I agree though that the tables are hard to make and not very nice looking. I think it's just not really an intended use of Vega.

https://altair-viz.github.io/gallery/scatter_linked_table.ht...

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact
Rendered at 22:42:56 GMT+0000 (Coordinated Universal Time) with Vercel.