HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
A Ray of Hope: Array Programming for the 21st Century

ACM SIGPLAN · Youtube · 130 HN points · 1 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention ACM SIGPLAN's video "A Ray of Hope: Array Programming for the 21st Century".
Youtube Summary
The ideas of APL and its successors, the array programming languages, were two generations ahead of their time. These languages are based on the notion that everything is a tensor, and all operations are rank-polymorphic: they extend automatically to tensors of any rank. These ideas are perfectly suited to an era of machine learning, large scale data, GPUs and other accelerators. Building on recent academic research, we are building ShapeRank, a new statically typed, purely functional language for industrial use, that extends rank-polymorphism to streams. We’ll introduce the key ideas and show how they are realized in ShapeRank.

Presented by Gilad Bracha
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.
Feb 05, 2021 · rscho on The APL Orchard
Yes, that's the one: https://youtu.be/x1FoTYnJxeY
Nov 19, 2020 · 130 points, 40 comments · submitted by akkartik
anonu
As a long time user of q/kdb, I can confirm that working with array-based languages definitely changes how you view data and code. It makes you a better coder in other languages as well.

Random thought: cool triptych behind the main speaker. Anyone know what it is?

EDIT: A search for "famous triptychs" revealed: https://en.wikipedia.org/wiki/The_Garden_of_Earthly_Delights

dri_ft
Can I ask how you got into using q/kdb? I've been using it in my current work, partly because it's well-suited to the table-oriented work I'm doing, and partly just because I wanted to learn it. But have no idea where to find opportunities using it.
anonu
I was working at a large bank back in 2010 - as a quant trader. The firm acquired an enterprise license so that came with training and general openness to using this.

More often than not, kdb is chosen as a basic time-series storage and retrieval. Which is a bit sad because thats really a small part of what it does well. Other firms use KDB in a more holistic manner, for building a distributed services system for example. (look up Torq from AquaQ)

Where to find opportunities? Probably going to a job board and searching for KDB is a good place to start - most opportunities will be in trading or trading-tech related roles.

dri_ft
Thanks!
woutgaze
Quick summary: Gilad Bracha introduces a new programming language called ShapeRank. It seems to be based on APL but introduces the concept of streams which are [vectors | tensors | arrays] of unbounded length.
teleforce
Personally I think array programming languages are the future and one of the most popular programming languages in science and engineering is Matlab, and it is to some extent an array based programming language [1].

I am surprised that the author (and reviewers of the paper) has missed to perform proper literature review, for example it missed other recent and promising works on functional array programming languages namely Single Assignment C (SAC) and Futhark [2],[3].

ShapeRank also seems to take vector algebra "tensor" concept to the extreme and to be honest it's better to based on "versor" since geometric algebra is probably the future of computer algebra [4].

Last but not least and probably the most controversial is that why create another standalone array language from scratch? It will be better to make a seamless DSL based on general purpose language like D language and you do not have to re-invent most of the libraries (and C library support in D is second to none). Arguably the most successful recent effort on array based scientific programming language is Julia and it is still very much dependent on some Fortran based libraries for speed. While with D you can go "turtle all the way down" and still meet the speed requirements that are needed in scientific computing [5].

[1]https://en.m.wikipedia.org/wiki/Array_programming

[2]http://www.sac-home.org/doku.php

[3]https://futhark-lang.org/

[4]https://en.m.wikipedia.org/wiki/Comparison_of_vector_algebra...

[5]http://blog.mir.dlang.io/glas/benchmark/openblas/2016/09/23/...

snicker7
> Arguably the most successful recent effort on array based scientific programming language is Julia and it is still very much dependent on some Fortran based libraries for speed.

That is because some libraries (e.g. BLAS implementations) have many developer-years of careful optimizations. These are not trivial algorithms to implement. Implementing them in Julia, nonetheless, has been a long-standing goal of the Julia community.

beagle3
Array programming is the future and always will be (since 1958). Same as LISP.

APL and LISP both had the problem that they went straight to the (extreme) logical conclusions of a paradigm. Which is means you have to put an effort to actually derive the benefits be can’t just “wing it till you understand it” like you can in most other languages. Notably, C++ has left that club with C++14 (maybe even 11), but popular languages at the time of their popularity have always been there, and likely always will be.

unishark
> ShapeRank also seems to take vector algebra "tensor" concept to the extreme and to be honest it's better to based on "versor" since geometric algebra is probably the future of computer algebra [4].

I'm curious why you say that? Quantum computing or something? I was very interested in geometric algebra several years ago, but have never found need for it. Seems like it helps simplify problems that are already simple.

teleforce
There is a recent HN entry on GA, the the posted article, and some of the commentaries provide the usefulness and the huge potentials of GA [1].

To be specific, if you have noticed, there's a recent popular trend in "network observability" and don't assume that the "network" is only for the computer networks, it can be any network (social, pandemic,system, etc) and the original term actually refer to the power system. For a start, this network observability requires fusion of data from multitude of sensors, parameters, entities, components, etc to provide accurate model for physical and/or virtual world. For example if you are trying to develop level 5 autonomous driving system, comprehensive data fusion, integration and analytics is the very first step towards successful automation.

If you think that the term network observability is familiar, it is because the main reason eBPF is created was to perform comprehensive Linux OS observability. There is a wonderful website that you can see and observe (pun intended) how to perform network observability, and why it is really useful [2].

By supporting and representing data as "versor" natively in programming languages you can easily model and perform insightful animation similar to here [3]. There is also a recent post on HN about ObservableHQ website itself [4].

[1]https://news.ycombinator.com/item?id=25142528

[2]https://observablehq.com/

[3]https://observablehq.com/@enkimute/animated-orbits

[4]https://news.ycombinator.com/item?id=25161409

7thaccount
I'd love to drop Python and go all in with a true array language (i.e. not Matlab), but you only really have 4 options:

Dyalog APL, J, Kdb+, Shakti.

All of those are closed source and expensive (Dyalog is fairly affordable, but still a paid product) with the exception of J. J is a cool language, but isn't quite my cup of tea.

So if one of the new projects ever picked up steam and got a decent sized community with hooks into all the same numeric libraries as Numpy, and some decent charting libraries...then we would have something nice.

pauldirac137
J is not closed source and hasn't been for a while. https://github.com/jsoftware/jsource
7thaccount
I said "with the exception of J" which was supposed to apply to mean not closed source and not requiring a commercial license (well... outside of Jd).
beagle3
You have Kona, ngn/k and others are shaping up. Not yet production quality, but Shakti isn’t there yet either.
7thaccount
I thought Kona was intentionally not going to ever get close to the commercial K languages due to fear of lawsuit. It is supposed to stay a toy.
jb_s
In my undergrad at UNSW I did a "baby's first interpreter" project on a language with a similar concept - function evaluation is defined in terms of unbounded-length arrays, being able to cache function evaluation for performance etc.

Looking back it was pretty cool and helped form a more abstracted view of programming beyond the low-level

edit:found some papers https://cartesianprogramming.files.wordpress.com/2020/07/sem... http://www.cse.unsw.edu.au/~plaice/archive/JAP/U-CSE-201306....

Snoddas
Little longer summary:

The ideas of APL and its successors, the array programming languages, were two generations ahead of their time. These languages are based on the notion that everything is a tensor, and all operations are rank-polymorphic: they extend automatically to tensors of any rank. These ideas are perfectly suited to an era of machine learning, large scale data, GPUs and other accelerators. Building on recent academic research, we are building ShapeRank, a new statically typed, purely functional language for industrial use, that extends rank-polymorphism to streams. We’ll introduce the key ideas and show how they are realized in ShapeRank.

https://2020.splashcon.org/details/splash-2020-rebase/26/A-R...

justincormack
Have you got links to the research papers?
Gravityloss
Can it be done in some easier syntax than APL?

Easier means - less effort in learning coming from someone who knows mainstream languages like Java.

The idea is not only limited to APL. I don't like crafting for loops or maintain indexes. Fortran has something similar. With Matlab many operators operate in an intuitive way on vectors and matrices. It breaks down quite quickly if you try to do something more complex though. This somewhat extends to Julia. In Ruby also you can have .map or .each.

Julia:

    x=10
    v=[1 2 3 4]
    x.*v
    #1×4 Array{Int64,2}:
    # 10  20  30  40
PeCaN
If you watch the video it looks like their proposed syntax is not APL-like but closer to mainstream languages.

I'm honestly not sure if this is a good thing or not. You said "easier" syntax than APL but APL is honestly a very easy syntax for working with arrays. That's a significant part of the advantage of APL, it makes it very easy to come up with, talk about, and maintain array algorithms.

Matlab and Julia and other languages aimed at scientific computing have some array language-like traits but lack a lot of the functions that make APL more generally applicable. And .map is all wrong; it's extra noise and it doesn't generalize down to scalars or up to matrices—the defining feature of array languages is that operations are implicitly polymorphic over the rank of the input.

Gravityloss
I understand that. I still don't want to spend the effort to learn APL.

It's like digital cameras that came around. Many users knew how to use film cameras so you made the digital cameras to be mostly like film cameras even if the digital medium would have enabled a very different, much better camera straight out of the box. But the market had invested so much time in this learning how to work with film that you had to do it like that. Path dependency is not just about rigid thinking, it's about using what you have because that saves a lot of resources.

Regarding, .map being all wrong, in Ruby it's not a property of an array, it's a method for enumerables. Array is one type of enumerable, but it works with hashmaps etc. https://ruby-doc.org/core-2.6.5/Enumerable.html So it's not that non-general. It is noisy (and weird with the pipes) because it's general.

PeCaN
To be honest I don't really see people who don't want to learn APL being that interested in putting in the effort to completely upend how they think about programming and algorithms in order to use other array languages, regardless of syntax. (After all this is by far the hardest part of learning APL, the symbols are easy enough and easy to look up anyway.)

map is general in kind of the wrong way. You could after all add a #map method to Object for scalars and make a Matrix class that also implements it and then just call map everywhere. However you still run into the problem, mentioned in the video, that it doesn't easily generalize to x + y where both x and y are arrays; you have to use zip or map2 or something (and now you still have to figure out how to do vector + matrix) and yes you can kind of do explicit "array programming" in Ruby if for some reason you're really compelled to do that but it will look awful. And that's just what array languages do for you implicitly. As a paradigm there's a bit more too it than "just call map everywhere"—there's still all the functions for expressing algorithms as computations on arrays.

chillee
https://github.com/google-research/dex-lang

Another interesting language in this vein is Dex. The authors are creators of Jax and Pytorch, and they have a lot of interesting ideas.

Bostonian
Fortran has had operations on arrays and array sections since the 1990 standard. I wonder why it is hardly mentioned in this discussion of array programming languages.
beagle3
APL which is discussed had them since 1958 or so.
blisse
The video spent a bunch of time basically describing ReactiveX / Reactive-Streams, specifically around the [zip](http://reactivex.io/documentation/operators/zip.html) and [combineLatest](http://reactivex.io/documentation/operators/combinelatest.ht...) operators.

I don't know if we've solved parallelizable computation in Rx, but it doesn't seem like it should be too much of an abstraction on top. I didn't get the sense that the speaker was aware of Reactive-Streams, but hopefully they're aware of the existing effort!

RyanHamilton
I think array languages could be a great alternative to java/c++ for a lot of operations. For anyone wanting to try another array based language, I've been working on jq an open source q language implementation: https://github.com/timestored/jq

It's implemented in java, which will allow loading in a wide range of libraries. An older version can be tried online here (1 min load time): http://www.timestored.com/jq/online

kitd
I don't know if you're aware, but `jq` is also a popular JSON processing tool. Possibility of confusion?

https://stedolan.github.io/jq/

eps
jq is also a jQuery abbreviation that is recognized as such by Google and other search engines, e.g. "jq on" will have jQuery.on() spec as the first hit.
RyanHamilton
I was aware of the json parser and it did put me off the naming. Unfortunately jq is the most obvious name. For example there is jruby/jython. Longer term when a stable version is reached I would like to change to it's own name as eventually the goal is to extend the current q language. jq++, jq# :)

Any suggestions?

pavlov
Another potential source of confusion is that J is also a well-known APL-derived array language. So it's not obvious that J here refers to Java.

I don't have any good suggestions. "Jaq"?

BMSmnqXAE4yfe1
What is a tensor? I remember from physics course that a tensor is some multidimensional matrix that maps vectors/matrices to other vectors/matrices, and vary from point to point in space. Looks like all arrays are called tensors now? Is it for coolness reasons?
xboxnolifes
It is my (probably incorrect) understanding that tensors are the generalized form of scalars, vectors, matrices, etc. A scalar being a 0 dimensional tensor, a vector being a 1 dimensional tensor, a matrix being a 2 dimensional tensor, and so on.
BMSmnqXAE4yfe1
Yeah, I guess "multidimensional array" is longer and not as "deep"
jayjader
Tensors have meaningful operations already rigorously defined in mathematical texts; consider multiplication between a vector and a scalar, dot product of two vectors, matrix multiplication, etc.

A multidimensional array is just data ordered over several dimensions, there are no intrinsic operations. So if you're talking about multidimensional arrays that have such operations defined, it's useful to communicate that distinction by using the name "tensor". In the same way that it's useful to talk about coordinates rather than "1-dimensional array of length <base size> that respects certain invariants".

brundolf
Needs a [video] tag
PeCaN
I've been working on something like this on and off for the past 4 years or so, although with something more like generators than streams.

I think it's a very, very promising idea (I admit to heavily being biased towards anything APL-influenced) although surprisingly difficult to get right. Gilad Bracha is obviously way smarter than me so I'm definitely curious where he goes with this.

One additional idea that I keep trying to make work is integrating variations of (constraint) logic programming and treating solutions to a predicate as a generator or stream that operations can be lifted to rank-polymorphically. As a simple example a range function could be defined and used like (imaginary illustrative syntax)

    range(n,m,x) :- n <= x, x <= m
    
    primesUpto(n) = range(2,n,r),               # create a generator containing all solutions wrt r
      mask(not(contains(outerProduct(*, r, r), r)), r)  # as in the video
    
I've never really gotten this to work nicely and it always feels like there's a sort of separation between the logic world and the array world. However this feels incredibly powerful, especially as part of some sort of database, so I keep returning to it even though I'm not really sure it goes anywhere super useful in the end.
hobby-coder-guy
What does “:-“ mean?
asimpletune
Probably “such that”
Groxx
Have you seen Halide? https://halide-lang.org/

It's a bit special-cased (afaict), but it sounds like it makes at least a solid step in this sort of direction.

HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.