HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
2017 ACM PPoPP Keynote: It's Time for a New Old Language

Guy Steele · Youtube · 3 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention Guy Steele's video "2017 ACM PPoPP Keynote: It's Time for a New Old Language".
Youtube Summary
Keynote address (sound plus slides) of a keynote talk given by Guy Steele on Monday, February 6, 2017, at the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. It surveys the history and current status of computer science metanotation.

A shorter version of this talk was given at Harvard University on April 19, 2017, as part of the Celebration of Computer Science at Harvard in honor of Harry Lewis. https://youtu.be/8fCfkGFF7X8?t=37m46s
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.
Or ideas in general, should we ever gate keep the dissemination of ideas and knowledge? Innovation is more likely to occur when people cross disciplines and they can only do that if it is approachable, it is gradual and discoverable. Simple is hard, simplistic is easy. Someone is going to come back with essential vs accidental complexity [1] but we can always strive for simpler, more cohesive ways to exchange knowledge. How much of our current notation systems are essential? And when do they diverge for non-essential reasons? Computer systems form primarily around notation (language and syntax), so notation is definitely important [z]

In the post the author mentions an excellent talk by Guy Steele that goes over issue of notation and balkanization titled, "It's Time for a New Old Language" where he outlines "the history and current status of computer science metanotation". [2]

Notation, like syntax is important, but like a plethora of ill defined and specified Domain Specific Languages (DSL), notation also needs to be defined, regular and learnable. For notation takes words and turns them into pictures that compose into thought structures that cannot be expressed in human written language to the same specificity. Think Feynman [3] and Railroad [4] Diagrams, or musical notation [5]

The abstract explains it beautifully,

> The most popular programming language in computer science has no compiler or interpreter. Its definition is not written down in any one place. It has changed a lot over the decades, and those changes have introduced ambiguities and inconsistencies. Today, dozens of variations are in use, and its complexity has reached the point where it needs to be re-explained, at least in part, every time it is used. Much effort has been spent in hand-translating between this language and other languages that do have compilers. The language is quite amenable to parallel computation, but this fact has gone unexploited.

[1] https://simplicable.com/new/accidental-complexity-vs-essenti...

[2] https://www.youtube.com/watch?v=7HKbjYqqPPQ

slides, https://groups.csail.mit.edu/mac/users/gjs/6.945/readings/St...

paper, https://sci-hub.st/10.1145/3155284.3018773

[3] 6 minute video from Fermilab on Feynman Diagrams https://www.youtube.com/watch?v=hk1cOffTgdk

[4] https://sqlite.org/syntaxdiagrams.html

[5] https://www.mfiles.co.uk/music-notation-history.htm

[z] Wadler's Law

(Nice to see your work shared here :) )

I went through the corresponding videos last year:

https://www.youtube.com/watch?v=vU3caZPtT2I

https://www.youtube.com/watch?v=MhuK_aepu1Y

It was a great refresher as someone who once liked math but hasn't done much of it in ~20 years :) I had seen the blog posts, but there was some "color" in the videos that helped. For example I didn't realize that the fonts sometimes matter! Honestly, I still don't really read the notation, as I haven't had a strong reason to, but I feel it would be useful at some point.

----

For others, I also recommend this 2017 talk by Guy Steele It's Time for a New Old Language

https://www.youtube.com/watch?v=7HKbjYqqPPQ

Because even people in the field seem to have problems with the notation. He also was asked about this work a few days ago here and said he was still working on it in the background (being a "completionist"):

https://www.youtube.com/watch?v=c_ZJECVlpog

-----

FWIW as you know, Oil is more static than shell, and that was largely motivated by tools and static analysis (and negatively motivated by false positives in ShellCheck https://news.ycombinator.com/item?id=22213155)

I would like to go further in that direction, but getting the basic functionality and performance up to par has taken up essentially 100% of the time so far :-(

My use of Zephyr ASDL was also partly motivated by some vague desire to get the AST into OCaml. However I haven't used OCaml in quite awhile and I get hung up on small things like writing a serializer and deserializer. I don't want to do it for every type/schema, so it requires some kind of reflection. And my understanding is that there are a bunch of packages that do this like sexplib, but I never got further than that.

Formulog sounds very nice, so I wonder if there is some recommended way of bridging the gap? For example imagine you want to load enormous Clang AST or TypeScript ASTs into Formulog. The parsers alone are 10K-30K lines of code, i.e. it's essentially infeasible to reproduce those parsers in another language in a reasonable time. And even just duplicating the schema is a pretty big engineering issue, since there are so many node types! I could generate them from Zephyr ASDL, but other projects can't. I wonder if you have any thoughts on that? i.e. to make the work more accessible on codebases "in the wild"

-----

Also FWIW I mentioned this "microgrammars" work a few days ago because I'm always looking for ways to make things less work in practice :)

https://news.ycombinator.com/item?id=23978432

Doing anything with languages seems to be very "long winded" so I'm glad to see work in that direction!

mgreenbe
Thanks! :) We should be very clear that the bulk of the work is Aaron Bembenek's.

I think Formulog would work great for analyzing the shell---as would any other Datalog, though SMT-based string reasoning will certainly come in handy. I don't think it will help you with parsing issues, though. The general approach to static analysis with Datalog avoids parsing in Datalog itself, relying on an EDB ("extensional database"---think of it as 'ground facts' about the world, which your program generalizes) to tell you things about the program. See, e.g., https://github.com/plast-lab/cclyzer/tree/master/tools/fact-... for an example of a program for generating EDB facts from LLVM. Just like real-world parsers, these are complicated artifacts.

chubot
Ah OK thanks for the link. Since it depends on commercial software, I don't see a path to trying it (which is fine, because I probably don't have time anyway :-/ )

So are you saying that it's more conventional to serialize relations from C++ or Python, rather than serialize an AST as I was suggesting?

Your blog post mentions ASTs too, so I'm not quite clear on that point. I don't have much experience writing such analyzers, and I'd be interested if there is any wisdom / examples on serializing ASTs vs. relations, and if the relations are at the "same level" as the AST, or a higher level of abstraction, etc.

-----

FWIW I read a bunch of the papers by Yannis because I'm interested in experiences of using high level languages in production:

https://news.ycombinator.com/item?id=21666658

After all the reason I like shell and R is that they're both higher level than Python or JS :)

I like short code, and Datalog seems interesting in that regard. I also hacked on this Python type inferencer in Prolog awhile ago:

https://github.com/andychu/hatlog

I did get hung up on writing simple pure functions in Prolog. There seems to be a debate over whether unification "deserves" its own first-class language, or whether it should be a library in a bigger language, and after that experience, I would lean toward the latter. I didn't really see the light in Prolog. Error messages were a problem -- for the user of the program, and for the developer of the program (me).

So while I haven't looked at Formulog yet, it definitely seems like a good idea to marry some "normal" programming conveniences with Datalog!

mgreenbe
I'd say it's conventional to reuse an existing parser to generate facts.

The AST point is a subtle one. Classic Datalog (the thing that characterizes PTIME computation) doesn't have "constructors" like the ADTs (algebraic data types) we use in Formulog to define ASTs. Datalog doesn't even have records, like Soufflé. So instead you'll get facts like:

``` next_insn(i0, i1). insn_type(i0, alloc). alloc_size(i0, 8). insn_type(i1, move). insn_dest(i1, i0). insn_value(i1, 10). ```

to characterize instructions like:

``` %0 = alloc(8) %0 = 10 ```

I'm not sure if that's you mean by serializing relations. But having ASTs in your language is a boon: rather than having dozens of EDB relations to store information about your program, you can just say what it is:

``` next_insn(i0, i1). insn_is(i0, alloc(8)). insn_is(i1, move(i0, 10)). ```

As for your point about Prolog, it's a tricky thing: the interface between tools like compilers and the analyses they run is interesting, but not necessarily interesting enough to publish about. So folks just... don't work on that part, as far as I can tell. But I'm very curious about how to have an efficient EDB, what it looks like to send queries to an engine, and other modes of computation that might relax monotonicity (e.g., making multiple queries to a Datalog solver, where facts might start out true in one "round" of computation and then become false in a later "round"). Query-based compilers (e.g., https://ollef.github.io/blog/posts/query-based-compilers.htm...) could be a good place to connect the dots here, as could language servers.

This is great! I wish I had seen something like this years ago.

Related: "It's Time for a New Old Language" by Guy Steele.

https://www.youtube.com/watch?v=7HKbjYqqPPQ

He gives a history of metalanguage notations (for both language semantics and syntax, including regexes and CFGs). And then he criticizes the inconsistency that has cropped up in these notations over the years.

The irony is that people who are formalizing precise semantics of programming languages do not agree on the syntax/semantics of the metalanguage.

I guess this is what you notice when you read hundreds of programming language papers, as I'm sure Guy Steele has.

-----

Also, does anyone know if what he refers to as "progress and preservation" are the same / related to "liveness and safety" in distributed algorithms? It sounds like the same kind of idea.

mannykannot
The inconsistency may be more of a problem for someone new to the field, as it can give such persons the impression that they are not following what the author is saying. With more experience, it is easier to skip over the inconsistencies without really noticing.
neel_k
> Also, does anyone know if what he refers to as "progress and preservation" are the same / related to "liveness and safety" in distributed algorithms? It sounds like the same kind of idea.

Progress and preservation are used to prove type safety, which (like the name says) is a safety property. Informally, safety means "a bad thing never happens", and liveness means "a good thing will eventually happen".

Informally, progress means "a typed program will never immediately exhibit undefined behaviour", and preservation means "a typed program always stays well-typed". Together, these two properties mean "a typed program never exhibits undefined behaviour".

This has the shape of a safety property. All this can be made much more formal, but that's the intuition.

HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.