HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
Anders Hejlsberg on Modern Compiler Construction

channel9.msdn.com · 378 HN points · 2 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention channel9.msdn.com's video "Anders Hejlsberg on Modern Compiler Construction".
Watch on channel9.msdn.com [↗]
channel9.msdn.com Summary
Channel 9 has joined Microsoft Learn
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.
Basically the IDE one needs to take into account that your program is broken all the time, yet you want code completion for everything else that is actually correct.

Also it needs to respond immediately after asking for completion, as anything beyond 2s is frustrating development experience.

You also want to get real time errors and warnings, just for the parts that are actually broken, not a wall of text like many batch compilers that fail to understand the remaining of the file.

Also you want to be able to do code refactorings, regardless of the compilation state.

So basically you want a Smalltalk/Lisp Machines like experience.

Anders has a nice interview about this,

https://channel9.msdn.com/Blogs/Seth-Juarez/Anders-Hejlsberg...

_bxg1
So would you typically maintain a separate compiler for each use-case, or try and make one that serves both?
steveklabnik
IMHO, one that serves both. The IDE case is a superset of the batch case.
Feb 02, 2020 · 187 points, 40 comments · submitted by ehaskins
W-Stool
Turbo Pascal (which he authored) was quite the accomplishment back in the day - editor and compiler in 64K. Think about that for a minute.
yesenadam
That was my first language after BASIC - I loved it!

I may be remembering this wrong, but it seemed virtually free - it was about $100 when its competitors and most software was $500+. That was in about 1986. The only way to get a manual was to buy the software. Another world..

tomcam
You do remember wrong. It was $49.95. It changed my life.
yesenadam
Ah yes, I think it was $70 here in Australia.
noir_lord
Mine too.

Started with ZX BASIC then when we got a second hand PC I switched to TP - Pascal was and is an excellent first language, in that sense it met it's design goals well.

dave84
I got a copy of Delphi 3 on the cover CD of PC Plus magazine in ‘99. Changed my 15 year old life.
barrkel
Huw Collingbourne shifted the needle on a lot of young UK & Ireland programmers, IMO.
dave84
Definitely. I owe him a lot. The late Wilf Hey also for his language columns, I remember following along with his Eiffel, Prolog and Smalltalk series as a youngster.
pjmlp
I paid 100 euros in today's money for Turbo Pascal for Windows 1.5, the last TP version before Delphi got released.

Likewise I paid 150 euros in today's money for Turbo C++ for Windows a couple of years later.

Borland products were king for small business, but then they decided to switch focus to corporate users, and the rest is history.

As side note, Delphi and C++ Builder are still quite loved in Germany, several companies keep using them, and there is at least an yearly conference.

badsectoracula
$99 was for Delphi up until version 5 for the cheapest version, which most likely explains why it was very popular among solo shareware developers.

Sadly that market was abandoned by Borland when they decided to go after enterprises and renamed themselves to Inprise and their prices skyrocketed after that.

gameswithgo
turbo pascal went straight from source to machine code. no ast!
mixmastamyk
^and it was awesome.
barrkel
The DOS TP compiler was still at the heart of Delphi 1 supported an inline assembler and reference-based class system, code generator, etc. in less than 43k lines of assembler (it was entirely written in assembler).

(32-bit Delphi 2 switched to a C-based compiler originally written by Peter Sollich. I maintained the front end on that compiler for 6 years; PS followed AH to MS.)

slowmovintarget
I got a lot of use out of that in my early career. Thank you.
barrkel
To be clear, I was only involved in 2006-12 time period. I got a lot of use out of TP 6 and Delphi 2 when I was learning, which is a significant part of the reason I ended up at Borland.
mika9090
Cool. I love Delphi to this day. Nothing beats that for Windows native apps IMHO.

Any interesting stories to share? I read somewhere that now the compile is a million lines of undocumented code.

tybit
This is a great talk, I was struck by how similar some of the high level design goals are to LLVM https://www.aosabook.org/en/llvm.html

I kind of hope Anders moves on from TypeScript soon, he’s done a fantastic job there considering the constraints of JavaScript, but I’d love to see him tackle something new.

jakear
He says he doesn’t think he’ll make another language. Which is a shame... I’d particularly like to see an actually good language for Azure(/cloud in general) resource management. There have been a bunch of attempts, but typically they’re just config files are their core.

Anyone know a declarative interface for describing services and their interactions that “compiles” to a complete cloud resource manager.

Rochus
Very basic stuff. Actually only the fact that for a syntax aware editor you need other data structures than for a compiler and that edits only affect parts of that structure.
zackbrown
Between Delphi, C#, and TypeScript the man has already had a storied and impactful career.
noir_lord
My three favourite languages ever (with Turbo Pascal as my first proper programming language) - The man has quite simply defined my career.
tejinderss
How's does c# compare to typescript? Which one u prefer between two ?
pjmlp
You forgot about the previous 10 year's of the man's career with Turbo Pascal.
acqq
Also his first work in Microsoft: Windows Foundation Classes (WFC) and Visual J++ development system.
tybit
Yes, he’s fantastically accomplished and capable, and I think he could do amazing things in a new area.

I’d love to see him take lessons learned from C# for a language designed for WASM from the get go.

pjmlp
WASM is just yet another bytecode format.

AssemblyScript is already quite ahead regarding WASM support.

AshleyGrant
Unfortunately, he is quite adamantly not a fan of WASM.
cm2187
Why is he not?
symlinkk
Only thing I could find of him mentioning WebAssembly was here at 19:45: https://youtu.be/MxB0ldQfvT4?t=1185

In that video he says that it’s not a suitable target for TypeScript to compile to because WebAssembly doesn’t have a garbage collector. They’d have to basically implement an entire JavaScript engine in WebAssembly which would be pointless.

He gives some examples of things it could be good for, which are all CPU intensive things like image processing and video games, but he doesn’t seem to think it’s really suitable for making normal web apps.

oaiey
But wasm is an architecture. Like x86. The more interesting question is: what language is optimal for the use cases wasm serves. Games come into mind.
dang
A thread from 2018: https://news.ycombinator.com/item?id=17280589

Discussed at the time: https://news.ycombinator.com/item?id=11685317

bordercases
It sounds like that content-addressed code as in the Unison language https://www.unisonweb.org/ could end up facilitating the design of a very effective IDE since the language can naturally generate an immutable structure by default.

I don't know enough to know if this hunch is correct, I'm wondering if anyone better informed can chime in.

zubairq
Yes this is correct. We use content addressable code, inspired by Unison for yazz Pilot, and it makes so many things in the ide really simple, including dependency management and stuff like that
zubairq
Very interesting, even though we don’t have a compiler in the traditional sense for yazz Pilot as a container image is our output format, many of the points are still valid.

Also very interesting how rosalyn had the idea of compiler as an API, we went in the opposite direction and did not have any extensibility, instead we made the Code editor an API

davidjnelson
Something I’ve observed with languages he designs are that they end up with a ui form builder. I wonder if that will happen with typescript. It’s harder, because they’d either need to pick a framework, or design it for at least react, vue, angular.
oaiey
At that time I wanted to read about compiler building. Missed that I'm University. That video made me skipping that. The standard books do not exist for this level of evolvement.
pbiggar
This is kinda how the compiler-stuff in Dark is written.

Everything - the editor, semantic analysis, version control, execution engine, everything - all use the same data structures (the same abstract syntax tree).

We use functional data structure everywhere and we do functional updates within the AST all the time; that's even how text entry in the editor updates the program.

chrisseaton
> Everything - the editor, semantic analysis, version control, execution engine, everything - all use the same data structures (the same abstract syntax tree).

Is that data structure suitable for all those purposes though?

How do you do optimisations like GVN on an AS->T<-?

pbiggar
It's not suitable for all purposes, but it's suitable for all editor purposes (including an in-editor execution engine). We don't do GVN, or any optimizations really, right now - I'm sure when we have a compiler of sorts we'll have other structures, SSA, etc.
octref
One take away for me was: Language designers & compiler writers today need to consider editor support as one of their goals.

On the top page there's another thread about rust-analyzer vs RLS. What Aleksey said[0] about RLS that "[RLS's] current architecture is not a good foundation for a perfect IDE long-term," feels similar to my coworker's conclusion in her effort to provide better editor support for PHP[1].

Parsers for compiling code into machine-executables vs parsers to serve LSP responses have different tradeoffs. For example, Anders mentioned TS parser need to have good error-recovery, can respond with completions/errors when one file changes inside a thousand-files-project. I vaguely remember TS had a goal to provide completions in < 50ms and errors in <150ms. Such goals are hard to achieve as after-thought. If your core compiler doesn't do error recovery, such as PHP, you need someone to write a new compiler from scratch for a language server implementation. If tools such as RLS rely on compiler to dump all JSON metadata and figure out LSP responses[0], it's too slow for editors.

TS's good editor support doesn't come free. I think one of the most under-appreciated achievement of TS is how it took editor support seriously and designed its compiler infrastructure to do it well. That's why I don't believe in those hot-new-web-languages that try to take over TS by designing a better type system. TS brought the average developer's expectation of a language to having fast completion, fast error reporting, editor autofix, F2 to rename and renaming-a-file-in-editor-to-update-all-references. It's 2020 and people aren't going back to write code like in Notepad.

[0]: https://ferrous-systems.com/blog/rust-analyzer-2019/

[1]: https://github.com/microsoft/tolerant-php-parser

---

EDIT: grammar.

Kinrany
> One take away for me was: Language designers & compiler writers today need to consider editor support as one of their goals.

In a way this is very intuitive: a programming language is a kind of a UI for the language's runtime. The IDE is just another UI layer on top of that.

Thank you, this kind of design doc is exactly what I was looking for. I'm not really familiar with Microsoft's ecosystem, but I mentioned it in the blog post because I suspected that they had the most advanced technology in this domain.

From that doc, which I plan to read thoroughly:

This enables the second attribute of syntax trees. A syntax tree obtained from the parser is completely round-trippable back to the text it was parsed from. From any syntax node, it is possible to get the text representation of the sub-tree rooted at that node.

This is true with my representation too, but I don't actually attach "trivia" to trees. Instead I just have every node store a bunch of span IDs. And then if you want to reconstruct the text, then you just take min(span_ids of node) and max(span_ids of node) and then concatenate those spans.

I also think the name "lossless syntax tree" makes sense, because they are describing very specific properties that ASTs and CSTs / parse trees don't have.

They also have an immutable property which is cool. I recall that Hjelsberg had a video on this:

https://news.ycombinator.com/item?id=11685317

https://channel9.msdn.com/Blogs/Seth-Juarez/Anders-Hejlsberg...

May 12, 2016 · 191 points, 40 comments · submitted by ingve
wwwigham
As Anders said near the end of the video, if you want to know more, look at the source code[1]. Speaking as someone who's worked on it (so I'm biased), I feel it's pretty easy to jump in and edit (though I'd advise new people to avoid the typechecker unless you feel particularly adventurous, it's a multi-thousand line long monster file). There are piles of easy issues[2] that are looking for community members to work on them.

By the way - one of the most meaningful comments he made in this talk (to me) was when he said that your parser had to "parse more than the allowed grammar" so you can provide better error messages. Compilers are, ultimately, tools for developers - so developer experience is tantamount. This, I've found, is so very very true in any smaller projects I've worked on, and is easily one of the first things neglected by some of my more algebraically inclined peers (who are very satisfied with a perfect ABNF and a parser which strictly adheres to it).

[1] https://github.com/Microsoft/TypeScript [2] https://github.com/Microsoft/TypeScript/labels/Effort%3A%20E...

seanmcdirmid
I've developed a couple of cool tricks to get very error tolerant parsers (I design/build live programming languages). We can go pure shunting yard (precedence parser), which will basically parse anything since it doesn't rely on grammar rules. Even if going with grammar based parsing (they are convenient), braces can be matched on tokens in a pre-pass before parsing occurs, eliminating an entire class of difficult to deal with errors, and allowing for brace stickiness in an incremental edit context; no need to rebalance because someone typed an opening paren without brace completion!

Though I can't help but think that someone will eventually develop an NN-based PL parser that will be much more error tolerant than straight grammar-based implementations could ever be.

ejanus
I would like to see a codebase where this idea is used. I am playing around shunting yard ...
infinite8s
Do you have those tricks written up anywhere?
seanmcdirmid
Not really. I haven't taken the time to do any analysis given that they are always details in the languages I'm building (which are written up). I gave a talk about this at Parsing SLE (2013?) but I guess it didn't need a PDF.

They are really simple ideas (really, pre match your braces, I'm sure I'm not the first one to do that!), I'm not sure they are publishable.

lemming
This is a fantastic overview. If you've ever wondered why JetBrains build specific editor support for each language they support (including, effectively, a compiler), this is why.

As the developer of an IDE for Clojure, I'm also very happy that one of the secret sauces is persistent data structures.

sethjuarez
As a quick aside, Anders had committed code the same day to the TypeScript compiler. He also is in a team room with like 20 devs (not in his own plush window office). He told me he loves that kind of environment. The dude is a really cool dev.
constexpr
This is a very good talk but I wonder if this alternate compiler design has actually made the TypeScript compiler slower in normal compilation mode. If you really do "helicopter in" to every point in the syntax tree and run IDE queries to implement type checking then that could potentially be much slower if there's any overhead at all to doing that.

I've been experimenting with programming languages and compilers myself (https://github.com/evanw/skew) and my compiler appears to be ~2x faster than tsc when run with node even though it's also doing constant folding, tree shaking, inlining, and minification while tsc is just removing type annotations (my compiler appears to be ~15x faster when run natively). The slow compilation speed of tsc is my main complaint about the TypeScript ecosystem.

nv-vn
The compiler doesn't just remove the type annotation, it has to go through and check that the annotations are valid and that the type safety is kept throughout the program. Type checking is often not a quick process, since it requires every single value in the program to have its type verified. If I declare that a variable is an integer and set it to the result of the function, the compiler has to make sure that that function returns an integer and not some other type, or that the type that function returns can be implicitly converted to an integer. And for that to be safe, it has to first prove that the function being called can be given that type, etc.
constexpr
I know what type checking is. :) Both Skew and TypeScript are type-checked languages. Just because a compiler does type checking doesn't mean it has to be a lot slower. I was just pointing out that it would be interesting if this alternate compiler design was the reason why the TypeScript compiler is so slow relative to another compiler for a similar language (object-oriented, typed, garbage collected, etc.) that also uses the node runtime, especially since that other compiler is doing even more work than tsc.
wwwigham
If you run tsc with benchmarking on, you'll find that almost all the time is taken up by the typechecker. Increasingly, TypeScript is adding control-flow-analysis-based[1] and contextual type system features... Saying that it just 'removes' type annotations as a bit disingenuous, as processing the type constraints those annotations apply is one of the most time consuming tasks TS can perform! TypeScript is structural, too, so for every comparison it can need to get down to the nitty-gritty of comparing individual object members... Recursively. TS does heavy cacheing of not just types, but the relationships between types - just to avoid redoing any work, where possible.

FYI, the 'helicoptering in' doesn't affect full-program compilation time because of the mentioned cacheing. Since the state of the world is unchanged during a build, the parsetree and type hierarchy are only built once. Additionally, you'd find that incremental builds (tsc --watch) are relatively snappy because of the heavy cacheing done and the tree reuse, and so has benefitted from the architecture improvements for service work.

[1]https://github.com/Microsoft/TypeScript/pull/8010

rtpg
Are there books or more developed material on these strategies? I understand the concepts mentioned here but reading some implementation strategies would be helpful.
mattchamb
The C# compiler he is taking about is open source, so you could have a play around with it to see how they implemented it. https://github.com/dotnet/roslyn
Scarbutt
Is it common to see people at Anders's age so enthusiastic about CS as he is?
stevetrewick
He's 56. What's your point? Do you imagine that people who devote themselves to a field have some kind of expiry date built in?
jonsen
Our expiry date is not har coded. It is a nondeterministic finite state machine.
fsloth
CS is like mathematics or chess - if you find it interesting or beautiful there should be no expiry date as such.

What burns people out, IMO, is lack of autonomy or broken work places, not the subject matter.

marssaxman
Certainly common in my experience. A coworker of mine just celebrated his 57th birthday. He's definitely better at this stuff than I am; experience counts. The youth bias in this industry has always been largely based on illusion.
oblio
I think this applies to many people in our domain, provided they're careful not to burn out: https://popperfont.files.wordpress.com/2014/04/lifesatisfact...
jules
It's very common for professors and researchers, which I guess he kind of is.
ctvo
He clearly is not JavaScript fatigued.
chubot
Well he clearly has a curious mind... I think you only get burnt out once you can't keep up, or once nobody is interested in the things you spent your career on.

But he happened to spend his career on something that is still profoundly interesting to people. The open source world is very much in the old architecture he was talking about, so it appears that Microsoft compilers are simply more advanced. (And Borland and Jetbrains did this too, just not open source compilers)

There are several programmers at my company who are in their 60's, and I think even 70's, past retirement age. All of them are super sharp; they are not "code monkeys". I think those types tend to leave the industry sooner.

visarga
It looks like they are building syntax trees in a similar way to how React is building the DOM tree - using functional programming and caching/diffing.
amasad
So it's not that the classic compiler architecture that had to change -- it's the addition of languages services that broke the existing model.
chubot
The classic compiler architecture did change, unless you want to maintain two compilers. That's the main point of the talk.
amasad
Yes, I'm just saying that the requirement was external to the original function of compilers.
chubot
Sure... they mentioned the fact that the TypeScript compiler barely generates code, if that's what you mean by the original function of compilers.

The architecture is completely dominated by the front end for usability and incrementality now. Generating JavaScript after all that is trivial.

You can imagine that during a typical program's development lifetime, 99% of the time is spent with the program in a non-working state, and 1% is when it's working. The compiler has to help with the 99% case now, not just the 1% case.

None
None
amasad
Typically langauge services are decoupled from compilers. The 1% case still dominates how compilers functions. Furthermore, typically compilers are not deamons and build systems would take care of caching etc. Do you have other examples of consolidated tooling like that? It seems like a good idea but I don't think it's the standard thing yet or that everyone thinks it's a good idea.
pjmlp
Yes the model of programmer workstation as envisioned by Xerox PARC and followed up by ETHZ.

Smalltalk, Interlisp-D, Mesa/Cedar at Xerox PARC, followed by Niklaus Wirth work at ETHZ with Oberon, Oberon-2, Active Oberon and Component Pascal environments.

Also the initial Ada Workstation developed as the first product from Rational Software and Eiffel development environments.

The Energize C++ environment created by Lucid after pivoting from Lisp Machines.

atombender
If you look at how compilers are written these days, one other thing has changed. Almost every book on compiler design, including the classics, will direct you towards separating the lexer, scanner and compiler stages using parser generators like Lex/Yacc or Flex/Bison. In practice, this isn't done anymore. If you look at modern compilers like Clang, Go or Rust, they all have hand-coded parsers that build syntax trees right out of tokens. It turns out that parser generators don't really give you that much, and only get in the way once you need to go beyond the basics.
musesum
Anders Hejlsberg wrote the first mass-market Compiler+IDE with Turbo Pascal. What was unique is that TP would automatically bring up the editor and position the cursor on the offending error. Seems trivial now, but it was a game changer for writing code on a PC. TP sold for around $49. The competitor, Pascal MT+, sold for $400.

I doubt if Anders would remember me, but I was lucky enough to be contracted by Borland to write their first test suite for TP 4.5. It was their first object oriented language. The spec was one of the most beautiful and concise pieces of technical documentation I've ever read.

FullyFunctional
There were many wonderful things about TP, but it started much earlier than that. BLS Pascal on Nascom 2, COMPAS Pascal on CP/M, Turbo Pascal on CP/M, and finally TP on DOS. I met Anders at Herning Messen where he demoed COMPAS Pascal (running a Maze generator). From my recollection he was a few years older than me which suggests he wrote BLS Pascal as a young teenager o_O!

This thread has a big of this history: https://news.ycombinator.com/item?id=10202299

agumonkey
Last month I recovered an old backup tape with TP.EXE. Couldn't resist but to play with it in dosbox.

http://imgur.com/oT3u4fR

It really was a brilliant piece of software. 600KB.

ps: the text GUI shadows.

musesum
Ah yes, text mode graphics! I spent a few thousand hours writing a text mode version of Augment, with TP. Amazing what you could do with 80x25x16 colors.
Flow
I always fancied the EGA text mode of 80x43x16 colors. Much prettier than 80x50.
agumonkey
In terms of ergonomy, that amount of text was very nice. As a long emacs user, I felt straight jacketed by their keyboard shortcuts.

ps: really, these pseudo alpha transparent text shadows ...

atombender
Turbo Pascal 5.x and onwards were implemented using Turbo Vision, a really good, object-oriented UI framework [1]. Every view could render a sub-rectangle of itself on command into a buffer, which is how the UI was able to render drop shadows. The C++ version was open-sourced in 1997 and ported to a bunch of platforms including Linux [2].

[1] http://www.sigala.it/sergio/tvision/html/index.html

[2] http://tvision.sourceforge.net/

fernly
To assist you all in reminiscing? A lot of TP manuals are at bitsavers:

http://www.mirrorservice.org/sites/www.bitsavers.org/pdf/bor...

lobster_johnson
TP was indeed amazing, though I think you're misremembering — TP 5.5 was the first version to add OO extensions.

It came with a separate little manual just for the object-oriented stuff (beautifully typeset with the same type of syntax flow diagrams that Apple did [1]; Borland's OO extensions came from Apple Pascal, after all), and I remember reading it from cover while on vacation when I was in my mid-teens, and having my mind blown by the ideas of object-orientation, which seemed like complete science fiction at the time.

Anything from your recollections you can add to this? Was there any collaboration with Apple at all?

[1] http://pascal-central.com/images/pascalposter.jpg

musesum
Yep, my bad regarding version; was relying on faulty wetware instead of a web search.

Not that I was privy to inside information; but it was a strange coincidence that Borland dropped its Basic compiler at the same time that Microsoft dropped its Pascal compiler.

Apple had a pretty cool OO Pascal that somewhat resembled Xerox Parc's Mesa.

HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.