HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
Introduction to LLVM · 271 HN points · 0 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention's video "Introduction to LLVM".
Watch on [↗]
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.
Feb 06, 2018 · 271 points, 23 comments · submitted by matt_d
Slides and code examples:
Mike says in his talk that LLVM started out with the goal of having an intermediate representation to run optimizations on. Hasn't this always been the case with compilers in recent times? I seem to recall similar things being said about the GNU Compiler Collection at a presentation many years ago.

If this is true, what is (was) the appeal of the LLVM project at the time of the project inception?

As I understand it, difference is more political than technical. GCC is made to prevent people plug in nonfree parts into compiler. So you cannot plug in your dynamic library or use part of GCC as library which is artificial restriction. See this,
Not just political, gcc’s IR was designed in the early 1980s and “just growed.” It’s not easy to understand.
If you really wanted to, I imagine you could write a proprietary compiler that generated some serialisation of one of GCC's IRs (GENERIC or GIMPLE), and you could write a deserialiser for GCC that reads in that IR. Release the deserialiser under GPL, then you can legally use GCC as the backend for your proprietary compiler.

This sort of IPC-like hackery often seems to happen when someone is looking to work around copyleft - they can then break the spirit of the GPL without breaking the legal obligations.

You would still need to use gcc code to operate on the deserialized IR, which is GPL. Unless you then transform the IR into a proprietary one, in which case that's not breaking the spirit of the GPL.
By your logic, every single bit of code that has passed through gcc is now GPL licensed.
> You would still need to use gcc code to operate on the deserialized IR, which is GPL

The GPL's terms would not extend to the program that generated the IR.

A GPL web server can't require a web browser to relicense under GPL, either. Same idea.

That's not too far from how the the first LLVM frontend worked, IIUC. llvm-gcc was a GCC fork (pre-Clang) that produced LLVM bitcode or IR. That IR would be fed into an LLVM backend for code generation.
At its inception the appeal was the "multi-stage optimisation system" and the virtual instruction set design (from
At the time, I had a DSL I had written in C++, and I wanted to produce proper compiled binaries. I had started looking at the now defunct TenDRA - I was thinking it would be nice to compile to a "machine-independent binary" which could be converted to machine code on the actual runtime system. But soon after I found LLVM, and it seemed like a no-brainer comparatively.

I seem to remember that "machine-independent binaries" was Apple's first use of LLVM: distributing LLVM IR and having it be converted to machine code on the user's computer, back when they were supporting Motorola and Intel chips. And I think consequently that's how LLVM got a lot of its momentum.

> Mike says in his talk that LLVM started out with the goal of having an intermediate representation to run optimizations on. Hasn't this always been the case with compilers in recent times?

Not just recent times. The dragon book from 1986 covers IRs.

> Hasn't this always been the case with compilers in recent times?

Yes. LLVM used SSA form earlier than GCC did, and it makes its intermediate representation more accessible than GCC or other compilers. But the general idea of having a midend level IR for optimizations was not at all novel at the time that LLVM appeared. I think the point was mainly to emphasize the fact that LLVM IR allows more optimizations than the somewhat comparable IR of JVM bytecode.

GCC’s passes, at least back then (don’t know about today) used numerous intermediate representations, many of which were simply dumps of their data internal structs that the following pass happened to be able to parse (usually by including the header of the previous pass.)

LLVM’s IR is the same “thing” all the way through the pipeline, and that thing is standardized separately from the compiler, so you can write tooling that processes it and expect it not to break with new LLVM versions.

That same “thing” also round trips to a text representation that looks like assembly. That means people can more easily talk about it, that programmers can use it to write test cases or even to to write code in it.

I think that has helped llvm tremendously.

I attended this talk, and it was one of the highlights of the conference for me. Mike is a great speaker and I found myself inspired to dive deeper into LLVM internals afterwards.
Shame the mic is so hot, it's really hard to listen to.
Youtube mirror, for those who'd like to add it to their "Watch later"-list:

Or for those who want to easily watch on 2x speed.
mpv does a nice job with that '[' and ']' select any arb speed and '.' steps frame by frame (with sound).
You're my hero
Was just about to do the same, regrettable about the audio quality.
HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.