HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
Three Things I Wish I Knew When I Started Designing Languages

Peter Alvaro · InfoQ · 164 HN points · 1 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention Peter Alvaro's video "Three Things I Wish I Knew When I Started Designing Languages".
Watch on InfoQ [↗]
InfoQ Summary
Peter Alvaro talks about the reasons one should engage in language design and why many of us would (or should) do something so perverse as to design a language that no one will ever use.
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.
It's slow, but there. It's just a link to this though in case you can't see it: https://www.infoq.com/presentations/language-design-process/
Mar 19, 2019 · 158 points, 32 comments · submitted by matt_d
asplake
“He summarises by saying the only good reason to design a programming language (I assume he means a radically novel language) is to shape your understanding of the problem. No regrets of being the only user of his first language, Datalist, because the point is that it shaped all his later thought in his research.”

I don’t design languages but I speak and write (iterating between different forms) for much the same reason. Thoughts made concrete in some way are much easier to manipulate. Programming can be much the same.

Klathmon
This is also why I'll often not ask important questions in meetings and will instead ask them in a chat message or email.

Not only does the act of writing it out make me ask a better question (and there are times I answer the question myself before sending it!), but the answers are often much better when written down. (Not to mention the additional benefits like easier searchability)

lugg
Discussion is cheap. If it is on topic, saying something stupid quickly and then rolling back the conversation is a far more efficient use of your time.

You do need to pick your audience.

Some people really struggle with this style of conversation. The rapid depreciation/substitution of facts can be a tad tiring.

But for those that can take it, this is essentially the same as writing things down but it's faster and has the added benefit of at least one other person "checking your math" as you go.

Klathmon
Don't get me wrong, a realtime discussion is extremely important in many cases. When brainstorming or talking through bugs or building something collaboratively, it doesn't get any better than an in-person or video meeting.

But when it comes to more complicated details, one-off questions, important decisions, or bringing the "results" of an investigation to a group, text is just better a lot of the time.

I can't count the number of times that a decision was made verbally in a meeting only to find that half the meeting misunderstood the decision or can't remember details about it the next day. Or going to discuss a complex bug/problem and having to spend a significant amount of time talking about the "setup" to the problem rather than the problem itself.

It's those cases where spending 5 minutes writing the question up and letting people read and understand it before jumping into a meeting, or even something as simple as following up a meeting with a written summary of the important points made can be very powerful.

ska
I think what you are saying has a lot to do with particular meeting culture(s) and dynamics, rather than anything specific about the medium.

Many people would be stunned by the idea that a written followup (e.g. minutes) would represent a novelty - or for that matter circulating context beforehand (e.g. an agenda) with time to ask clarifying questions offline would be, either.

joshmarlow
> I can't count the number of times that a decision was made verbally in a meeting only to find that half the meeting misunderstood the decision or can't remember details about it the next day.

I've had the same experience, but I tend to follow a hybrid approach; bring up everything you can think of in the meeting, then immediately following up with an emailed recap of what was agreed upon. It helps make the verbal decisions explicit and iron out additional details.

stcredzero
He summarises by saying the only good reason to design a programming language (I assume he means a radically novel language) is to shape your understanding of the problem

What of "meta-languages" like Lisp and Smalltalk? You can write your own control structures in Smalltalk, and they'll hew to the same 1st class meta-syntax as all of the "built in" control structures. Want a control structure for database transactions? Just write it as a function that takes code block arguments. Want control structures to handle optional values? Just write it!

eequah9L
For every new project idea, I open a file and start writing everything that comes to mind. A design document if you will. Writing stuff down like this makes things obvious and tangible, easy to juxtapose, as you say, easy to manipulate.

I can't count how often that document is the last thing that I do on a project, because I just can't make it work even on the paper, where ideas are cheap to sketch out.

sshine
> Jeff Ullman's claim that encapsulation and declarativity are in tension

Can someone provide an elaboration on this, or link to this claim?

mjburgess
SELECT name FROM users;

Users.getName();

The above is "encapsulated" meaning that you need to have done the extra work to write "getName". Encapsulation is a bad solution to a worse problem (changing state).

If you hide information you have to do extra work to get it out. This "extra work" is in tension with a declarative approach which uses general verbs across domains, requiring therefore, these domains expose their data.

mlthoughts2018
Why wouldn’t encapsulation be linked more to normalization in a declarative setting? I’d argue that “implementation hiding” in a declarative setting is more about unfolding all data dependency, so that attributes are isolated from each other. Then the extra work to perform more joins to reproduce a cumulative data record seems very analogous to the extra work of getters and setters.
mjburgess
joining is connected to normalization, not to declarative style

declarative style is (essentially) the highly polymorphic use of verbs in programming. You might even call it "polymorphic programming" rather than declarative, since that's often what is being prioritized.

this level of polymorphism requires the data-operated-on by the verb to be relatively exposed

ie., "select" cannot be different for every table

OO essentially has "select" redefined every time you want to expose data.

mlthoughts2018
> “OO essentially has "select" redefined every time you want to expose data.”

I still don’t see how this is different in concept, since you “redefine” selection every time for normalized attributes as well (you must individually hand-specify where to select from).

I get what you’re saying about polymorphism, such that things like select or join “just work” on everything. Many OO / functional programming languages have that too, for example with __getitem__ or __getattr__ protocols in Python, so that the same “verb” (e.g foo.x = 3) “just works” automatically without concern of specialized implementations.

But in the declarative case you’re just trading off the conplexity of creating and registering specific getter logic per each type against flattening out all the attributes such that all the complexity is embedded into knowing the correct name to ask for the universal verbs (e.g. select) to apply to.

mjburgess
> against flattening out all the attributes

Hence the tension with encapsulation

mlthoughts2018
I don’t understand what you mean by “tension.” In an OO setting, you register specialized implementations with some type of interface / protocol / dispatcher. In functional programming you create functions that allow a data type’s specialized implementation to be type matched against a typeclass. In declarative programming you flatten out attributes so that a pre-existing set of polymorphic operators apply, and you translate the complexity of lookups (that match the implementation) into the complexity of specifying everything in a splayed-out form.

It’s all the same complexity, just different trade-offs for how it’s represented. I wouldn’t describe this as “tension” because that makes it sound as though declarative programming has some fundamental limitation or missing capability regarding encapsulation, but it doesn’t. It’s just making a different trade-off of the same thing.

mjburgess
I guess there's two senses of "encapsulation" in use here.

What I think is meant by the article is information-protection rather than information-hiding, or implementation hiding.

Encapsulation in the sense of "private string name;".

cy6erlion
High level languages should be designed for humans to use, Encapsulation gives humans a levels of abstraction.
fnord123
>This "extra work" is in tension with a declarative approach

No it isn't. They work complementary to each other. If User is an interface then retrieving the name could be a call to a db, read from a file, read in memory, http GET request, it doesn't matter. The declarative component here does not care how the request is fulfilled as long as the interface is respected.

0815test
The problem is that encapsulation makes it hard to specify reasonable constraints. For example, a "getter" might come with a constraint that "x = myobj.GetX(); myobj.SetX(x);" should not change the observable behavior of myobj. Or that "myobj2.ConstructFromXY(myobj.GetX(), myobj.GetY());" should result in a "myobj2" that's in some ways indistinguishable from "myobj". With a "declarative" approach, these constraints are implied in the system itself.
skybrian
It seems to be briefly discussed in "Principles of Database and Knowledge-base Systems" where it's pointed out that object-oriented databases are usually not declarative.

https://www.sti-innsbruck.at/sites/default/files/Knowledge-R...

fnord123
From the transcript:

> I knew what I wanted my language to talk about, but I had no idea what it looked like, or how you could use it, or what its semantics were. I had six or seven years to screw around with this stuff, so I did. A lot of time passed. I spent my time on Google docs thinking about what the right state primitive should be. How can we talk about data changing? Whatever language I come up with there should be a way to efficiently evaluate it, that's pretty important if you're a systems person. I screwed around more with syntax. I got really upset about this thing that Jeff Ullman said in one of his books, in which he argued that encapsulation, this thing that you really want, you hide an implementation so that you can abstract above it, and declarativity, allowing a programmer to give a precise description of what should happen in a computation but not necessarily how to do it, appear to be very much at odds with each other.

> So, declarativity tends to work really well for very small programs, but when we want to build big programs out of small programs, stuff gets fussy. I'm not going to talk much more about this in this talk, but, this is a problem. So anybody in the audience who is interested in declarative programming, I'd love to hear if you have any ideas about how we can do a better job of hiding encapsulation and reuse in query languages. And then, of course, the number one thing that occupied all my attention and kept me up at night, which was how do we, in a reasonable way, talk about uncertainty? How does a language talk about what it can't talk about? That's weird.

I don't know which book this refers to or how Ullma supports his claim. And yet, Postgres foreign data wrappers exist providing encapsulation to the declarative SQL interface. shrug. Maybe Ullman is just wrong - or his comments were from a long time ago and the world has moved on since then.

naasking
> And yet, Postgres foreign data wrappers exist providing encapsulation to the declarative SQL interface.

I'm not sure what this means. Do you have a reference?

It seems pretty clear that building an efficient query plan depends on having access to the internals of any entities referenced by the query. Encapsulation hides the internal of entities, so the claim seems pretty solid to me.

numbsafari
Interesting. Can this be addressed by ensuring the API of the encapsulated subsystem is sufficiently rich to provide what is needed by the declarative query planner? That is, the API allows the encapsulated subsystem to describe its contents in a way that supports query planning (e.g. shape of the contained data, supported operators, and general statistics), without having to reveal its exact implementation (columnar v row store, distributed v centralized storage, exact file format used, etc)?

Perhaps the evolving shape of that API as the query planner becomes more capable (and therefore requires more knowledge) is what was meant by “are in tension”.

Tension isn’t a bad thing. My favorite art is all about tension. Life is at its most immediate and palpable in the presence of tension.

skybrian
Abstraction allows us to substitute one thing for another without breaking compatibility. This is at the heart of both API-level compatibility and also most optimizations. If two implementations weren't considered equivalent, we wouldn't be able to do the substitution without breaking things. On the other hand, if the change didn't have some important, user-visible difference (such as performance), there would be no point in doing the substitution.

Now consider that most optimizations are enabled by inlining, which means that changing the internal details of a function could break optimizations in calling code and this could cause performance regression.

A stricter API could prevent regressions (some bad changes are disallowed) but it also prevents good changes (they break compatibility or are disallowed optimizations).

Language designers get to decide what "equivalent code" means using various forms of abstraction, but there is not going to be a perfect answer.

naasking
> Can this be addressed by ensuring the API of the encapsulated subsystem is sufficiently rich to provide what is needed by the declarative query planner?

Perhaps, but this is still a clear weakening of encapsulation, and it's merely one example where it's most obvious. I think most uses of encapsulation are probably overkill anyway, necessitated only because the language is weak in other areas.

For instance, encapsulation is often used to control state changes due to aliasing concerns, but if you're working with an immutable language, state change and aliasing don't conflict. Or you can go the other way, like Rust, and forbid aliasing via ownership and then state change is also easy.

fnord123
> I'm not sure what this means. Do you have a reference?

https://www.postgresql.org/docs/9.3/postgres-fdw.html

https://www.percona.com/blog/2018/08/21/foreign-data-wrapper...

https://pgxn.org/tag/foreign%20data%20wrapper/

>It seems pretty clear that building an efficient query plan depends on having access to the internals of any entities referenced by the query.

Most query planners are based on table statistics so what you need to know are the size of the table, and the correlation of the actual data, but they don't need to know whether it's a BTree, LSM, epsilon tree, etc. This can be measured without knowing the underlying implementation.

Postgres query planning statistics: https://www.postgresql.org/docs/current/view-pg-stats.html

naasking
Firstly, try writing a query planner when some of the columns being joined are hidden or inaccessible (which is what I described in my post). For instance, suppose it's a dynamically computed property whose computation you don't have access to. That's specifically what the tension between encapsulation and declarative programming means.

Secondly, it seems clear that "table statistics" are a weakening of encapsulation properties. Perhaps they are the minimal weakening needed to provide efficient query planning, but it still justifies the claim that encapsulation and declarativity are in tension.

fnord123
>it seems clear that "table statistics" are a weakening of encapsulation properties.

I don't have the original source of the Ullman comments so I don't know the definitions he's using to make his assertion. So unfortunately I'm speaking a bit out of ignorance about precisely what he meant. I can't really say whether or not table statistics is a weakening of what he intended.

bunderbunder
I don't know that he's wrong, but there's a major premise in there that's easy to miss if you aren't watching for it: "Whatever language I come up with there should be a way to efficiently evaluate it, that's pretty important if you're a systems person." It makes me think that he mis-stated his case slightly.

In an ideal case, SQL's encapsulation supports its declarativity. In practice, it's riddled with holes. Some of them, like micro-managing the indexing scheme, aren't part of the basic idea behind relational databases (Codd's paper doesn't even mention them, IIRC), but have become so deeply engrained in our expectations of how a relational database should work that we don't even recognize them as a compromise on the idea behind the relational calculus anymore. But they're generally either deeply non-declarative, or lay bare some pretty deep implementation details of the system, or, in the case of index optimization features, both.

So, what I'm guessing is really going on here is an incomplete observation that there's a "pick two" situation here: You can only have two of performance, declarativity, encapsulation.

None
None
dang
Url changed from http://lambda-the-ultimate.org/node/5569, which points to this.
chalst
The virtue of the LtU link is that it gave a brief outline of the talk.
jfengel
The semantics of negation is like security: there's an urge to ignore it at the beginning because it's not the use case you have in mind, but it turns out that it's absolutely fundamental to everything. You have to build it in from the beginning; you can't layer it on afterwards.

Different negation semantics put you in different computability classes. And there's a tension: you want more powerful semantics, but they take longer to run. Full first-order logic is great, but it's also exponential time to evaluate. (Also, the semantics of negation for FOL are easy to explain but not actually very similar to what you expect negation to mean in real life, leading to unexpected semantics.)

Mar 19, 2019 · 6 points, 0 comments · submitted by lrsjng
HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.