HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
Cory Benfield - Building Protocol Libraries The Right Way - PyCon 2016

PyCon 2016 · Youtube · 6 HN points · 2 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention PyCon 2016's video "Cory Benfield - Building Protocol Libraries The Right Way - PyCon 2016".
Youtube Summary
Speaker: Cory Benfield

One of the great strengths of the Python ecosystem is the enormous collection of powerful, flexible libraries. However, these libraries tend to suffer from one extremely common design flaw that mean that the work done is not easily re-usable or transferable. In this talk, we talk about how to build libraries that can be used as widely as possible, through the lens of the Python Hyper HTTP project.

Slides can be found at: https://speakerdeck.com/pycon2016 and https://github.com/PyCon/2016-slides
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.
In some cases it is very practical, even in other (non functional) programming languages.

For example, https://sans-io.readthedocs.io/ (and its accompanying talk https://www.youtube.com/watch?v=7cC3_jGwl_U) give some really great brainfood about the way currently all kinds of network protocols are implemented, and the benefits of separating the protocol (Logic) from the actual implementation (IO).

You might also want to query Google or DDG for "Free monads" if you want to implement a similar layered approach in a functional language such as Scala or Haskell.

truth_seeker
That's very concise. Thanks. Always good to see actual code :)
Aug 23, 2016 · 6 points, 0 comments · submitted by joeyespo
Jun 06, 2016 · geofft on Accidentally nonblocking
Cory Benfield's PyCon talk last week, "Building Protocol Libraries the Right Way" (https://www.youtube.com/watch?v=7cC3_jGwl_U), makes the argument that a large number of problems can be traced to not cleanly separating responsibilities of actually physically doing I/O and making semantic sense of the bytes. His primary worry was about reimplementing things like HTTP many times, once for each I/O framework (why do Twisted, Tornado, and asyncio all have their own HTTP implementation?). But it seems the same problem can be seen here: every single part of the code thinks it knows how to actually retrieve data from the network, so it interacts with the network on its own, causing nested polling and similar awkwardness. If every part of the event-processing code thinks it knows how to do network I/O, you have many more opportunities for getting network I/O wrong.

If xterm were designed so that e.g. xevents() had only the responsibility of fetching bytes from the X socket and do_xevents() and everything else had only the responsibility of handling bytes from an buffer, there would be no temptation to poll in two different functions. Only one function would even know that the byte source is a socket; the rest just know about the buffer.

yetihehe
The more I see such problems the more I like erlang. Most socket handling libraries split handling into protocol handling layer and application layer. Protocol layer ensures there is full message available and application layer handles only full messages. Most of the time it's the simplest and most natural way to do anything in erlang.
jacquesm
Erlang really gets this right. Abstract out all the generic server stuff and have it coded up by experts, then have the application programmers concentrate on the application. A bit like programming a plug-in for Apache but then extrapolated to just about anything you could do with a server. Erlang is a very interesting eco-system, the more I play around with it the more I like it and the way it is put together. If it had a shallower learning curve it would put a lot of other eco-systems out of business. But then again, the fact that it doesn't makes it something of a secret weapon for those shops and individuals that have managed to really master it.
raarts
> if it had a shallower learning curve

Elixir is meant to address that.

JoshTriplett
> If xterm were designed so that e.g. xevents() had only the responsibility of fetching bytes from the X socket and do_xevents() and everything else had only the responsibility of handling bytes from an buffer, there would be no temptation to poll in two different functions.

X is an interesting special case. The X protocol has some special cases where you have to make sure you read before you write, or vice versa; doing the wrong buffering or blocking operation can result in a deadlock between you and the server.

I certainly enjoyed that PyCon talk, and I agree with the conclusion; however, there are some special-case protocols like X where integrating them into your main loop requires some special protocol-specific care.

geofft
I think you can solve this by reporting an "I can't read unless you write some more" event, or allowing a "I can't write unless you process some events" return code from the write function. You need some protocol awareness (you can't completely abstract every protocol as bytes -> JSON and JSON -> bytes), but it doesn't rise to the level of letting application code directly have access to the underlying file descriptor.

I believe both SSL and SSH have similar issues, where the state of the protocol client requires that you order reads and writes in some way to avoid deadlock. I guess TCP also has the a similar risk with window sizes going to 0, and in practice, hiding TCP behind a UNIX file descriptor and a relatively constrained socket API works fine; client apps don't need to care about the exact state of the TCP implementation.

nicolast
https://blog.incubaid.com/2011/12/13/hybrid-sync-async-pytho...
jerf
One of the nice things about Go is that the io.Reader and io.Writer interface being written into the base libraries means a lot of code gets this right, and only expects a stream rather than "a socket".

The takeaway here is not that Go is awesome; the takeaway is a lesson on the importance of getting a very early release of a language and its stdlib correct. The vast majority of modern languages today could trivially-to-easily do the same thing, but they don't in the standard lib, so the first couple of libraries end up string based, so the next libraries that build on those end up based on strings, and before you know it, in practice hardly anything in the ecosystem is implemented this way, even though in theory nothing stops it from happening. (Then around year 3 or 4, a big library gets built that does this correctly, but it's too late to retrofit the standard library and it only ever gets to about 10% penetration after a lot of reimplementation work.)

HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.