HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
David Beazley - Python Concurrency From the Ground Up: LIVE! - PyCon 2015

PyCon 2015 · Youtube · 131 HN points · 19 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention PyCon 2015's video "David Beazley - Python Concurrency From the Ground Up: LIVE! - PyCon 2015".
Youtube Summary
"Speaker: David Beazley

There are currently three popular approaches to Python concurrency: threads, event loops, and coroutines. Each is shrouded by various degrees of mystery and peril. In this talk, all three approaches will be deconstructed and explained in a epic ground-up live coding battle.

Slides can be found at: https://speakerdeck.com/pycon2015 and https://github.com/PyCon/2015-slides"
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.
I highly recommend Victor Skvortsov's exhaustive, 12-part series, "Python Behind the Scenes," a deep dive into the implementation details of just about every Python language feature one could possibly take for granted. There isn't a single one I haven't learned something from, but some of my personal favorites include:

* Part 2, How the CPython compiler works: https://tenthousandmeters.com/blog/python-behind-the-scenes-...

* Part 6, How Python's object system works: https://tenthousandmeters.com/blog/python-behind-the-scenes-...

* Part 10, How Python dictionaries work: https://tenthousandmeters.com/blog/python-behind-the-scenes-...

* Part 13: The GIL and its effects on Python multithreading: https://tenthousandmeters.com/blog/python-behind-the-scenes-...

The whole series can be found here: https://tenthousandmeters.com/tag/python-behind-the-scenes/

For a slightly higher-level (i.e., no macabre, low-level C code), but still incredibly enlightening dive into the "dark side" of Python, in particular various abuses and hexes for manipulating control flow, have a look at David Beazley's smorgasboard of keynote presentations like "Generators: The Final Frontier" (https://www.dabeaz.com/finalgenerator/), "Python Concurrency From The Ground Up" (https://www.youtube.com/watch?v=MCs5OvhV9S4), and "Build Your Own Async" (https://www.youtube.com/watch?v=Y4Gt3Xjd7G8&t=2045s).

Rich Hickey - Hammock-driven development: https://m.youtube.com/watch?v=f84n5oFoZBc

Brett Victor - Inventing on principle: https://m.youtube.com/watch?v=8QiPFmIMxFc

Programming is terrible: https://m.youtube.com/watch?v=csyL9EC0S0c

Dave Beazely - Python concurrency from the ground up (applicable to languages in general, with generator and corourine functionality) https://m.youtube.com/watch?v=MCs5OvhV9S4

Functional programming, with birds: https://m.youtube.com/watch?v=6BnVo7EHO_8&t=1009s

Short and classic: Wat https://www.destroyallsoftware.com/talks/wat

mikewarot
>Brett Victor - Inventing on principle: https://m.youtube.com/watch?v=8QiPFmIMxFc

This gave me a sharp moment of clarity, thank you so much for this!

I'll work on the wording over time, but here's a rough sketch of my principle:

My principle is that no person should ever be forced to blindly trust a computer to do the right thing. Computing shouldn't be either blindly trust the black box, or get nothing done.

Nobody should have to hand over their wallet to buy an ice cream cone, you can just take the exact change out and pay. Why should you have to give a program access to everything when you just want to edit one text file?

If a demo is this, then my view is that it's no good.

Here's a demo: https://www.youtube.com/watch?v=MCs5OvhV9S4

Notice in the Rust case the narrator is just "announcing" what's happening.

Notice in the Python case the narrator is explaining why they are doing something; what they are doing; critically evaluating it (why its good/bad).

Notice also, that David uses humour and brings people on a "journey through a thought process".

w0m
You're comparing a 46 minute presentation with single 5 minute slice of a 160 minute course?

(I have watched that concurrency pycon session previously - it is good.)

mjburgess
Yes, I am. I am rejecting the premise that videos should be 5min. That 10min on traits and a 5min "demo" in this style is actually a productive form of learning.

If you can only do 160 min, do a 160 min journey. Consider any of the 3hr tutorial videos done at pycon.

10 min on traits, within that 160min, I think would be very productive.

EDIT: from elsewhere in this thread: https://www.youtube.com/watch?v=WnWGO-tLtLA -- c. 3hrs on rust (I havent seen it, but I suspect that it is approached as a journey, will improve its quality).

AlexCoventry
I think there's a place for both types of presentation. Strang is giving a conceptual overview, and a listener is necessarily passive. The MS videos are presenting technical details of how to write rust programs, and bite-sized chunks can be good there, because they provide natural stopping points for listeners to experiment with the presented methods. Similarly, I could see bite-sized videos being natural for teaching the business of actually calculating solutions to a system of linear equations.
BHSPitMonkey
You can watch as many sequenced 5 minute videos as you wish in a single sitting; The fact that they're divided into separate videos in a playlist is hardly different than taking a longer video and marking the timestamps where different ideas are presented.
setr
That’s technically true, but it’s obviously not true in practice — a 5-min video system means that everything will be self-contained within those 5 minutes.

TV is the clearest counter-example (though it has additional constraints, like week-long gaps) — you need an ending at each 30 minutes, because you don’t expect the viewer to watch 4 episodes in one sitting. And this obviously isn’t the same as 2 hrs in a movie, where they expect the viewer to sit through the whole thing continuously.

Even on Netflix, where they do expect you to binge watch, they still have to uphold the ending-per-episode constraint, because they’ve told the viewer this is a natural stopping point, by virtue of the format

w0m
People are talking out both sides here. Complaining that the 5 minute demo isn't giving enough context; but also saying that you can't get deep enough in 5 minutes because the 5 minute video has to rehash to add context.

I get it; some people dislike short videos; but there's also a fare bit of straw man building going on at the same time, starting at the dislike and trying to justify it.

Luckily we have lots of options :)

You add value by modelling thought process to the learners, as you solve problems which are relevant to them.

Consider: https://www.youtube.com/watch?v=MCs5OvhV9S4 or anything of this kind.

And ask: what is David doing as the narrator (of his own thought process) ?

This.

The best references I know of are listed in that comment.

I just wanted to emphasize David Beazley's videos. The video [0] about python concurrency I consider a masterpiece.

[0] https://www.youtube.com/watch?v=MCs5OvhV9S4

Nov 09, 2020 · 101 points, 39 comments · submitted by BerislavLopac
pixelmonkey
This is a presentation given by David Beazley (dabeaz), wherein he live codes, in emacs and from scratch, a concurrent socket server to illustrate concepts of IO vs CPU bound concurrency in Python. The presentation is not just illuminating on how socket programming works in Python, but it is also a fun and relatively unique example of live coding being an effective presentation tool. I was at the conference live for this session and someone next to me in the crowd, when it was over, said, "I think we just witnessed the Jimi Hendrix of Python."
flobosg
My favorite talk of him is “Discovering Python”: https://www.youtube.com/watch?v=RZ4Sn-Y7AP8

> So, what happens when you lock a Python programmer in a secret vault containing 1.5 TBytes of C++ source code and no internet connection? Find out as I describe how I used Python as a secret weapon of "discovery" in an epic legal battle.

quietbritishjim
It's very disappointing to see someone who seems authoritative spreading the myth that the GIL prevents you from using threads to achieve concurrency.

It didn't work with his toy example, calling the Fibonacci function, because it's pure Python. Typically, if you have pure CPU-bound processing like this then you wouldn't want to use pure Python anyway as it would be too slow. You'd either using C extension library like numpy, scipy or pytorch, or (more rarely) write the code yourself in Cython. In either case the GIL is released whenever you make a call into them (you have to do it manually in Cython but it's straightforward). If his example had been multiplying matrices together with np.dot then threads wouldn't have been a problem.

The GIL is also released whenever you do I/O like reading data from a socket or reading from a file. This includes libraries that do this for you, such as those reading from a database (whether a remote DBMS like Postgres or a file-based one like SQLite).

Taken together, these cover 99% of cases where you want concurrency. In that 1% where the GIL is not released, fixing that pure Python computation code is often worth doing before adding concurrency anyway.

nurettin
> GIL prevents you from using threads to achieve concurrency

I didn't watch, but nothing in python prevents concurrency. In Python, a lot of things prevent parallelism. Threads/async are ways of achieving concurrency in python. Parallelism is achieved using multiprocessing. It is inefficient, but it works.

quietbritishjim
Oops, you're right, I was talking about parallelism, not just concurrency. Thanks for the correction.

> Parallelism is achieved using multiprocessing.

As I said in my previous comment (but used the wrong term), threads are a perfectly reasonable way of getting parallelism in Python, so long so long your application uses C-based libraries like numpy (or even some built-in modules like zipfile). In my experience, the vast majority of Python programs that might need parallelism fall into that category (or are IO bound) anyway.

Geminidog
Also from certain perspectives parallelism is achieved with threads during IO.

So if 20 threads are making blocking IO calls, all 20 threads can make progress on those IO calls in parallel while having an additional parallel thread doing compute operations yielding a total of 21 threads executing in parallel.

nurettin
We use "parallel" explicitly to mean "running on multiple cores scheduled by the kernel".
Geminidog
Yes, but "parallel" is also used in CS to specify multiple things happening at the same time including IO.

https://www.wikiwand.com/en/Parallel_computing

Quote: "Parallel computing is a type of computation where many calculations or the execution of processes are carried out simultaneously."

https://www.wikiwand.com/en/Parallel_communication

Quote: "In data transmission, parallel communication is a method of conveying multiple binary digits (bits) simultaneously. It contrasts with serial communication, which conveys only a single bit at a time; this distinction is one way of characterizing a communications link."

znpy
> It didn't work with his toy example, calling the Fibonacci function, because it's pure Python.

maybe I do want to write pure python AND have multi-threaded applications ?

Geminidog
One use case is IO bound applications. Similar to NodeJS which can handle 10K concurrent IO calls on a single thread, python releases the GIL on IO. So you can write pure python and have multi threading and have all the benefits of "parallelism" depending on whether or not python releases the GIL which it does with certain libraries and on IO.

For all else use multiprocessing.

huseyinkeles
For the curious, GIL = Global Interpreter Lock
jacobwilliamroy
I remember using the multiprocessing library to speed up some file I/O. Basically I found out that I could spawn a separate process for each physical storage interface, and I was able to hash like 7 TiB of data in a little over 7 hours. My understanding was that threads are subject to GIL and thus cannot run on multiple cores, whereas processes don't have that same restriction, so I needed to use the multiprocessing library, NOT the multithreading library to parallelize my I/O.

Also, the featured video is from 2015. Maybe it's just outdated information?

quietbritishjim
> My understanding was that threads are subject to GIL and thus cannot run on multiple cores

That's exactly the myth I'm trying to address. It's true that in some circumstances the GIL prevents threads from running in multiple cores, but not all. The GIL would definitely have been released during the file I/O calls in your program. The GIL could well have been released during hashing too, depending on what hashing library you were using, which would have enabled true thread-based parallelism. For example, the built in Python hashlib says [1]:

> Note: For better multithreading performance, the Python GIL is released for data larger than 2047 bytes at object creation or on update.

On the other hand, I'm absolutely not saying that nobody should use multiple processes. For some applications, it's no more complex to use the multiprocessing module than to use threads and not much overhead to pass the data between processes. In that case, it's nice because you just don't even need to worry about whether the GIL is going to play a role, which can be a pain since sometimes there's no documentation saying whether specific functions release the GIL or not. All I'm saying is we should always make clear that to clear to everyone that threads are an option that the GIL doesn't (always) prevent.

> Also, the featured video is from 2015. Maybe it's just outdated information?

Another note on the hashlib page said that the above feature was added in Python 3.1 (released in 2009). That's just about that particular module. I found documentation from Python 1.5 (released in 1998) that describes how C extensions can release the GIL [2]

[1] https://docs.python.org/3/library/hashlib.html

[2] https://docs.python.org/release/1.5/api/node43.html

Geminidog
You are completely wrong. GIL is released on I/O. So threads can parallelize during I/O calls.

This means if your application is bottle necked by IO threading is better. If it's bottle necked by both IO and compute or only compute then multiple processes are better.

For your specific case it sounds like the latter, your application is bound by both, therefore multiple processes are better, but what you are implying with your post of using multiprocessing over multi threading as the more "correct" way to "speed up" IO is categorically wrong.

Dealing with the IO bottleneck by allowing IO calls to make progress while switching to other threads is a very common pattern in web development and the pattern has literally been baked into several frameworks and languages. NodeJS, GoLang, Erlang and Elixir can handle around 10k concurrent IO calls on a few threads without skipping a beat.

quietbritishjim
> If it's bottle necked by both IO and compute or only compute then multiple processes are better.

This is why I started this whole thread. The myth that you can't multithread compute in Python is so pervasive that even when we're in the middle of a thread specifically about the topic, and right next to my comment where I show hashlib specifically does release the GIL, there's still a comment saying that multiple processes are needed for compute parallelism.

Geminidog
My comment is more addressing the guy trying to use processing OVER threading to improve overall speed of completing multiple IO tasks.

Yeah I'm aware of your comment. And I can see how you could be pissed off over someone not mentioning what you're saying but please be aware that forum threads can go off into slight tangents when someone states something that is factually wrong.

Yes. I know... Programmers can turn to C or they have to use specific libraries to achieve parallelism with threading. I get it. Though you're describing special cases. Similar to the grammatical special cases in English: https://www.e-education.psu.edu/styleforstudents/c1_p6.html

Special cases make certain grammatical rules technically incorrect but the general notion behind those rules and why those rules are still pervasive and stated by people who are aware of the special cases is still valid. It's the same story with the python GIL. There could be multitudes of special cases but this does not negate the existence of a general grammatical rule: Compute in python threads is not parallel.

Do I go into a long winded technical discussion of all the special cases or do I just use the generality to point out his categorical mistake? I guess I should have taken the long winded route because if I don't I'll piss you off. Thank you for voting me down btw, I will be sure to address all stakeholders in HN threads in my future replies.

jacobwilliamroy
I think you're confusing async and parallelism. In multithreading, the computer will switch between multiple tasks whenever there is nothing to do, such as waiting for I/O read or waiting for a network packet to arrive. Things don't happen at the same time, it's just a more efficient, dynamic scheduling of a sequence of tasks happening one after the other. It's not physically possible for a multiple threads to happen at the same time unless they are on separate cores. In my use-case, hashing 200000 files on 7 separate hard drives, the only way to speed up the computation was to read from all 7 drives AT THE SAME TIME. There is no way to schedule 10^6 disk reads and hash operations into a single thread of execution that will reduce the runtime. Those async tricks can work in a more complicated program but you have to understand, I was ONLY reading from disk. The hashing step was faster than disk read (thanks xxhash) and the data was written to a sqlite file in a RAMdisk instantly so pretty much the entire runtime was spent reading from disk. multiprocessing was able to parallelize that unambiguously. Personally I don't trust python's multithreading because I have no control over whether it executes on one core or multiple cores, and in the world of parallel (not async) it is a very common convention that processes are parallel, while threads are async.

However, that being said, I'm open-minded. I will try to benchmark reading from multiple hdds using multithreading and compare bandwidth to the multiprocessing approach. How's that?

Geminidog
No you're confused. I'm perfectly aware of the differences between parallelism and concurrency. You are not.

Have you ever heard of a popular web server framework called NodeJS? You can create servers with NodeJS that handle 10k concurrent IO requests.

Let's say each request takes about 1s to finish because of the internet. If you fire 10k of these requests at a NodeJS server, the server can echo the ALL requests back in probably 2s.

And get this. NodeJS is single threaded.

If you fire 10k requests and NONE of the requests are parallel, you would typically expect all requests to finish at 10k seconds. This is not what happens with a single thread of NodeJS. People write entire chat servers in NodeJS that handle IO requests that hit around 10k concurrent messages in flight. NodeJS handles ALL of this on a SINGLE thread and from the users perspective all these requests finish in seconds.

If you don't understand why the above happens it means you don't understand concurrency vs. parallelism in the context of IO.

>that processes are parallel, while threads are async.

No. Both processes and threads are async. Only processes are parallel on compute (for python) and threads become parallel on only IO (for python). And technically certain python libraries can parallelize threads.

That is the key. A single thread can handle parallel IO. Why? Because the time spent on an IO operation is usually waiting for an inflight message to travel across the wire. So in a sense you can have 50 messages in flight across a wire handled by a single thread while the CPU just spends all this time waiting for the message to arrive. It's a form of "Parallelism" if you will but you won't find any literature using the word "parallelism" in conjunction with single threaded IO even though this is technically what is going on.

>However, that being said, I'm open-minded. I will try to benchmark reading from multiple hdds using multithreading and compare bandwidth to the multiprocessing approach. How's that?

This is a useless test. Your code involves both compute and IO. So processes will parallelize BOTH IO and compute. Threads will only parallelize IO, so if you have say like 10 fixed threads and 10 fixed processes OF COURSE processes will beat threads for your test case.

To make it a fair test and to see the benefits of python threads over processes you need to get rid of the hash operation. Maybe make your python program write a copy the file to another file. Then try to make everything concurrent. By everything I mean for every single file, fire off a new thread and do the same for processes.

You will find that the threaded approach will take up much less resources and go much further before crashing.

Even better than threading though... high IO operations can actually be handled by a single thread. You can parallelize 200000 IO calls on a single thread with python async await or nodejs (i recommend node for this type of stuff).

Though the other bottleneck will be your HD in this case. When your HDs see 200000 IO calls the HD's themselves will start serializing the requests. Also with such short message flight time in the wire is super short. Likely your IO is spending a good chunk of time writing messages to program memory as well.

The most fair test is 200000 IO calls to multiple external services that can handle such volume. Don't touch your HD. Just focus tests on services that will not block or serialize IO.

>multiprocessing was able to parallelize that unambiguously.

No dude. Wrong again. Have you noticed that you're able to create more processes than you have cores? Have you ever thought about all the processes that are running on your OS? Likely more total processes then you have cores. How does this happen? Because not all processes are parallel.

For OS threads and for all processes your OS decides whether they will be parallel or whether they will not. I know of no API that gives you direct control over which Core to execute your thread/process on.

The only difference between threads and processes is that threads have shared memory with other threads and are less expensive to instantiate in terms of time and memory and processes have their own memory space and are therefore more "expensive". Python is unique in the sense that threads are not parallel on compute due to the GIL, but this is only true for python and python-like languages.

How's that for a wall of text? Hope you learned something junior.

jacobwilliamroy
I did the benchmark and multiprocessing achieved higher bandwidth than multithreading. 8 times higher.

This was just reading from disk, so no hashing step at all. Didn't even write to disk.

I think you might not be so knowledgable about concurrency.

Geminidog
Did you read my full post? I don't think you did. Let me quote it:

>To make it a fair test and to see the benefits of python threads over processes you need to get rid of the hash operation. Maybe make your python program write a copy the file to another file. Then try to make everything concurrent. By everything I mean for every single file, fire off a new thread and do the same for processes.

>Though the other bottleneck will be your HD in this case. When your HDs see 200000 IO calls the HD's themselves will start serializing the requests. Also with such short message flight time in the wire is super short. Likely your IO is spending a good chunk of time writing messages to program memory as well.

>The most fair test is 200000 IO calls to multiple external services that can handle such volume. Don't touch your HD. Just focus tests on services that will not block or serialize IO.

I'm very very knowledgeable about concurrency. The problem is you. You failed to read my post and the caveats of all the details I mentioned. You failed to adjust the test fully to make it fair. You just overall failed and you don't know much.

Let me explain why you failed. Like I quoted above. HD IO has very very short time of flight meaning that the actual operation of compute (writing the data to program memory) takes much more time. Your threads aren't parallelizing that.

Like I said before THE ONLY DIFFERENCE in processes and threads is that python threads DO NOT parallelize compute and processes are MORE EXPENSIVE to spawn in terms of memory space. SO if you logically parallelize off of a fix number of threads/processes what do you think will occur? Of course processes will be faster.

I told you to adjust the test to spawn a new thread/process for every IO call to see the benefits. Because threads are cheaper in memory to spawn it will be able to spawn MANY more threads than it can processes. So if your operation is IO bound what you typically did with like 10 processes can likely be done with 1000 threads. <--- That is the difference.

You said you have 2000000 files? That means 2000000 threads/processes must be spawned for a fair test. And if you can, try to hit some server on the internet that can handle that many requests to fully see parallelization of IO because time of flight is longer over the internet.

Additionally your HDD's themselves can't parallelize 2000000 requests so in the end all your threads/processes will be bottlenecked by the HD itself NOT by IO. That's why I told you to hit something over the internet that can handle 2000000 IO calls. I mentioned All of this in my original response and you failed to understand.

Now do you Get it? I actually am trying to teach you something and you have the balls to tell me that I don't know about concurrency. It's quite obvious you have no idea what you're talking about.

jacobwilliamroy
If I was doing network I/O then there's a chance that threads would be faster... but you do realize that would also be multiprocessing on separate physical CPUs right? A separate computer, running a separate process, is processing your request on the network at the same time. That's how async works. It's really just an illusion of single-thread paralellism.

Anyways, the original scenario didn't require any network services and hashing each data chunk and writing to ramdisk really was instant. My script originally was supposed to enable integrity checks and deduplication of files on separate hard drives. There's no point turning it into a webapp just to shoehorn async in there.

Also this is a dumb conversation. I didn't actually run the benchmark. I just lied to you because I wanted to see you waste hours of your life typing out a wall of text for someone who doesn't care what you think. Also I still think you have no idea what you're talking about.

Geminidog
>Also this is a dumb conversation. I didn't actually run the benchmark. I just lied to you because I wanted to see you waste hours of your life typing out a wall of text for someone who doesn't care what you think. Also I still think you have no idea what you're talking about.

I was trying out of my heart to help out someone who didn't know anything. Looks like you took my altruism and threw it on the ground and stepped on it.

>If I was doing network I/O then there's a chance that threads would be faster... but you do realize that would also be multiprocessing on separate physical CPUs right?

Your HDD has it's own processor inside of it that handles reads and writes. Your CPU only sends it commands. So it's the same thing either way. It's just servers are designed to be hit by 200000 simultaneous requests, HDDs are not.

>That's how async works. It's really just an illusion of single-thread paralellism.

Except IO messages travel down the wire in parallel. This is what you need to get through your head. For compute, time is divided among threads in a single core but not for IO.

>Anyways, the original scenario didn't require any network services and hashing each data chunk and writing to ramdisk really was instant. My script originally was supposed to enable integrity checks and deduplication of files on separate hard drives. There's no point turning it into a webapp just to shoehorn async in there.

I didn't tell you to turn it into a webapp. I told you that your test didn't make sense and I told you to run a completely different test. I didn't tell you to shoehorn your app. If you're using sockets the API for interfacing with IO is exactly the same whether it's web OR an HDD. No shoehorning.

HDD still can benefit from async.

joshgev
No, it isn't outdated. The claims are still issues. In the scenario you describe, multiprocessing should work pretty well because each processes can run independently of the others and report one big result at the end (all the hashes that process computed). There are plenty of scenarios, however, where the different processes do have to talk to each other frequently which, in Python, means you introduce a ton of overhead in serializing and deserializing these data (Python transmits pickled data between processes). From personal experience I can tell you that this can be a major problem in terms of performance.

Other languages give you more flexibility in how to share data between processes.

The fundamental claim that threads running pure Python are limited by the GIL still stands. Others do point out that you can get around this in C and some of the standard libraries in Python (like hashlib) do this for you. That goes a long way to helping the issue, of course, but as yet others point out, it is weird to support Python in this context by saying "Yes Python can use threads effectively; all you have to do is use C."

theelous3
It's a python conference demonstrating achieving goals in python. Basically saying "eh use C for cpu bound work" isn't helpful.

I think you missed the point a bit, which was purely to show co-operative yielding. The fact that some libs use c and release the gil really doesn't matter. This is another way. The real world usecase, as with basically all python, is io.

quietbritishjim
I'm not saying write your whole program in C. I'm saying, at worst, write a tiny corner of it in C so that you don't have to restructure the rest of your application - overall it would be simpler, even for most mostly-Python devs. Or, much more likely, use a library like numpy which you were probably going to do anyway. I regularly use numpy and don't find myself thinking "I wish didn't have to write all this C code...". (I must admit, an unfortunate consequence of this is that I've interviewed a few junior candidates that were so used to writing vectorised code that they seemed to be terrified of using a for loop!)

I admit I didn't watch the whole of the talk. (Personally I much prefer learning from text than video, and I was put off by the GIL bit anyway.) From the parts I saw, it seemed like the GIL issue was critical for motivating everything he did afterwards, but perhaps there were other reasons that became clear later. In that case, he could have avoided mentioning the GIL altogether (given that he ended up being so misleading about it).

> The real world usecase, as with basically all python, is io.

That is just straight up not true. Yes, IO is a valid use case of Python but it definitely isn't "basically all Python". I'm sure the vast majority of data scientists use Python, and for most of them their workloads are almost entirely computational (using deep learning libraries with C/C++/CUDA backends).

toyg
I think we have to consider that this talk is from 2015, when datascience was a much smaller part of the python ecosystem. I have used python since 2003 for various things and never once I’ve touched numpy, but I bet today 1 in 3 python devs use it probably every day. I understand your point of view and I understand the speaker’s own too - his is just more old-school.

We’ve been able to sidestep the GIL in many ways since forever (for example with alternative implementations like Jython, IronPython, PyPy etc), and I think everyone has known that for a long time, but here Dave just wanted to show what the actual problem is, beyond the buzzwords (it had become a bit of a myth in the early ‘10s, when Go suddenly exploded in the sysadmin/backend niche partially because “it can do parallelism better”).

quietbritishjim
About the fact this talk is from 2015: The GIL is released by many libraries that aren't about data science, so was and is relevant to many other types of application. In the Python standard library, for example, zipfile (which can operate on files or in-memory buffers) and hashlib come to mind. As I mentioned in another comment, hashlib calls have released the GIL at least as far back as 2009, well before this talk, and releasing the GIL was possible all the way back in 1998. Many (most?) Python libraries are implemented in C, and almost all of those that are will release the GIL.

About appreciating both points of view: I'm sorry if I'm making it sound like I think Python developers should always use threads and never multiprocessing. I don't believe that. And I don't believe the speaker should have reformulated their whole talk to use threads instead of processes. Just that it's such a pervasive myth that the GIL prevents all thread-based parallelism that they should have been careful to avoid reinforcing that. The speaker showed an example where the GIL actually did prevent parallelism; all I wanted was for them to add verbally that, by the way, if you were calling almost any CPU-bound library function rather than one you wrote yourself then this wouldn't be a problem.

NiceWayToDoIT
What is mind boggling for me is how many of you out there can talk and code in parallel at the same time, while text you typing is completely different than the things you saying?

Maybe subject for another post ...

thomasjudge
David Beazley and Bryan Cantrill are two of my favorite tech speakers. Suggestions for others?
hexa00
I work in a ML ecosystem ATM and concurrency is a major problem in python:

  - Threads can't be used efficiently because of the GIL

  - multiprocesses has to serialize everything in a single thread often killing performance. (Unless you use shared memory space techniques, but that's less than ideal compared to threads)

  - You can't use multiprocess while inside a multiprocess executor.  This makes building things on top of frameworks/libs that use multiprocess a nightmare...  e.g try to use a web server like over something like Keras... 
Those are the top reasons I don't like python but if you got appetite for more:

  - The dependency ecosystem is a pita, between python versions, package versions pinned or unpinned, requirements.txt, pipenv, poetry, conda... pick one and you're still sure to get into issues with other tools needing one system of another, or packages working a bit differently in conda etc... (I use poetry, with conda or pyenv)

  - The culture of let's write code easily is good to start with but it becomes a problem as people especially maybe in DS don't go further then that... and you end up with bad practices all over the place, un-testable code (the tests systems are also a pain to navigate), copy & pasted blobs etc...  Reading the code of some major libraries doesn't inspire confidence, especially compared to like Java, C++, go...
And last note I've seen way better emacs setup for python and presentations, it's ok as it is but I would not call it a Jimi Hendrix of python like a comment said...

I wish ML/DS would switch to Julia

tempest_
In 2020 when core counts are going up and up I reach for Elixir where I might have used python in the past for these reasons.
shepardrtc
I've been building a program that heavily uses multiprocessing for the past few months. It works quite well, but it did take me a little bit to figure out the best way to work with it.

> - Threads can't be used efficiently because of the GIL

Python's "threads" are actually fibers. Once you shift your thought process toward that then its easy enough to work with them. Async is a better solution, though, because "threads" aren't smart when switching between themselves. Async makes concurrency smart.

But if you want to use real threads, multiprocessing's "processes" are actually system threads.

> - multiprocesses has to serialize everything in a single thread often killing performance. (Unless you use shared memory space techniques, but that's less than ideal compared to threads)

I'm not quite sure what you mean. Multiprocessing's processes have their own GIL and are "single-threaded", but you can still spawn fibers and more processes from them, as well as use async.

Or are you talking about using the Manager and namespaces to communicate between processes? That is a little slow, yes. High speed code should probably use something else. Most programs will be fine with it, but it is way slower than rolling your own solution. However, it does work easily, so that's something to be said about it. Shared memory space techniques do work, too, but they are a little obtuse. Personally, I rolled my own data structures using the multiprocessing primitives. You have to set them up ahead of time, but they're insanely fast. Or you can use redis pubsub for IPC. Or write to a memory-mapped file.

- You can't use multiprocess while inside a multiprocess executor. This makes building things on top of frameworks/libs that use multiprocess a nightmare... e.g try to use a web server like over something like Keras...

I'm not sure what you mean. Multiprocessing simply spawns other Python processes. You can spawn processes from processes, so I don't know why you would have issues. Perhaps communication is an issue?

> - The dependency ecosystem is a pita

Yes, absolutely.

Geminidog
No man. Python threads are not fibers. This is factually wrong. Please Read: https://wiki.python.org/moin/GlobalInterpreterLock
zb
> Python's "threads" are actually fibers.

They’re actually not. They are native threads with high lock contention.

Async is arguably fibers, as are greenthreads in libraries like gevent or eventlet.

> But if you want to use real threads, multiprocessing's "processes" are actually system threads.

They’re system threads running in separate memory spaces. Also known as… processes.

shepardrtc
You're right. To me they just feel like fibers because they can't run in parallel.
helgie
If you use numba (or cython, c extensions, etc) you can make them run without requiring that they hold the GIL, and they can run in parallel. Here's an example that should keep a CPU pegged at 100% utilization for a while:

  import numba as nb
  from concurrent.futures import ThreadPoolExecutor
  from multiprocessing import cpu_count

  @nb.jit(nogil=True)
  def slow_calculation(x):
      out = 0
      for i in range(x):
          out += i**0.01
      return out

  ex = ThreadPoolExecutor(max_workers=cpu_count())
  futures = [ex.submit(slow_calculation, 100_000_000_000+i) for i in range(cpu_count())]
shepardrtc
I had no idea that existed, thank you!
shepardrtc
> and they can run in parallel.

Even without requiring the GIL, these are still child threads of the main process, correct? And because of that, wouldn't the OS keep them all on the same core? And if that's the case, would ProcessPoolExecutor solve that problem?

Jugurtha
Could you give examples of where exactly in the ML process/lifecycle you're hitting these issues?

For example: "When training a [type] model with X characteristics, the GIL causes Y, which makes it impossible to do Z".

We're building our machine learning platform[0] to solve problems we have faced shipping ML products to enterprise, and are interested in your problems as well.

For example, we've faced the environment/dependencies/"runs on my machine" problems and have addressed these with Docker images. Our users can spin up a notebook server with near real-time collaboration to work with others, and no setup because the environment is there.

The same with training jobs: they can click on a button and schedule a long-running notebook that runs against a specific environment to avoid "just yesterday I had X accuracy on my machine". The runs are tracked, the models, parameters, and metrics are automatically tracked because if we rely on a notebook author to do it, they might forget or have to context switch and it's an added cognitive load.

Some problems we faced were during deployment, too, where a "data scientist" writes a notebook to train a model and then we had to deploy that model reading their notebook or looking into dependencies. Now they can click on a button and deploy whichever model they want. It really was hindering us because they were asking someone else's help, who may have been working on something else.

- [0]: https://iko.ai

Try curio or trio in python :)

Trio especially with its requirement of an owner for each task makes the thinking of it all entirely sequential and clear. In curio it's optional.

See the rationale here: https://vorpus.org/blog/notes-on-structured-concurrency-or-g...

On top of that, I must recommend dabeaz's concurrency from the ground up talk, where he explains it all in primitive terms and lives codes a wee co-operative yielding thing on the fly:

https://youtu.be/MCs5OvhV9S4

crubier
Curio is really good and underrated
theelous3
Yep, absolutely exceptional library.

Given its existence, I honestly don't understand why asyncio is in use by anyone. It should be removed from stdlib as a mistake and efforts around the curio / trio ecosys should be redoubled and brought to the mainstream. It's clear now, all this time after asyncawait was brought in, that the thinking which lead to asyncio was erroneous.

I'm the author of a sort of popular curio/trio lib. I used to maintain a compatibility layer lib that allowed code to be written to use either or, just for my project. That torch has been taken up by anyio, allowing anyone who's writing something for either curio or trio to kill two birds with one stone and contrib to both sides at once. Is great.

Jun 20, 2020 · 6 points, 0 comments · submitted by guiambros
> > Python's multiprocessing library is needed to overcome the GIL

> No it's not, just use threads.

I just wanted to expand on this a little to describe some of the downsides to threads in Python.

Multi-threaded logic can be (and often is) slower than single-threaded logic because threading introduces overhead of lock contention and context switching. David Beazley did a talk illustrating this in 2010:

https://www.youtube.com/watch?v=Obt-vMVdM8s

He also did a great talk about coroutines in 2015 where he explores threading and coroutines a bit more:

https://www.youtube.com/watch?v=MCs5OvhV9S4&t=525s

In workloads that are often "blocked" like network calls our I/O bound work loads, threads can provide similar benefits to coroutines but with overhead. Coroutines seek to provide the same benefit without as much overhead (no lock contention, fewer context switches by the kernel).

It's probably not the right guidelines for everyone but I generally use these when thinking about concurrency (and pseudo-concurrency) in Python:

- Coroutines where I can.

- Multi-processing where I need real concurrency.

- Never threads.

quietbritishjim
Ah ha! Now we have finally reached the beginning of the conversation :-)

The point is, many people think (including you judging by your comment, and certainly including me up until now but now I'm just confused) that in Python asyncio is better than using multiple threads with blocking IO. The point of the article is to dispel that belief. There seems to be some debate about whether the article is really representative, and I'm very curious about that. But then the parent comment to mine took us on an unproductive detour that based on the misconception that Python threads don't work at all. Now your comment has brought up that original belief again, but you haven't referenced the article at all.

gshulegaard
I didn't reference the article because I provided more detailed references which explore the difference between threads and coroutines in Python to a much greater depth.

The point of my comment is to say that neither threads or coroutines will make Python _faster_ in and of themselves. Quite the opposite in fact: threading adds overhead so unless the benefit is greater than the overhead (e.g. lock contention and context switching) your code will actually be net slower.

I can't recommend the videos I shared enough, David Beazley is a great presenter. One of the few people who can do talks centered around live coding that keep me engaged throughout.

> The point is, many people think (including you judging by your comment, and certainly including me up until now but now I'm just confused) that in Python asyncio is better than using multiple threads with blocking IO. The point of the article is to dispel that belief.

The disconnect here is that this article isn't claiming that asyncio is not faster than threads. In fact the article only claims that asyncio is not a silver bullet guaranteed to increase the performance of any Python logic. The misconception it is trying to clear up, in it's own words is:

> Sadly async is not go-faster-stripes for the Python interpreter.

What I, and many others are questioning is:

A) Is this actually as widespread a belief as the article claims it to be? None of the results are surprising to me (or apparently some others).

B) Is the article accurate in it's analysis and conclusion?

As an example, take this paragraph:

> Why is this? In async Python, the multi-threading is co-operative, which simply means that threads are not interrupted by a central governor (such as the kernel) but instead have to voluntarily yield their execution time to others. In asyncio, the execution is yielded upon three language keywords: await, async for and async with.

This is a really confusing paragraph because it seems to mix terminology. A short list of problems in this quote alone:

- Async Python != multi-threading.

- Multi-threading is not co-operatively scheduled, they are indeed interrupted by the kernel (context switches between threads in Python do actually happen).

- Asyncio is co-operatively scheduled and pieces of logic have to yield to allow other logic to proceed. This is a key difference between Asyncio (coroutines) and multi-threading (threads).

- Asynchronous Python can be implemented using coroutines, multi-threading, or multi-processing; it's a common noun but the quote uses it as a proper noun leaving us guessing what the author intended to refer to.

Additionally, there are concepts and interactions which are missing from the article such as the GIL's scheduling behavior. In the second video I shared, David Beazley actually shows how the GIL gives compute intensive tasks higher priority which is the opposite of typical scheduling priorities (e.g. kernel scheduling) which leads to adverse latency behavior.

So looking at the article as a whole, I don't think the underlying intent of the article is wrong, but the reasoning and analysis presented is at best misguided. Asyncio is not a performance silver bullet, it's not even real concurrency. Multi-processing and use of C extensions is the bigger bang for the buck when it comes to performance. But none of this is surprising and is expected if you really think about the underlying interactions.

To rephrase what you think I thought:

> The point is, many people think (including you judging by your comment, and certainly including me up until now but now I'm just confused) that in Python asyncio is better than using multiple threads with blocking IO.

Is actually more like:

> Asyncio is more efficient than multi-threading in Python. It is also comparatively more variable than multi-processing, particularly when dealing with workloads that saturate a single event loop. Neither multi-threading or Asyncio is actually concurrent in Python, for that you have to use multi-processing to escape the GIL (or some C extension which you trust to safely execute outside of GIL control).

---

Regarding your aside example, it's true some C extensions can escape the GIL, but often times it's with caveats and careful consideration of where/when you can escape the GIL successfully. Take for example this scipy cookbook regarding parallelization:

https://scipy-cookbook.readthedocs.io/items/ParallelProgramm...

It's not often the case that using a C extension will give you truly concurrent multi-threading without significant and careful code refactoring.

camgunz
For single processes you’re right, but this article (and a lot of the activity around asyncio in Python) is about backend webdev, where you’re already running multiple app servers. In this context, asyncio is almost always slower.
May 28, 2020 · 2 points, 0 comments · submitted by elteto
I was at pycon when beaz programmed a set of raw socket server/client scripts running fibonnaci sequences LIVE from scratch in front of 1000 people using nothing but a text editor. The man is truly a mad genius.

https://youtu.be/MCs5OvhV9S4

Apr 30, 2019 · gshulegaard on Python at Netflix
> I fail to understand the use of python in a distributed environment while the language has such poor concurrency support

Because it's a distributed environment probably is exactly why. Python has (arguably) great concurrency support apart from Multi-threading.

https://www.youtube.com/watch?v=MCs5OvhV9S4

So if you need concurrency in the context of a single thread, then Python's GIL is a non-starter. But a distributed environment is not likely one of those.

Edit: I should amend concurrency in a single thread to: concurrency in a single thread that is compute gated...since coroutines can give you pseudo concurrency in a single thread provided you're workload has blocking steps like IO or TCP calls.

pletnes
If you’re doing (data analysis|simulations|Image processing) you can offload computation to numpy, which releases the GIL. This allows nice multicore speedups with python and threading.

The same holds for various CPU intensive standard library functions implemented in C.

The GIL issue is real, but posts like this one confused me for years. Please, don’t exaggerate GIL issues.

weberc2
Not everything is amenable to numpy and it's pretty easy to make performance worse by throwing numpy at every problem. For example, if your array contains Python objects that are part of an operation, you've likely just introduced a significant performance regression. Worse, there's no way to detect these regressions except to have performance tests. Please, don't understate GIL issues.
xapata
Why would you make a NumPy array of objects? Use a list until it makes sense to create an array.
weberc2
Because you’re using Numpy via an intermediate library like pandas and you have composite data in one column?
xapata
Ah. Pandas is the problem. Unfortunately, you need to understand how its features are implemented to use it well. Still, my main trouble with Pandas is unnecessary memory bloat, not compute inefficiency.

That caveat is somewhat true for all programming abstractions, but well-designed interfaces make the more efficient techniques more obvious and beautiful, while the inefficient or risky techniques are made esoteric and ugly.

weberc2
Pandas isn't the problem, the problem is assuming that "$LIBRARY releases the GIL so things will be fast!". It's a pennywise, pound-foolish approach to performance. Someone will write a function assuming the user is only going to pass in a list of ints and someone else will extend that function to take a list of Tuple[str, int] or something and all of a sudden your program has a difficult to debug performance regression.

In general, the "just rewrite the slow parts in C!" motto is terrible advice because it's unlikely that it will actually make your code appreciably slower, and if it does, it's very likely to be defeated unexpectedly as soon as requirements change. Using FFI to make things faster can work, but only if you've really considered your problem and you're quite sure you can safely predict relevant changes to requirements.

xapata
> Someone will write a function assuming the user is only going to pass in a list of ints and someone else will extend that function to take a list of Tuple[str, int]

There are plenty of pitfalls in leaky abstractions. Establishing that fast numeric calculations only work with specific numeric types seems to help.

One thing you seem to be encountering, that I've seen a few times, is that people don't realize NumPy and core Python are almost orthoganal. The best practices for each are nearly opposite. I try to make it clear when I'm switching from one to the other by explaining the performance optimization (broadly) in comments.

Regardless, any function that receives a ``list`` of ints will need to convert to an ndarray if it wants NumPy speed. If the function interface is modified later, I think it's fair to expect the editor to understand why.

weberc2
> There are plenty of pitfalls in leaky abstractions

Sure, but this is a _massive_ pitfall. It's an optimization that can trivially make your code slower than the naive Python implementation, all due to a leaky abstraction.

> Regardless, any function that receives a ``list`` of ints will need to convert to an ndarray if it wants NumPy speed. If the function interface is modified later, I think it's fair to expect the editor to understand why.

Yeah, that was a toy example. In practice, the scenario was similar except the function called to a third party library that used Numpy under the hood. We introduced a ton of complexity to use this third party library on the grounds that "it will make things fast" instead of the naive list implementation, and the very next sprint we needed to update it such that it became 10X slower than the naive Python implementation.

That's the starkest example, but there have been others and there would have been many more if we didn't have the stark example to point to.

The current slogan is "just use Python; you can always make things fast with Numpy/native code!", but it should be "use Python if you have a deep understanding of how Numpy (or whatever native library you're using) makes things fast such that you can count on that invariant to hold even as your requirements change" or some such.

xapata
I have mixed feelings about your conclusion. On one hand I don't want to discourage newbies from using Python. On the other, I enjoy that my expertise is valuable.

It seems reasonable that different parts of the code are appropriate for modification by engineers of differing skills.

weberc2
Even if your app is IO bound, Python's concurrency is painful. Because it's not statically typed, it's too easy to forget an `await` (causing your program to get a Promise[Foo] when you meant to get a Foo) or to overburden your event loop and such things are difficult to debug (we've had several production outages because of these class of bugs). Never mind the papercuts that come about from dealing with the sync/async dichotomy.
Too
The missing await is a a very common fault indeed, they should have used another keyword like 'not_await' for that scenario to make the decision explicit. Pycharm at least will warn you if you call an awaitable without 'await' and without assigning it to a variable. If you assign it to a variable and pass it into another function that doesn't expect an awaitable, it's up to you to have added sufficient type annotations and run your code through some static checker like mypy. Running python at scale without mypy is kindof doomed to failed to begin with.
y3sh
And that even with async you're still bound by the GIL
jnwatson
Huh? With async, there's typically only ever 1 thread running, so there's no contention for the GIL.
jnwatson
Both problems have built-in debug solutions in recent versions of python. The event loop will literally print out all the un-awaited coroutines when it exits, and you can enable debug on the event loop and have it print out every time a coroutine takes longer than a configurable amount of time.
weberc2
> The event loop will literally print out all the un-awaited coroutines when it exits

IIRC, I've only ever seen "unawaited coroutine found" (or similar) errors; I've never seen anything that points to a specific unawaited coroutine. In either case, a bug in prod is still many times worse than compile time type error.

> you can enable debug on the event loop and have it print out every time a coroutine takes longer than a configurable amount of time

I don't run my production servers in debug mode, and even when I do manage to find the problem, I have limited options for solving it. Usually it amounts to refactoring out the offending code into a separate process or service.

An extreme counterpoint is a language like Go which

1) Is roughly 100X faster in single-threaded, CPU-bound execution anyway

2) Allows for additional optimizations that simply aren't possible in Python (mostly involving reduced allocations and improved cache coherence)

3) Has a runtime that balances CPU load across all available cores

This isn't a "shit on Python" post; only that concurrency really isn't Python's strong suit (yet).

gvd
"(yet)", but then it's still a dynamic language (no typing, not dynamic as in hip). The reason for python is mainly:

- it's easy to learn - we've go numpy

If I were to pick a language to build significant infrastructure with, I wouldn't choose python.

mixmastamyk
These are not really an issue in vfx production and other things Python is used for.
weberc2
It’s a problem for lots of things Python is used for, but maybe not vfx (whatever that is).
mixmastamyk
They are using it for things it’s good at, for others they use java. So this subthread is largely a waste of time.
weberc2
Who is “they”? What is your point?
mixmastamyk
They is Netflix, and other post-production oriented users. You know, what this article and discussion is about?
weberc2
It's not at all obvious that "they" refers to "netflix and other post-production oriented users", and your argument is a tautology "Python is good at the things that Python is good at". Obviously. The rest of us are debating what those things are or are not.
mixmastamyk
The subject is well-trodden, there's not much to debate. Python is not good at threading, but works well in multiprocessing situations. Netflix is using it in the later situation, and not the former. Async is unlikely to be a use case either.
weberc2
> The subject is well-trodden, there's not much to debate

And yet we see the same incorrect information trotted out over and over again.

wongarsu
In fact it's precisely Python's deficiency in multithreading that lead to it having one of the best ecosystems for every other form of concurrency, like green threads and multiprocess applications.
David Beazley: Python concurrency from the ground up https://www.youtube.com/watch?v=MCs5OvhV9S4
type0
Oh, Dabeaz has so many great talks. He has one of those teaching styles that can contagiously convince even the most Python dismissive person to start learning it.
agumonkey
Casually live coding any idiom even if the language didn't really support it. Definitely worth your time.
You can see some of David Beazley's presentations on YouTube. He presents regularly at PyCon and other conferences. He's also the author of Python Essential Reference and the editor of Python Cookbook, 3rd Edition.

One of my favorite presentations of his that gives a flavor of his instruction style is "Python Concurrency from the Ground Up" at PyCon 2015:

https://www.youtube.com/watch?v=MCs5OvhV9S4

I saw this one live, but I think the recording is good quality. He live codes a Python 3 highly concurrent, generator-based TCP server that does fibonacci calculations, if I recall correctly.

One of the YouTube comments gets it right, it's "The Jimi Hendrix of Python".

The prior course I took with him was called "Writing a Compiler (in Python)". I think I took it the first time it was offered, in 2012. In that course, we implemented a programming language that was a (very small) subset of GoLang, by building the entire parser/compiler in Python. So, in 5 days, you learned all about compiler theory, and wrote your own compiler (in Python) for this GoLang-style language. The last day, we peeled away the layers of the onion for how PyPy is implemented under the hood, using our newfound compiler toolchain knowledge. It was really fun. He has a very engaging presentation style and a fascination with computing that is infectious. You leave the course energized and happy to be a programmer.

bogomipz
Interesting, thanks for sharing. I would love to hear feedback on how the SICP course is. It looks like his classes are in person only which is unfortunate for those not in the US I guess.
almata
Thanks for sharing this video. I had no idea who he was.
ska

   "So, in 5 days, you learned all about compiler theory, ..."
You did not. Not to detract from the course - I have no opinion on that and you seem to have got a lot out of it and sounds like fun. But in 5 days no instructor or course is going to give you anything more than a fairly superficial understanding of something like compiler theory.

It sounds a lot like a typical module from one of the more rigorous introductory undergraduate courses on compilers, that's good stuff!

May 21, 2018 · 1 points, 0 comments · submitted by amzans
Nov 28, 2016 · 7 points, 0 comments · submitted by m_mueller
All of them.

Very little infrastructure is a perfect fit, and you often find yourself fighting with it to do something you need to be able to do.

Sometimes a new approach, sometimes not.

Here's a few examples:

* Python's GIL : https://www.youtube.com/watch?v=MCs5OvhV9S4

* Node's processes : https://www.youtube.com/watch?v=9o8B3L0-d9c

* PHP's Super Globals : https://www.youtube.com/watch?v=QZQ_V_ZJUjA

For more anecdotal information: ARM is my bane.

Though the ARM processor is cool, and lets me use whatever toolchain I want... Instruction sets in the wild can be:

* Incomplete

* Inconsistent

* "Extended" with unexpected, and often undocumented, side-effects

appleflaxen
I've heard of the other issues, but the ARM example was great; thanks!
I'm a bit of a David Beazley fanboy and Python lover. I've watched all of his keynotes and lectures at this point and I have yet to find one that wasn't incredibly informative. You can watch the video for whatever the main topic is about, but finish the talk having picked up a wealth of other bits of useful information.

In addition, I have incredible amounts of respect for people that are willing (and capable) to live code what they're teaching. For one of the best examples of how to effectively live code, look no further than when he implemented a concurrent system from scratch at PyCon 2015: https://www.youtube.com/watch?v=MCs5OvhV9S4

Channel (with some of his videos): https://www.youtube.com/user/dabeazllc

Ologn
I like live coding as well. For example: https://www.twitch.tv/notch/v/38122203 . Notch of Minecraft fame coding a new game from scratch over a period of two days. You get to see what tools they use, what their thought process is and so forth. It's especially good if they know what they're doing, as you can learn when watching it.
Aside from a lot of the classics here, one that stands out is this AMAZING live demo at pycon by David Beazley:

https://www.youtube.com/watch?v=MCs5OvhV9S4

The simple and followable progression to more and more complex ideas blows my mind every time.

Why are you advising him to run before he can crawl? All of those paradigms are built on top of the 70's era threading-model as used by the JVM (as well as every other language, behind the scenes).

How can you understand these tools thoroughly if you don't understand the foundation that they're built on?

Yes, they probably have been used in services with billions of users, but I guarantee those engineers understood the basics of threading just as well, if not better, than the tools built on top of them.

OP: This video is pretty good at explaining some core concepts: https://www.youtube.com/watch?v=MCs5OvhV9S4

Jul 09, 2016 · 2 points, 1 comments · submitted by bakery2k
bakery2k
Does anyone know the details behind what happens to the thread-based server at 11:30? The response time for small requests "drops off a cliff" due to a single large request, apparently due to "the implementation of the GIL"?

Compared to the simple thread-based server, the generator-based server presented in the second half of the talk avoids the above problem. Does the generator-based approach have any other advantages?

Also note that David Beazley (google him if you're not aware) has a competitor to asyncio, which arguably is also a direct competitor to this called curio:

http://curio.readthedocs.io/en/latest/

The caveat is that it uses the new async/await coroutine bits that just landed in Python 3.5, so it only works with Python 3.5+. He also gave a talk on concurrency in python recently at last year's PyCon:

https://www.youtube.com/watch?v=MCs5OvhV9S4

1st1
There are benchmarks of curio in the blog post, FWIW.
SEJeff
Good stuff. I saw him RT your announcement of this on twitter.
cderwin
Unless I'm mistaken, the decorator @asyncio.coroutine() is equivalent to async def and yield from is functionally a drop-in for await, so you should be able to use it with at least 3.4, maybe 3.3. Not that that's much better though.
masklinn
That makes it functionally equivalent to uvloop (which from my understanding is a drop-in replacement for the built-in asyncio eventloop).
I'm not sure I always agree with this. David Beazley's 2015 PyCon talk on concurrency (https://www.youtube.com/watch?v=MCs5OvhV9S4) was one of my favorite talks of the conference, and it was almost all just live coding.

Part of what made that talk compelling was that it took a concept that lots of people find complex/intimidating (how the internals of an asynchronous IO library work) and in ~30 minutes created a full working example in front of a live audience. Writing the code live in front of the audience helps to nail down the central theme of "this stuff isn't actually as scary as it looks".

There are certainly talks that would be better of just presenting snippets of code, but I think there's a time and a place for live coding examples as well.

notdonspaulding
I came to the comments on this one just to make sure this talk got mentioned as a counterpoint. Fantastic explanation of everything as he went along.

As I recall, he actually took the same conceptual problem and rewrote the solution in a handful of different concurrent styles.

And no, at least in this video I can not think faster than Dave Beazley can type. By the time I've just about figured out what nuance of concurrency he's showing off in his last example, he's already got his next example typed out!

meej
Similarly, I really enjoyed Raymond Hettinger's 2015 PyCon talk, which also had a fair amount of live coding.

https://www.youtube.com/watch?v=wf-BqAjZb8M

winterismute
Agree with this. In my undergraduate, 2nd year course of Opeating Systems, one day (pretty soon after the start) the teacher decided to write a small terminal emulator in C to show us what it really does, just there, in the classroom. It took him 2 hours of coding, but it really changed my perspective on how things really work in a UNIX based system, and on always checking in depth whether something that sounds like almost impossible really is.
inconshreveable
Yep, there are certainly exceptions! It's not a blanket rule. Rants aren't quite as much fun if you equivocate for the 5% case though. =)
rdtsc
> There are certainly talks that would be better of just presenting snippets of code, but I think there's a time and a place for live coding examples as well.

Step 1. be David Beazley. He really is such an engaging speaker and I think his jokes and lightheartedness might make it look easy, but I don't think it is. Many probably think in their heads "I'll be just like David on stage" but they are not.

I have see nice demos where everything is setup and they just run a command it builds or launches a VM, that's fine. But building code from scratch, watching it compile, dealing with 1 off errors, or some hidden bug that now everyone is debugging, is usually painful to bear through.

fuzzythinker
Although not a "true" live coding demo in the normal sense, but this talk won't have nearly the same effect if done via video instead.

https://www.destroyallsoftware.com/talks/wat

ser_tyrion
Yea I remember that one, it is a great example and awesome talk.

However as far as presenters go, this guy is a bit of an outlier. He is also a teacher, he offers some python mastery classes in Chicago, so he is more practiced at explaining and working through example code.

n0us
I came here to say exactly this and link to that talk. If you can't code live then don't, Beazley apparently can. Python lends itself to these kinds of talks because of its brevity.
Bognar
I agree with you, but I'm probably biased because I've used live coding in one of my talks. However, the intent of coding live was similar to the talk you mentioned - it was to show people that what I was trying to accomplish isn't as hard as people think it is. In fact, it's easy enough that I can do it in an hour while explaining out loud what's happening.
jskulski
Thanks for the video, looks interesting.
jMyles
Being in the crowd during this talk was seriously like being at a rock concert.

Beazley was 'playing' the keyboard like an instrument. Every square inch of floor space had someone sitting or standing. The crowd was incredibly invested - nary an eye nor ear wavered. Even Guido looked on with a hawk eye.

I was in a small circle on the floor of people who had just smoked some amazing herb before the talk. I was hanging on his every word and every expression. I've rarely felt so engaged by a conference talk. I'll never forget this one.

He received a raucous standing ovation that is not evident from the conference video.

I asked a question at the end, and I was so giddy I had trouble getting it out. :-)

As a core contributor to an async framework, I felt that this talk gave me a lot more enthusiasm and confidence about my work which has lasted to this day. I think about it often. Definitely a track for the PyCon greatest hits album.

SourPatch
The people who really know what they are doing make the complicated stuff seem dirt simple. I had Dave as the instructor for my undergrad compilers and operating systems courses back in 2000-2001. His lectures then were every bit as enlightening as his PyCon talks today. Those courses were demanding but extremely fun.
dgrant
> I was in a small circle on the floor of people who had just smoked some amazing herb before the talk.

This is a thing at software conferences?

jMyles
I think it's a thing anywhere people gather, no? It's probably more a thing at community conferences than corporate conferences.
astrange
Did you stand in the designated smoking area?
I'm guessing that this project is related to this presentation: https://www.youtube.com/watch?v=MCs5OvhV9S4]

From the youtube description: "There are currently three popular approaches to Python concurrency: threads, event loops, and coroutines. Each is shrouded by various degrees of mystery and peril. In this talk, all three approaches will be deconstructed and explained in a epic ground-up live coding battle."

(I may be wrong, I watched it already 6 months ago. BTW it is excellent.).

Jul 11, 2015 · 9 points, 0 comments · submitted by erbdex
Apr 11, 2015 · 3 points, 0 comments · submitted by cing
HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.