HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
Larry Hastings The Gilectomy How's It Going PyCon 2017

PyCon 2017 · Youtube · 76 HN points · 6 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention PyCon 2017's video "Larry Hastings The Gilectomy How's It Going PyCon 2017".
Youtube Summary
"Speaker: Larry Hastings

One of the most interesting projects in Python today is Larry Hastings' ""Gilectomy"" project: the removal of Python's Global Interpreter Lock, or ""GIL"". Come for an up-to-the-minute status report: what's been tried, what has and hasn't worked, and what performance is like now.

Slides can be found at: https://speakerdeck.com/pycon2017 and https://github.com/PyCon/2017-slides"
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.
>First off, this is not why Python has a GIL, but lets leave that aside. Atomic writes are more expensive than non-atomic ones, but they are not slow operations in the grand scheme of things. If you properly implement acquire-release semantics, they are not even that slow under high contention. Compare this to a GC which literally STOPS ALL THREADS, it's nothing.

This is actually part of why Python still has the GIL. A GILECTOMY was attempted and multithreaded atomic refcounting made things a lot slower (going up with the number of threads) and even other methods were not sufficient for performance.

https://www.youtube.com/watch?v=pLqv11ScGsQ

There are versions of Python without RC: https://www.python.org/download/alternatives/. There is no doubt that Python would be better off without RC, the problem is that Python extensions rely on RC. So CPython (the main python implemenation) can't just do the switch.

If you want more information about this topic, there is a nice talk by Larry Hastings: https://www.youtube.com/watch?v=pLqv11ScGsQ

amelius
Thanks. It seems that the JyNI project is trying to make a bridge between CPython and Jython, so from that I take that it is somehow possible to have extensions which more or less rely on RC (or at least the C API) while you can use a GC (Java's GC in this case) at the same time.

https://www.jyni.org/

I feel like the GIL is, at this point, Python's most infamous attribute. For a long time I thought it was also the biggest flaw with Python...but over time I care less and less about it.

I think the first thing to realize is that single-threaded performance is often significantly better with the GIL than without it. I think Larry Hasting's first Gilectomy talk was extremely insightful (about the GIL in general and about performance when removing the GIL):

https://youtu.be/P3AyI_u66Bw?t=23m52s

I am not sure I would, personally, trade single-threaded performance for enabling multi-threaded applications. I view Python as a high-level rapid prototyping language that is well suited for business logic and glue code. And for that type of workload I would value single-threaded performance over support for multi-threading.

Even now, a year later, the Gilectomy project is still slightly off performance-wise (although it looks really really close :) ):

https://youtu.be/pLqv11ScGsQ?t=27m32s

As noted elsewhere, multi-processing offers adequate parallelization for this type of logic. Also, coroutines and async libraries such as gevent and asyncio offer easily approachable event loops for maximizing single-threaded resource utilization.

It's true that multi-processing is not a replacement for multi-threading. There definitely are tasks and workloads where multi-processing and its inherent overhead make it unsuitable as a solution. But for those tasks, I question whether or not Python itself (as an interpreted, dynamically typed language) is suitable.

But that's just my $0.02. If there is a way to remove the GIL without negatively impacting single-threaded performance or sacrificing reference counting for a more robust (and heavy) GC, then I am all for it. But if there is not...I would just as soon keep the GIL.

sametmax
Which is what multi-interpreters is a good solution. You keep the GIL and the benefits of it, but you loose the cost of serialization and can share memory.
VectorLock
The GIL has been a much bigger problem for perception than it ever has been for performance. Python has lost more mindshare over it than anything else. The few machine cycles that were ever saved by moving away from it were far outweighed by the waste of human cycles.
bsder
The few machine cycles that were ever saved by NOT moving away from it (which is the ONLY justification for keeping it) were far outweighed by the waste of human cycles.

If Python would simply suck it up and eat the 20% performance hit, we could stop talking about the GIL and start optimizing code to get the 20% back.

xenadu02
Many projects have solved this problem with dual compilation modes and provide two binaries the user can select from at runtime.

Eliminating the GIL doesn't have to mean actually eliminating it. You could certainly have #defines and/or alternate implementations that make the fine-grained locks no-ops when compiling in GIL mode. Conversely make the GIL a no-op in multithreaded mode.

The urls.

Jake Vanderplas - Keynote: http://youtu.be/ZyjCqQEUa8o

Static Types for Python: http://youtu.be/7ZbwZgrXnwY

The Gilectomy How's It Going: http://youtu.be/pLqv11ScGsQ

Optimizing Pandas Code: http://youtu.be/HN5d490_KKk

Debugging in Python 3 6 Better, Faster, Stronger: http://youtu.be/NdObDUbLjdg

Instagram Keynote: http://youtu.be/66XoCk79kjM

Python from Space: http://youtu.be/rUUgLsspTZA

Factory Automation with Python: http://youtu.be/cEyVfiix1Lw

Dial M For Mentor: http://youtu.be/Wc1krFb5ifQ

Jun 09, 2017 · gshulegaard on PyPy v5.8 released
> And I know some smart Alec will trot out the usual 'downshift into C' line that everyone (including Guido) use as the final goto solution for performance but that is simply a disgrace in 2017.

Easy gluing of other languages together has long been something I considered a strength...but I suppose to each their own.

> Why can I not choose to write Python and it be fast??

Well there are lots of reasons...including implementation issues and I don't know them all...but I think Python has a very clear productivity niche. Personally, I am ok with Python trading performance for productivity. For the most part, I haven't had Python be so much of a bottleneck that writing a very small part of logic to be performant hasn't solved my use case.

> And yet Python 3 is getting slower. Don't agree?

Yeah I don't agree...that benchmark uses Python 3.3. The corner on Python 3 performance over Python 2 started turning around 3.4. Perhaps a talk from this years PyCon would help illustrate:

https://www.youtube.com/watch?v=d65dCD3VH9Q

> But PyPy is proof that Python can be fast.

Indeed, I would even say that Cython is even more proof that there are frontiers of performance that could be explored. But with PyPy (as with Cython) their are sacrifices you have to make.

Personally, I think the most promising performance improvement that is tantalizingly close for me is Larry Hasting's Gilectomy project:

> https://www.youtube.com/watch?v=pLqv11ScGsQ

But at the same time, I am not sure that Python ever needs to be fast running in CPython. With `WASM` perhaps it is better to just compile Python.

I don't know, performance in Python has always been a mixed bag...but personally I think it doesn't get much focus because it doesn't really serve Python's target niche. I don't know if there ever will be (or should) be 1 language to do everything...and as it is Python is a good "productivity" focused language to have in your toolbox so-to-speak.

May 28, 2017 · 73 points, 54 comments · submitted by varunramesh
thomaslee
If any Python devs are out there reading: my understanding is that removing the GIL itself isn't the hard part so much as removing the GIL while satisfying certain constraints deemed necessary by GvR and/or the rest of the community. I know some of those constraints relate to compatibility with existing C extensions -- but there must be others too?

The reason I ask is Larry's attempt buffered ref counting surely has implications for single-threaded code that maybe relies on the existing semantics -- e.g. a program like this may no longer reliably print "Deallocated!":

  Python 2.7.13 (default, Mar  5 2017, 00:33:10) 
  [GCC 6.3.0 20170205] on linux2
  Type "help", "copyright", "credits" or "license" for more information.
  >>> class Foo(object):
  ...     def __del__(self):
  ...             print 'Deallocated!'
  ... 
  >>> foo = Foo()
  >>> foo = None
  Deallocated!
  >>> 
A bad example in some ways since in this particular case we could wait for all ref counting operations to be processed before letting the interpreter exit, but hopefully my point is still clear.

Similarly, what about multi-threaded Python code that isn't written to operate in a GIL-free environment -- absent locks, atomic reads/writes, etc.? At best, you might expect some bad results. At worst, segfaults.

Are these all bridges that need to be crossed once a realistic solution to the core GIL removal issue is proposed? As glad as I am that folks are still thinking hard about this problem, I'm personally sort of pessimistic that the GIL can be killed off without a policy change wrt backward compatibility. Still, I do sort of wonder if some rules of engagement wrt departures from existing semantics might help drive a solution.

bdarnell
The big constraint (aside from backwards compatibility) is performance: Guido has indicated that he is unwilling to accept much (if any) slowdown of single-threaded code in order to remove the GIL. It's (relatively) easy to remove the GIL and replace it with a bunch of fine-grained locks (or atomic increments, etc), but doing so tends to slow things down. The challenge is in figuring out how to avoid synchronization overhead for common operations (mainly reference counts).

It's true that buffered refcounting probably means that `__del__` would no longer be called immediately as it is now, but I'm not sure if that's a requirement - pypy and jython don't do this either, and destructors are generally discouraged in favor of `with` blocks these days.

fatbird
In his talk at last year's pycon, Hastings said the three constraints GvR laid out are:

1. Can't degrade single-threaded performance

2. Can't break existing extensions

3. Can't make the implementation of cpython much more complicated (i.e., can't raise the barrier to entry to participating in the development of python)

All of these are pretty reasonable, if tough, targets to meet, and Hastings agrees with all of them. For 1 and 2 he was generally looking at making GIL-less cpython a compiled mode so that the default was the single threaded version, thus retaining compatibility and performance, but offering a true multi-threaded binary for those who would use it.

jholman
If I'm understanding you, some or all of these questions are explicitly addressed in the Q&A. My apologies if you got that far and I simply didn't understand you.

For example, your first question seems to be asking about whether there's a semantic change coming from a lack of immediacy in when __del__ will run. And the answer is explicitly "yes, and the docs already told you not to count on that".

As for multi-threaded Python code... and perhaps also multi-threaded C code in extensions... I think the clear answer is "yes, our whole goal is to remove some guarantees that were previously provided, so if you counted on those guarantees you're in trouble". Again, c.f. the Q&A in case that helps.

From the talk, it doesn't look to me like Larry Hastings has a plan for the policy change in question; so maybe "bridges that need to be crossed once [the technical issues are smaller]" is correct?

WaxProlix
It's funny, I've written a lot of python in quite a few domains and haven't really struggled directly because of the GIL before. Is this more of a 'data scientist' problem? I feel like if I had a huge pile of data to crunch, python wouldn't be my first choice really.
bayesian_horse
Actually, the data science stack is pretty much unconstrained by the GIL right now. Most of the libraries do their work in C/Fortran, and release the GIL in-between.

For that matter you should look into Theano, Tensorflow and Numba to bring Python code to the GPU (no GIL there). Or use Dask to scale to multiple cores or nodes.

askvictor
Strangely, python is the choice for heaps of (data and other) scientists, mainly as it's so easy to pick up and tooling (jupyter) - remember that these people aren't native coders. And even then parallelism is handled behind the scenes with libraries such as scipy and numpy. I think the problems come up when you have a high performance application that doesn't play nice with numpy or scipy
a3n
If we didn't have GIL issues, then you may have written python in additional domains that the GIL makes impractical or weakly advised. And then, thought experiment, if the GIL were suddenly introduced, we'd be pissed.
chrisseaton
> if I had a huge pile of data to crunch, python wouldn't be my first choice really.

Right - and one reason that it wouldn't be your first choice would be the GIL, so let's remove the GIL, and then the other barriers.

gaius
one reason that it wouldn't be your first choice would be the GIL

I don't think this is true. GIL means Global Interpreter Lock - it isn't held when you are in native code, which is how NumPy et al actually work - Python to marshal the data, then do the heavy lifting under the hood in C, FORTRAN and ASM, that the end user never needs to see. So this is a non-problem. As the above comment says, the problem is if you want to write a server that handles a lot of concurrency and shared state.

chrisseaton
I think there's a bit of Amdahl's law to this. You could have most of your application in NumPy, running beautifully in parallel on your 64-core machine, but then you just need to drop back into Python to do a tiny bit of transformation or logging or something before you go back into NumPy, it's really quick, but because all 64 cores contend you do that work inside the GIL you then have yourself a sequential bottleneck. Even if it's small, it starts to dominate.
gaius
Sure but that would be the case if you were marshalling data to/from a GPU too.
ubernostrum
As the above comment says, the problem is if you want to write a server that handles a lot of concurrency and shared state.

If you write a server and it uses threading as the concurrency model and its workload is primarily CPU-bound, the GIL will be a problem for you.

Take away either of those conditions -- use a model other than threading, or have an I/O-bound workload -- then the normal background overhead of the GIL is not something you'll notice.

People who deploy Python applications as network daemons tend to use pools of worker processes (not threads) and have I/O-bound workloads, which is why the assertion that Python is somehow "bad for servers" is not one you'll typically hear from people who actually deploy Python on servers. Much like the comment you're referring to, which is from someone who seems to primarily have type-system complaints about Python (i.e., they wouldn't touch a dynamically-typed language to begin with) and is throwing in "oh and the GIL probably makes it unsuitable for a server" as an additional (but factually incorrect) reason not to use Python.

Meanwhile, if you are writing a server which uses threading and lots of shared state which will be modified concurrently, you're in for a world of pain no matter what language you choose. There's a reason why there are multiple up-and-coming or even moderately-popular languages now which have as a design feature the inability to do such a thing.

gaius
Right, hence me making that caveat.

multiple up-and-coming or even moderately-popular languages now which have as a design feature the inability to do such a thing

STM in Haskell seems quite promising, but the approach of just outsourcing that problem to Redis gets you pretty far.

jackmott
some things need performance some don't. that simple.
prewett
Servers are where the problem is. The GIL makes python functionally single-threaded, which is a bummer for your server at any kind of scale. So you end up having to have n cores' worth of server processes behind a load balancer, even if you only need one server machine, which is a bummer if a you have a stateful server (such as a game server), as you now have to manage communicating state between processes by storing it in another process (frequently Redis).

But python is easy and fun to write code in, and "developer time is expensive, servers are cheap," so there are a lot of python servers which could benefit from a lack of GIL. Never mind that it's fast to write, but difficult to maintain since it is a dynamically typed language and one typo creates runtime errors that any statically typed language will catch at compile time. Or that it is slow. Or that a python process never really releases memory back to the system, just within itself, so the process slowly grows over the course of a few weeks. Or that the Twisted framework you're using for cooperative multitasking because of the GIL is really easy to block on a database query by accident, leading to uncooperative multitasking (= large lags), resulting to forced server restarts, and loss of players (= loss of revenue). So yeah, "developer time is cheap" but it's sort of an expensive cheap. I came to the conclusion that python is unsuitable for servers, but until Go came out, there wasn't a realistic alternative, since C++ and Java are too heavyweight, and Ruby suffers from similar problems (don't know about a GIL).

ubernostrum
The GIL has two noticeable effects:

1. For CPU-bound applications which use threading, performance is severely degraded due to only one thread being executed at a time regardless of number of cores/CPUs.

2. For all other threaded applications, performance can be slightly but measurably degraded by obtaining and releasing the GIL.

If you are an I/O-bound threaded application, the GIL is not something you're likely to really be bothered by, and probably will get lost among the noise of all the other things that will affect performance.

As a result, Python for "servers" -- assuming you mean network service daemons which are almost always I/O-bound -- is perfectly fine, including "at scale", as demonstrated by a number of large sites and services which get along just fine on Python.

You can also avoid the GIL entirely by using a parallelism construct other than threading.

And for completeness' sake, the oversimplified history of the GIL:

Python is an older language than people tend to realize. It predates Java. And Python came in part out of the Unix scripting-language tradition, where Unix approaches -- such as forking additional processes -- were the typical way to do things. Then along came Java, which had the limitation of being designed originally to run on set-top TV boxes which didn't have true multitasking. So Java imposed threading as the way to do multitasking, and Java became very popular.

Thus, Python was pressured to develop a story on threading. But since Python had been built in the Unix multi-process tradition, it wasn't implemented in a way that was friendly to threading. The GIL was the compromise that allowed Python to have threading: it would only seriously affect threaded CPU-bound applications on multi-CPU or multi-core hardware (since I/O-bound applications aren't affected nearly as much, and single-core, single-CPU hardware is only physically capable of running one thread at a time anyway).

Fast forward a couple decades and now we all have multi-CPU and/or multi-core computers, including literally carrying them in our pockets, and CPU-bound applications are more common. In retrospect, the GIL can look like the wrong tradeoff to make, which is why people want to get rid of it, but at the time it was quite reasonable.

thomaslee
> Servers are where the problem is. The GIL makes python functionally single-threaded, which is a bummer for your server at any kind of scale.

Right, agreed. I can imagine some of the frustration you might experience using CPython for high throughput systems: kind of like NodeJS without the benefits of a standard library written with async/non-blocking I/O in mind.

A bit curious about a few things you mention here, though:

> Or that a python process never really releases memory back to the system, just within itself, so the process slowly grows over the course of a few weeks.

I'm not sure this is true in general, is it? Can you elaborate? It's been a while since I've dug around in Python innards, but if Py_DECREF(x) leads to a refcount of zero IIRC free(x) is ultimately called -- albeit in an indirect manner via a layer or six of tp_dealloc calls and tp_free. :) I suppose calling free(x) may only return the memory associated with x to (g)libc's free list and not necessarily back to the OS [0]. No different to C/C++ in that regard, I guess.

> I came to the conclusion that python is unsuitable for servers, but until Go came out, there wasn't a realistic alternative, since C++ and Java are too heavyweight, and Ruby suffers from similar problems (don't know about a GIL).

"Too heavyweight" in that they're relatively difficult to write in comparison? Maybe true of Java-the-language, but the JVM itself is an absolute workhorse when it comes to high performance. Plenty of languages to choose from there, typically without a GIL. Jython, for example, has no GIL [1].

And yep, Ruby/MRI has a GIL (but JRuby does not).

[0] https://www.gnu.org/software/libc/manual/html_node/Freeing-a... [1] https://stackoverflow.com/questions/1120354/does-jython-have...

fiddlerwoaroof
Common Lisp implementations generally do multithread really well and give you lovely syntactic abstraction capabilities while also running significantly faster than comparably high-level languages.
poooogles
>Or that a python process never really releases memory back to the system, just within itself, so the process slowly grows over the course of a few weeks.

This just isn't true anymore. Create a list with range(1000000) then delete the list and you can see the memory freed.

crdoconnor
>Never mind that it's fast to write, but difficult to maintain since it is a dynamically typed language and one typo creates runtime errors that any statically typed language will catch at compile time.

This distinction is only important if you do not have tests.

You need more tests if you are using a language that is weakly typed (e.g. C, js) and fewer if you are using a language that is largely type-safe (e.g. python) and even fewer if it is very, very type-safe (e.g. rust/haskell), but the compile-time/runtime distinction doesn't change the number of tests you need to achieve a requisite level of code quality - it only changes whether a behavioral test you should always be writing or a compiler picks up type errors.

prewett
Well, yeah, if you have 100% test coverage it doesn't matter. The thing is, with a dynamic typed language you have to have 100% test coverage to have any idea of whether the thing will even run. One misspelled variable will take down your server (something that frequently happened to me in development). With a static typed language all you need to worry about is logic errors. Yeah, you should have tests (and with a server, there's really no reason not to), but the reality is that tests are a pain to maintain, and writing the test often takes as long as writing the code. So, along with documentation, the chances of 100% test coverage is minimal unless you are in control of the project. With a statically typed language, at least you know somebody didn't make a misspelling in an infrequent code path and doom your server from the start.
crdoconnor
Except you don't need 100% and it's not even a good idea to optimize for SLOC coverage. Doing TDD for 90+% of stories and reported bugs is sufficient.

Linting also catches variable misspellings.

jabl
> I came to the conclusion that python is unsuitable for servers, but until Go came out, there wasn't a realistic alternative, since C++ and Java are too heavyweight, and Ruby suffers from similar problems (don't know about a GIL).

If you're willing to go slightly outside the mainstream, there's stuff like erlang and haskell with kickass runtimes. And haskell, at least, is pretty strongly typed.

joobus
A good individual programmer can write Haskell, but I just can't imagine deploying Haskell in production and trying to hire and maintain a group of people who are all capable of writing Haskell, and reading each other's Haskell. Most programmers are just about average, after all.
mrfusion
I've noticed the not releasing memory thing. Why does no one talk about that? Is that something that could be fixed?
brianwawok
Many popular frameworks do fix this. A common python setting is to restart process if memory exceeds X.
dbcurtis
Yes, all true.

But with Twisted, and now with async def, it is straight-forward to write performant cooperative multitasking code. For some definition of "straight forward" -- I have been doing coopertive real time code since before Python existed, so I've learned to think that way. But it really is worth the effort to climb the learning curve on Twisted, and it really does make cooperative tasking painless.

The problem with the GIL and Python semantics is that collections imply a zillion fine-grained locks everywhere. The time spent aquiring/releasing locks, surprisingly, is not the issue. It is that every lock requires that every CPU cache has to sync on the lock. All that locking makes the process cache-invalidate bound.

ars
Gilectomy project: the removal of Python's Global Interpreter Lock, or "GIL".
chairmanwow
Is he being serious when he says he only has one test case? That really doesn't seem like a reasonable thing to do. Furthermore, would a recursive implementation of Fibonacci even benefit from multithreading?
wulfjack
The goal should be, and is kind of what Larry Hastings is looking for, is that any program should run 8 times faster on a 8-core CPU compared to a 1-core. And as said above Python can basically only use one core b/c of GIL. Actually Python 2.7 multithreading runs much slower on a multicore CPU than on a single core due to locking congestion on the GIL.
marvy
What? No! Multithreaded programs should run faster on 8 cores than on one core. That's not very realistic for single-threaded programs, in any language.

I could be wrong, but I think Py2.7 is about the same speed on multicore vs 1 core. Where did you get that idea?

alfanerd
Some years ago I wrote pyworks (www.github.com/pylots/pyworks), a proposal for async objects (inspired by ABCL and cooC (Concurrent Object Oriented-C, https://www.researchgate.net/publication/220178380_Concurren...)

The testcase is 100 threads sending 1000 messages in to each other in a ring. On a 8-core Mac Jython and IronPython performs better than on 1-core, but Python 2.7 performs so badly that it never finishes.

The ideal scenario is probably that the CPython interpreter starts one thread per core running as many coroutines in parallel as possible, but that looks like a long way away for Python

diek
Python 2.7 has terrible thrashing in the way the GIL is acquired that is exacerbated as more threads are used. Dave Beazley has given great talks with the technical details: http://www.dabeaz.com/GIL/
thomaslee
> The goal should be, and is kind of what Larry Hastings is looking for, is that any program should run 8 times faster on a 8-core CPU compared to a 1-core.

A program that's inherently single-threaded it's unlikely to benefit from more CPUs. When you say "any program" here, you mean "any program with >=8 threads", right?

alfanerd
I mean the developer (using a highlevel language as Python) ideally should get more performance on an 8-core than on a 1-core CPU.

Erlang, which btw. is older than Python, will perform better the more cores you have due to its message oriented nature. Python (2.7) on the other hand performed worse with multithreading on multicore.

I was hoping that Python would take the same direction in the future, but unfortunately we are getting the async/await mess, instead of a simple async object model (sorry, my pet peeve)

m_mueller
Basically, his point at the moment is not to construct some benchmarks that show some multithread benefits and then call victory. Instead he wants to reduce its overhead so it won't make existing (thus unoptimised for Gilectomy) programs much slower - which is probably necessary for main line adoption. As such I find his approach really refreshing, interesting and honest: Basically start with the worst case first, get everyone's expectation down and then at the end show that the situation is actually not as hopeless as you might think.
mrfusion
Can Python copy how other languages like golang or java operate without a Gil? Why or why not?
thristian
As the talk mentions, Jython (Python on the JVM) is living proof that it's possible: it doesn't have a GIL and it works just fine.

The issue is "just" one of implementation. Sun and Oracle have spent millions of dollars paying ridiculously-smart people to make HotSpot as good as it is... while the CPython codebase has traditionally aimed at being simple enough for average C programmers to contribute to. I'll be interested to watch how that tension plays out.

tempay
How much existing Python code works with Jython? Particularly that depends on C extensions like numpy?
alfanerd
Both Jython (Java impl) and IronPython (C# impl) works fine without the GIL. The big issue is compatibility with C-based libraries, as I hear Larry Hastings, there are three levels of compatibility:

1) Fully compatible, nothing needs to be done 2) Fully compatible, but a recompile of C-libs is needed 3) Almost compatible, but some updates to C-libs are needed

Larry is aiming for 2)

bbayles
Yes, it can. The purpose of the project is to figure out how to do that while minimizing the number of APIs that have to break.
chrisseaton
Python has different semantics to those languages. I don't think it's formally specified, but people program in Python expecting the semantics that reference counting provides, and that unsychrnoised concurrent access to data structures will not cause errors. Despite ongoing research, it appears to be hard to continue to provide these semantics without a GIL.

Golang (informally I think) and Java (more formally) are not specified to provide reference counting semantics and not specified to guarantee that unsychrnoised concurrent access to data structures will not cause errors.

So the languages have different semantics - that's why you can't copy-and-paste the solution from one to another.

Some alternative implementations of Python don't follow the above semantics, like Jython, but then some people aren't happy with that. It may not be acceptable to the community to drop those semantics, even if they were never formally given.

mrfusion
Interesting food for thought. So what exactly do you lose in jython?
alfanerd
You loose access to all the C-libraries that comes with Python. On the other hand you get beautiful integration with all Java libraries very nicely integrated.
munin
> and that unsychrnoised concurrent access to data structures will not cause errors

Python doesn't ensure that unsynchornized concurrent access to data structures won't cause errors. As I understand and experience Python multithreading, all the GIL ensures is that of the "load - inc - store" stages running amonst N threads, each separate stage will be locked, but not the overall sequence. So you'll still have data races, even with the GIL, and you still need to use mutexes etc in your Python program, which is why they are there.

chrisseaton
I mean it won't cause errors within the basic data structure access operations. I'm not talking about composition. In Java a hash table write can fail with an exception if there is a concurrent write that conflicts. A Python dict write is atomic, because it happens within a single instruction as you say and so will not be interrupted and will never fail. That's what you aren't getting in Java. That expectation is very hard to provide without a GIL. Jython does it with blunt fine grained locking, but that's slow.
amelius
Can't they use the GC techniques used in other languages? I've heard that Golang has a very efficient concurrent garbage collector.
peterhunt
Implementing a tracing gc was covered in the video.
amelius
Well, I didn't see the video yet, but I noticed they are referencing "The Garbage Collector Handbook", which is from 2011. The people from Golang have had some more recent successes with their concurrent garbage collector, which, as I've heard, is really efficient.
poooogles
>The people from Golang have had some more recent successes with their concurrent garbage collector, which, as I've heard, is really efficient.

Haven't they just sacrificed throughout for pause time though? Not a GC expert at all, but that's what Ive got the gist of from speaking to people.

ubernostrum
Before making a comment which begins with:

"Why don't they just..."

"Can't they just..."

"Haven't they heard of..."

or other similar phrases, please

1. Read through the linked content (or if it's a video, watch it or find a transcript), and

2. Bonus points: familiarize yourself with the problem domain.

Doing (1) will usually answer these questions all on its own. If it doesn't, doing (2) will help. In this case, we're not talking about a random amateur starting fresh on a problem nobody's considered thoroughly yet. Python's global interpreter lock is extremely well-trod ground; the people working on it are not just experienced programmers but experienced with Python; and there's a lot of background out there, available in an easy Google search, explaining what's been tried and what's been ruled out over the history of the problem.

In particular, the two problems you'd learn about from following the advice above are:

* A lot of important existing Python code consists of modules which are partially or entirely written in C and depend on the documented Python/C API remaining stable. Breaking the C API, and thus forcing all that existing code to rewrite to a new API, is considered an unacceptable solution (this rules out many "why not just use (insert your favorite GC technique here)" approaches).

* Prior attempts at removing the global interpreter lock while not breaking existing Python/C code have caused major performance degradation for single-threaded Python code, which is also considered unacceptable.

comex
If you're the type that prefers to read text, here's LWN's writeup of the linked talk:

https://lwn.net/SubscriberLink/723514/f674d4a807264ba1/

scribu
I've enjoyed LWN articles in the past, but found this particular writeup very tedious to follow, compared to watching the video.

A direct transcript, perhaps with some light editing, would have been more useful, IMO.

HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.