HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
Optimizations which made Python 3.6 faster than Python 3.5

pyvideo.org · 163 HN points · 0 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention pyvideo.org's video "Optimizations which made Python 3.6 faster than Python 3.5".
Watch on pyvideo.org [↗]
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.
Jun 05, 2017 · 163 points, 51 comments · submitted by mmastrac
rplnt
More reading linked at the end of the slides: https://faster-cpython.readthedocs.io/
DonbunEf7
Ah, yes, the "rewrite it all in C" theory of optimization. Meanwhile, PyPy continues to be a thing, writing most of its core in RPython and spanking CPython in benchmarks. Why do we keep doing this?
ATsch
Your comment is ironic considering RPython is (to oversimplify) nice syntactic sugar around C, and it is transpiled and then compiled with a normal C compiler.

So one could argue that pypy is just as much optimisation by "rewriting it in C".

ciupicri
From https://faster-cpython.readthedocs.io/pypy.html (emphasis mine):

> - Bad support of the Python C API: PyPy was written from scratch and uses different memory structures for objects. The cpyext module emulates the Python C API but it’s slow.

> - New features are first developed in CPython. In january 2016, PyPy only supports Python 2.7 and 3.2, whereas CPython is at the version 3.5. It’s hard to have a single code base for Python 2.7 and 3.2, Python 3.3 reintroduced u'...' syntax for example.

> - Not all modules are compatible with PyPy: see PyPy Compatibility Wiki. For example, numpy is not compatible with PyPy, but there is a project under development: pypy/numy. PyGTK, PyQt, PySide and wxPython libraries are not compatible with PyPy; these libraries heavily depend on the Python C API. GUI applications (ex: gajim, meld) using these libraries don’t work on PyPy :-( Hopefully, a lot of popular modules are compatible with PyPy (ex: Django, Twisted).

> - PyPy is slower than CPython on some use cases. Django: “PyPy runs the templating engine faster than CPython, but so far DB access is slower for some drivers.”

eat_veggies
A lot of the db drivers are written in C so that would make sense. It's really just the same problem.
std_throwaway
More important than the optimizations themselves is the instrumentation they built up to reliably measure and monitor the performance.

One thing to keep in mind is that python 3 is still considerably slower than python 2 (edit) in some areas such as startup times while it has gotten faster in most other benchmarks.

mundanevoice
> One thing to keep in mind is that python 3 is still considerably slower than python 2.

You have wrong information.

jayflux
> The net result of the 3.0 generalizations is that Python 3.0 runs the pystone benchmark around 10% slower than Python 2.5. Most likely the biggest cause is the removal of special-casing for small integers. There’s room for improvement, but it will happen after 3.0 is released!

https://docs.python.org/release/3.0.1/whatsnew/3.0.html

estebank
Quoting a release note from 8 years ago to represent the current state of the project is disingenuous at best.
rspeer
I don't think you'll see anyone arguing that 3.0.1 was a good version of Python, but we are talking about 3.6 here.
coldtea
How is 3.0 era information relevant for 3.5, much less 3.6?
rplnt
Here's the full benchmark suite from speed.python.org showing what is faster and what is slower in 3.6 as compared to 2.7: https://speed.python.org/comparison/?exe=12%2BL%2B3.6%2C12%2...

It would be very hard for someone to say "python 3 is considerably slower". Startup is significantly slower, yes, but that's about it.

dom0
It's not like it should surprise anyone. The import machinery in Python 2 was mostly written in C, while it's pure Python in Python 3. And if you've seen what that stuff does it's no surprise that even fairly small applications can take .1-.2 s to reach main.

A tool gathering imports in one file if it is safe to do so would be really quite neat for deploying interactive Python applications (command line / GUIs).

On the more extreme end you'd have a C application linking not too many libraries (symbols are lazily loaded, libraries are not) that reaches main after 0.0001-0.0002 s (i.e. one to two hundred µs).

On the other extreme would be Java apps using heavy runtime-code-generating frameworks like Spring, where even a trivial app can take up to ten seconds to spring to life.

masklinn
> One thing to keep in mind is that python 3 is still considerably slower than python 2.

The video quite definitely states otherwise for most benchmarks. Only a few benches still show regressions over P2, and according to Victor they're unlikely to affect the vast majority of systems.

baldfat
I wished this Python 2 and Python 3 was done with more carrot and whip to move people over to 3. It has been over a decade and we still have divided community.
jxramos
I take it on a project by project basis. We have a bunch of legacy 2.x stuff running at work but when I began to greenfield new projects I made them 3.0 from the start. Folks at work asked why I made our codebase mixed with Python versions and I answered because I'm creating isolated tool projects from scratch and have no real constraints to use Python 2. It's made some issues for folks having to install both versions of python but it's been a pretty minor issue thus far.
maxerickson
Python 3.0 was released December 3rd, 2008, 8.5 years ago.

I don't really remember all that clearly, but my recollection is that numerous caveats were made fairly clear (like the new IO not being optimized and so on). I guess something needed to go out the door but I wonder if the biggest mistake was calling that release the 3.0 release (rather than something that would have better tempered naive expectations).

eat_veggies
The changes were breaking enough that a 2.x release wouldn't​ have been appropriate, so I get why they did that.
maxerickson
I mean something like branding the initial release as a "Python 3 Developer Preview" or something. A line in the sand where the goal is release quality but with less implication that it is now the preferred version (because there was a big ecosystem of libraries to move over and such)
amenod
This. I don't care if I'm using Python 2 or 3, but I really hate switching between two similar, but not completely equal, languages. The rift made Python more damage than a forceful switch would, IMHO of course.
infogulch
It's gained much more momentum over just the last year from what I can see.
sametmax
The community is not divided anymore. Anybody who knows wants to use 3. Anybody who doesn't is told to. The only people using Python 2 are now people stuck with legacy code base. It's an important part, but the situation is very clear.
coldtea
>The only people using Python 2 are now people stuck with legacy code base.

So, just largest and more important part of Python's non-hobbyist users?

sandGorgon
> The only people using Python 2 are now people stuck with legacy code base

Like tensorflow - https://github.com/tensorflow/tensorflow/issues/1

chronial
You do realize that that issue was closed in 2015, right?
IanOzsvald
tensorflow looks to work fine with Python 3.4+ https://github.com/tensorflow/tensorflow

I've just installed it using conda in Python 3.6 just fine.

user5994461
The situation is very clear, any code base that is older than 5 years and more than a million lines will be forever stuck with python 2.
joshuamorton
Its worth watching the video in its entirety. That's not true. Python 3.6 beats 2.7 in most benchmarks, its just that there are a few places where 3 is slower (like startup time) that make a lot of things look bad, but that don't actually affect "speed" as most people would consider.
gshulegaard
This is the right answer.

> One thing to keep in mind is that python 3 is still considerably slower than python 2.

IIRC this statement started to shift ~3.4. There are still areas where 2.7 is faster, but it is more of a gray area than black and white like it used to be.

dr_zoidberg
I have a codebase at work (which arguably was optimized for 2.7) that is about 2-5% percent slower when running in Py3* . It's not a huge slowdown, but it is consistent.

Hope 3.7 finally puts it on par (or faster) than 2.7, but we've decided to migrate with 3.6 anyway. Don't take me wrong, I like having finally put 2.x behind, but it still bothers me a bit. Maybe we just have to get used to optimizing the "3.x series".

* this has been measured without startup time, just function calls, with horrible datetime.datetime.now() timings and %timeit magic-keyword from IPython -- always consistent.

dman
The startup time fiasco was one of the reasons why Java never took over for GUI apps. Even though its unscientific, time to first usable interaction goes a long way in establishing "speed" in the users mind. You should see the hoops that Chrome jumps through to excel on this metric.
kstrauser
True, but:

  $ echo exit | time python2
gives an average time of 0.015s on my laptop, and

  $ echo exit | time python3
works out to about 0.039s. If you're spamming hundreds of processes from a looping shell script, the Python 3 overhead would probably start to grate. For anything manually launched, 40 milliseconds is still substantially close to instant.
philipov
> If you're spamming hundreds of processes from a looping shell script,

You should absorb the loop into a single python process that instead calls the original script as a library...

Too
Mercurial has another approach. They have something called command server launched once and communicati g over socket if you need to invoke it thousands of times. If start up time is a problem with python your biggest problem is start up time with python, not the difference between 2 and 3.
intchanter
This may not make much difference to these benchmarks, but these commands and the nearly-equivalent:

  $ time python2 -c exit
cause Python to execute the following script:

  exit
This will look up the exit function, and then do nothing with it before the script exits due to reaching the end of the file. Interestingly, the representation of this function is set to generate the message that reminds you how to get out of the interactive interpreter if you type "exit".

You can get the results you want with:

  $ time python -c ''
meaning: Load Python, run a completely empty script, and clean up.
nomel
I think the problem is pep8. For some reason, pep8 says you load your modules at the top of the file rather than when needed. Giant libraries often follow pep8, meaning something like "import pandas" loads hundreds of python files, which gives the absurd startup times.

I've seen some large code bases take 4 seconds to import, > 80% being unused code.

dragonwriter
> For some reason, pep8 says you load your modules at the top of the file

Clarity of dependencies is the “some reason”.

Like most style rules, there are times when breaking it is justified by other considerations, but it's not an arbitrary rule.

smitherfield
But couldn't that inefficiency be optimized away in the implementation? (Load imports lazily).
marmaduke
It can't be done automatically because imported can execute code. Since that's part of the semantics, you have to be lazy by hand.
deckiedan
Only on the first load ever. When modules are compiled to their byte code, they could get a "clean" flag if they don't execute code, which means they get loaded lazily next time.
marmaduke
Ok but figuring out what executes code might require executing code: consider non trivial use of meta classes, where an inconspicuous subclass actually invokes e.g. A registration process like a Django model class.
smitherfield
Sure, but, just spitballing here, could a substantial fraction of imported code benefit even with a very conservative heuristic?

In any case, given your points, it seems like a future Python version ought to introduce a new version of the import statement with lazy semantics (which, besides eliminating dead code, is also IMO the more correct/explicit behavior when importing symbols).

deckiedan
That's kind of what I was hoping. I suspect a vast proportion of normal Python code doesn't use the meta programming and run on import and so on.
joshuamorton
If a module imports anything, it cannot be assumed clean, since any of its imports may execute code, and they will not be available at compile time.
marmaduke
It'd be pretty easy to do the experiment, since you can override the import logic and do lazy imports, eg.

    class LazyModule:
        def __init__(self, name):
            self.name = name
        def __getattr__(self, key):
            mod = {}
            exec('import %s as mod', mod)
            self.mod = mod['mod']
            return getattr(self.mod, key)

    class LazyImporter:
        def get_module(self, name):
            return LazyModule(name)
vosper
Huh, I never thought about that before. Importing inside a function is very frowned-upon AFAIK, but now I wonder what kind of performance improvements you might get from importing things only when you need them.

I wonder if it would be possible to automatically rewrite a script to do that.

microcolonel
and I would point out that on my machine (GCC 6.3-based x86-64 Linux) the times are much closer

    ~ time python2 -c exit
    real    0m0.014s
    user    0m0.011s
    sys     0m0.004s
vs.

    ~ time python3 -c exit
    real    0m0.022s
    user    0m0.018s
    sys     0m0.004s
So for me it's more like an 8ms realtime difference, and four milliseconds are spent in the system either way.

python2 seems to do about 700 system calls (!) to start up and exit immediately, including enumerating things like gtk-2 and wxwidgets, which boggles the mind, but it is what it is.

python3 seems to do about 470 system calls, much less but still bonkers to my mind. Also weird that they take the same ~4ms in the kernel given that python3 calls into the kernel so much less.

kstrauser
That's better yet - thanks for sharing! Most of my Python processes run for weeks at a go, so startup time isn't that important to me as long as it's not glacial. But 40ms (on my system) or 22ms (on yours) is totally acceptable for interactive shell usage.

Thanks for including the syscall counts. I know they'd been working to reduce that, but I hadn't seen how much progress they'd made yet. Are most of those to malloc() and open()?

microcolonel
Mostly open, read, and fstat for python2:

     ~ strace python2 -c exit 2>&1 | sort | grep "^[a-z_]*[(]" | sed -e "s/^\([a-z_]*\)[(].*/\1/" | uniq -c | sort -rn | head -n 8
        199 open
         98 read
         94 fstat
         92 stat
         68 rt_sigaction
         63 close
         27 mmap
         16 mprotect
Mostly stat, rt_sigaction, and read for python3:

     ~ strace python3 -c exit 2>&1 | sort | grep "^[a-z_]*[(]" | sed -e "s/^\([a-z_]*\)[(].*/\1/" | uniq -c | sort -rn | head -n 8
         92 stat
         68 rt_sigaction
         57 read
         54 fstat
         35 open
         35 close
         28 mmap
         24 lseek
Some of this might be noise though, it's possible that it has something to do with the installed packages, not sure exactly.
joshuamorton
It absolutely is. Consider that that means that python takes 2-3 frames to start up. I'm not sure one should or could reasonably consider that "slow" for interactive use.
mkl
My times on x86-64 Linux are similar to yours, but on the Windows Subsystem for Linux (which emulates the Linux kernel) it's a different story:

  $ time python2 -c exit
  real    0m0.111s
  user    0m0.016s
  sys     0m0.078s

  $ time python3 -c exit
  real    0m0.063s
  user    0m0.016s
  sys     0m0.047s
Python 3 is consistently ~twice as fast. Both seem slower, but this is on a small laptop so the absolute measurements may not be comparable. This is on Creators Update.
HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.