HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
Christophe Pettus - PostgreSQL Proficiency for Python People - PyCon 2016

PyCon 2016 · Youtube · 3 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention PyCon 2016's video "Christophe Pettus - PostgreSQL Proficiency for Python People - PyCon 2016".
Youtube Summary
Speaker: Christophe Pettus

PostgreSQL has become the default database for most green-field development projects, and is the data storage architecture behind many major Python-based success stories, such as Instagram. Despite a reputation as being complex and fiddly, Postgres is easy to install, administer, maintain, and use... with just a little bit of orientation. This is that orientation.

Slides can be found at: https://speakerdeck.com/pycon2016 and https://github.com/PyCon/2016-slides
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.
I found this to be great introductory that touches on all the major points to keep in mind with Postgres. It's pretty up-to-date too!

https://www.youtube.com/watch?v=knUitQQnpJo

It isn't quite what you ask for, but should be enough to get you high-level overview of major systems and contains solid practical advices (on vacuuming, mvcc, replication, backups, etc…), so you know what to google for deeper understanding.

hyperpape
Thanks! It's too long for me to watch right now, but I've saved it for the next time I can dedicate that much time to a video.
elmigranto
Feel free to skip to any topic, it is quite well structured to do that.
I recommend following Christophe Pettus - PostgreSQL Proficiency for Python People - PyCon 2016 video for postgresql basics and optimization

https://www.youtube.com/watch?v=knUitQQnpJo

ceyhunkazel
Here is the self notes from the video:

1)for debian use: pg_createcluster instead of initdb

pg_createcluster 9.5 main -D /data/9.5/main -E UTF8 --locale=en_US.UTF-8 -- -k

https://wiki.debian.org/PostgreSql

https://makandracards.com/makandra/18643-how-the-change-the-...

2) Create a separate database volume partition for data. do not put version number in the mountpoint (/data, not /data/9.5) then create a folder with version number EXT4 or XFS for filesystem (ZFS is extra for experts) EXT4 on average a litter faster than XFS but standard deviation is higher than XFS

3) postgresql.conf - most server settings pg_hba.conf - who gets to log in to what databases?

4) Important parameters:

  a) Logging


  b) Memory


  c) Checkpoints


  d) Planner

5) log use csvlog format http://cheng.logdown.com/posts/2016/04/08/enable-logging-in-...

http://bonesmoses.org/2014/08/05/on-postgresql-logging-verbo...

log_destination = 'csvlog' logging_collector = on log_min_duration_statement = 250ms log_checkpoints = on log_lock_waits = on log_temp_files = 0

https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Serv...

6) shared_buffers (cache for data) Sets the amount of memory the database server uses for shared memory buffers. Below 64GB set to %25 of total system memory above 64GB set to 16GB

7) work_mem (affect slow joins) Specifies the amount of memory to be used by internal sort operations and hash tables before writing to temporary disk files. The value defaults to four megabytes (4MB). Note that for a complex query, several sort or hash operations might be running in parallel; each operation will be allowed to use as much memory as this value specifies before it starts to write data into temporary files. Also, several running sessions could be doing such operations concurrently. Therefore, the total memory used could be many times the value of work_mem; it is necessary to keep this fact in mind when choosing the value. Sort operations are used for ORDER BY, DISTINCT, and merge joins. Hash tables are used in hash joins, hash-based aggregation, and hash-based processing of IN subqueries.

Start low: 32-64MB. Look for 'temporary file' lines in logs Set to 2-3x the largest temp file you see. Can cause a huge speed-up if set properly! But be careful: It can use that amount of memory per planner node.

8)maintenance_work_mem (integer) Specifies the maximum amount of memory to be used by maintenance operations, such as VACUUM, CREATE INDEX, and ALTER TABLE ADD FOREIGN KEY. It defaults to 64 megabytes (64MB). Since only one of these operations can be executed at a time by a database session, and an installation normally doesn't have many of them running concurrently, it's safe to set this value significantly larger than work_mem. Larger settings might improve performance for vacuuming and for restoring database dumps.

Note that when autovacuum runs, up to autovacuum_max_workers times this memory may be allocated, so be careful not to set the default value too high. It may be useful to control for this by separately setting autovacuum_work_mem.

* 10% of system memory, up to 1 GB. Maybe even higher if you are having VACUUM problems.

9) effective_cache_size effective_cache_size should be set to an estimate of how much memory is available for disk caching by the operating system and within the database itself, after taking into account what's used by the OS itself and other applications. This is a guideline for how much memory you expect to be available in the OS and PostgreSQL buffer caches, not an allocation! This value is used only by the PostgreSQL query planner to figure out whether plans it's considering would be expected to fit in RAM or not. If it's set too low, indexes may not be used for executing queries the way you'd expect. The setting for shared_buffers is not taken into account here--only the effective_cache_size value is, so it should include memory dedicated to the database too. Setting effective_cache_size to 1/2 of total memory would be a normal conservative setting, and 3/4 of memory is a more aggressive but still reasonable amount. You might find a better estimate by looking at your operating system's statistics. On UNIX-like systems, add the free+cached numbers from free or top to get an estimate. * Set to the amount of file system cache available. * If you don't know set it to 75% of total system memory free -m

For this, I highly recommend this tutorial by Christophe Pettus from PyCon: https://www.youtube.com/watch?v=knUitQQnpJo

It covers the basics of setting up Postgres as well as other things.

HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.