HN Books @HNBooksMonth

The best books of Hacker News.

Hacker News Comments on
Joe Celko's Trees and Hierarchies in SQL for Smarties (The Morgan Kaufmann Series in Data Management Systems)

Joe Celko · 4 HN comments
HN Books has aggregated all Hacker News stories and comments that mention "Joe Celko's Trees and Hierarchies in SQL for Smarties (The Morgan Kaufmann Series in Data Management Systems)" by Joe Celko.
View on Amazon [↗]
HN Books may receive an affiliate commission when you make purchases on sites after clicking through links on this page.
Amazon Summary
The demand for SQL information and training continues to grow with the need for a database behind every website capable of offering web-based information queries. SQL is the de facto standard for database retrieval, and if you need to access, update, or utilize data in a modern database management system, you will need SQL to do it. The Second Edition of Joe Celko's Trees and Hierarchies in SQL for Smarties covers two new sets of extensions over three entirely new chapters and expounds upon the changes that have occurred in SQL standards since the previous edition's publication. Benefit from mastering the challenging aspects of these database applications in SQL as taught by Joe Celko, one of the most-read SQL authors in the world.
HN Books Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this book.
If this topic interests people, take a look at Joe Celko's book called Trees and Hierarchies: https://www.amazon.com/Hierarchies-Smarties-Kaufmann-Managem...
WalterGR
Available on what was formerly O'Reilly Safari: https://learning.oreilly.com/library/view/joe-celkos-trees/9...
Why not add that functionality directly to SQLite via stored procs*

https://www.amazon.com/Hierarchies-Smarties-Kaufmann-Managem...

*https://github.com/wolfch/sqlite-3.7.3.p1

kevas
Can’t edit on my phone and meant to toss this in the comment too: https://www.amazon.com/Hierarchies-Smarties-Kaufmann-Managem...
pstuart
A more recent effort: https://cgsql.dev/
kevas
Thanks
claytongulick
I just read through the docs of this, what an amazing project.

I'm considering doing a js template string implementation for node.. cql`...` type thing with an internal compilation cache.

I don't know much about this but Joe Celko has a whole book on representing trees in SQL: http://www.amazon.co.uk/Hierarchies-Smarties-Kaufmann-Manage...

The obvious first question would be: why are you representing your data as a graph? Can you represent it better as sets of predicates (ie: relations)?

Using views should not reduce the number of queries: a view is just a query with a name. If you can do it by combining views you can do it by combining queries.

A good book on the subject is Joe Celko's Trees and Hierarchies in SQL for Smarties.

http://www.amazon.com/Hierarchies-Smarties-Edition-Kaufmann-...

He spends a chapter on each of the models outlined in this post: adjacency, path, and nested set models.

platz
I've done nested set before; it's interesting - very fast for queries, but requires a lengthy insert/update cost. Also, team members were absolutely clueless as to what was really going on. I'm not sure nested sets are really much faster than what modern rdbms's can provide today.
rickmode
One just needs to be aware that nested sets eventually hit a limit.

I experimented with a variation by Vadim Tropashko using something called "Farey fractions" [1]. These represent the intervals as a 2x2 matrix of four integers rather than two floating point values.

The numbers are effectively limited to 32-bit values in the matrix since a multiplication is required (resulting in 64-bit intermediate results).

It was very interesting, but couldn't model a file system hierarchy well. It could roughly 10^32 items in the best case, but a very small number (hundreds) in edge cases.

For example, it maxes out at a depth of 17 with 100 items at each level, or a depth of 34 with 10 items each. This might be fine for modeling some hierarchies, but definitely not a file system. The edge case is if the fraction extends along one edge. So if we have 10 items per level, and add a child at the leftmost edge each time, we create the most costly fractional subdivision. Do this 35 times and you hit an math overflow.

[1] Check out the Chapter 5 and Errata links: http://vadimtropashko.wordpress.com/“sql-design-patterns”-bo...

tempVariable
In addition to expensive insert/update, it is necessary to keep the left and right boundaries across ALL of your entries in perfect order. If the boundaries get out of whack, fixing the tree is a nightmare scenario.

I worked on a multi-tenant application with distinct trees present in one table and with one tree per table and so on. Fun fun fun!

alistairbayley
I think the nested intervals model, a refinement of nested sets, solves this slow update problem: http://www.rampant-books.com/art_vadim_nested_sets_sql_trees...

But I've never had to use it, so I am just guessing.

Same article, different site: https://communities.bmc.com/docs/DOC-9902

And a paper: http://www.sigmod.org/publications/sigmod-record/0506/p47-ar...

Here's a comparison of the different approaches in a matrix: http://vadimtropashko.wordpress.com/2008/08/09/one-more-nest...

mceachen
If you're storing a tree in an RDBMS, please look into the closure table algorithm rather than adjacency, nested set, or materialized paths.

If you're using Rails (3.2 through 4.1), try this: http://mceachen.github.io/closure_tree/

Like it says in the README:

* Fetch your whole ancestor lineage in 1 SELECT. * Grab all your descendants in 1 SELECT. * Get all your siblings in 1 SELECT. * Fetch all descendants as a nested hash in 1 SELECT. * Find a node by ancestry path in 1 SELECT. * 2 SQL INSERTs on node creation * 3 SQL INSERT/UPDATEs on node reparenting

None of the other approaches above have even remotely similar performance characteristics. If your tree is small (tens of nodes), you won't care. If it's bigger, you will.

Todd
I wasn't aware of closure trees before, thanks. The presentation that you link to by Bill Karwin, along with a few other resources are in the comments by coleifer and chdir.
HN Books is an independent project and is not operated by Y Combinator or Amazon.com.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.