Hacker News Comments on "AWS re:Invent 2018: Amazon DynamoDB Under the Hood: How We Built a Hyper-Scale Database (DAT321)" Amazon Web Services Youtube Video

Rankings: this week · month (mar/apr) · year (2024) · all time

digests · search

Hacker News Comments on
AWS re:Invent 2018: Amazon DynamoDB Under the Hood: How We Built a Hyper-Scale Database (DAT321)

Amazon Web Services · Youtube · 6 HN comments

HN Theater has aggregated all Hacker News stories and comments that mention Amazon Web Services's video "AWS re:Invent 2018: Amazon DynamoDB Under the Hood: How We Built a Hyper-Scale Database (DAT321)".

Youtube Summary

Come to this session to learn how Amazon DynamoDB was built as the hyper-scale database for internet-scale applications. In January 2012, Amazon launched DynamoDB, a cloud-based NoSQL database service designed from the ground up to support extreme scale, with the security, availability, performance, and manageability needed to run mission-critical workloads. This session discloses for the first time the underpinnings of DynamoDB, and how we run a fully managed nonrelational database used by more than 100,000 customers. We cover the underlying technical aspects of how an application works with DynamoDB for authentication, metadata, storage nodes, streams, backup, and global replication.

HN Theater Rankings

This course is unranked · view top recommended courses

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.

⬐

Jan 20, 2022 · schwarzmx on DynamoDB 10 years later

This one is pretty good for DynamoDB: https://youtu.be/yvBR71D0nAQ

⬐

Jan 20, 2022 · mjb on DynamoDB 10 years later

We don't have a paper on DynamoDB's internals (yet?), but here's a talk you might find interesting from one of the folks who built and ran DDB for a long time: https://www.youtube.com/watch?v=yvBR71D0nAQ
And Doug Terry talking through the details of how DynamoDB's transaction protocol works: https://www.usenix.org/conference/fast19/presentation/terry
If we did publish more about the internals of DDB, what would you be looking to learn? Architecture? Operational experience? Developer experience? There's a lot of material we could share, and it's useful to hear where people would like us to focus.

⬐ pow_pp_-1_v
All of it - architecture, operational experience, best practices etc.

⬐ ldrndll
Just want to second this. All of the above sounds really interesting to me!

⬐

Apr 14, 2021 · ec109685 on Latency Comparison: DynamoDB vs. FaunaDB vs. Redis

If people really want to know how DynamoDB works, this is a good tech talk: https://www.youtube.com/watch?v=yvBR71D0nAQ

⬐

Jun 23, 2019 · appwiz on Learning to Build Distributed Systems

Here are a couple videos from reInvent 2018:
Jaso talking about DynamoDB internals https://www.youtube.com/watch?v=yvBR71D0nAQ
Marc talking about Lambda internals https://www.youtube.com/watch?v=QdzV04T_kec

⬐

Jan 10, 2019 · talawahtech on Amazon DocumentDB, with MongoDB compatibility

Saying DynamoDB is built on top of InnoDB is a pretty big oversimplification of a much more complex distributed system[1] and for all we know they could have switched out the low level the storage engine on the backend to something like RocksDB or WiredTiger.
The Aurora storage subsystem is much more limited in terms of horizontal scalability and performance, they probably chose it because it was a better/quicker fit.
1. https://youtu.be/yvBR71D0nAQ

⬐ evil-olive
Yeah, I used to work on DynamoDB, I know it's more complicated (much more complicated than that video makes out - their code quality was atrocious, like 2000-5000 line Java classes in 3 or 4 deep inheritance hierarchies; no unit tests, only "smoke tests" that took 2 hours to run and were so prone to race conditions that common advice was to close everything else on your machine, run them, then leave them alone while you went to meetings)
There was work underway at the time I left to replace InnoDB with WiredTiger. It seemed to be very slow going, and I suspect WiredTiger being acquired by 10gen had a part in it. They also had only 1-2 engineers on the project of ripping out MySQL and replacing it, in a long-lived branch that constantly dealt with merge conflicts from more active feature development happening on mainline.
Aurora, simply by virtue of being newer and learning from DDB's mistakes (in the same way DDB learned from SimpleDB and the original Dynamo) probably has better extension points for supporting (MySQL, Postgres, Mongo) in a sane way.

⬐ talawahdotnet
Interesting, how long ago was that? I would be curious to know if the WiredTiger switch ever happened, and what that support relationship looks like not given the contentious relationship between MongoDB and AWS. The old Wired Tiger Inc website[1] still lists AWS as a customer.
Then again, the relationship between AWS and Oracle is even more contentious and Aurora MySQL is one of AWS's most popular products so I don't think they are terribly worried about building on competitor's technologies.
1. http://www.wiredtiger.com/

⬐ evil-olive
3+ years ago, so it's entirely possible that things have changed since I left. I don't have any more recent information on the state of the system.
At least when I was there, the strong focus was always on adding new features (global & local secondary indexes, change streams, cross-region replication, and so on) to keep up with the Joneses (MongoDB et al).
Meanwhile, a bunch of internal Amazon teams were taking a dependency on it instead of being their own DBAs, and those teams didn't care that much about the whiz-bang features, they just wanted a reliable scale-out datastore that someone else would get paged about when some component failed.
Adding features at a breakneck pace while keeping up umpteen-nines reliability and handful-of-milliseconds performance meant tech debt and non-user-facing improvements, including WiredTiger, all got sidelined. Around the time I left, our page load was around 200 per week. That's one page every 50 minutes, 24/7, if you're keeping score at home.

⬐ talawahtech
Given the scale and popularity of DynamoDB and the distributed nature you would think that they could hire multiple teams just to work on improving it, but I guess it isn't as simple as that.
I would love to get a behind the scenes look at the process of gradually improving the components of DynamoDB with better technologies, while still maintaining reliability and performance.

⬐ manigandham
According to this post [1] the WiredTiger project seems to have been cancelled after the acquisition.
https://news.ycombinator.com/item?id=13170746#13173927

⬐

Dec 03, 2018 · manigandham on Amazon's cloud business is competing with its own customers

DynamoDB is different from their published paper, which is mostly about designing a highly available key/value system that can be run on top of any other datastore. The paper does mention MySQL as a storage option.
The recent MySQL info comes from this 2016 thread on DynamoDB storing empty strings: https://news.ycombinator.com/item?id=13170746
Confirmed by this comment specifically: https://news.ycombinator.com/item?id=13173927
The latest 2018 ReInvent deep dive on DynamoDB doesn't reveal anything but still fits if mysql is powering the storage nodes: https://www.youtube.com/watch?v=yvBR71D0nAQ

Hacker News Comments on AWS re:Invent 2018: Amazon DynamoDB Under the Hood: How We Built a Hyper-Scale Database (DAT321)

Hacker News Stories and Comments

Hacker News Comments on
AWS re:Invent 2018: Amazon DynamoDB Under the Hood: How We Built a Hyper-Scale Database (DAT321)