HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
AWS re:Invent 2018: Amazon DynamoDB Deep Dive: Advanced Design Patterns for DynamoDB (DAT401)

Amazon Web Services · Youtube · 16 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention Amazon Web Services's video "AWS re:Invent 2018: Amazon DynamoDB Deep Dive: Advanced Design Patterns for DynamoDB (DAT401)".
Youtube Summary
This session is for those who already have some familiarity with DynamoDB. The patterns and data models discussed in this session summarize a collection of implementations and best practices leveraged by Amazon.com to deliver highly scalable solutions for a wide variety of business problems. The session also covers strategies for global secondary index sharding and index overloading, scalable graph processing with materialized queries, relational modeling with composite keys, and executing transactional workflows on DynamoDB.
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.
Maybe for short durations. But when it comes to an hour, Rick Houlihan might be world champion.

https://www.youtube.com/watch?v=HaEPXoXVf2k

"Amazon DynamoDB Deep Dive" https://www.youtube.com/watch?app=desktop&v=HaEPXoXVf2k&list...

why and how of the single table design approach

__derek__
That's exactly what I thought of when reading the title.
This is the video I recommend to others when working with dynamodb. The video is by Rick Houlihan about dynamodb modeling. In my experience most developers that complain about dynamodb don't fully understand it.

https://www.youtube.com/watch?v=HaEPXoXVf2k

newlisp
And many developers don't fully understand that he's a good salesman and can't see through the BS.
sass_muffin
All technologies have their pros and cons. They have use cases where they make sense and use case where they don't. The job of an engineer to decide which tool fits which use-case. To dismiss a useful technology as "BS", especially one used by companies all over the world for over a decade without any backing data seems a bit disingenuous.
newlisp
All technologies have their pros and cons. They have use cases where they make sense and use case where they don't. The job of an engineer to decide which tool fits which use-case.

Exactly. But that's not how he paints it, I have seen him bashing RDBMs as been a thing of the past and his promoted way of data modeling and "new" database technology is how companies should start today or be moving to.

SPBS
DynamoDB can model relational data just fine, if you're okay with setting your query access patterns in stone and never changing them again.
Jul 14, 2022 · mabbo on The DynamoDB Paper
Rick Houlihan did a talk a few years ago about designing the data later for an application using dynamodb. The most common reaction I get from people I show it to- most of them Amazon SDEs who operate services that use Dynamodb- is "Holy shit what is this wizardry?!"

https://youtu.be/HaEPXoXVf2k

One of the biggest mistakes people make with dynamo is thinking that it's just a relational database with no relations. It's not.

It's an incredible system, but it requires a lot of deep knowledge to get the full benefits, and it requires you, often, to design your data layer very well up-front. I actually don't recommend using it for a system that hasn't mostly stabilized in design.

But when used right, it's an incredibly performant beast of a data store.

time0ut
I also recommend Alex DeBrie's "The DynamoDB Book" (https://www.dynamodbbook.com/). It is a great resource that talks about these design patterns in depth. It has served me and my team well over the past few years.
aarondf
Seconded! Alex DeBrie is a great teacher.
ronjouch
For explicitness & searchability, commenting with the title of this talk, which is indeed excellent, not limited to DynamoDB, and which was kind of a revelation after years of using DynamoDB suboptimally:

Rick Houlihan - AWS re:Invent 2018: Amazon DynamoDB Deep Dive: Advanced Design Patterns for DynamoDB (DAT401) , https://www.youtube.com/watch?v=HaEPXoXVf2k

It should be watched along with reading the associated doc: https://docs.aws.amazon.com/amazondynamodb/latest/developerg...

itsmemattchung
Definitely one of my favorite talks by Rick and I apply lessons learned in that video on a daily basis.

Must of watched that video...about 4-5 times, before I really grasp the topics since I started my career that burned the concept of relational databases into my head. Breaking from that pattern of thought was difficult, initially.

ngc248
Indeed with the GSI's etc you can implement a priority queue or store data in the order you want etc. Once you are clear on the access patterns of your app DynamoDB is amazing to model for and will scale with your app. But if you are not clear about your app's access patterns or need adhoc queries, then dynamoDB is not a good fit.
0xbadcafebee
Can be performant, nowadays anyway. Worked with a team who built their own implementation because Amazon's was too slow and expensive.

It's a weird model. Too small of a dataset and it doesn't quite make sense to use Dynamo. Too big of a dataset and it's full of footguns. Medium-sized may be too expensive.

coredog64
Too-small seems to be the perfect use case for DDB. I need someplace to stash stuff and look it up by key. A full RDS is overkill, as is anything else that requires nodes that charge by the hour.
zurn
It's interesting how seldom "work harder on sw engineering and increase app complexity" style tradeoffs are questioned when targeting AWS, when it should be only for the very rare "web scale" app.
pfkurtz
Thank you for this recommendation, I'm on a DynamoDB contract job and... really learning to think hard about key structure and designing for efficient querying, rather than efficient storage.
LAC-Tech
Thanks, bookmarked this. It's good to see a proper take on data modelling on document stores instead of just "through any old JSON in there it'll be fine!!!"
davidjfelix
It's worth noting that a lot of the early database designs, including this 2018 video pre-date some dramatic improvements to dynamodb usability.

I think the biggest ones were:

- an increase in the number of GSIs you can create (Dec 2018) [1]

- making on-demand possible [2]

- an increase in the default limit for number of tables you can create (Mar 2022) [3]

I don't think these new features necessarily make the single-table, overloaded GSI strategy that's discussed in the video obsolete, but they enable applications which are growing to adopt an incremental GSI approach and use multiple tables as their data access patterns mature.

Some other posters have recommended Alex DeBrie's dynamodb book and I also think that's an excellent resource, but I'd caution people who are getting into dynamodb not to be scared by the claims that dynamodb is inflexible to data access changes, since AWS has been adding a lot of functionality to support multi-table, unknown access patterns, emerging secondary indexes, etc.

- [1] https://aws.amazon.com/about-aws/whats-new/2018/12/amazon-dy...

- [2] https://aws.amazon.com/blogs/aws/amazon-dynamodb-on-demand-n...

- [3] https://aws.amazon.com/about-aws/whats-new/2022/03/amazon-dy...

pdhborges
People don't need to be scared they just need to do their homework.

In my opinion having more tables and more GSIs available won't help you very much if you started with flawed data model (unless you kept making the same design mistakes 256 times). A team that tries to claw back from a flawed table design by pilling up GSIs is just in for a world of pain.

So if you are planing to go with Dynamo: - Read about the data modeling tecniques - Figure out your access patterns - Check if your application and model can withstand the eventual consistency of GSIs - Have a plan to rework your data model if requirements change: Are you going to incrementally rewrite your table? Are you going to export it and bulk load a fixed data model? How much is that going to cost?

Twirrim
Something else important to mention is that dynamodb now re-consolidates tables.

This is a lousy explanation, but Read/Write quota is split evenly over all partitions. Each partition is created based on the hash-key used, and there's an upper limit on how much data can be stored in any given partition. So if you end up with a hot hash-key, lots of stuff in it, that data gets split over more and more and more partitions, and the overall throughput goes down (quota is split evenly over partitions).

I believe this is still a general risk, and you need to be extremely canny about your use of hash key to avoid it, but historically they couldn't reconsolidate partitions. So you'd end up with a table in a terrible state with quota having to be sky high to still get effective performance. The only option then was to completely rotate tables. New table with a better hash-key, migrate data (or whatever else you needed to do).

Now at least, once the data is gone, the partitions will reconsolidate, so an entire table isn't a complete loss.

GauntletWizard
This bit me badly - An application that did significant autoscaling, and hit a peak of 30,000 read/write requests per second - But typically did more like 300.

The conversation with the Amazon support engineer told us that we had over a hundred partitions (which even he admitted was high for that number), and so our quota was effectively giving us 0 iops per partition. This obviously didn't work, and their only solution was "scale it back up, copy everything to a new table". Which we did, but was an engineering effort I'd rather have avoided.

Jan 20, 2022 · belter on DynamoDB 10 years later
You never heard of Rick Houlihan? He is the 90% of DynamoDB Evangelism... At the same time you are able to this internal lookups? Do you work with DynamoDB?

AWS re:Invent 2018: Amazon DynamoDB Deep Dive: Advanced Design Patterns for DynamoDB (DAT401) https://youtu.be/HaEPXoXVf2k

AWS re:Invent 2019: [REPEAT 1] Amazon DynamoDB deep dive: Advanced design patterns (DAT403-R1) https://youtu.be/6yqfmXiZTlM

AWS re:Invent 2020: Amazon DynamoDB advanced design patterns – Part 1 https://youtu.be/MF9a1UNOAQo

AWS re:Invent 2020: Amazon DynamoDB advanced design patterns – Part 2 https://youtu.be/_KNrRdWD25M

AWS re:Invent 2021 - DynamoDB deep dive: Advanced design patterns https://youtu.be/xfxBhvGpoa0

Amazon DynamoDB | Office Hours with Rick Houlihan: Breaking down the design process for NoSQL applications https://www.twitch.tv/videos/761425806

awsthro00945
No, I haven't. There are thousands of reinvent sessions every year. I don't watch them all (I don't watch hardly any of them, and most people I know in Amazon watch a couple breakout sessions if that. Some don't even watch the keynotes). Their targeted audience is AWS customers, not internal engineers. Reinvent itself is a sales conference. If internal Amazonians want to learn about something like DDB, there are internal talks and documents given by the engineering leaders that we watch.

>At the same time you are able to this internal lookups?

I looked him up on LinkedIn. Nothing internal about it.

amzn-throw
Do you expect the engineers on your team to know the top sales person at your company?

This person might be responsible for the majority of evangelism and revenue for the company. Do you expect the SDEs to know about him?

Again, no shot against against Rick - he is amazing, smart, technical, competent, and a deep owner.

But the average SDE on the team won't know about these or watch these talks. There are too many deep internal engineering challenges to solve.

garethmcc
I have watched almost all those talks as they are technically dense and full of very good and very useful technical knowledge that I would be much poorer for not watching. These are not sales videos but highly complex instructional content meant for developers on the ground
belter
Are you calling the person who did the core DynamoDB Technical Deep Dive sessions at reInvent, for the last 4 years in a row, a sales person?
amzn-throw
What do you think Solutions Architects and Developer Advocates (between the two groups who do most Re:invent sessions) are?

Hell, what do you think re:Invent is? It's a sales conference.

In any company you have two groups of people: Those that build the product, and those that sell it. Ultimately, solutions architects and developer advocates are there to help sell the product.

Of course Amazon is customer obsessed. And genuinely interested in ensuring customers have a good experience, and their technical needs are met - through education, support, and architectural guidance. But ultimately, that's what it is.

belter
I think I understand now why he left...
awsthro00945
There are over a thousand breakout sessions at every reinvent every year. Some of the speakers are sales people, some are engineers, some are managers. There are L5 or junior engineers who give reinvent session talks. It's a fun gig, but it doesn't mean that the speaker is some top executive or anything like that.

Rich was in the sales org. His primary job was sales. Reinvent is a sales conference. Speaking at reinvent is a sales pitch. He was a salesperson. I'm not sure why you're so offended by that. Being a salesperson isn't bad, it's just an explanation for why engineers wouldn't have heard of him.

gigatexal
Maybe that was the problem. He cited that there was seemingly not enough effort in making DynamoDB better as evidenced by the many orthogonally very close other DBs that AWS promotes. If Rick was ears to the ground listening to customers and sending back feedback but it was falling on deaf ears that's enough ground for someone as high up and as influential and productive as him to leave. It also speaks to inner AWS turmoil at least at DynamoDB.
awsthro00945
>It also speaks to inner AWS turmoil at least at DynamoDB.

How? Rick wasn't part of the DynamoDB service team. He wasn't an engineer, nor a manager on the team, nor even a product manager. He was a salesperson that specialized in DDB. He most likely had very little interactions, if any, with the engineering team. I don't see how him leaving speaks at all to anything about the inner workings of the engineering teams.

Rick seems cool, and after skimming some of his chats he seems really knowledgeable about the customer-facing side of DDB, and I mean absolutely no disrespect to him. But I think you're making way too many assumptions about his "rank" and "influence" within the company.

amzn-throw
Based on what I know, that's not the case.

DDB is a steady ship. The explanation on https://news.ycombinator.com/item?id=30009611 is likely the best explanation. L7 TPMs make the same money as L6 SDEs.

Getting promoted to L8 - director - is a monumental effort and likely seemed much harder than pursuing a comprable position at MongoDB.

Good for him for doing it, and for making Amazon take a long hard look at every way they failed in not keeping him.

We realized how great Dynamo was only after we migrated off AWS.

Dynamo was a key factor to us when we were releasing the MVP of our News API [0]. We used Dynamo, ElasticSearch, Lambda and could make it running in 60 days while being full-time employed.

Also, the best tech talk I saw was given by Rick Houlihan on re:Invent [1]

I highly recommend every engineer to watch it: it's a great overview of SQL vs NoSQL

[0] https://newscatcherapi.com/blog/how-we-built-a-news-api-beta...

[1] https://www.youtube.com/watch?v=HaEPXoXVf2k

pier25
BTW Rick Houlihan left AWS recently to work for Mongo.

https://twitter.com/houlihan_rick/status/1472969503575265283

On that thread he criticizes AWS regarding DynamoDB openly.

> I will always love DynamoDB, but the fact is it is losing ground fast because AWS focuses most of their resources on the half baked #builtfornopurpose database strategy. I always hated that idea, I just bit my tongue instead of saying it.

> The problem is the other half-baked database services that all compete for the same business. DocumentDB, Keyspaces, Timestream, Neptune, etc. Databases take decades to optimize, the idea that you can pump them out like web apps is silly.

> I was very tired of explaining over and over again that DynamoDB is actually not the dumbed down Key-Value store that the marketing message implied. When AWS created 6 different NoSQL databases they had to make up reasons for each one and the messaging makes no sense.

throwdbaaway
Interesting. MongoDB actually came to mind while I was reading the other comment here:

> No one uses DynamoDB alone: they bolt it onto Postgres after realizing they have availability or scale needs beyond what a relational database can do, then they bolt on Elasticsearch to enable querying, and then they bolt on Redis to make the disjointed backend feel fast. And I'm just talking operational use cases; ignoring analytics here.

Perhaps MongoDB is prime for a comeback?

pier25
In my circles Mongo has always been considered a bad database.

If Rick's vouching for it maybe it's time to give it a try. It must be pretty mature by now.

leetrout
It had some operational quirks 10 years ago (allocating giant chunks of space was more of an issue that dataloss) and I've not used it directly in that many years. We lost some data during an OOM process kill but it was just twitter firehose data so not a huge deal.

Lots of good info in the response in this SO post

https://stackoverflow.com/questions/10560834/to-what-extent-...

leetrout
Not snark: did MongoDB ever go away?

I've seen it used in many places over the years.

Today I would choose JSON in Postgres before I would just jump to Monogo but it certainly serves a purpose for many shops and it is still widely used AFAIK.

I _really_ miss RethinkDB.

jd_mongodb
What do you miss about RethinkDB?
leetrout
The table model with joins that mostly "just worked" and the sweet web ui that came with it by default.

Hat tip for compass, very nice tool I was just losing last week.

We use Atlas and it "just works" so no comment on administering mongo vs rethink haha.

Jan 20, 2022 · unfunco on DynamoDB 10 years later
I struggled at first but I watched Advanced Design Patterns for DynamoDB[0] a few times and it clicked. As other responses have suggested, generally you define your access patterns first and then structure the data later to fit those access patterns.

[0]: https://www.youtube.com/watch?v=HaEPXoXVf2k

The issue is that the keys can be used to various extent to handle relational data (this re:Invent video shows this well: https://youtu.be/HaEPXoXVf2k?t=1363). What people miss though is that in order to not shoot themselves in the foot, you have to have stable, known access patterns which meet the constraints of DynamoDB.

Even the documentation of DynamoDB says so:

https://docs.aws.amazon.com/amazondynamodb/latest/developerg...

This means that it's a good use case for shopping cart of Amazon since there's limited innovation possible there and it fits the bill nicely, but for a new application with relational data where the requirements change all the time - it's a terrible, terrible choice for which you will pay dearly.

If I may share some advice - don't bother. As another commenter pointed out, relationships are inherent in a lot of the data you'll store for your platform, and for those elements that you'd like to keep the full data hash, just use jsonb in Postgres.

Once you're at several million in revenue, then you can think about splitting off some parts of your data into document storage solutions for certain efficiency reasons, but that is by no means necessary.

This is a great video on the topic: https://www.youtube.com/watch?v=HaEPXoXVf2k

Check out this video from AWS's Principal Dynamodb expert that touches on comparisons against relation db's: https://www.youtube.com/watch?v=HaEPXoXVf2k
Rick has had the most-watched re:Invent talk the last three years.

Here are the links for each:

- 2019: https://www.youtube.com/watch?v=6yqfmXiZTlM

- 2018: https://www.youtube.com/watch?v=HaEPXoXVf2k

- 2017: https://www.youtube.com/watch?v=jzeKPKpucS0

Additional DynamoDB links here: https://github.com/alexdebrie/awesome-dynamodb

I've seen this done for new projects, and it works really well. If your data access patterns are truly relational (varied lookup paths) then it is probably not the right tool, but many apps can be modeled in a way that DDB handles well.

Highly recommended viewing: https://www.youtube.com/watch?v=HaEPXoXVf2k this talk explains how relational data can be efficiently modeled for key-value stores.

time0ut
A truly excellent talk. The free DynamoDB chapter of the book this article is advertising is really poor compared to the large amount of free documentation and training available from AWS and others.
AWS re:invent 2018 talks:

https://youtu.be/HaEPXoXVf2k

He has a sequence of 2-3 great talks on DynamoDB, the history of relational databases and the rise of access-pattern oriented db design.

If you go single table you can use filtered indexes if your database supports it to speed up queries instead of having to do a row by row search with a where clause.

Personally this seems like a job for DynamoDB but that’s a whole beast unto itself. You must know the queries you will do to the database before hand as that guides how you craft it. There’s an amazing talk at reinvent that is given almost every year on creating a DynamoDB schema (https://m.youtube.com/watch?v=HaEPXoXVf2k)

I would normalize as the theory says and then denormalize as performance requires it.

While the move seems to be towards NewSQL databases (Spanner/CockroachDB), the answer to relations in NoSQL is to model the data differently. This can generally resolve a lot of the problems inherent in the need for relations.

This might not be useful advice for everyone, but I can highly recommend this youtube video on DynamoDB data modelling: https://www.youtube.com/watch?v=HaEPXoXVf2k

HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.