Hacker News Comments on "“Fashion Is Hard. PostgreSQL Is Easy”" tech.zalando.com Video

Rankings: this week · month (jun/jul) · year (2024) · all time

digests · search

Hacker News Comments on
“Fashion Is Hard. PostgreSQL Is Easy”

tech.zalando.com · 154 HN points · 0 HN comments

HN Theater has aggregated all Hacker News stories and comments that mention tech.zalando.com's video "“Fashion Is Hard. PostgreSQL Is Easy”".

Watch on tech.zalando.com [↗]

tech.zalando.com Summary

Watch the video of Zalando's Valentin Gogichashvili keynoting this year's PGConf US.

HN Theater Rankings

This course is unranked · view top recommended courses

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.

“Fashion Is Hard. PostgreSQL Is Easy” [video]

⬐

Jun 18, 2015 · 154 points, 135 comments · submitted by nickdotmulder

⬐ tempodox
That title is really great, and also rings true. And of all open-source SQL DBs, PostgreSQL is superbly executed and a breeze to work with. I can only recommend it.

⬐ lmm
As someone who's used to MySQL, I keep trying to use PostgreSQL (better standards compliance, better behaviour in the face of bad data) - and I keep going back. The psql interface is awkward (too many magic backslash commands to memorize instead of SHOW CREATE TABLE) and always feels slightly laggy somehow.

⬐ keithy
The backslash commands are way better than having to type out SHOW TABLES; or SHOW DATABASES;, or CONNECT DATABASE X. All I have to do is \dt or \c x. It's so much better. Can't remember them? Just do \?.

⬐ collyw
Thats like trying to say emacs is easier than eclipse. Sure it may be more efficient once you are used to it, but its not more intuitive.

⬐ baq
eclipse and intuitive shouldn't ever be used in the same sentence. emacs ditto, but calling one more intuitive from another is missing the point by about 1 AU.

⬐ collyw
Eclipse isn't great in that respect, but choosing something from a menu is definitely more intuitive than remembering the cryptic commands in emacs.

⬐ semi-extrinsic
"The only intuitive interface is the nipple, after that it's all learned." - Bruce Ediger

⬐ collyw
Sure but a menu is still more intuitive than C-c.
(I even had to check the fact that C- is the ctrl key when looking up this command. I noticed commands listed as M-w - I have no idea what keys that represents).

⬐ dllthomas
M is short for Meta, and usually means "Alt".

⬐ neokya
\? is killer.
Thanks a ton, I didn't know that yet. I also find MySQL cli very easy to work. But that's going to change now, it seems.

⬐ realusername
That's also the problem I have, \dt,\c just don't make sense at all and I never remember them whereas 'SHOW TABLES' does not even need explanations. It's also the same thing with the unix users, I never fully understood how postgres handles user permissions. I would say that Postgres itself is really really good but the tooling around is just not there yet.

⬐ noir_lord
I just remembered it as \d as display and \dt as display tables.
I often invent mnemonics that only mean something to me until muscle memory kicks in.

⬐ bshimmin
Here's a nice one-page cheat sheet, of which about 75% you will probably never need to use: http://www.postgresonline.com/downloads/special_feature/post...

⬐ noir_lord
Thats excellent, thanks :).

⬐ pmontra
Show create table was so awkward to me after years of \d table. I remember I had to google even desc table and show databases :-)
Anyway, you might want to check http://pgcli.com/ for a free alternative cli with completion. I started using it over the standard psql and I recommend it to everybody.

⬐ brightball
Admittedly, I haven't used the PG or MySQL cli in years. Coughed up money for a Navicat Premium license about 5 years ago and never looked back.

⬐ jkestner
Same here. I muddled along with pgAdmin for a while, but now use Valentina Studio. Excellent free app (has an in-app purchase for some advanced features but doesn't feel crippled at all).

⬐ dolel13
Same here. The Valentina Studio free app is very helpful and somewhat smoothens the learning curve

⬐ neverminder
You should also check out https://www.jetbrains.com/dbe/

⬐ jsmeaton
I'm fairly sure I signed up for the early access program months ago and still haven't heard back. It's possible that I meant to and didn't though.

⬐ payamb
http://go.jetbrains.com/HQ0110R0QkKgjD01O6V003M

⬐ cwyers
> It is important to distinguish EAP from traditional pre-release software. Please note that the quality of EAP versions may at times be way below even usual beta standards.
That... scares me, more than a little. I'm okay with it crashing, or losing queries I wrote. I really want failure cases to be restricted to NOT affecting the database, though. I'll probably stay away or try it out on my personal machine on a toy database for now.

⬐ noir_lord
If you haven't already checkout pgadmin-3, I mostly live in the cli but if I need to do something complex or write a large query I often jump into pgadmin as it's display and proper editor are excellent!.
Prefer it to MySQL Workbench even which I liked.

⬐ knodi123
I've recently made the switch to postgres, and then I found pgadmin3, and I'm really feeling some pain.
For instance, the "explain query" visualization is great, and very helpful, but there's no way I can find to zoom out, at all. And that window doesn't support mouse scrolling. And there's no way to export the visualization somewhere else. So when I'm trying to figure out why a very, very large query is running slowly, it's agony to use pgadmin3.
Do you know any good postgres query analysis tools?

⬐ bmdavi3
It's not graphical, but pasting the explain analyze output into http://explain.depesz.com/ helps me see where postgres is spending most of its time.

⬐ knodi123
That's a bit of a help, thanks!

⬐ marcosdumay
Count another vote for the pgadmin.
I hate the backslash commands of postgres, about as much as I hate the show commands of mysql (what's not nearly as much as I hate the oracle introspection tools).
Why couldn't SQL standardize database introspection?

⬐ WorldWideWayne
This whole thread makes me happy to be a SQL Server user, where the UI tools are standard and robust.
90% of the time, I don't need to remember any special commands for introspecting and since it's on Windows - I can use the keyboard to navigate the entire UI with ease. When I write SQL, I get glorious Intellisense (that's autocomplete to you) for every single object (edit: and command) in the db.

⬐ jqm
I don't know why you are getting downvoted. I'm not personally crazy about SQL Server, but a lot of big companies are. In this situation some of us find ourselves in from time to time, SQL Management Studio is pretty handy.

⬐ kaolinite
I have no idea how they compare but the open-source databases has proprietary database tools too, such as Navicat, which have auto-complete, a good UI (Navicat is hardly beautiful but it works well), etc. There are a few other options too.

⬐ baq
it did.
https://en.wikipedia.org/wiki/Information_schema

⬐ marcosdumay
I didn't know it was a standard, nor that it was widely supported.
Thanks.

⬐ allendoerfer
I have the same problem. Because of this I use PostgreSQL only if I need it feature-wise and use MySQL as the default. This of course prevents me from getting used to PostgreSQL.

⬐ ibotty
You are the first person I read saying they like the mysql cli. Ctrl-c kills the whole shell. great, but not what I expect. Want to fix a small typo in your big query? In psql you can just edit your last command in your $EDITOR. Useful (but not perfect) autocomplete is also not in mysql.

⬐ meritt
In MySQL type \e and you can edit your query just the same.
https://dev.mysql.com/doc/refman/5.6/en/mysql-commands.html

⬐ tonyarkles
I love that earlier up in this same thread people are complaining about the magic \x sequences in postgres and this is how you do query edits in the mysql cli :)
I've got my biases, but whenever something like this comes up, I just make some popcorn and enjoy the show.

⬐ lmm
It's not really possible to implement "edit the query I'm already partway through" with anything other than a magic sequence. Doesn't mean magic sequences are the right way to do everything.

⬐ thwarted
The complaint was that in psql the backslash sequences are a combination of client operations (\copy, \html) and database metadata operations (\d) [0], which while extremely rich, are a huge list and are not very intuitive to learn. Many differ on case (\df vs \dF). \des and \deu mention the mnemonics for why these are named that way. "External servers" listing "foreign servers" I can understand, but it seems that "external users" are not actually "external".
In MySQL the backslash sequences are commands to the mysql client for things that are implemented in the client and have to do solely with the command line client software. These things are not available using other clients. Vs things like listing columns and databases and functions have their own DDL language constructs entries and are queries in their own right that can be sent by any client software to the server and return iteratable results. MySQL is more like Oracle in this regard.
That distinction isn't obvious in psql and that contributes to a larger hurdle when using the command line client. This makes psql attractive to power users, as there are a lot of shortcuts for things that might require longer DDL statements to be typed or queries against information_schema.
[0] http://www.postgresql.org/docs/9.0/static/app-psql.html

⬐ ams6110
Whenever I do anything more than a oneoff SQL statement (and often even then) I'll use emacs with sql-mode. It supports all the major SQL dialects and cli utilities.

⬐ mdemare
> In psql you can just edit your last command in your $EDITOR.
Cool! How?

⬐ turrini
Just type \e [ENTER]

⬐ None
None

⬐ masklinn
`\e` will edit whatever is in the query buffer (`\p` shows the content of the query buffer)

⬐ xd
Autocomplete is in MySQL you might just need to enable it with -A or in the config comment out no-auto-rehash.

⬐ fweespeech
> Ctrl-c kills the whole shell. great, but not what I expect.
That is actually exactly what you should expect. SIGINT cancels most things you run in the CLI.
Interrupt a ping because you forgot a count parameter? Ctrl+C
Honestly, the reason I don't use Postgres has nothing to do with the CLI and everything to do with the ecosystem for Multi-Master being inferior to Galera. But yeah, I'm happy with mysql-cli as well and don't really see the Postgres equivalent as a noteworthy improvement.

⬐ jsight
In Postgres' psql, Ctrl-C cancels the current statement. That's a more useful behavior than killing the entire psql process, IMO.

⬐ antsar
Ping isn't a CLI, its a single command that expects no further input. A database shell is more like bash than ping. And Ctrl-C does not exit out of bash.

⬐ None
None

⬐ dllthomas
Ctrl-C conventionally does not kill shell-like things. This is, technically, of course because of cooperation of those shell-like things and their handling of SIGINT, but that's not actually relevant.
The following things (off the top of my head, and quickly verified) all handle Ctrl-C so that it kills only the current command and not the containing process:
bash, zsh, csh, ksh, mail, gnuplot, gdb, psql, vsql, python (interactive), ghci
While I object to captive user interfaces in general, if you're going to have one then handling ctrl-c inside it is the right thing to do.

⬐ juliangregorian
Many language REPLs don't exit on ctrl+c either, at least I know Ruby and Python don't. Node will, but it will make you do it twice.

⬐ emilburzo
I love PostgreSQL.
But for a personal project I went with MongoDB, because my data set was a perfect match for mongo's design.
Now I love mongo too, I'm amazed with how easy it has been to maintain 100% uptime on comodity hardware (one server is literally in a room in my apartment) through all the random server downtimes, upgrades, migrations, etc.
And now I have more ideas for some personal projects, and they would go very well with postgres, but I'm so missing replica sets from mongo.
If postgres would have something similar to replica sets in mongodb, that would be amazing.

⬐ sandGorgon
do take a look if you can leverage jsonb datastructure of postgres - you get 80% of the power of mongo and all the advantages of postgres.
https://www.compose.io/articles/is-postgresql-your-next-json...

⬐ ma2rten
Actually the article you posted makes the case that postgres is not a good json database, because it can not modify json documents in place.

⬐ valgog
Actually without providing some benchmark results, the statement, that PostgreSQL has performance problems because of on-copy updating of JSONB, is groundless.

⬐ neilc
PG 9.5 will improve this situation somewhat, e.g., builtin functions like jsonb_set(), and overloading the '-' operator for jsonb values:
http://www.postgresql.org/docs/devel/static/functions-json.h...
That said, if you're doing a lot of mutation of large JSON documents stored in a single Postgres row value, the storage/concurrency control behavior still won't be ideal.

⬐ valgog
Actually if you really need to update large JSON documents efficiently, probably the only really efficient technology would be ToroDB http://www.8kdata.com/torodb/

⬐ themartorana
I would love to hear more - this is the first I'm hearing of Toro - it never comes up in NoSQL talks or conversations. Any decent success stories?

⬐ emilburzo
Maybe I wasn't clear, I love everything about PostgreSQL.
But I would love it even more if it were easier to distribute (see replica sets in mongodb, with auto-failover and other goodies).

⬐ petepete
I'm genuinely interested in which dataset/use case is serviced better by MongoDB than PostgreSQL

⬐ emilburzo
I didn't mean that it was serviced better by MongoDB rather than PostgreSQL.
I meant that it was a valid/recommended use case for MongoDB since I didn't really have any relational data (every document inserted was pretty much standalone).
I've mentioned it because I've seen plenty of posts around HN where "mongo sucks" because they tried to fit a round shape through a square.
The extra goodies from MongoDB helped too.
Like automatic failover, I can literally go and unplug a node and everything will still be fine.
Having tail -f functionality in the db was also pretty handy (for my project).
The sysadmin in me was happy too, it's not everyday you see software that allows you to upgrade between major versions / storage engines without downtime (when using replica sets, not standalone, of course).

⬐ angrybits
If it's running on a desktop in your apartment, you don't need replica sets.

⬐ emilburzo
Why not?
It is a desktop (as-in, it has normal desktop components, although used headless like a server), but it's got plenty of RAM, good CPU, RAID1 on WD Re hard drives, 100mbit connection and hooked up to a UPS.
In the last year it has been more stable than some of the cheaper hosting I was using.
Besides, it's not running the whole replica set, just one member (out of 3).

⬐ angrybits
Because it can and will fail in odd ways, and that has one of two outcomes:
1) It doesn't matter, which means the time, energy and money spent setting it up was squandered when it could have been spent on marketing or product dev.
or
2) It does matter, which means now you have to blow even more time, energy and money recovering it and standing it back up. Hope you've rehearsed your DR plan!
I'm not trying to preach, I apologize if it's coming off that way. But this highly resembles tinkering, and tinkering doesn't generally pay the bills. Usually the opposite.

⬐ emilburzo
> Because it can and will fail in odd ways, and that has one of two outcomes:
Can't that happen anywhere? Regardless of the type of hardware.
> But this highly resembles tinkering
Guilty pleasure.
> [...], and tinkering doesn't generally pay the bills.
Thankfully, I was aware that it most likely won't be paying the bills, and considering I've made 35€ from it in the past year and a half, I guess I was right :-)
I've made it for myself (and opened it for the rest of the world if they need it), but I'm my most demanding customer, that's probably why I expected nothing less than 100% uptime since I launched it.
And I've managed to do that, without breaking the bank.
I don't know how my tone sounds (I'm not native), I'm just trying to emphasize that with the right tools, you don't need a shiny cloud for really good uptime.

⬐ simi_
If mongo impresses you, you should check out rethinkdb: https://github.com/rethinkdb/rethinkdb

⬐ emilburzo
no official java driver :-/

⬐ mhd
I had a job interview at Zalando a while ago, and apart from the huge bunch of bananas at the entrance to the developer's den, the one thing I remember most about was the fact that apparently they're using stored procedures for basically any database transaction.
Which is probably a more unorthodox use of databases these days, at least for Postgres (I've heard it was more common for SQL Server and I once had the questionable joy of debugging a petri net solver in Oracle).

⬐ kaufland
It's quite common for startups to have 1 or 2 sexy (and well-enough engineered) projects involving your favorite technology X, to run up the flagpole at conferences or in job ads, etc. But like as not, 80% of what they do (and of what you'll be doing as a developer there) is the same half-baked, post-"blitzkrieg" mop-up work you'll be doing pretty much anywhere else.

⬐ mhd
They were pretty straight about that: Java, ExtJS, stored procedures. Not exactly candidates for the sexiest development tools alive. They were pretty honest about that, no "this will look awesome on your resume" tomfoolery.
It was a pretty great interview, and I almost took them up on their offer. So don't get me wrong, I neither want to praise their stack nor put it down, I just thought that their ubiquitous use of stored procedures was an interesting fact (and Postgres with its pluggable languages supports this pretty well, these days you can even put your js/v8 code in it, if you really want to JavaScript all the thigns).
They've got some code about it on github: https://github.com/zalando/java-sproc-wrapper

⬐ kaufland
Alles klar. Thanks for clarifying.

⬐ mgkimsal
I've seen this pattern used a few places (sprocs for every single interaction). Seems like a lot of development overhead, but at at least one place I saw them in use at, the db admin and one of his appointees were the only people allowed to even write these things. Devs were just given connection privileges to run those sprocs, nothing else. Things worked, but the process didn't seem terribly ... productive. Apparently it was partially in reaction to some previous developer who'd done all the SQL by hand, and no one else could understand it. They didn't seem to realize they'd just shifted the problem from a developer to a dba.

⬐ ma2rten
Even during the talk it was even mentioned. "Stored procedures are the core our success"
He also mentioned that Zelando does 100 data model (schema?) changes per week, and that senior dev make code changes without code review.

⬐ valgog
Yes, 100 data model changes are schema changes (that can be one or more table structure change)

⬐ angrybits
I use the holy hell out of stored procs and UDFs, and they have saved my bacon plenty of times.
Once you get past building todo lists, you'll end up with complicated transactions that do many different data mutations, usually along with some simple logic. You can either maintain your virtue and pay the latency price for each of those, or just code a proc that implements the whole thing. It might offend your software dogma, but it'll get shit done so you can get home to the wife in time for dinner. These days, that is the only deadline I care about.

⬐ spacecowboy_lon
Isn't That how your supposed to do it using Sprocs is more time consuming but it does have several advantages from security and performance.

⬐ gdulli
There's no way you're "supposed to" do it. There are tradeoffs and you pick the approach you like most given those tradeoffs.

⬐ spacecowboy_lon
Not sure id agree 100% with you for standard LOB CRUD applications the use of Sprocs is best practice.
Obviously if your just writing a tool that will be used 2 or 3 times that - you can make trade offs - but you should never use $sql = "SELECT * FROM tblBobbins"

⬐ dragonwriter
"Stored procedures for everything" is a bit unorthodox, but its probably a better practice that "base tables for everything" (which seems to be the common trend these days), and is pretty much the closest real practice to the "views for everything" that was long the ideal for decoupling decoupling the interface that each component using the DB saw from the underlying data model (an ideal that was, when it was made, often unachievable because of each RDBMS had different limitations on what you could do with views, particularly on the update side.)
Though, with modern PostgreSQL, views-for-everything would probably work at least as well as stored-procs-for-everything.

⬐ ams6110
I don't think it's unorthodox. Perhaps uncommon these days. The application logic needs to live somewhere. I like doing in stored procedures because you are close to the data and that leaves your client code free to just implement the user interface. It's much easier to support different client platforms without having to rewrite all your core transactions. It's actually my favorite way to architect an application. I like views for the same reason, especially for use with 3rd party reporting/BI tools.

⬐ fideloper
PostgreSQL is great, but I've gone back to MySQL because of the ease of which I can setup replication (which I like to use when possible in addition to automated/periodic backups).

⬐ rockdoe
I'd love a honest, MODERN comparison of both systems. Sometimes I feel like the cargoculting with PostgreSQL is as bad now as it was with MySQL 10 years ago and MySQL is being badmouthed for mistakes from the MyISAM age.
MySQL has some features that I'm really missing in PostgreSQL, like more flexible compression, and batched index writeout.

⬐ angrybits
RDS makes it trivial to have high availability if you can't afford a DBA worth their salt. You don't need 10x replication to beat the uptime of your shitty web app.

⬐ takeda
I thought so too, until I decided to use RDS for storing zabbix.
It looks great on paper, but unless you're doing development I would discourage its use. That said your original point about the replication is true, but there are shortcomings that come together with it.
Some things that you will learn if you use RDS:
- if you decide to increase volume size, change type to SSD or use provisioned IOPS, you might have database down for an hour or more. Regardless whether you use single instance or HA.
- want to upgrade 9.3.x to 9.4.x? Tough luck you have stop dump the data and provision a new instance. You can't use postgres' inplace upgrade method
- you have limited control regarding tuning, many settings require rebooting the whole thing, when normally you would just restart the process (with HA there's still several seconds of interruption)
- you can only use extensions that they provide, there's almost no extension to monitor performance
- you can't login to use shell to monitor the process (obviously, but it's a still shortcoming)
- can't replicate data across regions or outside of RDS (it could resolve some of the issues above)
- it can fail, we had two times a failure caused by AWS, it happened during backup period. Normally AWS does backup on the secondary database, but we learned that in those instances it got confused and was attempting to do it on primary, rebooted it and for about 15 min the database was unavailable.

⬐ angrybits
First off, thanks for sharing your experiences. I am standing up a system right now that will use PostgreSQL RDS and it's good to know what kinds of bumps in the road to expect.
I think I am fine with most of those, since I am saving on labor costs. Periodically taking an app down for maintenance is par for most courses. And even on the last bullet, I've seen really talented database guys make mistakes and have small amounts of downtime. Perhaps what I should have said was "high enough availability". If I needed 5 9's, I agree that RDS is probably not the tree I want to be barking up.
Cheers!

⬐ takeda
Yeah RDS is ok as long as you are ok with the limitation.
I was more concerned about relying on it for 24/7 operation. Such as for transactions on a webpage. The zabbix scenario in our case also requires 24/7 uptime, but if it goes down our site is still up we just won't be alerted when something else breaks at the same time.

⬐ jordanthoms
Postgres replication is probably not as easy to setup, but can be extremely powerful especially when use tools like WAL-E as well.
We have a setup where our WAL log is rotated every minute and backed up to S3. Replica databases can either load from S3 or connect to the master for streaming - so they will catch up using S3 and then connect, reducing load on the master.

⬐ vassy
Two years ago I switched from PHP and MySQL to Ruby and PostgreSQL. The only thing I'm missing is Sequel Pro. PG Admin is ok, does everything I need, but that interface is ugly and unintuitive.
[1] http://www.sequelpro.com/

⬐ lxfontes
Have been using postico https://eggerapps.at/postico/ for day to day queries and pgadmin for more hardcore operations

⬐ vassy
It's still beta it seems. I used PG Commander, but it was too basic.

⬐ rattray
Relevant Github Issue: https://github.com/sequelpro/sequelpro/issues/362

⬐ krrrh
This has some promise. http://www.psequel.com/

⬐ juliangregorian
You might try Valentina Studio.

⬐ jkahn
When I tried that it crashed all the time. Huge pain to work with.

⬐ angrybits
As of late, I've been doing php+pgsql and it's like easy+easy. My latest app was an absolute pleasure to code, and I haven't said that in a a long time.
I agree that pgAdmin leaves a lot to be desired. It needs something on par with SSMS.

⬐ UserRights
I found this video to be much more interesting https://youtu.be/zsF1vfHBMBI
BTW I would like to send out a BIG THANK YOU to the guys that did the videos - this is such an important and great service for everybody who could not attend the conference, so: THANK YOU VERY MUCH!

⬐ valgog
There are definitely a lot of much more interesting videos from PGConf US 2015, especially the one from Robert. My talk was a 'keynote' and not really a conference talk :)

⬐ geedy
I recently used Postgres for a serious project for the first time and I am really impressed. Certainly much more impressed with it than MySQL or SQL Server. plpgsql is far better than what MySQL offers out of the box, plus you have access to nicer languages out of the box with plv8, plpythonu, plr, etc. Language extensions really are a killer feature and make moving application logic into the database a cinch.

⬐ None
None

⬐ ksec
Is there anything on Postgre Roadmap to allow, simple, easy, replication out of the box?

⬐ comrade1
Are they using Postgres for content, or for data? (at a place where I can't really watch a movie)
There are some great content management tools out there, my favorite being apache jackrabbit, and on top of that, apache sling for displaying/accessing the content. It requires a different way of thinking than the standard mvc process - it is a more content focused display process.
I don't know... if you're using Postgres to store non-relational data you may want to reassess your strategy. You may find you're doing the right thing still, but I think most people would be better served with a system like jackrabbit/sling, which is is used by publishers, fashion houses, etc (via Adobe's product that is built on sling) to store and display their content.
(that said, you can use postgres on the back-end with sling)
⬐ cwyers
> I don't know... if you're using Postgres to store non-relational data you may want to reassess your strategy.
What kind of data lacks any relations whatsoever? Yes, there are cases where fitting data to the relational model is more awkward than others. But I would be really, really cautious about thinking that I could design on my own a data model that's better than the relational model. Relational databases are incredibly battle-tested across every type of data storage problem out there.
⬐ JohnBooty
  > What kind of data lacks any relations whatsoever?
You're right: all data has some kind of relation - when we say "non-relational data" we almost always really mean "loosely related." But some data is much more strongly related than others.
Think of a traditional RDBMS application such as an ordering system where there are strong relational constraints that need to be enforced before committing your data: ie, your order items had damn well better correlate with an order, which damn well better correlate with a user, or else nothing makes sense -- you had better raise an exception and should definitely not store those things in your database if the relational integrity is not there.
Now constrast that with other types of data collection where relational integrity is much more relaxed.
Generally, these examples would be (near) real time data collection - times when you're collecting data with a "collect first, analyze/correlate later" mentality. Specific examples would be a service that aggregates logs from a server farm in real time, or things like tracking user behavior statistics on the web. Or even an onboard computer that collects data from a car's engine - you want to read that data every 50ms no matter what, and maaaaaybe you will correlate it with events (accidents? combustion issues? whatever) later on oooooorrrr maybe not.
⬐ cwyers
There are times where you really don't need an RDBMS, yes. Hard real-time embedded systems are their own bag, I certainly wouldn't want to tell the people who design the systems inside a car how to do their job. As for server logs... they're their own bag, too, although if you decide to roll your own setup there without fully investigating syslog and journald or whatever else, I think you're probably making a mistake here.
But if you're running a website and you have fewer than 100 servers and you think you need to start rethinking the entire relational data model, you may very well be getting ahead of yourself.
⬐ sancha_
Apache Jackrabbit is not a content management tool. It is an implementation of JCR, a content repository. But yes I agree that it might suit some usecases better than a relational db. That being said, you still need something like postgres for data storage, even if you use JCR.

⬐ feedjoelpie
Postgres provides really nice features for non-relational data these days, though. JSON columns are nice. Hstore is fantastic. And you can even do pubsub. All without introducing another database system into your stack. Is it perfect for all software? I'm sure not. But 99% of everything I ever need can be had with Postgres + Elasticsearch.
⬐ kriro
Coming up with your own ideas is hard. Cloning other businesses is easy.

⬐ toyg
Commenting on HN is easy. Delivering a presentation that gets on HN frontpage is hard.

⬐ janfoeh
While I agree in general, "sell shoes in a customer-friendly way" isn't exactly a revelatory business model.

⬐ ExpiredLink
"startup ideas are worthless"
http://paulgraham.com/ideas.html

⬐ perlgeek
Coming up with ideas is easy. Executing them well is hard.

⬐ kriro
Zalando is hardly executed well. The work conditions in their logistics centers are (infamously) atrocious. The strategy was distinctly described by one of the Samwer brothers in email conversation as "Blitzkrieg" and "the most aggressive plan in history". Work climate (not only for manual labor) reflects this attitude pretty well.

⬐ MrBuddyCasino
Yes, their website still sucks in the same ways it did 3 years ago. It boggles the mind.
They have a huge selection, expensive advertising campaigns and lenient shipping & return policies, fueled by huge sums of money, so they have a huge market share. Its nice, but execution is merely decent imho.

⬐ crdb
Try a rewrite at a company with more than 100 employees... cf http://www.joelonsoftware.com/articles/fog0000000069.html

⬐ MrBuddyCasino
No need to rewrite anything - all they need are a few UI improvements. Its not like their backend is MongoDB or something. ^_^

⬐ crdb
As a former Rocket employee who came from gleaming big corp, I used to think that, but later realized that the slightly more aggressive, chaotic culture was fairly typical of the startup world. Big companies have had the resources and time to iron out HR and management issues over their half century of existence...
When you look at it more closely, the Samwers have effectively managed to reallocate billions of dollars into startups all over the world in places where raising money, or working for a startup, was practically impossible. Remember the second largest location for venture capital funding, the UK, only receives about 1/20th as much as the Bay Area alone.
The more mature Rocket companies (btw Zalando isn't, strictly, a Rocket company - it had a distinct set of actual founders) are now pretty desirable employers in their respective countries, and deliver relatively good customer service, certainly better than many much bigger and more resourced companies, and usually crushing the non-existent, overwhelmed service you can expect in most emerging markets. Is the backend a little messy and "move fast and break things", sure, but I was surprised by how BAD most backends actually were in the business world period, and the codebases I had a look into were, by those standards, alright (for example: enforcing foreign keys! a luxury in the Era of Mongo and "logic is in the ORM"). What about the warehouses? Perhaps not quite Amazon, but definitely a huge step up from the average e-commerce startup. At least you know where your inventory is, you're not somehow missing 3%.
Yes, Berlin can be a bit abrasive on egos. I left in part because of that, or at least it tipped me over the edge of setting up my own thing after dealing with one too many fake-angry German (hi H!). But with a bit of time and distance I really appreciate what the Samwers tried to do, and there's definitely a lot of founders who got their "change in life" thanks to a few months/years in a Rocket company, making their mistakes at someone else's expense.
It's also quite impressive that Oli is still as hands on as he is, very numbers driven despite flying continuously around the world, spending less than a day in each business yet knowing their numbers better than their managers. I think the tone of the email, and his motivational speeches, is taken out of context; also, the "vision" is, when you strip the hyperbole away (in Australia: "I do not want to be before the wave, or after the wave, I want to surf the wave"), relatively accurate. By modern and especially American standards, he's quite a polite fellow if demanding.

⬐ kaufland
Hmm, looking a bit at their site[1]:
Addidas jogging pants -- 59,95 €
Volcom T-shirt print -- 17,45 €
Under Armour T-shirt -- 39.95 €
Like really now -- who needs this stuff? Especially when you can just go to Aldi or Tchibo (or any of a number of other discount retailers readily visible at nearly every shopping mall in Germany) and buy essentially same stuff (minus the logos, of course) for 1/10th of the price. So perhaps a better title might be:
“Fashion generates revenue, and handsome pay-outs. But is basically pointless. And a soul-sucking waste of time.”
[1] https://www.zalando.de/herren-home/

⬐ webjunkie
You always get what you pay for. I'd rather not buy too cheap and wear toxic stuff: http://www.welt.de/wirtschaft/article133578674/Greenpeace-wa...

⬐ None
None

⬐ kaufland
Valid point, but apparently the Greenpeace study only targeted Germany's largest (brick-and-mortar) discounted retailers; there's no particular reason to believe that the products offered by premium retailers would score any better.
Or that the 10x factor in price overhead is in any substantive way directly invested in safer product standards. No matter how you slice it, to a large extent all we're really getting when we shop at specifically fashion-outlets (as their marketing gurus know all to well) is... the brand.

⬐ paublyrne
Pointlessness is rather subjective.

⬐ spython
Please don't buy from discount retailers - support your local tailors and fashion designers. For a bit more than what it offered at zalando, you can get higher quality unique products from fashion design students.

⬐ whisdol
Honest question: How do I find those? Online, local brick-and-mortar stores?

⬐ pavlov
A t-shirt for 1.75 € is so cheap that it's practically guaranteed to have been manufactured in dangerous conditions in a place like Bangladesh or Ethiopia.
Big brands like Adidas or Under Armour don't have a clean track record either, but at least they are somewhat accountable to consumers because their #1 asset is their reputation. To fix global trade, it's better to do a bit of research into what kind of company you want to support, rather than blindly buying the cheapest imported thing "because they all suck anyway".

⬐ collyw
Sure, but is there any evidence that paying more for a brand name is going to avoid that? Nike doesn't exactly have the best reputation with this sort of thing.

⬐ icebraining
You mean the money is actually going to poor people? Sounds like a plus to me.

⬐ pavlov
Certainly not. The ultra-cheap stuff is cheap throughout the production pipeline. Costs are cut wherever possible: wages, tools, work environment, materials, packaging...
The "poor people" won't benefit from most of the branded product's premium, of course. But it's more likely that it's been produced by a subcontractor that at least respects local laws like minimum wage, construction standards and regulations concerning toxic materials.

⬐ icebraining
Being blunt, how do we know? For example, when the buildings collapsed back in 2013, the discount chains selling unbranded clothes like Primark and Loblaw pledge to offer compensation, while Benneton tried to hide their relationship with the manufacturer.
As it ever been studied if the work conditions of the workers making branded stuff are generally better than their counterparts working on cheaper clothes?

⬐ kaufland
I was extremifying a bit my comparison; but the basic point is that the 10x price difference is in no way proportionate to any real infrastructure costs that would be necessary to guarantee safer (or non-exploitive) product standards. I can see the same no-name, barely-fitting (but otherwise generally quite adequate) apparel products from the major discounters costing 2x or 3x if produced at gold standards (environmentally or labor-wise); but not 10x.
That is, at the end of the day, 80% of the 10x overhead you pay at the premium retailers is just for the brand, not for safer products or better labor conditions (which are at best a secondary matter of concern to these outlets).

⬐ StavrosK
PosgreSQL is pretty much my default. Any other database has to have compelling arguments for how it fits my use case if I'm going to use it over Postgres, and even then, it usually gets used for a subset of the data, while Postgres gets everything else.
I think it's better to default to the tried-and-true piece of software than the new and shiny, rather than the other way around and have to argue why Postgres is a good fit for your data.

⬐ dk8996
We used PostgreSQL at my old startup and switch away from MYSQL. Some of the things I liked about PostgreSQL; that you can do schema updates without rewriting and locking the table. The one thing we ran into, because of the way MVCC works, when you update a column the whole row gets rewritten. For example; if you have large rows and you update a Boolean column in a bunch of them - you will get a large write load. We ran on AWS and basically maxed out the PIOPs when updating a Boolean column on ~2m record table. Moreover, this would effect our read speeds causing our web site to be super slow. There was no easy way to throttle the MVCC writes.

⬐ brightball
That's probably the best way of looking at it. PostgreSQL is generally going to be the best solution for 90% of your data, short of having a subset that's write heavy to the point of streaming.

⬐ escherize
What is a good space to look at when you get write heavy data to the point of streaming?

⬐ fennecfoxen
Depending on your exact needs, Cassandra's not a bad idea for this space. O(n) scalablity for write capacity (+), and even on an individual node it's pretty write-friendly: its sstable data structures stream to disk well, keeping spinning disks happy while avoiding write amplification issues on SSDs. It does help if the data is nicely shardable, of course.
That's the one I'm familiar with, anyway.
(+) Write capacity is O(n) as you add machines but an individual write's time is pretty constant and cluster-wide maintenance operations do start taking longer as you add machines and they gossip to each other. It's not magic, obviously :)

⬐ marcosdumay
The filesystem.
Ok, that not completely serious, but almost completely.

⬐ spacecowboy_lon
Could work dump the incoming steam to disk in some sensibel text format and have an asynchronous queue process the data into the database.

⬐ neeleshs
At that point may be something like Kafka starts looking attractive

⬐ floppydisk
It depends on what your needs are. If you just have a lot of data that is coming in quickly but you aren't doing constant analysis on it, you can still use Postgres. Switch to large batch writes for getting the data into the database to reduce the transaction overhead and look at using a Master-Slave replica setup. The 9.4/9.5 log replication features worked really well last time I used for them handling streaming data. We had a write master and a read slave and optimized accordingly. It worked pretty well once we got the log replication tuned accordingly.

⬐ dlss
The kind of setup described by http://c2.com/cgi/wiki?PrevalenceLayer works quite well

⬐ brightball
Depends on your application. In most cases your "write heavy" is going to be isolated to 1 or 2 tables worth of data (or equivalent) and for those cases some type of NoSQL solution can be a really good option. Especially since you can use a PG foreign data wrapper to allow PG to run queries against that information too.
If it's system wide and you need PostgreSQL itself able to handle it, using PG's async features is one potential option but setting up and managing a Postgres-XC cluster would be the next best. XC allows scale-out for writes. If you're at a company with a budget for that type of thing I think I remember reading that EnterpriseDB (the PG company) is offering first class support for PG XC.
In most cases though, I find that the write heavy parts of a system are so isolated that diverting them to a simple NoSQL solution tends to be easiest (Mongo, Couchbase, DynamoDB from AWS, etc).

Hacker News Comments on “Fashion Is Hard. PostgreSQL Is Easy”

Hacker News Stories and Comments

Hacker News Comments on
“Fashion Is Hard. PostgreSQL Is Easy”