Hacker News Comments on
Haunted by Data - Maciej Ceglowski
O'Reilly
·
Youtube
·
11
HN points
·
3
HN comments
- This course is unranked · view top recommended courses
Hacker News Stories and Comments
All the comments and stories posted to Hacker News that reference this video.https://youtu.be/GAXLHM-1Psk?t=945 - I think this commentary by Maciej Ceglowski rings true here.
⬐ nv-vnGreat vid! Thanks for sharing⬐ timonoviciAh, it comes a bit later, at 19:00 -"At that point people who are angry, mistrustful, and may not understand a thing about computers will regulate your industry into the ground. You'll be left like those poor saps who work in the nuclear plants, who have to fill out a form in triplicate anytime they want to sharpen a pencil."
I love how some of the tech industry is beginning to see data as a liability rather than an asset. It dramatically reduces the ability for government mass surveillance for two reasons:1. If companies only collect what they need (to reduce their liability), governments can't demand more than that (or even hack in to get the data illegally).
2. If the industry culture is to limit data collection, governments can't just say, "Well every company does it, so why can't we."
There's a wonderful talk, Haunted By Data, that covers a lot of the societal downsides of treating data as an asset. Highly encourage watching/reading.
⬐ baxtrJust to be contrarian: I’m not so sure if that’s a good thing... Data can be used for great things, e.g. longitudinal data in healthcare. I think seeing data as a liability might reduce the speed of progress⬐ rhizome⬐ daxoridData can be used for great things, e.g. longitudinal data in healthcare. I think seeing data as a liability might reduce the speed of progressI think the word "can" is acting as a euphemism or term of elision here, since "longitudinal data in healthcare" includes things like the Tuskeegee Syphilis Experiment.
⬐ diafygiIn the talk, there's a parallel drawn between the nuclear industry 60 years ago and big data now, where the nuclear was originally touted as a miracle cure for everything, then disasters happened, then it never really got over the stigma despite its huge potential.Society decided, for now, the upsides aren't worth the downsides.
Oil is another parallel where society is currently just at the point where we are starting to not value the upsides over the downsides.
⬐ ScootySeems like oil and nuclear energy are different because both can be replaced with alternative sources of energy. What are the alternative sources of user data?⬐ r00fusWhy is data a requirement to keep?Currently has value with externalized downsides.
⬐ tankerslayPaper records, human memory....Comparing growth in data storage versus energy usage per capita is interesting.
Even if you look back to the founding of the U.S., the change in energy use per person is actually only a few fold, definitely less than an order of magnitude.
Harder to compare quantity of data storage but the change would seem much larger. How much data is there, per U.S. person?
> governments can't just say, "Well every company does it, so why can't we."Odd. That's precisely what's happening with GDPR. While the EU governments are demanding that private companies limit collection and use of data, the intelligence arms of these same countries (I'm looking at you, GCHQ) and their FVEY partners are doing everything they can to hoover up and store every single bit of data on the world population, including their own citizens.
And somehow, we think this is fine. An entity with the power to disappear you and render you to black sites can have all the data on you they wish. But Facebook determining that there is an 83.7% chance you are stressed serving up an ad for vacation rentals is completely verboten.
It's absolutely mad.
⬐ giancarlostoroWonder what will happen if the US passes laws enforcing companies to retain data for up to a year minimum for digital forensic integrity reasons of cyber criminal cases. Also maybe on the other hand GDPR is good for VPN services to be much more transparent.⬐ amelius⬐ bleachedsleetI suspect there will simply be a direct pipe for logging to the government, where companies don't need to retain anything.⬐ giancarlostoroGDPR requires you to make that kind of information available.⬐ dahaunsIf I'm not mistaken, GDPR has the usual exemptions in place for that kind of usage (national security etc.), hasn't it?⬐ JumpCrisscrossDo we really believe large Chinese tech companies will be complying with GDPR?⬐ chapium⬐ 13yearsThe thought anyone believes that makes me chuckle, thanks.⬐ confoundedAre there any that do a lot of business in the EU that you’re thinking of? I can think of lots of hardware, but not many traditional consumer Internet companies. Maybe Ali Baba for SMEs?⬐ OperylTencent to name one.However governments often have undisclosed access not even known to the holder of the information.I doubt GDPR will significantly affect information to which the government wants access
Can I ask the government to delete all information on me?
⬐ pjc50There will undoubtedly be a lawsuit again soon over whether companies are allowed to transfer data out of the EU to US government warrantless requests.⬐ risotto_grouponI hope its Greece so we can all be as transparent as possible about the real issue here...(Democracy)
I concur and would like to add that this view of data collection has become common only because of the slew of breaches. Hackers leaking massive customer databases has forced companies to review their collection policies because they don't want to deal with the fallout, technically, politically and otherwise. This is true of the hacker ethos, using radical, often criminal behavior to point out glaring flaws that others become complacent with. Of course, these days that goal is secondary if it is considered at all, to the goal of financial gain and that's a damn shame.⬐ NoneNone⬐ zhte415When young in my career, which was finance/banking, my 'Head' (as was the organisational naming structure) sent an email re-forwarding and re-emphasining data retention policy.I was inclined to never delete an email rather than comply with 'your inbox should not exceed 500MB'. That was 50 emails a day then, my inbox now exceeds 500MB on a daily basis.
Sure, you can save archive locally or on a shared drive.
But the key idea... we don't want customer data because of liability. If we're storing it for a specific purpose, be that regulatory reporting or clearly defined analytics, fine.
Before the term 'big data', simply 'lots of data' sounded nice.
But don't let it hang around. At some point it will be a liability. You're serving customers. The key part of this is keeping their data safe. Not having it anymore (or having it in an non-accessible lock-away auto-delete repisitory for legal purposes) is more than good enough, to properly store and manage it is what's better.
There's a great talk, Haunted by Data, by Maciej Ceglowski about how tech companies are making a mistake by wanting to collect more and more data on their users, because governments are just going to want to come in and take it.Slides: http://idlewords.com/talks/haunted_by_data.htmI want you to go through a visualization exercise with me. Really imagine it. Nixon's in your datacenter. He's got his laptop open. He's logged in! He's got root! What does he find? If you didn't break into a cold sweat at the thought, congratulations. You are a good steward of data. But if Tricky Dick in your data center scares you, then consider what you're doing.
⬐ sleepingeightsHorrible analogy, as what did Nixon do with data towards citizens? It'd be more like the FBI, Hoover, CIA, NSA, etc... who have the capacity to bend the data to invent facts to fit some crime and then act on it with force without fear of repurcussion/retaliation.Also, if there were a proponent of this kind of collection, wouldn't it be fine for a company like Google, Facebook, Microsoft, etc... if someone with the position of US President wanted to "sit in the datacenter with an open laptop"? Because then they'd be using data as a currency, which they are already very comfortable and capable of doing to meet their own and the "gov" agendas.
⬐ hordeallergyBoth governments and hackers(foreign governments). Just a bigger target all round.⬐ vidarh⬐ ergotBut convicing people that their own government can come in with a subphoena is easier than convicing people that their security just isn't likely to be good enough to stop each and every external hacker that tries.No matter how many examples we get of far better funded companies getting hacked.
The IPB is slightly tainted by the Snowden disclosures. It's an interesting thought experiment to apply Snowden's revelations to newly implemented surveillance measures by any government. Snowden produced documents which gave us all an intimate understanding of the mechanics and operational details of the NSA & GCHQ. It is clear that the apparatus is already in place for spying and is only a quick click away from being galvanized by broad and sweeping laws which allow such apparatus to operate out in the open.I think the masses are not scared enough to encrypt their communications and that's why such an apparatus has crept in so brashly and abruptly, sort of a 'surveillance creep'.
The moment the masses are conscious of the fact we are going through our second 'crypto war' is also the moment they might encrypt. Not that crypto is some munition they can use, as is wrongly spouted by the cypherpunks (IMHO), but that crypto can provide viable amounts of privacy for their needs and it doesn't need to be absolute privacy as spouted by the 'go dark' movement. Just enough that I can surf the web without my eyeball hours being monetized or that the pressure cooker I am interested in buying is not a potential tool to be used in a terrorist attack several weeks later.
⬐ walrus01s/nixon/poindexter/https://www.google.com/search?num=100&client=firefox-b&q=tot...
⬐ SixSigma⬐ mtgxLet's go the whole hog.s/nixon/goebbels/
and there's no need to imagine it
⬐ _0ffhOr the story about how the Dutch thought it would be a swell idea to have the religious affiliation of all citizens in their government files. Nowhere else the rounding up of the Jews went as smoothly as there, once the SS got their paws on those files.⬐ SixSigmaAccording the the book, the French too, I can't quite remember the story but the guy in charge of the data managed to delay and confuse so not quite as smooth.If the recently announced Yahoo data breach (which affects a lot of other sites as well if users re-used their passwords, and we know many did) taught us anything is that data is a liability not an asset, and that's how both governments and corporations should treat it. The government at least should've learned that with the OPM hack.⬐ guitarbill⬐ NoneExcept that it's only hurting Yahoo because they're trying to sell. Counterexamples include the UK telco TalkTalk, who has managed to increase users and revenue despite the lack of basic security features.It just doesn't matter that much, because the inconvenience is minimal for the average person, so the backlash is minimal. I mean, most people cannot even be bothered to use different passwords (!). That's how low the bar is. Say something gets hacked, unless you experience identity theft, nothing happens. Banks will reverse any fraudulent charges. Not even a minor inconvenience. So people won't learn and won't care. Brand damage is minimal. Not worth spending on infosec if the maximum fine is less than your CEO earns in a month. Meh.
None⬐ CaptSpifyWouldn't that be the ideal place for companies? If a government is dependent on a company to collect data, wouldn't that government support the company in hard times? Sure, if it was a choice between surviving and throwing the company under the bus, the government would choose the later, but if given the choice, wouldn't the government try to keep one of it's most powerful tools?⬐ cryptarchIt works in Russia, China, and less so (perhaps) America (e.g. telcos). It's similarly practical for a government to have unlimited intelligence on its populace.That doesn't mean its good for anyone not working for the government, explicitly (as an employee or contractor) or implicitly (as a data collecting company which can be forced to share).
IMO a) collecting data on users and b) doing it in a way that does not preserve user privacy makes you complicit to mass surveillance.
Edit: directly -> explicitly
⬐ CaptSpify⬐ pjc50> That doesn't mean its good for anyone not working for the government directly or indirectly.I totally agree. I'd argue that it's objectively bad for anyone not working for the government. But I'm talking about from the company's point of view.
And it was "good" for IG Farben to supply Zyklon B to the Nazis. Until they lost and several company executives served prison sentences for crimes against humanity. Amorality certainly pays the bills.
⬐ timonoviciI stumbled across this talk, and it really left an impression, although I'm just a humble web developer.At the end of the presentation, he talks about sampling and using transient data, rather than storing and mining it. Are there any academical papers that support his statement, that a little fresh data is better than a lot of it, with a big chunk being "stale"? What's your experience?