Hacker News Comments on
How a File Format Led to a Crossword Scandal - Saul Pwanson
csvconf
·
Youtube
·
174
HN points
·
0
HN comments
- This course is unranked · view top recommended courses
Hacker News Stories and Comments
All the comments and stories posted to Hacker News that reference this video.⬐ thaumasiotes> Despite Parker’s denial, many in the crossword world see willful plagiarism in Parker’s puzzles, and they see the database that revealed the repetition as a tool of justice. “It’s like a murder mystery solved 50 years later with DNA evidence,” Matt Gaffney, a professional crossword constructor, told me.There's a response to postmodernism that says "reality is that which, when you ignore it, doesn't go away".
I have a hard time seeing this as a "scandal"; it's firmly in the class of things that aren't a problem unless someone tells you you have a problem. A murder has victims. But if you're unhappy that the crossword puzzle you solved last week was secretly a rerun from 10 years ago, it's not obvious whether the blame for your unhappiness should go to the guy who reran the puzzle, or the guy who told you it was a rerun.
⬐ stordoff⬐ tzsI wouldn't really see the players as victims, but A) crossword constructors are potentially having their work ripped off and/or receiving less work and B) Uclick/USA Today are paying someone to do something when they could have just rerun old puzzles and got a similar result. A comparison to a murder investigation is maybe a bit much, but I can see where people are coming from.⬐ fsckboy⬐ VMGit should not be summed up as a comparison to a murder investigation, but rather as a a comparison to DNA evidence.> "reality is that which, when you ignore it, doesn't go away".Reality is that which, when you stop believing in it, doesn't go away. Philip K. Dick, I Hope I Shall Arrive Soon
⬐ labsterIf you’re unhappy after someone told you you worked a rerun crossword puzzle, maybe blame yourself? The only thing changing is your interpretation of your own experiences.⬐ thaumasiotes⬐ mklI was confused by this response, because it appears to just repeat the content of my original comment.Now I'm more confused that my comment was upvoted and this was downvoted.
⬐ labsterI'm guessing it's because you said it in an indirect way, and I said it directly. And people don't like being told that their gut feelings and outrage are only in their own head. I'm never really sure which is the right way to approach people -- the indirect approach goes over some people's heads sometimes (like me, a little this time) but the direct approach often gets outright rejected from confirmation bias. Teaching is hard, man.I don't think it's the rerunning that's the problem, but the misattribution, claiming others' work as their own or denying them credit (and presumably royalties).⬐ thaumasiotesIf the originality of the crossword has no value, why would it matter whether someone's claim that it is original is true or false? The most logical basis for attributing value to the claim of originality is that there is value in the originality that bleeds through to the claim.Compare e.g. someone being jailed for resisting arrest when there was no justification for arresting him in the first place.
⬐ ericsoderstromSorry... what? Why would you say the originality of the crossword has no value? And what on earth does that have to do with resisting arrest?⬐ TheRealPomaxThe law tends to disagree: crossword puzzles are copyrightable material just like any other published text is, so their value comes from the material that they help sell, whether that's a newspaper, or a crossword puzzle book, or a website, or any other published, in the legal sense, work.⬐ thaumasiotesBut misattribution is not a problem at all in that analysis. It's just as illegal to violate a copyright with proper attribution as it is if I claim the work is my own.The law doesn't care whether you claim a copyrighted work is yours or not. It cares whether, if you copy a copyrighted work, you have the license to do so.
Here's the FiveThirtyEight article about this mentioned a few times in the video [1].[1] https://fivethirtyeight.com/features/a-plagiarism-scandal-is...
⬐ ireflect⬐ abalajiThere's a footnote about Saul's interesting name, which leads to:Amazing!Pawanson is a bit quirky — his unusual last name is the product of a decision he made years ago. "I was born Paul Swanson," he said. "But I thought, 'there are lots of Paul Swansons out there. 'So I changed it."
⬐ mdonahoeNice! I too was intrigued by his unusual name, and went on a quest to see if he had it changed from the more common "Paul Swanson".It's a very interesting choice to just swap the letters like that instead of going for a completely different name.
It would be funny if his name gets included in a crossword puzzle, and people second guess the spelling.
⬐ wolfhumble⬐ xorcistI don't know anything about him or his decision to change from Paul to Saul, but Paul/Saul is on of the most important Christian apostles. As both Jew and Roman citizen, his Jewish name was Saul (from the Jewish king Saul in the Old Testament maybe?) and his Roman name was Paul. So just changing the first letter might not be completely random. :-)The question that directly pops up is why not Pwanson?⬐ gruturo⬐ servercobraActually it is Pwanson, indeed. A couple slides in the video confirm it.Pawanson is a typo.
⬐ ErwinSaul Pawnson would be a cute hacker alias, however.As someone with a name completely unique in the history of the world (so far as I have found), there are certain advantages! I wouldn't be surprised if people do this more often in the future. It is pretty nice that if you Google my name, you only get results about me.⬐ matt-attackInteresting. I relish the fact that when you google my name you get pages and pages of a semi-famous figure that, honestly, most people haven't heard of.I cherish the anonymity.
⬐ busyantI have a teacher friend named Mike Pence.He tells me that it's impossible for students to snoop on him because he is "google-proofed."
This is an awesome story, I especially like the speaker's organization of the narrative taking us along for the ride. Maybe this will be the push I need do a better job learning Unix tools.⬐ colmvp⬐ smitty1eThe delivery was very engaging and a good example to other engineers on how crafting a compelling narrative can help drive home the importance of your work.> "It's not hoarding if it's organized."Oh, that's getting thugged.
⬐ TheRealPomax⬐ ErwinIf it's organized, now it's archiving.The author's biography is quite fascinating. If there's a museum of visualization, it'd be an exhibit: https://www.saul.pw/biograph/⬐ paxysIt's interesting that while sophisticated plagiarism detecting software is commonplace for writing submissions at newspapers, book publishers, universities etc., they don't bother doing the same with crosswords (and probably other puzzles like Sudoku).⬐ xmprtI didn't realize unix tools were this powerful. That's an amazing story.⬐ fiddlerwoaroof⬐ ericsoderstromYeah, Unix utilities and the whole text processing paradigm can do some amazing things if you know how to design for it. I’ve been doing some Cloudformation work recently, and it’s so easy just to throw together dashboards to watch the progress and outcome of a deploy.⬐ smabieI think the point is that they're not, usually, this powerful. Saul made a deliberate choice to create a file format that would be extremely amenable to these tools.Were the misattributed authors aware that they were being given credit for puzzles they didn't write? I'm assuming they must have been.⬐ rabidrat⬐ rafaelturkThe misattributed authors are fake names, admitted by Timothy Parker himself. "Tim Burr" is one mentioned in the talk.Data is beautiful⬐ rabidratHi, this is Saul. If you like this kind of simplicity-first data-exploration approach, you might want to check out another project of mine, VisiData (visidata.org). It's specifically for lightweight data exploration and analysis and it runs directly in the terminal.⬐ manjalycHey, just wanted to say that this looks like a really cool and useful project. I work with a few medical databases and sometimes I just need a very quick breakdown of specific data and while I usually just write a short script, the utility and portability of this code looks fantastic to me. Which brings me to a question, how well does this program handle moderately large databases (~100GB-1TB) in your (or anyone else's) experience? In other words does it try to load everything into memory, or does it query as needed when given a database?⬐ rabidrat⬐ rolandogIt loads everything into memory, so it works well with <1GB datasets. The architecture could be changed to allow for larger datasets like yours, but that would likely be a large undertaking and would probably be a paid feature.Hey Saul, your talk was great and engaging! Great work!⬐ matthugginsGood job selling yourself this time!