HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
Greg Wilson - What We Actually Know About Software Development, and Why We Believe It's True

CUSEC · Vimeo · 43 HN points · 39 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention CUSEC's video "Greg Wilson - What We Actually Know About Software Development, and Why We Believe It's True".
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.
Apr 24, 2021 · jack9 on Am I Doctor Stallman?
"earn" as in was granted by a university? The purpose of a graduate degree (Masters, Doctorate, et al) is to further knowledge. Without churning out "dissertations", Universities would be nothing more than echo chambers for what's already accepted and considered "known". https://vimeo.com/9270320 - Greg Wilso mentions this, because it's important to all modern human industry and somehow that's rarely understood.

In that vein, there are discoveries, studies and experts that occur or develop outside of universities. This is the source of an honorary degree. Famously, others have been given various degrees for effort and demonstrated competence (eg https://en.wikipedia.org/wiki/Billy_Joel). People who scoff at honorary doctorates are elitists, at best. Those who do not accept that expertise can be cultivated excepting through anything other than standardized processes understandably believe that no other process is legitimate despite the fact that there are narrow and wide fields of study for which there exists no framework.

There is some truth to the observation that age unfairly plays into academic accomplishment insofar as someone younger could not be granted a teaching position or honorary degree, even if they demonstrate comparable knowledge. Part of the leeway is due to a passion or commitment to topics that can only be demonstrated through a life-long pursuit of knowledge and criticality.

leephillips
Nobody is being an elitist or scoffing at anyone. The phrase “earned degree” has a particular meaning: it means “not honorary”. I think honorary degrees are great. But they are what they are. The convention is not to use the title “Dr.” on the basis of an honorary degree. That‘s it. Stallman is a legend, and obviously has no need of titles.
Brian_K_White
What exactly are they that they are? Not earned? I bet there are more un-earned regular degrees than honorary.
jack9
Indeed, honorary degrees are earned. The accreditation process differs for every individual, at a micro level and for individuals across organizations (and time and discipline, et al) at a macro level. The idea that a degree is "earned" based on a lack of the "honorary" descriptor belies a bias of ignorance, at best.
dfnr2
All degrees are honorary.
jack9
> The phrase “earned degree” has a particular meaning: it means “not honorary”

Maybe you meant academic degree.

> But they are what they are.

Still not clear on what that means. Recognition of expertise? They are that.

> The convention is not to use the title “Dr.” on the basis of an honorary degree

Again, by those aforementioned.

leephillips
I meant exactly what I typed. As I just informed you, the phrase has a well-established meaning.
jack9
> As I just informed you

I think you have a flawed set of beliefs. Good luck with whatever.

Reminds me of the excellent 2010 talk, “What We Actually Know About Software Development, and Why We Believe It’s True”:

https://vimeo.com/9270320

Bit long, but well-presented and worth a listen for any practicing software developer (or person who manages developers).

Apologies for the "flexible" usage of the words "strongly typed". Arguably (as someone already pointed out) Python can be considered strongly typed (i.e. doing 1 + "1" won't cast the first operand to string, unlike e.g. JS) although there doesn't seem to be a clear definition of what strongly and weakly typed means (https://en.wikipedia.org/wiki/Strong_and_weak_typing). So let's stick with "statically" typed, which in the case of Python would mean using something like mypy.

Here are some references (there might be some overlap):

- https://labs.ig.com/static-typing-promise

- https://danluu.com/empirical-pl/

- https://vimeo.com/9270320

- https://medium.com/javascript-scene/the-shocking-secret-abou...

- https://www.researchgate.net/publication/259634489_An_empiri... (one of the original studies)

I hope it's clear I'm not implying that there are no advantages in using a statically typed language, only that it's often seen as a solution to a problem that doesn't originate there.

PS: what is up with the downvotes? What is this, Reddit?

> Why is the modern software industry in such a constant state of flux?

This is true of both language design and software practices. It's painfully obvious. There are no standards for what constitutes "better" or even "good". Is strict typing better? To what degree? What's it worth as a tradeoff? There are almost no quantitative analyses (https://quorumlanguage.com/ - an attempt was made), which is lamented https://vimeo.com/9270320 by some.

Even ROLES are not well defined (https://www.youtube.com/watch?v=GhfVK_ubk8U) because roles are dependent on known concerns, which are also fuzzy in this industry. Saying development is immature is so understated, it's laughable.

Obligatory (and still relevant): https://vimeo.com/9270320
ScottFree
I have his book and refer to it often.
I found a video about the paper: the presenter claims the author of the paper (not the same person) did a double-barrelled correlation and that no metric had better predicting value than "simply doing wc -l on the source code".

See minute 39:30: https://vimeo.com/9270320

0x445442
Thanks for finding that.
the_af
You're welcome. I've watched the whole video and it's pretty interesting. Not being Canadian or Australian I missed most of the jokes though :P
Dec 06, 2019 · 2 points, 0 comments · submitted by taeric
Brett victors talk innovating on principal(https://vimeo.com/36579366) and more so actually Greg Wilsons talk from the same conf (https://vimeo.com/9270320) marked a pivotal moment for me in my work ethic. It really motivated me to expect better from the tools i use around every day.
chrisweekly
Similar rxn to those talks. Also, if ever someone knew how to write an "evergreen" post (ie, one that holds up over time), it's Bret Victor.

His "Ladder of Abstractions" is a great example:

http://worrydream.com/#!2/LadderOfAbstraction

kragen
I really enjoy his writing, but I think he has a way to go before he's in the same league as Homer, Plato, Lao Tse, and Shakespeare!
chrisweekly
haha, those guys sure knew how to write a good blog post!
kragen
Yup! They got people still clicking on their posts hundreds or even thousands of years later
mcphage
Also his essay Magic Ink: http://worrydream.com/MagicInk/
> https://vimeo.com/9270320

There's very little you can prove, other than "less code gives you fewer errors/bugs". Arguably, this is primarily due to specification gaps being opaque and less logic to hold state. You can't file a bug or raise an issue about a choice that was already made by a dynamic language.

> It's only a shame if you presume to know better than all of those developers.

Statistically, someone is going to be right. The "right tool for the job" trope does not extend to every facet of every choice. The totality of developers are not experts in every set of practices (while it still may apply to some cases). Less code is better and until there's a study to say something more, I'm not interested in the hand waving.

> > Many of the people who came up with or promoted the idea of dynamically typed languages

> Most of the designers of the initial statically typed languages came from that same era

This isn't an argument about better, but precisely the point you turn around and make. It's not worse. Less code is better, for sure.

> Without something like Smalltalk's "open a code editor when an error occurs" live debugging/editing experience, you'll never get the real benefits of dynamic typing.

Debuggers allow this. You generally don't want the user to have this power.

There is a shocking lack of science in the industry. Re: The Quorum Programming Language made some attempts at advancement. It's been decades and I still see the same squabbles and continue doing 10x more with dynamic languages than when I am forced back to something like Java. At some point, I have to assume that I'm special or one of those language choices are crippling.

munificent
> Statistically, someone is going to be right.

Yes. If you ask 100 people what the answer to "3 + 5" is, you'll mostly get "8". But that's because you're asking them all for the right answer to the same problem.

If you ask 80 of them the answer to "3 + 5" and 20 the answer to "3 + 2", the right answer isn't 8 and the people who answered 5 for the latter aren't wrong. They are solving different problems.

Given the breadth of computing today, it seems very unlikely to me that all programmers are solving the same problem, and thus that there is a single objective right answer for what language or language paradigm is best. It certainly doesn't align with my own personal experience, where I can't point to a single language that I would prefer for all of the different kinds of programs I've written.

> Debuggers allow this.

Yes, with limitations. A SmallTalker will tell you that debuggers are a pale imitation of the full experience. (I don't have much first-hand experience with it myself, but I know people who get misty-eyed when you ask them about it, despite being very familiar with "modern" debuggers.)

> You generally don't want the user to have this power.

I don't disagree with you personally, but there's a counter-argument that forcibly separating people into "developers" and "users" is itself a moral failing akin to welding the hood shut on a car.

> There is a shocking lack of science in the industry.

I'd like more science too, but I don't find its absence that shocking. PL is very hard and expensive to study scientifically. Doing controlled experiments is very difficult when step one is "Design an entire programming language, implement it and all of its tools and ecosystem, and then get people to spend a long amount of time learning it to proficiency."

That's a lot more difficult than "Take a sip of two sodas and tell me which one you like more", and even that simple experiment turned out to be famously flawed.

civility
> I can't point to a single language that I would prefer for all of the different kinds of programs I've written.

I never liked the "pick the right tool for job" cliche in the context of programming languages. I'm curious if you can imagine a single programming language which you /would/ prefer for all of the different kinds of programs you've written. Not that it does exist, but could it?

Other than size, weight, power, and price limitations, I don't pick different computers for different problems. I mean I could live with one ISA for pretty much everything, including GPUs. I'm sure different people would pick different answers (one of your points), but after looking at a lot of languages over the years (including some of yours), I can't come up with two features I want which are inherently in conflict and necessitate being different languages.

I don't think it has to be a superset of all languages monstrosity either. And for the sake of argument, let's say this is just for one-person development. There's too much politics in trying to decide what features you /don't/ want your coworkers to abuse. :-)

mpweiher
> never liked the "pick the right tool for job" cliche in the context of programming languages

Me neither. Many of the differences are fairly random, at least in relation to the task they're being applied to.

Reminds me of the distinction we had in the late 80s and early 90s between "server" and "client" operating systems. "Client" operating systems had user friendly GUIs and crashed a lot. "Server" operating systems were solid but didn't have (nice) GUIs. Makes sense, right? Except that it was complete hogwash, there was no actual reason for it except random chance/history. As NeXTstep amply proved.

Why do we have Java with byte-codes on the server? This was initially invented for small machines, and the bytecodes/VM were for applets and "write once, run anywhere". How does that make sense on a server. You are deploying to a known machine. With a known instruction set architecture. It doesn't, that's how. But Java failed on the desktop and the server was all that was left.

> I don't think it has to be a superset of all languages monstrosity either.

Agreed. Most programming languages are actually quite similar. I am personally finding that the concepts I am adding to Objective-Smalltalk[1] work well, er, "synergistically" in (a) shell scripting (b) application scripting (c) GUI programming (d) server programming. Haven't really tried HPC or embedded yet.

[1] http://objective.st

jasode
>I'm curious if you can imagine a single programming language which you /would/ prefer for all of the different kinds of programs you've written. Not that it does exist, but could it?

>However, if the one true language already existed,

The one true language can't exist because we want to use a finite set of characters to express convenient programming syntax. (A previous comment about this.[0])

It might be possible to craft a single optimal language for only one particular programmer but I even doubt that limited scenario is even realistic. Consider trying to combine syntax of 2 languages that many programmers use: (1) bash (2) C Language

In bash, running an external program is a first class concept. Therefore the syntax is simple. E.g.:

  gzip file.txt
  rsync $HOME /backup
Basically, whatever one types at a bash command prompt is just copy-pasted into a .sh file.

But in C Language, external programs are not first-class concepts so one must use a library call such as "system()":

  main() 
  {
    system("gzip file.txt");
    system("rsync $HOME /backup");
  }
In C, we have to type out "system("")" that surrounds each external program. We have to add the noisier syntax of semicolons after each line. It's ugly and verbose for scripting work.

In the reverse example, C makes it easy to bit-shift a number using << and >>.

  y = x << 3;
How would one transfer that cleanly and conveniently to bash? Bash uses a bunch of special symbols for special functions.[1] Bashes uses << >> for input output redirection. Therefore, bash would need to have noisier syntax such as "bitshiftleft(x, 3)"

So, if we attempt to create a Frankenstein language called "bashclang" that combine concepts of bash and C, which set of programmers do we inconvenience with the noisier syntax?

What if we just tweaked C's parsing rules so that naked syntax to run external programs would look like bash? Well, what if you have executable binaries with names like "void", "switch"? Those are reserved names in C Language.

Same thing happens with other concepts like matrices. In Julia and Mathematica, matrices are first class. You can type them conveniently without any special decoration. But in Python, they are bolted on with a package like NumPy. So one has type type out the noiser syntax of np.full() and np.matmul().

Convenient syntax to enable easy-to-read semantics in one language leads to contradictions and ambiguity in another language.

To add to munificent's comment, I also don't see how one language can offer both garbage-collected memory and manual allocated memory using convenient concise syntax _and_ and zero-cost runtime performance-penalty for manual memory. Those two goals contradict each other. When I want to write a line-of-business type app, I just use C# with GC strings. On the other hand, when I'm writing a server-side app that's processing terabytes of data, I can use C++ with manually allocated strings with no virtualmachine runtime overhead for max performance.

[0] https://news.ycombinator.com/item?id=15483141

[1] https://mywiki.wooledge.org/BashGuide/SpecialCharacters

civility
> So, if we attempt to create a Frankenstein language that combine concepts of bash and C, which set of programmers do we inconvenience with the noisier syntax?

I won't speak for others, but I can make that choice for myself, and I'm willing to give the C-like language the upper hand. If bash like things were high enough priority, I might change the name "system" to "run" so it was just a bit more concise, perhaps taking multi-line strings to tidy it all up. I'm not saying one language to rule them all, I'm just saying I could have one language for nearly everything I've done or want to do.

What I was talking about was more like what features does the language have. For instance, I like algebraic data types (sum/product types). I like generics/templates. I want an "any" (variant) type. I want complex numbers and matrices. I like operator overloading so I can implement new arithmetic types. I want simple and immutable strings. I want structs and unions. I want SIMD types. I could also list things I don't want.

Anyways, I could go on, but all of those fit in a single efficient and expressive language. Some current languages come close, but get important details wrong.

> I also don't see how one language can offer both garbage-collected memory and manual allocated memory using convenient concise syntax _and_ and zero-cost runtime performance-penalty for manual memory. Those two goals contradict each other.

There are a lot of details that matter, and I can already anticipate some of your objections, but I would be very happy with automatic reference counting on a type system which precludes reference cycles. I would not use atomic increments or decrements (which is one of the more costly aspects of reference counting), and I would not let threads share data directly. This provides deterministic memory management and performance not too short of what you get in C, C++, or Rust.

So not "zero-cost", but damned close. It's also simple enough to think about the implementation so you can easily keep the non-zero-cost parts out of the inner loops.

Of course someone else would disagree and say they can't accept this (minor) compromise.

Ousterhout had a famous quote about needing both a high and low level language. For him, that was Tcl and C. I think I could have everything I need/want for high and low level tasks in a single elegant language. You're not alone in disagreeing :-)

koolala
Have you ever considered what a VR programming language could look like? A language doesn't have to be black and white text. C and Bash don't have to directly overlap if technology creates simple syntax that expands beyond how keyboards type.
civility
> A language doesn't have to be black and white text

I agree. I'd like to be able to insert pictures to explain data structures and algorithms, or equations for the mathy bits. I'd like to be able to choose different font sizes for different parts of the code to indicate their relative importance. Instead of files in directories, I'd like to be able organize functions clustered 2D parts of a page. I'm not sure what you mean by VR (3D?), but I'd be curious to see it.

> C and Bash don't have to directly overlap if technology creates simple syntax that expands beyond how keyboards type.

I distinctly don't want a polyglot catch all set of languages. I mean you can almost do that in the .Net world where a project can use many languages. I don't have any idea what would be better than a keyboard or touch screen for entering the syntax.

cr0sh
I've never used it, but based on what I have read about it, this language seems fairly radical (well, not completely - LISP could be considered a prototypical form?):

https://www.jetbrains.com/mps/

It is open-source, too:

https://github.com/JetBrains/MPS

It seems to have active development, and - interesting aside - it is written in Java.

But again - it purports to do what you seem to be explaining here, and a bit more: It's a system that lets the programmer define the programming language as they use that same language, for the specific purpose at hand (aka, DSL - Domain Specific Language).

As I've noted - I've not used this tool, but I've kept it in the back of my mind as the concept seems very fascinating to me (I don't know if it is practical, workable, or anything else - but I do think it's a "neat" idea).

munificent
> I'm curious if you can imagine a single programming language which you /would/ prefer for all of the different kinds of programs you've written. Not that it does exist, but could it?

Nope. Granted, I may work on a greater breadth of software than the average programmer. But, at the very least, I have implemented language VMs and garbage collectors where I needed to work at the level of raw bytes and manual memory management. But I sure as hell prefer memory safe languages when I'm not doing that.

I like static types for decent-sized programs, but I also use config files and other "data languages" where that would be more frustrating than anything.

Even if there was a single language that was perfect for me for all of the code I write, I don't expect that that language would be perfect for others, and I don't think those people are wrong.

mpweiher
I find Objective-C to have that range. I have used it for implementing everything from kernel drivers (DriverKit, yay!) to programming languages to server apps and GUI apps.

Not perfect at the entire range, but it does have it.

And having used it and seen what worked well and what didn't, I have some ideas as to how to make it better.

I think it could be improved by having the Smalltalk-side be the default and then add mechanisms to move towards the machine again. Either very simply (add some primitive type declarations) or with greater power, from a less constrained base.

civility
I always liked the way Objective-C added the message passing syntax in a way which fit in well with C. The [squareBrackets means: "we're in SmallTalk land"] looks nice to me. The @ (at sign) sigils I don't appreciate as much.

> I think it could be improved by having the Smalltalk-side be the default

It does kind of seem like you'd want the higher level language on the outside and only dive into the lower language when you need it, but I could go either way for that.

civility
> I have implemented language VMs and garbage collectors where I needed to work at the level of raw bytes and manual memory management

Fair enough, and I guess I'm forced to agree a little. However, if the one true language already existed, the VM/GC problem wouldn't have to be solved twice. Somebody had to write the first assembler in machine code, too.

I've written a (Hans Boehm style) GC of my own, and I admit that wouldn't fit with what I had in mind either, but working with raw bytes is a solvable problem in almost any level of programming language as a few library functions. All the batch or command line utilities, GUI applications, back end server modules, and most of the one-off exploratory programs could fit in a single elegant language.

> I like static types for decent-sized programs, but I also use config files and other "data languages" where that would be more frustrating than anything.

Again I agree, but (to me) this has a solution. I could be very content with a statically typed language with a single variant type for when you need to handle JSON-ish type dynamic variables or hierarchical data. I think dynamic and static typing can coexist very nicely in one language.

> Even if there was a single language that was perfect for me for all of the code I write, I don't expect that that language would be perfect for others, and I don't think those people are wrong.

I did caveat this was for single person programs and that different people would make different choices. I think my point was just that, having looked at languages from Icon to Prolog to SQL to Scheme to Ocaml to Rust to C++ and a lot of others, I think there is a point in the high-D trade space where I would be content to live and breath for almost every programming problem.

I've never gotten anyone else to agree, but I think it's an interesting exercise to fill in the details. I mean, computers are so much malleable than real world tools - you could have a single thing which handles screws, nails, rivets, and bolts effectively.

...and youtube doesn't have everything.

e.g. What we actually know about Software Development and why we know it's true.

https://vimeo.com/9270320

Seriously, what was the point of your comment?

1) Do not attempt to live code. change your code via version control, build small videos, copy and paste... whatever you do... do not try and live code

2) Read "The presentation secrets of Steve Jobs" ... its not really about Steve Jobs so much as its about how to be a great presenter.

3) building or writing a talk is significantly less important than the delivery... which means that more of your time should be spent on how to deliver the speech (recording I find is the best way to practice)

4) Pick a single thing and do it well, do not attempt to put too much into a talk or tackle something massively complex.

5) Make sure that everyone is walking away with something they can actually use. This doesn't require it be a hard skill... "Soft" or philosophical utilities are just as good

6) Have Fun and be funny... Speakers have a tendency to want to show off the size of their intellect, and the level their competency- having a speech be entertaining is way way more important.

7) DO NOT READ FROM YOUR SLIDES. know your shit. slides are supplementary

Extra Credit. Here's a collection of some fantastic technical speeches:

* https://vimeo.com/9270320 * https://www.destroyallsoftware.com/talks/wat * https://youtu.be/o_TH-Y78tt4 * https://www.youtube.com/watch?v=csyL9EC0S0c * https://youtu.be/a-BOSpxYJ9M * https://youtu.be/kb-m2fasdDY

> IBM used to report that certain programmers might be as much as 100 times as productive as other workers, or more. This kind of thing happens.

yeah, but that report is worthless, in context.

https://vimeo.com/9270320

This whole article seems like it was written by someone who is still attending high school.

#Softwarish - I'm biased a bit more towards interface development:

greg wilson - What We Actually Know About Software Development, and Why We Believe It’s True - https://vimeo.com/9270320#t=3450s

steve wittens - making webgl dance - the title is deceptive, it's in some ways a visual crash course in linear algebra - https://www.youtube.com/watch?v=GNO_CYUjMK8&t=84s

glenn vanderburg - software engineering doesn't work - https://www.youtube.com/watch?v=NCns726nBhQ

chris granger - in search of tomorrow - https://www.youtube.com/watch?v=VZQoAKJPbh8

alan kay - tribute to ted nelson at intertwingled fest - https://www.youtube.com/watch?v=AnrlSqtpOkw

bret victor - this is already mentioned but if I had to pick one it'd be 'the humane representation of thought' - https://vimeo.com/115154289

#Hardwarish:

saul griffith - soft, not solid: beyond traditional hardware engineering - https://www.youtube.com/watch?v=gyMowPAJwqo

deb chachra - Architectural Biology and Biological Architectures - https://vimeo.com/232544872

#Getting more meta in technology and history:

James Burke - Connections

This sort of feedback would much more helpful and credible if you added what sort of project you're using Rust on, your technical background, and concrete facts (not just metaphorical opinions / complaints) about what goes wrong. As it is, this article helps no one make informed technology decisions. See https://vimeo.com/9270320.
You might want to take a look at "How to Teach Programming"[0].

Bonus: A talk entitled "What We Actually Know About Software Development, and Why We Believe It's True"[1]:

[0]: http://third-bit.com/teaching/ [1]: https://vimeo.com/9270320

You will enjoy the video in its entirety, if you haven't seen it.

https://vimeo.com/9270320 ~37:00 he covers:

El Emam et al (2001): The Confounding Effect of Class Size on the Validity of Object-Oriented Metrics

http://ieeexplore.ieee.org/document/935855/

Shouldn't everyone demand more before promoting ideas without any basis? (https://vimeo.com/9270320)

Even if it is just "tribal wisdom", is HN where you recruit a tribe from a vapor posting? I would hope you at least have a tribe before upvoting it up the pile. It feels like it's being promoted for no reason or through a backchannel.

It didn't even give a wiki link to the Chesterton's fence analogy. Sigh.

Obligatory bit from Greg Wilson's "What We Actually Know About Software Development, and Why We Believe It's True".

https://vimeo.com/9270320#t=1525s

And the book recommendation: "Why Aren't More Women in Science?: Top Researchers Debate the Evidence". (researchers from either side of the arguments debate each other in the same book.)

https://vimeo.com/9270320 (Greg Wilson - What We Actually Know About Software Development, and Why We Believe It’s True).

It seems we have scientific data showing that the number of code lines one can review per day doesn't depend on the programming language. This means one would spend a lot more time reviewing this C++ code if you looking at the manually-rewritten-to-assembly version instead.

This is a very good argument supporting programming at the highest possible level of abstraction, which is the exact opposite of boilerplate and verbosity.

So please let's not rationalize/stockholm-syndrome over the limitations of any programming language ; boilerplace and verbosity are always bad.

Mar 13, 2017 · js8 on A comment left on Slashdot
If you enjoyed the OP, especially the comments about how unscientific SWE is, I think you will enjoy anything from Greg Wilson, such as:

https://vimeo.com/9270320

https://www.youtube.com/watch?v=FtKO619O5g0

wallacoloo
> especially the comments about how unscientific SWE is

After trying for months to get into a CSE course at my university (they had a disagreeable policy wherein CSE courses were almost entirely open only to CSE majors), I was surprised by this aspect. We were taught methods for proving correctness for recursion-based code, re-expressing recursion as iteration, and things like loop invariants - the latter in particular presented a more methodical way to approach programming than I had seen before. But the general reaction was almost uniformly "yeah, sure, like I'm ever going to use this".

And sure, I don't think about loop invariants when I write most loops. And KISS tends to make most of my loops pretty trivial to understand. But for more algorithm-intensive work, the ideas behind these approaches have proven to be really useful tools! It's disappointing for me to see scientific approaches to SWE so quickly shrugged off. I seem to see the "cobble something together, and then patch it until nobody complains about bugs" approach to coding more often than "assemble this code in a way where it's demonstrably correct and easy to comprehend", and I'm often in the minority by strongly preferring the latter.

logicchains
That sounds more like computer science than software engineering. At my university, they were different majors, with the focus of the latter being UML and building large inheritance hierarchies, so I imagine SWE means different things to different people.
This topic is addressed by Greg Wilson in his exceptional talk: What We Actually Know About Software Development, and Why We Believe It’s True

https://vimeo.com/9270320

Feb 26, 2017 · 1 points, 0 comments · submitted by vs2
Funny. I ran across this survey I had to write a document structure analysis step for a document analysis pipeline recently. I thought of this paper in particular when watching https://vimeo.com/9270320 ("What We Actually Know About Software Development"), which I ran across on the recent thread on tech talks.

Not a computer scientist (barely a scientist at all!), so I don't know if I'm being unreasonable or not, but I was a little dismayed at the literature here. Is this quality of publication common? Is it getting better? The lack of experimental methods and reproducibility seems abominable.

jahewson
I've found that it's common for computer science papers to reveal only about 60% of what is necessary to reproduce the result. Crucial steps of the algorithm are omitted. Numerous constants are used with no value ever given. Preprocessing steps are alluded to but never discussed. Datasets are kept private, or worse, charged for.

Anecdotally, I'd say that this seems to have been a worse problem in the 1980s and 90s, judging papers from that era. But it's still an issue today, especially with systems papers.

matt4077
This specifically is a survey where it's uncommon to carry out your own "experiments". You use the numbers from the original papers and hope there's some overlap in the metrics they use.

Reproducibility is a problem, especially in the sense that source code was traditionally not published along with the paper. There are a few reasons for that, i.e.:

- The publish in "publish or perish" isn't talking about github

- The grad student wrote the code and it's 3000 lines of FORTRAN

- The professor wrote the code and it's 6000 lines of ALGOL60 - The university owns the copyright but there's no process for OSS releases - The authors own the copyright and can't wait to turn this into a commercial spinoff.

- It's 30 lines of python, inextricably linked to the internal domain-specific toolset that's 25GB including multiple copies of test data and a lot of external source code with unclear licenses.

- The proof-of-concept is only about 20% of the work of a polished library that's fit to release

It's getting better now with some fields establishing standards for data analysis pipelines, and some large companies with product experience doing a lot of academic work (i. e. facebook/tensorflow). The AI/ML community is probably a shining example considering there are polished, ready-to-use implementations available for almost all publications.

roflc0ptic
>> This specifically is a survey where it's uncommon to carry out your own "experiments". You use the numbers from the original papers and hope there's some overlap in the metrics they use.

My comment was unclear. I wasn't criticizing the lit review, I was criticizing the literature it is reviewing. The lit surveyed does a poor job even defining metrics.

The rest is very interesting, and heartening. Thanks for the thoughtful response.

"Greg Wilson - What We Actually Know About Software Development, and Why We Believe It’s True"

https://vimeo.com/9270320

What We Actually Know About Software Development, and Why We Believe It’s True

https://vimeo.com/9270320

Feb 18, 2016 · 1 points, 1 comments · submitted by drhayes9
dalke
Greg Wilson video from 6 years ago. Not a new one, as I had hoped.
Minimizing key strokes is seen as 'false', but it has actually been proven that one can write (debug, maintain,...) about 10 lines of code per hour. Those 10 lines can be 10 lines of assembly, 10 lines of c++ or 10 lines of haskell. Of course, you can do much more in 10 lines of haskell than you can do in 10 lines of assembly. It basically means you want to be work on the highest level of abstraction that you can afford. It also means that economy of expression is important. source: https://vimeo.com/9270320
jerf
It may help to know what he's talking about. He's probably [1] not referring to whether you use C-style || vs. Python's "or". He's probably referring to things like this:

    NB. Initialize a global, but not if it's already been initialized
    NB. Example: 'name' initifundef 5
    initifundef =: (, ('_'&([,],[))@(>@(18!:5)@(0&$)) ) ux_vy ((4 : '(x) =: y')^:(0:>(4!:0)@<@[))
From the bottom of http://code.jsoftware.com/wiki/Phrases/Language_Utilities . NB. appears to be the comment marker, so the last line is the payload.

Yeah, I picked the worst thing I could find, and my guess is that that code is doing something really nasty hacky, because it is too large to be doing only what it is described. On the other hand, the fact that I pretty much have to guess with almost no information is sorta my point.

There's creating a computer language to be concise and readable, and then there's just concise. You can also have a look at Perl, which can (and generally should) be programmed as a fairly normal dynamic object-oriented language, but has a subset of functions, operators, and implicit variables that allows you to spit out terribly small, dense lines of few characters that do horrible things, usually not just doing whatever they are putatively doing but also having other mysterious side effects in the process.

[1]: I'm reasonably confident I'm on the right track here, but don't want to put words in his mouth, so I'm hedging. Especially since there's a non-zero chance the original author will show up here. As I type this, that has not happened yet.

bcbrown
> NB. appears to be the comment marker

Likely in reference to https://en.wikipedia.org/wiki/Nota_bene

Avshalom
It's a 'false god' as in not something you should worship at the feet of. Don't prefer fr to for; feel free to require empty () when calling a function without arguments; lambda instead of \ is fine.

It also doesn't mean you can't, just that you shouldn't use it as a/the primary concern.

seanmcdirmid
Then of course APL is king since a 10 line APL program is huge. But that doesn't really bear it given APL's reputation as a write only language.

Abstraction also has a cost. You can cram a lot into 10 lines, but if you have to think a long time about the most elegant way to do that, you might not be winning in productivity.

Jtsummers
Probably more a warning against pre-mature optimization.

If you can eliminate all ; in most, or even all, circumstances in otherwise C-like syntax, what do you gain?

And most of the savings you see there are from having greater levels of abstraction available.

Assembly = arithmetic. You have to explicitly state each step, and on its own there's no understanding of what you're operating on, they're all bits (numbers).

C++ (ok, more C) = algebra. You have a bit more understanding of your structure, better names and concepts behind things (types = units, for instance). You're still operating on bits, but now most of those bits have a specifically, language encoded, meaning.

Haskell = calculus. Not only are your bits typed, so are your functions (ok, technically also typed in C and kin syntaxes, but not usually as useful) and you can pass them around and modify them.

EDIT:

I'm more picking on the class of syntax available.

When you include semantics OO languages offer a great deal more expressive power than most procedural languages. Probably should've just had C and not C++, but it is what I wrote.

More generally what I should have written was towards language classes: assembly (as most primitive procedural language), structured procedural languages (C, Go, Algol, Ada, etc.), OO languages (Java, C++, Smalltalk, etc.) and functional languages (lisps, Haskell, ML-family, etc.).

But then you have to break it down further. OO is not inherently more or less expressive than functional languages, it depends on the feature set of the OO and functional languages. Smalltalk is very expressive, easily comparable to the more expressive functional languages. Java, as it used to be, not familiar with developments post adding generics so I'm really rusty on it, was not so expressive. So a comparison between Smalltalk and Common Lisp would put them on a similar level of abstraction. But Common Lisp versus Java (again, with my outdated knowledge), Common Lisp is at a higher level of abstraction (both syntactically and semantically).

josteink
> Those 10 lines can be 10 lines of assembly, 10 lines of c++ or 10 lines of haskell.

I seriously doubt the validity of this. 10 lines of assembly is usually reasonably clear-cut what is about.

10 lines of Haskell can take hours of mindfuck trying to peer through the functor and monad operators, trying to work out which operation does what and which operator takes precedence where. And then you need to start worrying about which data-type are you actually working on. And how does that type implement lift and bind. Etc etc.

Basically 10 seems like a BS number taken out of thin air, because it looks good in a base 0x0a number-system.

codygman
> 10 lines of Haskell can take hours of mindfuck trying to peer through the functor and monad operators,

Please stop. Can you even come up with one example that supports this? Alright, now how about 10 lines from a "real world"* Haskell project.

*real world meaning has at least 1 user besides it's developer

saurik
I think the more important counter-argument is "that ten lines of assembly just barely did one thing; the ten lines of Haskell was most of the program: clearly it will be easier for you to understand a single wire or switch than to understand an entire computer, but that isn't a valuable insight".
codygman
Hyperbole (if that is what the commenter was using and didn't seriously think it) should be used carefully when it reinforces stereotypes that are so damaging.
Jtsummers
The origin of this idea, for my first introduction I guess, was in The Mythical Man Month. I don't have a copy at hand, so I can't read what Fred Brooks wrote specifically. 10 is very precise, but really the writing was about the order of magnitude that people can manage in a day. It's about the level of complexity involved.

10 lines of assembly is very clear about what exactly is happening. It's a sequence of adds, loads, branches, stores, moves, etc. Each step is incredibly clear, but what are they operating on? Now it's not so clear. What is R2 at this point in the program? Am I calling a subroutine? On this architecture is there some way to automatically handle this with storing the PC and everything, or am I going to use a regular branch instruction? Regular branch? Ok, now I'm storing the PC and other information away per our calling convention.

C: Oh, I'm calling a function. I'm calculating a formula. I have named variables, so it's clear what I'm operating on (hopefully, this is at least possible here, even if not well done in practice). With typedefs, good naming conventions, structs and enums, you can maintain an order of magnitude greater complexity in the same time as an assembly programmer, in an order of magnitude lower amount of code.

Haskell: I have functions, I have types, I operate over these types transforming them from one to another. Collecting objects of a type here, transforming them there, zipping them with those, and back again. And I've written a function that would take dozens of lines of C or grosses of lines of assembly, and it took me an hour. Again, an order of magnitude more complexity can be entertained in another order of magnitude less code.

Don't get stuck on the specific number, that's incredibly pedantic and unhelpful. It's about the degree and magnitude.

> The science of developing robust software is great and pretty consistent going back decades varying mostly in specific tools and tactics

Nope (since uh-huh/nuh-uh is the starting point you've taken). Most science is BAD SCIENCE and I have doubts about the claims made by either the author or the topics you reference. Almost all formal studies are about specific tools and tactics in small sample sizes. This isn't proven science as much as they are test results. - https://vimeo.com/9270320 + http://quorumlanguage.com/evidence.php should orient you toward what proof is and how we can reference it in a reasonable manner. There will be plenty of bias to argue about how the tests are conducted or what the data demonstrates.

nickpsecurity
You reply with a combination of trolling and endorsing a specific link that you believe represents doing it better. The first thing I looked at, syntax study, was already weaker than prior work I've seen on the subject. The language choice was highly biased while not focusing on enough metrics for comparison of strengths and weaknesses. Between field use and its first "evidence," your Quorum is already weaker than most of what I presented in terms of empirical weight. Due to the trolling, I'm hesistant to even watch the video in case his presentation is something likes yours in terms of content.

@ non-trolling, HN readers also concerned about science in software

Back to science, though, given other readers might be interested in that point. Software development is a combination of human- and machine-driven processes to turn some requirements into working, maintainable code. A subset of this, significant it turns out, can be analyzed for effectiveness with enough different case studies, sample sizes, objective measures, and repeats to make a claim about it with believable evidence. For process stuff, it's not going to be as logically formulated or minutely analyzed as a math equation or something. Not usually, anyway. Instead, it works more like this:

1. A recurring problem exists in the development process or artifact.

2. Measurements are taken of how often that problem occurs in general and for specific scenarios.

3. A hypothesis is formed as to why it exists and a solution to it.

4. The hypothesis is tested by applying the technique to several projects likely to experience the problem with objective records on outcomes and subjective accounts from users for a take on less tangible aspects.

5. Significant reductions in the problem in the objective data provide supporting evidence that the theory is correct.

6. The theory (solution) is tested by others against the problem to find more supporting evidence or counter-examples to the claim.

7. If it's correct, the results continue to speak for themselves in it being a solution with likely modifications as the problem and solution space are further explored.

Note: If it's performed by humans, it's wise to either look out for or control for issues like Hawthorne effect. Must be sure that it's the overall solution that's working rather than people's heightened attention to an issue being studied.

The above 1-7 have applied to my list of techniques to varying degrees ranging from strong analysis + repeated, field results (eg formal specs, code reviews) to clear theory, models, assessment, and massive field results (eg type-systems, usage-driven testing). The evidence is quite clear and favors them as effective with repeated successes that matched the theory. Many other aspects of programming I left off the list because the evidence or studies were not clear. Or were non-existent. Hence me not including the parent's strawmen ("most science") or red herring (Quorum) on a list of techniques with decades of measured results behind them across many software projects.

Science, what little exists in our field, is on my side if the reader is focused on evidence aspects such as the Why, How, and Got Job Done. They collectively make up The Right Thing in robust, system development. YMMV for individual techniques or tools on a given project. Pick best tools for the job, prioritize your development budget, and so on. Common sense. People looking at where to start dealing with recurring issues or overall robustness have an evidence-backed list in my main comment, though. There are also papers and books on each showing where to apply them and most effectively.

jack9
> Quorum is already weaker than most of what I presented in terms of empirical weight

Saying you have presented empirical weight, is not compelling when you haven't and, frankly, can't.

It's telling that the rest of your pointless counter-troll? is devoid of proof when leading with "back to science", which is sad.

nickpsecurity
"Saying you have presented empirical weight, is not compelling when you haven't and, frankly, can't."

The proof is in the literature I'm referencing rather than me dumping all of it in the comment box. Anyone studying PL, QA, and so on literature should have seen research on effectiveness of basic techniques. Here's a few examples of methods of evaluation I claimed were scientific vs people guessing that their stuff seems better in vague ways.

1. Experiments conducted between similarly skilled groups doing same things with only difference being language or methodology. Those with new technique do significantly better in ways that match the expected properties of technique. Hypothesis based on real-world data, experimental confirmation, fix/reject if necessary, rinse, repeat.

2. Study of problems in software from a specific language identifies specific features as root cause, proposes a fix, the fix is implemented in significant test cases, and the problem goes away entirely or is significantly reduces. This is replicated across groups and codebases. Easiest example here is safe (Ada) or automatic (GC) techniques to manage memory to reduce or eliminate its many, common issues. Supported by theory, sometimes proofs, & countless applications in industry.

3. A certain verification approach espouses specific modeling, analysis, refinement, etc approach to achieve certain goals. The work on formal side translates into finding defects, enhancing performance, etc on the code side. Implies the model or method were correct, esp with diverse replication of results. This happened a lot in Orange Book days for catching security violations but most applicable to hardware with formal, equivalence checking. Theory, some proofs, and confirmation in production.

4. In some cases, properties are mathematically formulated, proven in theorem prover, checked by humans + machines, and tested on real-world examples to further validate them. Two words: type systems. Again, theories, proofs, and real-world validation. Tons of testing, too.

Most of the best evidence in the literature works something like that. You're saying none of the above constitute scientific evidence for a claim about a software tool or method? That's a big claim itself that contradicts what most of research community thinks. The people successfully finding bugs with Astree, avoiding out-of-bounds with languages that don't allow it, catching safety violations with protocol checkers, and so on will all be surprised about your claim. They and I thought the tools would work based on evidence presented and field results that confirmed the theories. Turns out we were all drinking spiked Kool-Aid, leading to hallucinations that followed. Or wrong on a technicality.

Got a link that briefly explains why all of that is unscientific and what methods count as scientific evidence in software tools? (without 1hr videos)

Note: Looking up more on Quorum is on my agenda. Already saved What Works Clearinghouse document. I'm not rejecting it outright in our discussion. Yet, you've indicated proven models, worked case studies, analysis techniques effective on 1+ million plus samples, mathematical proofs... that none of this is science. So, I'm starting with wanting an easy resource on what you think counts as scientific evaluation, argument, and method. Then, I can go from there in future discussions about specific topics and what science says about them.

So how many bugs remain?

Mostly rhetorical question, but can any extrapolation be done? If you go back five years, can any of those numbers correlate to the findings since? Do any metrics such as cyclomatic complexity, #defects/kLoC[1][2], unit tests or code coverage help?

In most cases the definition of "defect" is not well-defined, nor in many cases easily comparable (e.g., a typo in a debug message compared to handling SSL flags wrong). Is is a requirements or documentation bug: the specification to the the implementer was not sufficiently clear or was ambiguous. Also, when do we start counting defects? If I misspelled a keyword and the compiler flagged it, does that count? Only after the code is commited? Caught by QA? Or after it is deployed or released in a product?

Is it related to the programming language? Programmer skill level and fluency with language/libraries/tools? Did they not get enough sleep the night before when they coded that section? Or were they deep in thought thinking about 4 edges cases for this method when someone popped their head in to ask about lunch plans and knocked one of them out? Does faster coding == more "productive" programmer == more defects long term?

I'm not sure if we're still programming cavemen or have created paleolithic programming tools yet[3][4].

p.s.: satisified user of cURL since at least 1998!

    [1] http://www.infoq.com/news/2012/03/Defects-Open-Source-Commercial
    [2] http://programmers.stackexchange.com/questions/185660/is-the-average-number-of-bugs-per-loc-the-same-for-different-programming-languag
    [3] https://vimeo.com/9270320 - Greg Wilson - What We Actually Know About Software Development, and Why We Believe It's True
    (probably shorter, more recent talks exists (links appreciated))
    [4] https://www.youtube.com/watch?v=ubaX1Smg6pY - Alan Kay - Is it really "Complex"? Or did we just make it "Complicated"?
    (tangentially about software engineering, but eye-opening for how much more they were doing, and with fewer lines of code) (also, any of his talks)
Jan 17, 2015 · 1 points, 1 comments · submitted by StylifyYourBlog
dalke
It's a Greg Wilson talk, which immediately made my heart jump. Then I saw it was from 2010, and I've already seen it.
If you want an overview of the ideas behind this sort of research and a quick summary of some results, Greg Wilson gave a great talk on it[0].

I haven't read through the site to see what is there, but software engineering methodology and technique research* uses techniques from research of management techniques in business, making it closer to psychology or sociology. For more information, the blog "It Will Never Work in Theory"[1] does a good job of highlighting these sorts of results that are directly useful and has some explanation of the tools they are using to study software engineering practices. The book Making Software[2] goes into much more detail on software engineering research methodologies if you are interested.

*As opposed to CS theory research that could be used in software engineering, which is usually math.

[0] http://vimeo.com/9270320 [1] http://neverworkintheory.org/index.html [2] http://shop.oreilly.com/product/9780596808303.do

mathattack
Thanks! I had Making Software on my bookshelf, and someone "borrowed" it. I'll need to "borrow" it back. :-) The challenge from it's intro was that anyone in the field will overstate the truth in the research. I used to be a business book junkie until I realized what weak foundation most of it was built on. I've gradually come back to the genre but more for context and story than predictive power.

The Halo Effect [0] amped up my skepticism. Of course it was a business book [1] that introduced me to the Halo Effect... :-)

[0] http://en.wikipedia.org/wiki/Halo_effect [1] http://www.amazon.com/The-Halo-Effect-Business-Delusions/dp/...

I think you'll like this; http://vimeo.com/9270320
Dec 25, 2013 · 2 points, 0 comments · submitted by nvr219
It was mentioned in this talk: http://vimeo.com/9270320 and in the corresponding book "making software" http://shop.oreilly.com/product/9780596808303.do

have fun.

asabjorn
Have you been able to find the Thomas et al paper that supports this claim? I have been unable to find it, and others seem to have this problem as well:

http://www.gdb.me/computing/citations-greg-wilson-cusec.html

toolslive
the slides from the talk are here: http://www.slideshare.net/gvwilson/bits-of-evidence-2338367 (see slide 20)

I realized today I got it from somewhere else.. (Facts and fallacies of Software engineering) Looking up the whole reference in that book gave me the title, and googling that gave me the the paper

http://drum.lib.umd.edu/bitstream/1903/703/2/CS-TR-3424.pdf

Sep 28, 2013 · spellboots on Is TDD Worth It?
It seems like there are a sea of absolute assertions with few facts backing them up - people seem to love either saying that TDD is better or TDD is worse, and it seems that the only data ever really offered on the subject is anecdotal opinion.

We can do better than this. This study[1] at Microsoft reveals that TDD results in a defect reduction rate of between 40% and 90%, at a cost of 15%-35% increased development time. So, what's the best approach? It now becomes a business question. If speed of delivery is crucial and quality less so, don't use TDD. If quality is more important that speed, use TDD.

And finally, everyone involved in building software owes it to themselves to take an hour to go and watch this excellent video by Greg Wilson entitled "What We Actually Know About Software Development, and Why We Believe It's True" [2], and then perhaps dive in to the list of references from it [3], because you will find yourself more actually informed than anecdote and opinion will ever get you.

[1] http://research.microsoft.com/en-us/groups/ese/nagappan_tdd.... [2] http://vimeo.com/9270320 [3] http://www.gdb.me/computing/citations-greg-wilson-cusec.html

dmpk2k
I've read the Microsoft paper, and I'm not clear on something:

The paper is comparing TDD vs non-TDD groups... but what are the non-TDD groups doing? Are they test last? Are they ad-hoc? Do they test at all?

Did I misread something?

Also, Greg Wilson's video is excellent.

gry
Thank you for Greg Wilson's talk.

I've been trying to find empirical information about software development and he's frank about the lack of it. He noted work by Lutz Prechelt [http://www.inf.fu-berlin.de/w/SE/OurPublications], for one.

RogerL
As the other poster pointed out, this paper is measuring nothing, or at least it is not written in a way to let us know.

I find it unremarkable that a sudden focus on quality and testing reduced the bug rate. It does not follow that TDD is the cause, nor that TDD is the best of multiple options.

Its a well known effect in experimental design - measurement alters what you measure. Get depressed people to watch and talk about a dog video. I can predict that they will be less depressed. Not so much because dog videos are great at reducing depression, but because all of a sudden so much experimental interest is being directed at them. It would be almost churlish to not feel better, if you know what I mean. So, we don't design experiments that way - if your hypothesis is that dog videos cure depression, you need to do exactly the same protocol, but with cat videos, or I don't know, truck videos, or whatever, so both groups get the same attention, both are trying to reduce their depression and so on (I'm assuming unblinded here because obviously the TDD paper was unblinded, obviously we have better methods for the depression study than my suggestion here).

Kudos for bring empiricism into the discussion, but I do find the paper lacking.

I wrote above that I don't use TDD yet achieve low defect rates and good design. I think it is because, no matter how you do it, if you focus on those things you will more likely achieve them than if you don't. I hypothesize (don't be mad!) that TDD works for some simply because it brings the issue of quality to the forefront of their mind. Whereas, I'm already always thinking "is this line of code going to kill somebody"; I've tried TDD, and it always seemed like it added little to actively inhibited me. Your mileage will vary.

Watch Greg Wilson's talk "What We Actually Know About Software Development, and Why We Believe It's True" -- he cites research that supports working remotely (http://vimeo.com/9270320).
Summary of the links shared here:

http://blip.tv/clojure/michael-fogus-the-macronomicon-597023...

http://blog.fogus.me/2011/11/15/the-macronomicon-slides/

http://boingboing.net/2011/12/28/linguistics-turing-complete...

http://businessofsoftware.org/2010/06/don-norman-at-business...

http://channel9.msdn.com/Events/GoingNative/GoingNative-2012...

http://channel9.msdn.com/Shows/Going+Deep/Expert-to-Expert-R...

http://en.wikipedia.org/wiki/Leonard_Susskind

http://en.wikipedia.org/wiki/Sketchpad

http://en.wikipedia.org/wiki/The_Mother_of_All_Demos

http://io9.com/watch-a-series-of-seven-brilliant-lectures-by...

http://libarynth.org/selfgol

http://mollyrocket.com/9438

https://github.com/PharkMillups/killer-talks

http://skillsmatter.com/podcast/java-jee/radical-simplicity/...

http://stufftohelpyouout.blogspot.com/2009/07/great-talk-on-...

https://www.destroyallsoftware.com/talks/wat

https://www.youtube.com/watch?v=0JXhJyTo5V8

https://www.youtube.com/watch?v=0SARbwvhupQ

https://www.youtube.com/watch?v=3kEfedtQVOY

https://www.youtube.com/watch?v=bx3KuE7UjGA

https://www.youtube.com/watch?v=EGeN2IC7N0Q

https://www.youtube.com/watch?v=o9pEzgHorH0

https://www.youtube.com/watch?v=oKg1hTOQXoY

https://www.youtube.com/watch?v=RlkCdM_f3p4

https://www.youtube.com/watch?v=TgmA48fILq8

https://www.youtube.com/watch?v=yL_-1d9OSdk

https://www.youtube.com/watch?v=ZTC_RxWN_xo

http://vimeo.com/10260548

http://vimeo.com/36579366

http://vimeo.com/5047563

http://vimeo.com/7088524

http://vimeo.com/9270320

http://vpri.org/html/writings.php

http://www.confreaks.com/videos/1071-cascadiaruby2012-therap...

http://www.confreaks.com/videos/759-rubymidwest2011-keynote-...

http://www.dailymotion.com/video/xf88b5_jean-pierre-serre-wr...

http://www.infoq.com/presentations/Are-We-There-Yet-Rich-Hic...

http://www.infoq.com/presentations/click-crash-course-modern...

http://www.infoq.com/presentations/miniKanren

http://www.infoq.com/presentations/Simple-Made-Easy

http://www.infoq.com/presentations/Thinking-Parallel-Program...

http://www.infoq.com/presentations/Value-Identity-State-Rich...

http://www.infoq.com/presentations/We-Really-Dont-Know-How-T...

http://www.mvcconf.com/videos

http://www.slideshare.net/fogus/the-macronomicon-10171952

http://www.slideshare.net/sriprasanna/introduction-to-cluste...

http://www.tele-task.de/archive/lecture/overview/5819/

http://www.tele-task.de/archive/video/flash/14029/

http://www.w3.org/DesignIssues/Principles.html

http://www.youtube.com/watch?v=4LG-RtcSYUQ

http://www.youtube.com/watch?v=4XpnKHJAok8

http://www.youtube.com/watch?v=5WXYw4J4QOU

http://www.youtube.com/watch?v=a1zDuOPkMSw

http://www.youtube.com/watch?v=aAb7hSCtvGw

http://www.youtube.com/watch?v=agw-wlHGi0E

http://www.youtube.com/watch?v=_ahvzDzKdB0

http://www.youtube.com/watch?v=at7viw2KXak

http://www.youtube.com/watch?v=bx3KuE7UjGA

http://www.youtube.com/watch?v=cidchWg74Y4

http://www.youtube.com/watch?v=EjaGktVQdNg

http://www.youtube.com/watch?v=et8xNAc2ic8

http://www.youtube.com/watch?v=hQVTIJBZook

http://www.youtube.com/watch?v=HxaD_trXwRE

http://www.youtube.com/watch?v=j3mhkYbznBk

http://www.youtube.com/watch?v=KTJs-0EInW8

http://www.youtube.com/watch?v=kXEgk1Hdze0

http://www.youtube.com/watch?v=M7kEpw1tn50

http://www.youtube.com/watch?v=mOZqRJzE8xg

http://www.youtube.com/watch?v=neI_Pj558CY

http://www.youtube.com/watch?v=nG66hIhUdEU

http://www.youtube.com/watch?v=NGFhc8R_uO4

http://www.youtube.com/watch?v=Nii1n8PYLrc

http://www.youtube.com/watch?v=NP9AIUT9nos

http://www.youtube.com/watch?v=OB-bdWKwXsU&amp;playnext=...

http://www.youtube.com/watch?v=oCZMoY3q2uM

http://www.youtube.com/watch?v=oKg1hTOQXoY

http://www.youtube.com/watch?v=Own-89vxYF8

http://www.youtube.com/watch?v=PUv66718DII

http://www.youtube.com/watch?v=qlzM3zcd-lk

http://www.youtube.com/watch?v=tx082gDwGcM

http://www.youtube.com/watch?v=v7nfN4bOOQI

http://www.youtube.com/watch?v=Vt8jyPqsmxE

http://www.youtube.com/watch?v=vUf75_MlOnw

http://www.youtube.com/watch?v=yJDv-zdhzMY

http://www.youtube.com/watch?v=yjPBkvYh-ss

http://www.youtube.com/watch?v=YX3iRjKj7C0

http://www.youtube.com/watch?v=ZAf9HK16F-A

http://www.youtube.com/watch?v=ZDR433b0HJY

http://youtu.be/lQAV3bPOYHo

http://yuiblog.com/crockford/

ricardobeat
And here are them with titles + thumbnails:

http://bl.ocks.org/ricardobeat/raw/5343140/

waqas-
how awesome are you? thanks
Expez
Thank you so much for this!
X4
This is cool :) Btw. the first link was somehow (re)moved. The blip.tv link is now: http://www.youtube.com/watch?v=0JXhJyTo5V8
No too sure about technical but Greg Wilsons "What We Actually Know About Software Development, and Why We Believe It's True" has greatly influenced how I approach everything I have to look at in life.

http://vimeo.com/9270320

cpeterso
Greg Wilson expands on that topic in a book called "Making Software: What Really Works, and Why We Believe It".
The 10x figure is pseudoknowledge - something we "all know" that isn't actually true (or that at least we have no good reason to believe to be true), cf. http://vimeo.com/9270320 . In brief: the study that it is ultimately derived from had a tiny sample size (on the order of 30, to wit), and for several reasons the experimental design would have invalidated its results outside the very narrow original focus of the study (which was the relative productivity of batch processing vs. interactive processing on the computing systems of 1968) even if the sample size had been larger.
JoeAltmaier
The 10X figure is true. My study is 30 years of working with a wide variety of people in startup environments. Adding some people makes a project later. Others can manage a whole project alone (if they are LEFT alone).
Jan 26, 2013 · 2 points, 0 comments · submitted by mef51
Jan 25, 2013 · 1 points, 0 comments · submitted by AYBABTME
Oct 13, 2012 · 4 points, 0 comments · submitted by sidcool
In the past six years or so since this was written, Greg Wilson has advocated gathering data on this and other programming related topics (watch http://vimeo.com/9270320 if you have an hour, or get a flavor for the arguments here http://blog.stackoverflow.com/2011/06/se-podcast-09/), and has compiled a book of essays (with research) on the topic: http://www.amazon.com/Making-Software-Really-Believe-ebook/d...

I haven't yet read the book, and so don't know it's conclusions, if any, on the various methodologies, but if you're curious about research in the area it would be a great start.

The other thing worth pointing out is that, while the Google Yegge describes in the essay might be different in 2012 than it was in 2006, it's still a different level of organization than even the average software development company, let alone a non-software company that happens to employ internal or contract developers. And Steve is writing about developing within Google.

marcusf
Thanks, but I've seen Gregs talk and the book has been on my short list for a while. I whole heartedly agree with trying to bring some science to the table. However, in lieu of any research on the subject and given the tools at our disposal right now, we can only look at the metrics we have and draw conclusions from them.

Also, I disagree that Yegge discusses Google in particular. Sure, he blabbers on about how great Google is, but he dismisses agile methods outright. FWIW, the Googlers I know don't tell the same rosy picture about the place. What Google is seems to vary quite a bit with where you are and on what project.

Speaking as an actually credentialed computer scientist, this list is ridiculous.

First and for all, only one of these books is actually about theoretical computer science, and even then.

Secondly, (feel free to disagree on this one) K&R is recommended more as a prestige book. So much of C is in the toolkit and libraries that I think it's a little silly to be recommending a 30 year old intro that is actually kind of hard to read.

Thirdly, this question comes up all the time. Here's an actually serious version of this question: http://cstheory.stackexchange.com/questions/3253/what-books-...

If you only care about actually practical issues to your life as a programmer, give this list a shot http://news.ycombinator.com/item?id=3320813

Addendum: Although I'm credentialed as a CompSci, I really work as an engineer. The difference? Scientists read papers (Avi Bryant) http://vimeo.com/4763707 and think about the nature of our work (Greg Wilson) http://vimeo.com/9270320

veyron
What is a "Credentialed Computer Scientist"? Do you mean a college degree in CS/EE? I'd imagine a whole slew of people here on HN are Credentialed in that sense.
blario
I take it to mean, Ph.D; not from experience, but that's the only meaning that makes sense in the context.
tptacek
My read: he just means it in the sense that I mean "professional" when I say I'm a professional application security person, except that it's weird to say you're a "professional computer science" when your day job doesn't involve writing papers.
phillmv
That's pretty much it.

I got a nice piece of paper, but it's not like my day to day job involves much of what I learned during school.

ericmoritz
I apologize for any offense that I may have given by using the phase "self-taught computer scientist".

This article is my personal recommendation of books that I often recommend to junior developers that are smart but didn't complete a degree and often find their knowledge of basic computer science lacking.

tptacek
As a self-taught C software developer, I came here to say the same thing Phillip said.

To it, I would also add that C isn't the "latin" of computer science (if any language is going to prove to be that, it's Lisp, but it's too early to say, other than to say that isn't going to be C).

I'd also suggest that the best book to follow up K&R is Hansen's _C Interfaces and Implementations_.

If I wanted to teach someone to be a computer scientist, I'd look for a book that would help them read papers. I'd also point them towards compiler theory, not so much because it's fundamental computer science (it's a vital applied discipline), but because it exposes you to more real computer science than most other domains.

ericmoritz
Do you have a book recommendation to help someone read papers. I spent an couple evenings going over MIT course notes on discrete mathematics to attempt to read a paper on Conflict-free Replicated Data Types <http://hal.archives-ouvertes.fr/docs/00/55/55/88/PDF/techrep...;

What I would give for a book at that time that would help me translate the following:

    merge (X, Y ) : payload Z
       let ∀i ∈ [0, n − 1] : Z.P[i] = max(X.P[i], Y.P[i])
Into

    def merge(X, Y):
       Z = ... new object ...
       for i in range(len(X.P) - 1):
           Z.P[i] = max(X.P[i], Y.P[i])
       return Z
I eventually figured it out but it was a bit rough trying to figure out what all the symbols meant.
stiff
There is a really nice book trying to teach exactly this (among other things) called "The Haskell Road To Logic, Maths and Programming":

http://www.amazon.com/Haskell-Logic-Maths-Programming-Comput...

gtani
available for free, according to

http://hackershelf.com/topic/math/

phillmv
Well… it depends on the paper! Some stuff might need more than a single course of background to fully understand.

For more rudimentary papers, any undergrad course on discrete mathematics should get you started. I personally was forced to read http://www.amazon.com/Discrete-Mathematics-Applications-Susa... - and it's pretty decent.

ericmoritz
Thanks for the recommendation. That was one of the text books that I looked at but the sticker shock steered me towards free MIT course notes.
bwarp
Buy it second hand from Amazon - they are much cheaper! ($17)

+1 for the book recommendation - I also have a copy of that.

arghnoname
I don't know if it is true about that book in particular, but if I'm buying a textbook as a resource I usually look for the international editions. They sell near identical versions of the books (shuffle the problem sets around) in other places around for much, much less.
jacques_chester
Lisp is the "greek". Older, more free-wheeling. Intellectually more influential but swamped in "the real world" by the latin-speakers.
gjm11
And full of lambdas.
user24
> books that I often recommend to junior developers that are smart but didn't complete a degree and often find their knowledge of basic computer science lacking.

I think you may be confusing "Computer Science" with "Software Engineering".

phillmv
To be fair, I didn't take offense to "self-taught computer scientist". I took offense to "every self-taught computer scientist".

I'll be the first to admit that university is bullshit and mostly acts as a social signifier. Like I mentioned above, I'm credentialed, and I'll put that on my business cards but between you and me and the internet at large I mostly work as a software engineer.

There are even many problems with talking about "computer science" because for the most part it's treated like a branch of mathematics and its academic circle really hates dabbling in messy empirical data.

Yet, there's a degree of rigour in it. There's an actual underpinning behind a lot of this stuff.

So! Can you be a self taught computer scientist? Certainly! K&R and The Little Schemer just have almost nothing to do with it - even if they might make you into better programmers :).

scott_s
There are even many problems with talking about "computer science" because for the most part it's treated like a branch of mathematics and its academic circle really hates dabbling in messy empirical data.

This gets repeated again and again on HN, but it's not true. I am a CS researcher, and I always have messy empirical data. Systems research almost always has tons of experimental results. A large chunk of our papers are dedicated to the experimental design and results.

I have made this point many times before:

http://news.ycombinator.com/item?id=968013

http://news.ycombinator.com/item?id=690798

http://news.ycombinator.com/item?id=1131606

One of these days I'm going to actually set up a blog, write this point up as an essay, and be able to point to it.

My own personal definition of computer science: everything concerned with computation, both in the abstract and in the implementation.

phillmv
Please do! And link to the rest of your research :)!

I'm just talking from my own (very) limited personal experience, though, and the Greg Wilson video I linked to (albeit it's been two years since I last watched it).

scott_s
You can find everything that I have currently published in a peer-reviewed place on my (old) webpage in my profile. I've done lots since graduating (lots), but we've hit a bunch of paper rejections (aarrrrggg), so they're not published yet. I was, however, involved in a paper that we've submitted to a journal, so it has not gone through peer-review yet, but we made a tech report so that others could cite it: http://researcher.ibm.com/files/us-hirzel/tr11-rc25215-opt-c...
gruseom
That looks interesting! I saved it for later.
wisty
From what I can tell (based on HN and other forums), systems research is empirical testing of systems, while theoretical computer science is math heavy. The typical systems paper is "The design, implementation, and performance of a [application] system, using [technology]". While a theory paper is "Proof of the existence of a solution for [problem] in time [O(something)]".

These two branches do talk to each other, but not much.

scott_s
In general you got it, but CS theorists do more than just algorithmic analysis. They also may be after some other desired, provably guaranteed effect with an algorithm, and they'll say in passing "By the way, this is a linear algorithm, so it's performance isnt an issue." This is true in scheduling, cryptography and probably many other areas. (It's early, sorry.)

You got systems research pretty much head-on, but I would stress the design and implementation part more. That is, CS systems research tends towards engineering, where we claim we made a better whatzit, and then we need to provide a suite of experiments to support our claim.

Greg Wilson cited a "Barnes and Phillips 2002" study saying that true power is always concentrated in shadow committees. @1:00:40 http://vimeo.com/9270320 Anybody has a link to the paper?
Really, these are anecdotes, not evidence.

And I think he chose his words carefully, he said median, not average. Of course - to paraphrase Greg Wilson[1] - if one were to go further and compare the "best" driver to the "worst" driver, the difference would be to the power of infinity because the worst driver is dead. The comparison is not useful.

I'd love to see some studies on this sort of stuff though.

[1] "Greg Wilson - What We Actually Know About Software Development, and Why We Believe It's True" http://vimeo.com/9270320

greiskul
Great video. Do you know if the book that he mentioned, originally to be called Beautiful Evidence, but that he had to change the name, was published? By which name?
bartschuller
Making Software http://shop.oreilly.com/product/9780596808303.do
Dec 10, 2011 · 1 points, 0 comments · submitted by kmfrk
The Mythical Man-Month

As always. Considering we are making great software, but a lot of projects still seem to go over the estimated time....then I guess learning about that is what will bring us ahead.

Also this talk by Greg Wilson: http://vimeo.com/9270320

v21
I remember hearing about it while I was at uni, and checking it out of my uni library. I read through it, and it was a dusty old tome full of references to long dead systems.

And it was good, while not saying anything I was hugely shocked by, but making clear and sensible points.

And now a few years down the line, I'm out of uni and working, and I find myself toying with buying it for my boss...

bjg
Do it, I bought Peopleware for a previous boss as a kind of thanks present when I left that project. He really enjoyed it I think.
Dec 03, 2011 · 13 points, 1 comments · submitted by ExpiredLink
pdhborges
Here at hacker news there seems to exist an obsession on how to hire the 10x programmers and how to build great teams.

I'm submitting this link because I'm really tired to read blog posts filled with anecdotes and making claims because some "scrum master" or whatever said so and so.

Jul 22, 2011 · 3 points, 1 comments · submitted by motters
antr
great video. thx for the link
HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.