HN Books @HNBooksMonth

The best books of Hacker News.

Hacker News Comments on
Bioinformatics and Molecular Evolution

Paul G. Higgs, Teresa K. Attwood · 2 HN comments
HN Books has aggregated all Hacker News stories and comments that mention "Bioinformatics and Molecular Evolution" by Paul G. Higgs, Teresa K. Attwood.
View on Amazon [↗]
HN Books may receive an affiliate commission when you make purchases on sites after clicking through links on this page.
Amazon Summary
In the current era of complete genome sequencing, Bioinformatics and Molecular Evolution provides an up-to-date and comprehensive introduction to bioinformatics in the context of evolutionary biology. This accessible text: provides a thorough examination of sequence analysis, biological databases, pattern recognition, and applications to genomics, microarrays, and proteomics emphasizes the theoretical and statistical methods used in bioinformatics programs in a way that is accessible to biological science students places bioinformatics in the context of evolutionary biology, including population genetics, molecular evolution, molecular phylogenetics, and their applications features end-of-chapter problems and self-tests to help students synthesize the materials and apply their understanding is accompanied by a dedicated website - www.blackwellpublishing.com/higgs - containing downloadable sequences, links to web resources, answers to self-test questions, and all artwork in downloadable format (artwork also available to instructors on CD-ROM). This important textbook will equip readers with a thorough understanding of the quantitative methods used in the analysis of molecular evolution, and will be essential reading for advanced undergraduates, graduates, and researchers in molecular biology, genetics, genomics, computational biology, and bioinformatics courses.
HN Books Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this book.
Biology is completely different from Computer Science and metaphors between the fields build no understanding and can only be misleading, every time I hear someone comparing DNA to a computer program I fall into pieces. I recommend "Molecular Biology for Computer Scientists" instead for those willing to learn some actual biology:

http://www.biostat.wisc.edu/~craven/hunter.pdf

I think it's the first chapter of this book:

http://mitpress.mit.edu/books/processes-life

I once considered going into bioinformatics, and did an intense three weeks sprint trying to learn some molecular biology, ending in a seminar presentation to other people explaining the basics. I used this book back then which I also recommend strongly to those interested:

http://www.amazon.com/Bioinformatics-Molecular-Evolution-Pau...

It covers all the basics of molecular biology very understandably and at the same time the scientific/computational content is interesting even for a computer scientist. Still, learning this stuff takes hard work, you have to rehash some relevant chemistry first or you get nowhere, than biologists use a lot of both chemical and biological lingo which you have to understand, and only then the actual biological content becomes clear. Once you do understand it, however, it's beautiful, beautiful stuff, one of the most beautiful things one can learn in general I think, of which you unfortunately won't get a sense from reading this article, or in general from trying to understand it by sloppy metaphors. Do yourself a favour and try to understand this for real.

ramsaysnuuhh
> Biology is completely different from Computer Science and metaphors between the fields build no understanding and can only be misleading, every time I hear someone comparing DNA to a computer program I fall into pieces.

That's the kind of attitude that pushes competent programmers away from bioinformatics. End result? Almost all bioinformatics software is a steaming pile of crap.

mtdewcmu
The problem is probably that you have to train in two separate fields to be competent in computer science and biology, and that's not going to be common.
ramsaysnuuhh
But the quality and sophistication of software in, say, astronomy makes bioinformatics software look like it was written by 14 year old spammers in the former soviet bloc. The real problem is that biology went from being essentially a liberal art, more like history or sociology than a science, to an extremely quantitative field in only a few academic generations. You can get a PhD in Molecular & Cell Biology from Berkeley without ever having taken a statistics, linear algebra, multivar calc, or CS course. Biologists need programmers more than the other way around, so they should start by taking a few courses with actual numbers and stuff and learn the bare basics first.
nl
bioinformatics software look like it was written by 14 year old spammers in the former soviet bloc

So it's pretty damn good code, then? (Having worked with plenty of coders from the former soviet bloc: they have a pretty low tolerance of crap code)

dekhn
So, you would say that information theory has nothing to do with biology? Because that's what your first sentence implies.
zaras
This is an interesting question. I have always though that computational models are the most interesting description of genetic mechanisms (DNA primary sequence, epigenetics, transcriptional control, etc.) and singaling networks (RNA, proteins, metabolic networks, etc). They abstract away complexity and give us tools to make powerfull predictions. What you are suggesting here is that biology is more than a universal Turing machine. What makes you believe so?
stiff
The statment "biology is a Turing machine" does not make any sense to me, so I won't discuss whether it is true or not. Just pursue the analogy of DNA as a program for a few steps more and you will see how it goes. If DNA is a program what would be the computer? Ribosome? The universe? What would be the outcome? I guess the protein. But then the result of the computation is dependent on hundreds of factors outside of both the DNA and the ribosome. The "result" immediately gets distorted by whatever physics applies to it, the "instructions" might have different effects on the "computer" depending on position in the strand and on hundreds of external factors etc. What insights does this metaphor give you beside endless confusion?

The "computational" models that have been successfully used to model the DNA have nothing to do with treating it as a description of a computation, they are only computational in the sense that they use mathematics and computing for making predictions about biological phenomena. Those are mostly probabilistic models that treat the DNA as a string of discrete symbols, and encode the physical and chemical complexities associated with this approach in various ways, one good example is Hidden Markov Models, where the probabilities might encode physical or chemical knowledge:

http://en.wikipedia.org/wiki/Hidden_Markov_model

yshklarov
A correction: The chapter at

http://www.biostat.wisc.edu/~craven/hunter.pdf

is actually the first chapter of "Artificial Intelligence and Molecular Biology":

http://mitpress.mit.edu/books/artificial-intelligence-and-mo...

mtdewcmu
You have to understand things on their own terms. A lot of times you can find a biological process that resembles something in computer science, but you shouldn't take the analogy too far. Nature doesn't know what a computer is, so any resemblance between a natural process and a computer is incidental.

You can't conceive nature as an engineer and talk about how sloppy and random its designs are. You have to bear in mind that nature has made things that we can scarcely comprehend, let alone engineer, so if you're critiquing its designs as if you can read its mind and see what it was trying to do, you're probably missing the point. It pulls off incredible things, so nature is the master and we're the student.

delian66
>>You can't conceive nature as an engineer and talk about how sloppy and random its designs are.

Why not? We can talk about how sloppy something is from a given perspective; we do it all the time, and then try to improve things (there will be no progress otherwise), and if we succeed/fail, learn more ...

>>as if you can read its mind

Nature/evolution is not a sentient being, thus has no mind. Do you have evidence to the contrary ?

mtdewcmu
When you look at something designed by a person, you can intuit what they were doing and judge it as good or sloppy. Nature requires a completely different type of thinking. It's ultimately unhelpful to anthropomorphize nature.
Datsundere
Not if our universe was a simulation of a computer program
jokoon
I don't think you should take this really seriously
ye
DNA is not just a sequence that encodes proteins. We just found out there's a secondary higher level code in it:

https://news.ycombinator.com/item?id=6896779

And then there's RNA, which can fold and form structures, sort of like proteins.

Anybody who has ever seriously dealt with things produced by evolution knows that evolution "cheats" and optimizes when constrained by resources, producing structure upon structure upon structure, crazy indirection and non-obvious side-effects, as long as it helps the goal (survival and reproduction usually).

That's why I don't think human brains will be understood any time soon. We might figure out all the layers, but to figure out all the hidden structures would take a superhuman AI.

EDIT:

Example of circuit design by evolutionary algorithms:

http://hforsten.com/evolutionary-algorithms-and-analog-elect...

charlieflowers
Yes. Evolution is the ultimate "three star programmer": http://c2.com/cgi/wiki?ThreeStarProgrammer
badger288
circuits reference reminded me of this - http://www.damninteresting.com/on-the-origin-of-circuits/
timr
"DNA is not just a sequence that encodes proteins. We just found out there's a secondary higher level code in it"

So, I don't want to rain on anyone's parade, but we didn't "just" find this out. We've known for quite a long time about secondary (and even tertiary!) "codes" in DNA. That HN article was the result of a press-release about something that was interesting, but certainly not earth-shattering new theory. The reason it was in Science was because they did an extremely large-scale test of a whole bunch of different codons on thousands of different gene promoters, and directly quantified the impact of rare codons. That was an impressive way to settle an open debate.

Anyway, it's good to know that there are higher-order interactions encoded in DNA than just DNA -> RNA -> Protein, but you should realize that this is an old/deep area of research. This is pretty much what bioinformatics is about, actually: deducing the higher-order structures in DNA, RNA and proteins.

dnautics
>Biology is completely different from Computer Science and metaphors between the fields build no understanding and can only be misleading, every time I hear someone comparing DNA to a computer program I fall into pieces.

What? As someone who can write code in several different languages and who has also made de novo designer modifications to proteins at the low level (biological equivalent to assembler, if you will: http://ityonemo.github.io/acs2013/#/, if you're curious) I can attest to the metaphor between the two. It's not totally simple and not one-to-one, but in many situations it has helped me. Hopefully it keeps helping me as I move into my next stint (http://indysci.org/projectmarilyn)

If your idea of "biology" is "going into bioinformatics", well then no wonder your perception of the metaphor is skewed. Bioinformatics is more like "big data on biology"; what is more analogous to computer programming is Synthetic Biology.

Now, I would happen to disagree with many of the metaphors in the OP (and have, as a practitioner of both, better ones in mind that I use in my work), but that there are no connections is, I think, misguided.

stiff
I would not consider every digital code immediately a program, and even treating DNA just as a digital code can be confusing, since there is so much chemistry and physics happening until DNA is expressed as a protein, that you can't just expect that whatever you encode will really come out exactly as encoded. The nice simple mathematical models and metaphors we use for man-made artifacts fall on their face when applied in biology.

One can understand this better by looking for example at the difficulties encountered by people who do actually base computations on the DNA:

http://en.wikipedia.org/wiki/DNA_code_construction

dekhn
stiff, you've got multiple people who are experts in either fields (including people like me, who are experts in both fields) telling you your posts are full of crap.

It's not clear to me what your link about the DNA code construction really means.

Consider instead:

1) people can encode terabytes of digital information in DNA and retrieve it using molecular techniques. To do so accurately enough requires coding.

2) Crick, who helped figure out many of the basic information coding mechanisms, widely acknowledged being influenced by the theory of information as proposed by Shannon.

3) both computer scientists and biologists frequently use the theory of control mechanisms (feedback, etc, also known as 'cybernetics' when I studied under David Huffman) to understand how their systems work, and engineer new ones.

4) you can make a computer out of DNA< and you can put computers into biological systems; they interact according to the laws of information theory, entropy, thermodynamics, etc.

The relationship between computing theory- as practiced today- and what biology has managed to spontaneously developed- are intimately related at the most basic levels of physics. Understanding that is KEY to being a productive biologist or computer scientist, and escpeially so for bioinformaticians.

stiff
Do you understand the difference between a digital code and a computer program? I know information theory and coding theory can be used for studying DNA, when you make the models account for the physical and chemical complications that arise in biology, but the models used are way more complex than the ones used in traditional CS, so it doesn't make sloppy metaphors like the ones from the OP worthwhile for people who only know CS and no biology. Specific techniques from CS of course might be useful in understanding biology, but not metaphors of the kind OP uses, like calling introns comments. Also, while people do base computations on DNA, it doesn't work vice versa, e.g. none of the computational approaches to studying biology views DNA as a description of any sort of computation, as far as I know, it only treats it as a (often quite complicated) digital code or as a discrete stochastic process, as in the case of using hidden markov models for modelling DNA.
dekhn
Yeah, I worked on hidden markov models with David Haussler 20 years ago. HMMs are really a massive simplification of the underlying process of evolution, of course (I think that's what you're trying to say).

I'm not defending the original poster's use of analogy between introns and comments. Those are naive, simple analogies.

What I am responding to is your categorical statement that there are no legitimate analogies between CS and biology, and that's patently false.

I'm certainly not claiming - without evidence - that biological systems perform computation. I strongly suspect that biological systems carry out computation- quorum sensing being a canonical example- and in some sense, any sufficiently complex biological system can be considered an analog computer of some sort, with goal-seeking computational behavior.

One of my interests in some time has been building a digital circuit in DNA; one that can compute, using state, a taylor series approximation of an interesting constant, such as pi, using feedback circuits, error correction, and other "digital" approaches. None of this is impossible, it's just technology. People have already started doing this. That the underlying systems are capable of being turned into computing systems is just an outcome of the fact that biological systems are sufficiently complex that they can be used to instantiate simpler digital computing systems.

stiff
I think it boils down to what you consider an "analogy", is a markov model really an analogy between computer science and biology? I would consider that simply a mathematical concept that finds applications in both Computer Science and Biology, and it isn't even that related to computations or computers, it's just an useful mathematical tool. I don't deny concepts from CS or Mathematics can be applied to biology.

On the other hand, I think there are lots of CS people who apply blindly oversimplistic models to complex biological situations without understanding the biology, and that's what I was trying to protest against.

dekhn
I agree with your last sentence. But you made a categorical/general statement upthread, and I called you on that.
I am in an undergraduate-level bioinformatics course right now, and can echo what was said about statistics. The biology isn't too tough to pick up, but I am weak in stats and that is my greatest difficulty with this class. That said, I am enjoying it immensely. The book we use is a good overview of the subject - http://www.amazon.com/Bioinformatics-Molecular-Evolution-Pau... -Bioinformatics and Molecular Evolution. Most of the tools that bioinformaticians use are buggy, horribly designed and very unfriendly to the average user. One way to get involved is just improving the usability and stability of the tools. There are plenty of command line tools that you can add a GUI to if that is your thing:

http://bio-bwa.sourceforge.net/

http://abacus.gene.ucl.ac.uk/software/paml.html

http://samtools.sourceforge.net/

You don't need to know too much to contribute. If you are in Utah and interested I could get you some contacts with professors.

HN Books is an independent project and is not operated by Y Combinator or Amazon.com.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.