HN Books @HNBooksMonth

The best books of Hacker News.

Hacker News Comments on
The Design of the UNIX Operating System

Maurice Bach · 9 HN comments
HN Books has aggregated all Hacker News stories and comments that mention "The Design of the UNIX Operating System" by Maurice Bach.
View on Amazon [↗]
HN Books may receive an affiliate commission when you make purchases on sites after clicking through links on this page.
Amazon Summary
Classic description of the internal algorithms and the structures that form the basis of the UNIX operating system and their relationship to programmer interface. The leading selling UNIX internals book on the market.
HN Books Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this book.
Interesting idea, might be important to brush up on the implementation rationale of Unix.

To that end there is The Design of The Unix Operating System, Bach [1]. Or A Fast File System for Unix [2]. Or The Design and Implementation of a Log-Structured File System [3]. It seems to me that files solve for the problem of how to deliver an economic memory management system, and they do it by paging cheap disk space so that it's an effective proxy for fast-but-expensive memory. Also files, with their file pointers, can certainly be used as data structures, but this 'understanding' of what each section means is clearly variable, so is best left to specific applications and it's not good to build into the system as a default. Different applications will want different structures for optimal performance...

[1] https://www.amazon.com/Design-UNIX-Operating-System/dp/01320...

[2] https://people.eecs.berkeley.edu/~brewer/cs262/FFS.pdf

[3] https://people.eecs.berkeley.edu/~brewer/cs262/LFS.pdf

A classic: The Design of the UNIX Operating System by Maurice Bach. https://www.amazon.com/Design-UNIX-Operating-System/dp/01320...

I learned more about OS from this book than any academic material used at the university and gave me a good foundational handle on *NIX's.

There's also the great "The Design of the UNIX Operating System":

https://www.amazon.com/Design-UNIX-Operating-System/dp/01320...

Which is probably less relevant today in terms of directly understanding the implementation. But an interesting and enlightening read. Things were much simpler and fundamental back in the 1980s. It's easier to understand that way. Then layer on top.

nicklaf
Slightly more recent: The Design and Implementation of the 4.4 BSD Operating System, which includes some of the Berkeley additions to the kernel, such as TCP/IP and sockets.

And then there's xv6 [1], a small Unix running on vx32 from MIT for teaching purposes, full of comments, and available as a booklet that is directly inspired by Lions' commentary on the 6th edition of Unix.

I actually agree, though: to really appreciate the classics you should start with (Maurice) Bach. :-P

[1] https://pdos.csail.mit.edu/6.828/2018/xv6.html

nicklaf
Correction: xv6 runs on QEMU, not vx32 (although Russ Cox co-authored both xv6 and vx32).
I'm currently reading The Design of the UNIX Operating System [0], so this is super interesting for me to watch. I love how he creates a spellchecking program live, while being filmed.

[0] http://www.amazon.com/The-Design-UNIX-Operating-System/dp/01...

good idea, I would buy two kinds of books 1) one which aggregates all the "How do I write an OS from scratch" info under links below into one coherent whole. and 2) a tutorial on how to build a domain-specific OS, say OS that only knows to run a web-server, or OS for FPGA or Raspberry Pi, or a mobile OS that knows messaging and nothing else

1. http://www.quora.com/How-do-I-write-an-operating-system

2. http://www.quora.com/If-you-were-to-write-a-new-operating-sy...

3. http://www.cs.bham.ac.uk/~exr/lectures/opsys/10_11/lectures/...

4. http://www.amazon.com/The-Design-UNIX-Operating-System/dp/01...

5. http://www.amazon.com/The-Elements-Computing-Systems-Princip...

6. http://www.returninfinity.com/baremetal.html

Nov 27, 2012 · robomartin on I’m writing my own OS
OK, if you don't have any real experience in low-level embedded coding (relevant to device drivers), RTOS or OS design in general, file systems, data structures, algorithms, interfaces, etc. And, if you have "hobby level" experience with Assembler, C and C++. And, if your intent is to write a desktop OS, from the ground up, without making use of existing technologies, drivers, file systems, memory management, POSIX, etc. Here's a list of books that could be considered required reading before you can really start to write specifications and code. Pick twenty of these and that might be a good start.

In no particular order:

1- http://www.amazon.com/C-Programming-Language-2nd-Edition/dp/...

2- http://www.amazon.com/The-Answer-Book-Solutions-Programming/...

3- http://www.amazon.com/The-Standard-Library-P-J-Plauger/dp/01...

4- http://www.amazon.com/C-Traps-Pitfalls-Andrew-Koenig/dp/0201...

5- http://www.amazon.com/Expert-Programming-Peter-van-Linden/dp...

6- http://www.amazon.com/Data-Structures-In-Noel-Kalicharan/dp/...

7- http://www.amazon.com/Data-Structures-Using-Aaron-Tenenbaum/...

8- http://www.amazon.com/Mastering-Algorithms-C-Kyle-Loudon/dp/...

9- http://www.amazon.com/Code-Complete-Practical-Handbook-Const...

10- http://www.amazon.com/Design-Patterns-Elements-Reusable-Obje...

11- http://www.amazon.com/The-Mythical-Man-Month-Engineering-Ann...

12- http://www.amazon.com/The-Programming-Language-4th-Edition/d...

13- http://www.amazon.com/The-Standard-Library-Tutorial-Referenc...

14- http://www.amazon.com/API-Design-C-Martin-Reddy/dp/012385003...

15- http://www.amazon.com/The-Linux-Programming-Interface-Handbo...

16- http://www.amazon.com/Computer-Systems-Programmers-Perspecti...

17- http://www.amazon.com/System-Programming-Unix-Adam-Hoover/dp...

18- http://www.amazon.com/Memory-Programming-Concept-Frantisek-F...

19- http://www.amazon.com/Memory-Management-Implementations-Prog...

20- http://www.amazon.com/UNIX-Filesystems-Evolution-Design-Impl...

21- http://www.amazon.com/PCI-System-Architecture-4th-Edition/dp...

22- http://www.amazon.com/Universal-Serial-System-Architecture-E...

23- http://www.amazon.com/Introduction-PCI-Express-Hardware-Deve...

24- http://www.amazon.com/Serial-Storage-Architecture-Applicatio...

25- http://www.amazon.com/SATA-Storage-Technology-Serial-ATA/dp/...

26- http://www.amazon.com/Beyond-BIOS-Developing-Extensible-Inte...

27- http://www.amazon.com/Professional-Assembly-Language-Program...

28- http://www.amazon.com/Linux-Kernel-Development-3rd-Edition/d...

29- http://www.amazon.com/Version-Control-Git-collaborative-deve...

30- http://www.amazon.com/Embedded-Software-Primer-David-Simon/d...

31- http://www.amazon.com/Programming-Embedded-Systems-C/dp/1565...

32- http://www.amazon.com/Making-Embedded-Systems-Patterns-Softw...

33- http://www.amazon.com/Operating-System-Concepts-Abraham-Silb...

34- http://www.amazon.com/Performance-Preemptive-Multitasking-Mi...

35- http://www.amazon.com/Design-Operating-System-Prentice-Hall-...

36- http://www.amazon.com/Unix-Network-Programming-Sockets-Netwo...

37- http://www.amazon.com/TCP-Illustrated-Volume-Addison-Wesley-...

38- http://www.amazon.com/TCP-IP-Illustrated-Vol-Implementation/...

39- http://www.amazon.com/TCP-Illustrated-Vol-Transactions-Proto...

40- http://www.amazon.com/User-Interface-Design-Programmers-Spol...

41- http://www.amazon.com/Designing-Interfaces-Jenifer-Tidwell/d...

42- http://www.amazon.com/Designing-Interfaces-Jenifer-Tidwell/d...

43- http://www.amazon.com/Programming-POSIX-Threads-David-Butenh...

44- http://www.intel.com/p/en_US/embedded/hwsw/software/hd-gma#d...

45- http://www.intel.com/content/www/us/en/processors/architectu...

46- http://www.intel.com/p/en_US/embedded/hwsw/hardware/core-b75...

47- http://www.hdmi.org/index.aspx

48- http://en.wikipedia.org/wiki/Digital_Visual_Interface

49- http://www.amazon.com/Essential-Device-Drivers-Sreekrishnan-...

50- http://www.amazon.com/Making-Embedded-Systems-Patterns-Softw...

51- http://www.amazon.com/Python-Programming-Introduction-Comput...

52- http://www.amazon.com/Practical-System-Design-Dominic-Giampa...

53- http://www.amazon.com/File-Systems-Structures-Thomas-Harbron...

54- ...well, I'll stop here.

Of course, the equivalent knowledge can be obtained by trial-and-error, which would take longer and might result in costly errors and imperfect design. The greater danger here is that a sole developer, without the feedback and interaction of even a small group of capable and experienced programmers could simply burn a lot of time repeating the mistakes made by those who have already trenched that territory.

If the goal is to write a small RTOS on a small but nicely-featured microcontroller, then the C books and the uC/OS book might be a good shove in the right direction. Things start getting complicated if you need to write such things as a full USB stack, PCIe subsystem, graphics drivers, etc.

tagada
The guy don't want to write a perfectly designed OS, neither he wants to design a RTOS by the way ... He didn't even specify whether his OS would be multitask. If not, no need for any scheduling at all, huh ... he just "writes his own OS". It's a fun experience, very rich and very teaching. But at least now, we all can admire the wideness of all your "unlimited knowledge" especially in terms of titles of books. Afterall, no needs for creating a new OS, if you get all your inspiration from things that already exists. You will end with Yet Another Nix ... Not sure he wants a YanOS, though.

Without talking about filesystems, drivers, memory managment or existing norms and standards (which it seems he wants to avoid ... which is not that stupid ... all having been designed 40 years ago ... who knows maybe he'll have a revolutionary idea, beginning from scratch), going from scratch with no exact idea of where you go could be a good thing. Definitely. You solve the problem only when you face them. Step by step, sequentially. Very virile, man.

I would advice the guy to start obviously with the boot loader. Having a look at the code of GRUB (v1), LILO, or better yet, on graphical ones (BURG, GAG or this good one XOSL which is completely under GUI - mouse pointer, up to 1600x1200 screens, and many supported filesystems, written in ASM ... it could actually be a hacking basis for a basic OS on itself) could be a great source of inspiration, and beginning to play directly with the graphical environment.

You could also have a look on these things http://viralpatel.net/taj/tutorial/hello_world_bootloader.ph... and of course on Kolibri OS http://kolibrios.org/en/?p=Screenshots

tagada
Blah, blah, blah ...

Better advice would be: Try, make mistakes, learn from them, continue.

Way, way, way less discouraging, pessimistic and above all pedantic.

derefr
> If the goal is to write a small RTOS on a small but nicely-featured microcontroller, then the C books and the uC/OS book might be a good shove in the right direction. Things start getting complicated if you need to write such things as a full USB stack, PCIe subsystem, graphics drivers, etc.

I've always wondered if there could be created some way to skip this step in [research] OS prototyping, by creating a shared library (exokernel?) of just drivers, while leaving the "design decisions" of the OS (system calls, memory management, scheduling, filesystems, &c.--you know, the things people get into OS development to play with) to the developer.

People already sort of do this by targeting an emulator like VirtualBox to begin with--by doing so, you only (initially) need one driver for each feature you want to add, and the emulator takes care of portability. But this approach can't be scaled up to a hypervisor (Xen) or KVM, because those expect their guest operating systems to also have relevant drivers for (at least some of) the hardware.

I'm wondering at this point if you could, say, fork Linux to strip it down to "just the drivers" to start such a project (possibly even continuing to merge in driver-related commits from upstream) or if this would be a meaningless proposition--how reliant are various drivers of an OS on OS kernel-level daemons that themselves rely on the particular implementation of OS process management, OS IPC, etc.? Could you code for the Linux driver-base without your project growing strongly isomorphic structures such as init, acpid, etc.?

Because, if you could--if the driver-base could just rely on a clean, standardized, exported C API from the rest of the kernel, then perhaps (and this is the starry-eyed dream of mine) we could move "hardware support development" to a separate project from "kernel development", and projects like HURD and Plan9 could "get off the ground" in terms of driver support.

robomartin
A lot depends on the platform. If the OS is for a WinTel motherboard it is one thing. If, however, the idea is to bypass driver development for a wide range of platforms it gets complicated.

In my experience one of the most painful aspects of bringing up an OS on a new platform is exactly this issue of drivers as well as file systems. A little google-ing quickly reveals that these are some of the areas where one might have to spend big bucks in the embedded world in order to license such modules as FFS (Flash File System) with wear leveling and other features as well as USB and networking stacks. Rolling your own as a solo developer or even a small team could very well fit into the definition of insanity. I have done a good chunk of a special purpose high-performance FFS. It was an all-absorbing project for months and, realistically, in the end, it did not match all of the capabilities of what could be had commercially.

This is where it is easy to justify moving into a more advanced platform in order to be able to leverage Embedded Linux. Here you get to benefit and leverage the work of tens of thousands of developers devoted to scratching very specific itches.

The down-side, of course, is that if what you need isn't implemented in the boad support package for the processor you happen to be working with, well, you are screwed. The idea that you can just write it yourself because it's Linux is only applicable if you or your team are well-versed in Linux dev at a low enough level. If that is not the case you are back to square one. If you have to go that route you have to hire an additional developer that knows this stuff inside out. That could mean $100K per year. So now your are, once again, back at square one: hiring a dev might actually be more exoensive than licensing a commercial OS with support, drivers, etc.

I was faced with exactly that conundrum a few years ago. We ended-up going with Windows CE (as ugly as that may sound). There are many reasons for that but the most compelling one may have been that we could identify an OEM board with the right I/O, features, form factor, price and full support for all of the drivers and subsystems we needed. In other words, we could focus on developing the actual product rather than having to dig deeper and deeper into low-level issues.

It'd be great if low level drivers could be universal and platform independent to the degree that they could be used as you suggest. Obviously VM-based platforms like Java can offer something like that so long as someone has done the low-level work for you. All that means is that you don't have to deal with the drivers.

To go a little further, part of the problem is that no standard interface exists to talk to chips. In other words, configuring and running a DVI transmitter, a serial port and a Bluetooth I/F are vastly different even when you might be doing some of the same things. Setting up data rates, clocks, etc. can be day and night from chip to chip.

I haven't really given it much thought. My knee-jerk reaction is that it would be very hard to crate a unified, discoverable, platform-independent mechanism to program chips. The closest one could possibly approach this idea would be if chip makers were expected to provide drivers written to a common interface. Well, not likely or practical.

Not an easy problem.

derefr
> It'd be great if low level drivers could be universal and platform independent to the degree that they could be used as you suggest. Obviously VM-based platforms like Java can offer something like that so long as someone has done the low-level work for you. All that means is that you don't have to deal with the drivers.

Another thought: if not just a package of drivers, then how stripped down (for the purpose of raw efficiency) could you make an operating system intended only to run an emulator (IA32, JVM, BEAM, whatever) for "your" operating system? Presumably you could strip away scheduling, memory security, etc. since the application VM could be handling those if it wanted to. Is there already a major project to do this for Linux?

Nov 26, 2012 · robomartin on I’m writing my own OS
Anyone who has ever written a small RTOS on a small 8 bit embedded processor will only laugh at the OP. And, I hate to say, it would be justified. There are about twenty or thirty books between where he is now and where he'd have to be in order to even start talking about designing an OS for a desktop platform. Add to that 10,000 hours of coding low to high-level projects across embedded to desktop platforms.

A quick read of the "About" page is probably in order:

http://gusc.lv/about/

What to say?

"Someone holding a cat by the tail learns something he can learn in no other way" --Mark Twain.

Here's the tip of the tail:

http://www.amazon.com/Embedded-Controller-Forth-8051-Family/...

http://www.amazon.com/Operating-System-Concepts-Abraham-Silb...

http://www.amazon.com/Performance-Preemptive-Multitasking-Mi...

http://www.amazon.com/Design-Operating-System-Prentice-Hall-...

Have fun.

gosu
At my university, students in groups of 1 or 2 write a fully preemptible, multiprocessor unixy kernel and userspace on x86. Essentially nothing is provided except a bootloader, a build system, a syscall spec, and some moderate protection from the details of some hardware beyond the CPU. This is considered doable in a semester by college students, some of who have no preparation except a general "here's what a CPU register is, here's what caching is" course. No one reads 20-30 books, but rather most students work from first principles and wind up reinventing the usual algorithms on their own.

I thought I'd offer a more optimistic counterpoint. I think that the 10,000 hours figure is way, way overestimating the amount of time needed in order to create something usable enough to make you satisfied, and which could teach you enough to understand any real operating system at the level of source code.

Although, yeah, if you've got commercial aspirations like OP, then I think you're in for it.

robomartin
> Although, yeah, if you've got commercial aspirations like OP, then I think you're in for it.

I think you nailed it right there. If the OP had said something akin to "I want to write a small OS with a command line interface to learn about the topic" it would have been an entirely different question. I would encourage anyone to do that. It would be an excellent learning experience and the foundation for a lot more interesting work.

If you go back and read the post, this is no small OS any of of the many sub-projects he is proposing is a huge undertaking for a developer who self-describes this way: "I spend most of my days in the world of PHP, JavaScript (I love jQuery) and a little bit of HTML, CSS and ActionScript 3.0".

With regards to your comment about writing a "fully preemptible, multiprocessor unixy kernel and userspace on x86" in school. Sure. Of course. But, keep in mind that you are actually being TAUGHT how to do this and guided throughout the process. You also mentioned that "a bootloader, a build system, a syscall spec, and some moderate protection from the details of some hardware beyond the CPU" are provided. The OP is talking about writing everything!

For example, he seems to talk about writing PCIe, SATA and USB interfaces. That alone could take a newbie a whole year to figure out. Particularly if coming from being a web developer.

Insane? Yes. Impossible? Of course not. Probable? Nope.

gosu
Agreed. I just wanted to talk in general to make sure that people weren't too intimidated. I think a hobby OS is a great way to become a better programmer.

About the course: the extent to which we were guided was minimal by design. It wasn't a matter of "here's a skeleton, here are steps A, B, and C to get it working", but rather "here's an API, implement it; here are some general design ideas". I think that this is comparable to what you'd get if you sat down with a book on your own, and so I hope that it might give people a decent idea of the level of difficulty of such a project.

I emphatically agree with your point about the difficulty of a newbie + PCI situation, although a year still seems steep.

meaty
Bullshit. A simple operating system is not much work with respect to getting a kernel off the ground.

Back in the old days before the pc took off and people started expecting abstractions for everything conceivable, this was a normal part of a project. At university, I was tasked with building an rtos platform for the m68k. Took about a month from zero knowledge to working multitasking os with DMA, memory management and protection and a virtual machine which ran plc-style ladder logic.

The only problem is if you start with x86, you're going to have to fight the layers of fluff that have built up since the 8086 (ldt/gdt/long mode/segments/pci bus/shitty instruction set etc).

I'd go for ARM.

robomartin
> Bullshit. A simple operating system is not much work with respect to getting a kernel off the ground.

Did you read the article? There's nothing there than hints a simple operating system at all.

So, yeah: Bullshit. The operating system he is talking about is no hobby project.

Also, define "simple operating system". What is it? What can it do? What can't it do?

meaty
Successive abstractions. What he/she proposed was an end game, not the first steps.
jacquesm
I wrote a micro kernel based OS. Then I wrote a graphics driver to get bit mapped graphics. Then a window manager using that bit mapped graphics driver. Then a bunch of applications using that window manager.

And I never read any one of those books you're listing there other than K&R and the 486 reference manuals. Sure enough I had a fair grasp of the x86 processor architecture before starting this and I'd done a lot of low level 8 bit work. But on the whole I spent more time waiting for it to reboot than I did writing code or reading books and I still managed to do all this in about two years.

This is doable. It's hard, but it is doable, and it is a lot easier now than when I did it. For one you have VMs now which make it a thousand times easier to debug a kernel. No more need to use a dos-extender to bootstrap your fledgling kernel code and so on.

This guy is way out of his depth, that's for sure. But what is also for sure is that he's going to learn quickly and seems on the right road for that (acknowledging what he doesn't know yet).

Don't tell other people what they can't do. Just wait and see, they just might surprise you. You'd have been talking to Linus like that just the same. And you would have been right about him being out of his depth, and you would have been wrong about him not being able to achieve his goal in the longer term.

Maybe this guy will get discouraged, maybe he won't. But no need to kill his enthusiasm with a negative attitude. If you're so smart, why not give him a hand, point him in the right direction on the concrete questions he's asking rather than to literally throw the book (or in this case a whole library) at him and tell him he's clueless.

He probably already knows that anyway, but at least he's willing to learn.

qznc
From a previous comment of yours: "Every new programmer needs to start with C. In fact, I am convinced that every new programmer needs to start with C and be tasked with writing an RTOS on a small memory-limited 8 bit processor. And then write several applications that run within that RTOS."

He is not that far off from your suggestions. Why do you think 8bit and realtime is better than AMD64? The architecture is a lot more complicated. On the other it is probably much better documented as well.

derleth
> Every new programmer needs to start with C.

Wow, what an idiotic thing to say. C is one of the worst languages for learning the actual fundamentals of programming, which are algorithms and data structures.

robomartin
Absolutely not true. You can learn all of that and more with C. You might not like C and that is a different issue.

Let's take it down even further: Every language you care to suggest ultimately ends-up in machine language. You can implement ANY algorithm or data structure management you care to mention in assembler. So, assembler isn't any less capable in that regard than any language anyone might care to propose.

Now, of course there's the practical matter of the very real fact that doing object oriented programming --as an example-- in assembler would be extremely painful, so yeah, this would not be the first choice.

Nobody who has done a reasonable amount of programming across tools and platforms would, for a minute, suggest that C is the be-all and end-all of programming languages. That I have never said anywhere. In fact, in a recent post I believe I suggested a progression involving assembler, Forth, C, Lisp, C++, Java (or other OO options). Even less popular languages like APL have huge lessons to teach.

As the level of abstraction increases one can focus on more complex problems, algorithms and data structures. That is true.

One of the problems with a lot of programmers I run into these days is that a lot of what happens behind the code they write is absolute magic to them. They have almost zero clue as to what happens behind the scenes. That's why I tend to like the idea of starting out with something like C. It is very raw and it can get as complex as you care to make it.

One can use C to write everything from device drivers, operating systems, mission critical embedded systems, database managers, boot loaders, image processors, file managers, genetic solvers, complex state machines and more. There's virtually nothing that cannot be done with C.

Is it ideal? No such language exists. However, I'll go out on a limb and say that if I have two programmers in front of me and one only learned, say, Objective-C and nothing more while the other started out with C and then moved to Objective-C, the second programmer will be far better and write better code than the first.

All of that said, there is no magic bullet here. Start with whatever you want. No two paths are the same. Just different opinions.

derleth
> So, assembler isn't any less capable in that regard than any language anyone might care to propose.

You're arguing against a strawman here.

> Nobody who has done a reasonable amount of programming across tools and platforms would, for a minute, suggest that C is the be-all and end-all of programming languages.

And I never used that strawman.

> As the level of abstraction increases one can focus on more complex problems, algorithms and data structures.

And this is my point. You can focus on what you're learning without having to waste time on anything else.

Why don't you advocate a return to punch cards?

> One of the problems with a lot of programmers I run into these days is that a lot of what happens behind the code they write is absolute magic to them.

And most programmers don't know enough physics to understand how a transistor works, either. You can learn stuff like that as and when you need it. The actual core needs to come first.

> There's virtually nothing that cannot be done with C.

Ditto machine language, as you just said. So why didn't you say everyone needs to start with machine language?

robomartin
I merely used the fact that all languages ultimately compile to (or are interpreted by) machine code to illustrate the fact that you calling me an idiot and asserting that "C is one of the worst languages for learning the actual fundamentals of programming, which are algorithms and data structures." is, well, misplaced.

You can learn just as much with assembler. It would be a huge pain in the ass. And, just in case there's any doubt, I am not proposing that anyone use assembler to learn complex algorithms, patterns or data structures.

Your original comment "what an idiotic thing to say" is just false. You can learn ALL fundamentals of programming with C. And, yes, you can learn ALL fundamentals of data structures with C.

Classes and OO are not "fundamentals". That's the next level. And there's a whole movement proposing that there are huge issues with OO to boot.

I have a question. You were quick to call me an idiot for suggesting that newbies need to start with C. OK. I have a thick skin. Thanks.

Now, let's move on. I noticed that you did not offer a solution. What would you suggest someone should start with? Why? How is it better than starting with C?

Now, keep in mind that we are talking about STARTING here. We are not talking about --and I have never suggested that-- C is the ONLY language someone should learn. Quite the contrary.

Your ball.

derleth
> you calling me an idiot

I NEVER DID THAT. I merely said an idea was idiotic.

I will not proceed until you acknowledge that. It's a question of honesty.

robomartin
Hmm, semantics? Dont' know:

http://dictionary.reference.com/browse/idiotic

    1. of, pertaining to, or characteristic of an idiot.
    2. senselessly foolish or stupid: an idiotic remark.
Either way, not a wonderful statement to make. But, that's OK. I can take the criticism, even if misplaced. I am far more interested in how you would answer my questions. Reminding you that we are talking about what might constitute a reasonable choice for someone to learn as their very first programming language, the questions were:

    What would you suggest someone should start with? 
    Why? 
    How is it better than starting with C?
And I'll add:

    What will they learn that they cannot learn with C?
    How would learning C as their first language hinder them?
    Why is C an idiotic first choice?
derleth
> Hmm, semantics? Dont' know:

Newton held idiotic ideas. Was Newton an idiot? No. Did I just call Newton an idiot? No.

> What would you suggest someone should start with?

It depends on the person and why they want to program.

> Why?

Because I don't think C is the best choice for all tasks. In fact, I think C is a poor choice for most of the reasons people start programming.

> How is it better than starting with C?

Because C forces the programmer to prioritize machine efficiency above everything else. Algorithms get contorted to account for the fact the programmer must explicitly allocate and release all resources. Data structures get hammered down into whatever form will fit C's simplistic (and not very machine efficient) memory model.

In short, everything is simplified and contorted to fit the C worldview. The programmer is forced to act as their own compiler, turning whatever program they want to write into something the C compiler will accept.

> What will they learn that they cannot learn with C?

A clearer understanding of things like recursive data structures, which are complicated with excess allocation, deallocation, and error-checking noise code in C.

Compare a parser written in Haskell to one written in C: The string-handling code is reduced to a minimum, whereas in C it must be performed with obscene verbosity.

> How would learning C as their first language hinder them?

> Why is C an idiotic first choice?

It is purely wasteful to have new programmers worry about arbitrary complexities in addition to essential complexities. It is wasteful to have new programmers writing the verbose nonsense C imposes on them every time they want to do anything with a block of text. That time should be spent learning more about the theory behind programming, the stuff that won't change in a few years because it is built on sound logic, not accidents of the current generation of hardware design.

robomartin
> Because C forces the programmer to prioritize machine efficiency above everything else. Algorithms get contorted to account for the fact the programmer must explicitly allocate and release all resources. Data structures get hammered down into whatever form will fit C's simplistic (and not very machine efficient) memory model.

Well. We couldn't disagree more.

I love APL because it absolutely removes you from nearly everything low-level and allows you to focus on the problem at hand with an incredible ability to express ideas. I did about ten years of serious work with APL. I would not suggest that a new programmer start with APL. You really need to know the low level stuff. Particularly if we are talking about writing an operating system and drivers.

Nobody is suggesting that a programmer must never stray outside of C. That would be, to echo your sentiment, idiotic. A good foundation in C makes all else non-magical, which is important.

javert
Coding algs and data structures in C lets you see how those things _actually work_ in the computer. A lot is hidden by the abstractions of higher-level languages.

In particular, I am thinking about pointers and memory management, but there are other things.

derleth
> Coding algs and data structures in C lets you see how those things _actually work_ in the computer.

No. Not really. C doesn't show you any of the essential parts of cache, opcode reordering, how multicore interacts with your code, or much of anything else that actually makes hardware fast.

C makes you act as if your computer was a VAX.

robomartin
I am holding my breath to learn what language you are going to propose, as a first language, that teaches all of those things.
derleth
> I am holding my breath to learn what language you are going to propose, as a first language, that teaches all of those things.

You're not reading my other posts, then. I explicitly said programmers can learn those things as and when they need to.

aidenn0
Robomartin: I learned in C. I also find C to be superior to e.g. Java for learning data structures and algorithms. On the other hand you are losing this argument:

Below let X represent roughly the sentiment "C is a good learning language, since it teaches you what happens at a low level"

darleth: C sucks as an intro language robomartin: No it doesn't because X darleth: X was true 30 years ago but isn't anymore robomartin: well C is still better because there is no language that does X

A better refutation is that I cannot predict the order of complexity for an algorithm written in Haskell that I could trivially do in C. Haskell presents immutable semantics, but underneath it all, the compiler will do fancy tricks to reuse storage in a way that is not trivially predictable for a beginner.

Similarly with Java, you end up having to explain pointers and memory and all that nastyness the first time the GC freezes for 1-2 seconds when they are testing the scaling of an algorithm they implemented in it.

Yes there is a "learn that when you need it" for a lot of stuff, but for someone actually learning fundamentals like data-structures and algorithms, we are talking about a professional or at least a serious student of CS. Someone in that boat will need to be exposed to these low-level concepts early and often because it is a major stumbling block for a lot of people.

If you just want to write a webapp, use PHP. If you want to learn these fundamentals you will also need to be exposed to the mess underneath, and it needs to happen sooner than most people think.

derleth
> A better refutation is that I cannot predict the order of complexity for an algorithm written in Haskell that I could trivially do in C. Haskell presents immutable semantics, but underneath it all, the compiler will do fancy tricks to reuse storage in a way that is not trivially predictable for a beginner.

Except this is also true for C at this point. Maybe the order won't change, but maybe it will at that, if the compiler finds a way to parallelize the right loops.

C compilers have to translate C code, which implicitly assumes a computer with a very simplistic memory model (no registers, no cache), into performant machine code. This means C compilers have to deal with the register scheduling and the cache all by themselves, leading to code beginners have a hard time predicting, let alone understanding.

Add to that little tricks like using MMX registers for string handing and complex loop manipulation and you have straightforward C being transformed into, at best, with a good compiler, machine code that you need to be fairly well-versed in a specific platform to understand.

This is why I get so annoyed when people say C is closer to the machine. No. The last machine C was especially close to was the VAX. C has gotten a lot further away from the machine in the last few decades.

The implication here is that you should teach C as an end in itself, not as an entry point into machine language. If you want to teach machine language, do it in its own course that has a strong focus on the underlying hardware. And don't claim C is 'just like' assembly.

aidenn0
1) Nowhere in my comment did I say C is closer to the machine.

2) Despite #1 C is still closer to the machine than Haskell, and I'm not sure how you could maintain otherwise

3) Nearly all of the C optimizations will, at best, make a speedup by a constant factor. Things that add (or remove) an O(n) factor in Haskell can and do happen.

robomartin
> Robomartin: I learned in C. I also find C to be superior to e.g. Java for learning data structures and algorithms. On the other hand you are losing this argument

I appreciate your sentiment. However, I think you made the mistake of assuming that there is an argument here. :)

I find that most software engineers who, if I may use the phrase, "know their shit", understand the value of coming-up from low level code very well. I have long given-up on the idea of making everyone understand this. Some get it, some don't. Some are receptive to reason, others are not.

I am working on what I think is an interesting project. Next summer I hope to launch a local effort to start a tech summer camp for teenagers. Of course, we will, among other things, teach programming.

They are going to start with C in the context of robotics. I have been teaching my kid using the excellent RobotC from CMU. This package hides some of the robotics sausage-making but it is still low-level enough to be very useful. After that we might move them to real C with a small embedded project on something like a Microchip PIC or an 8051 derivative.

In fact, I am actually thinking really hard about the idea of teaching them microcode. The raw concept would be to actually design a very simple 4 bit microprocessor with an equally simple ALU and sequencer. The kids could then set the bit patterns in the instruction sequencer to create a set of simple machine language instructions. This is very do-able if you keep it super-simple. It is also really satisfying to see something like that actually execute code and work. From that to understanding low-level constructs in C is a very easy step.

After C we would move to Java using the excellent GreenFoot framework.

So, the idea at this point would be Microcode -> RobotC -> full C -> Java.

Anyone interested in this please contact me privately.

robomartin
Yup. Exactly why I think C is a great starting point.

This is also why I think we have so much bloated code these days. Everything has to be an object with a pile of methods and properties, whether you need them or not. Meanwhile nobody seems to be able to figure out that you might be able to solve the problem with a simple lookup table and clean, fast C code. There was a blog post somewhere about exactly that example recently but I can't remember where I saw it.

I wrote a GA in Objective-C because, well, I got lazy. Then, after seeing the dismal performance I got I re-coded it in C. It's been a couple of years but I think that the performance was hundreds of times faster than anything the optimized Objective-C code could achieve. The heavy bloated NS data types just don't cut it when it comes to raw performance.

Someone who has only been exposed to OO languages simply has no clue as to what is happening when they are filling out the objects they are creating with all of those methods and properties or instantiating a pile of them.

derleth
> This is also why I think we have so much bloated code these days.

'Bloat' is a snarl term. It's meaningless. It literally means nothing, except to express negative emotion.

> I wrote a GA in Objective-C because, well, I got lazy. Then, after seeing the dismal performance I got I re-coded it in C.

Did you try any other algorithms? Any other data structures? Simply picking a new language is laziness.

robomartin
No, it's not a snarl term. It's very real.

When dealing win an array is 400 times slower in a "modern OO language" then in raw C, well, the code id fucking bloated.

When you can use a simple data structure and some code to solve a problem and, instead, write an object with a pile of properties and methods because, well, that's all you know, that's bloated code.

Of course there are lots of places where OO makes absolute sense. And the fat and slow code is the compromise you might have to make. That's the way it goes.

With regards to my GA example. No, I had to implement a GA. That's what was required to even attempt to solve the problem at hand. Later on we used it to train a NN, which made the ultimate solution faster. But, the GA was required. There was no way around it and Objective-C was such a an absolute pig at it that it made it unusable.

> Simply picking a new language is laziness

See, there's the difference. I started programming at a very low level and have experienced programming languages and approaches above that, from C, to C++, Forth, Lisp, APL, Python, Java, etc.

I have even done extensive hardware design with reconfigurable hardware like PLD, PLA's and FPGA's using Verilog/VHDL. I have designed my own DDR memory controllers as well as raw-mode driver controllers and written all of the driver software for the required embedded system. My last design was a combination embedded DSP and FPGA that processed high resolution image data in real time at a rate of approximately SIX BILLION bytes per second.

So, yes, I am an idiot and make really fucking dumb suggestions.

Because of that I would like to think that, if the choice exists --and very often it does not-- I do my best to pick the best tool for the job.

More often than not, when it's pedal-to-the-metal time C is the most sensible choice. It used to be that you had to get down to assembler to really optimize things, but these days you can get a way with a lot if C is used smartly.

derleth
> When dealing win an array is 400 times slower in a "modern OO language" then in raw C, well, the code id fucking bloated.

Social science numbers do not impress me. Besides, what is a "modern OO language"? Haskell? How can you give any numbers without even specifying that detail?

> Of course there are lots of places where OO makes absolute sense. And the fat and slow code is the compromise you might have to make.

Your idea that "OO = fat and slow" is blown away by actual benchmarks.

http://www.haskell.org/haskellwiki/Shootout

(And, yes, unless and until you define what "OO" is to you, I'll pick Haskell as a perfectly reasonable OO language. Given than I've seen C called OO by people with better writing skills than you, this is hardly a strange choice in this context.)

> So, yes, I am an idiot

Again, I did not call you an idiot. The only one calling you an idiot here is you.

> More often than not, when it's pedal-to-the-metal time C is the most sensible choice.

I agree fully with this. However, I disagree that "pedal-to-the-metal time" is all of the time, or even most of the time. Especially when you're trying to teach programming.

Do you teach new drivers in an F1 racecar? Why or why not?

tptacek
Which begs the question of what a freaking moron Knuth must be to have presented TAOCP in terms of assembly language.
derleth
Knuth is Knuth. You are not Knuth.
robomartin
I think that there's huge value in starting with something very simple and as "raw", if you will, as possible. Any one of the various small 8 bit processors out there are very easy to understand. You have a few registers, interrupts, a small amount of RAM, perhaps some Flash, serial ports, etc. Simple.

The idea here is to really get down to the basics and understand them with a series of incremental projects.

Of course, there are no universally true rules about this stuff. This happens to be my opinion based on over quite of few years of developing and shipping products that entail both electronics and software.

As an example, I am teaching my own son how to program with C and Java almost simultaneously. Why? Well, he is not learning on his own, his Dad happens to knows this stuff pretty well and we are spending a lot of time on all of it. So, navigating two languages at the same time is working out OK. I've also had him sit with me while I work in Objective-C and ask questions as I go along.

In about three months we are going to build a real physical alarm clock using a small microprocessor and LED displays. I am going to to bootstrap Forth on that processor. The job will also require writing a simple screen text editor in order to make the clock its own development system.

So, by the middle of next year he will have been exposed to raw C, Java and a threaded interpreted language like Forth. I want to expose him to Lisp as well but don't yet know when it will make sense to do that. Maybe in a year or so. With three programming paradigms on the table it will be far more important to explore algorithms and data structures/data representation and understand how they look like with each technology.

rayiner
Load of crap. You can write a small OS in a semester using Tanenbaum's book. It's trivial.
robomartin
There's nothing trivial about it. Following a book while being taught at a University over a semester is vastly different from what the OP is talking about. For example, he is talking about writing his own PCIe, SATA and USB drivers as well as, well, everything. He is not talking about starting with MINIX and a well-digested book that guides you through the process.

In fact, your suggestion is exactly on point: The OP should pick-up Tannenbaum's book and take a year to implement everything in the book. Why a year? He is a web developer and, I presume, working. It will take more than time to learn what he does not know in order to even do the work. So, let's say a year.

I would suspect that after doing that his view of what he proposed might just be radically different.

For example, wait until he figures out that he has to write drivers for chip-to-chip interfaces such as SPI and I2C. Or that implementing a full USB stack will also require supporting the very legacy interfaces he wants to avoid. Or that writing low-level code to configure and boot devices such as DRAM memory controllers and graphics chips might just be a little tougher than he thought.

There's a reason why Linux has had tens of thousands of contributors and millions of lines of code:

http://royal.pingdom.com/2012/04/16/linux-kernel-development...

...and that's just the kernel.

rayiner
I didn't say what the OP wanted to do was trivial, but rather that bringing up some sort of OS on a desktop platform is trivial.
robomartin
I would urge you to really think that statement through. Are you proposing that writing all the drivers and low-level code that already exists is trivial? I don't think you are. What, then, is bringing up an OS on a desktop platform?

Writing some kind of a minimalist hobby OS on top of the huge body of work that is represented by the drivers and code that serve to wake up the machine is very different from having to start from scratch.

My original comment has nothing whatsoever to do with anything other than the originally linked blog post which describes almost literally starting from scratch, ignoring decades of wisdom and re-writing everything. That is simply not reasonable for someone who's experience is limited to doing web coding and dabbling with C for hobby projects. In that context, just writing the PCI driver code is an almost insurmountable task.

If I were advising this fellow I'd suggest that he study and try to implement the simplest of OS's on a small embedded development board. This cuts through all the crud. Then, if he survives that, I might suggest that he moves on to Tanenbaum's book and take the time to implement all of that. Again, in the context of a working web professional, that's easily a year or more of work.

After that --with far more knowledge at hand-- I might suggest that he start to now ask the right questions and create a list of modifications for the product that came out of the book.

Far, very, very far from the above is the idea of starting with a completely blank slate and rolling a new OS that takes advantage of nearly nothing from prior generations of OS's. And to do that all by himself.

MatthewPhillips
That's a bummer. Reading this article was inspirational.
endlessvoid94
Don't let this squash your inspiration. It won't stop this guy from learning what he needs to learn.
marssaxman
I've written a small RTOS for an 8-bit embedded processor, and ported it to run on a 32-bit embedded processor too. I don't laugh at the OP; I think he's got the true hacker spirit and I wish him well. It sounds like a fun project.

What's the harm here? He's going to dive in and learn something, and he's probably going to get further along than you expect, because this stuff just isn't as complicated as people like to think it is.

JoeAltmaier
I remember my 1st OS - CTOS running on Convergent Technologies hardware. After a couple weeks on the job, I had read all the code and changed many modules, and I remember thinking "Wait, I never saw any magic. Its all just code!"
robomartin
I'll be you didn't write drivers for USB, Hard disk, DDR3 controller, wear-leveling Flash storage, Ethernet/Networking, Graphics, etc.

You have to keep in mind that the OP is talking about such things as writing his own PCIe and USB stacks as well as everything else. He is leaving all history and prior work on the floor and re-inventing the wheel.

That's very far from writing a small RTOS for an 8-bit processor. In fact, my suggestion is that he should do just what you did in order to understand the subject a lot better. There's a lot of good Computer Science that can be learned with a small 8-bit processor.

andrewflnr
What makes you think he's incapable of getting through those twenty or thirty books? Did you have the experience to write an OS when you started your first OS? No, that's why you write your first OS. I think he knows what he's doing when he grabs the cat's tail.
robomartin
Writing your own OS is not the problem here. My first one was actually bootstrapping Forth on an embedded system. Let's call that an OS if we are going to stretch things. It did involve writing a pile of drivers for I/O --parallel, serial, i2c, etc. It also involved designing and building my own floppy disk controller as well as programming the driver to run it. Then I had to write my own screen editor to be able to edit code on the little machine itself.

Did I know all of that stuff? Of course not. Did it compare in complexity to what this article is proposing. Nope. The OS described in the article is far, far more complex than what I just described.

Is is incapable of doing it? Nope. I did not say that. I think I said that anyone who has written a non-trivial RTOS would laugh at the idea of what he described. Why? Because it is a monumental job for one person, particularly if they've almost done zero real development at the embedded level and they also have to work for a living.

I got started designing my own microprocessor boards, bootstrapping them and writing assembly, Forth and C programs before when I was about 14 years old. By the time I got to college I knew low-level programming pretty well. As the challenge to start diving into writing real RTOS's presented itself I could devote every waking hour to the task. Someone starting as a web developer --who presumably still needs to keep working-- and wanting to develop such an extensive OS is just, well, let's just say it's hard.

rthomas6
It's like a pilot writing about building his or her own plane.
Someone
Compared to building an OS and support libraies, building your own plane is trivial. Flying it may take balls, though. For the best example I know of: short video at http://www.joostconijn.org/film/vliegtuig/index.php.

The guy built three planes. One didn't take of, one crashed, one brought him across Africa in short hops with landings at rebel-occupied airports where he didn't always manage to announce his arrival.

Short thread at http://www.homebuiltairplanes.com/forums/hangar-flying/12196....

(Apologies for deviating from the subject, but I figure hackers might find this interesting)

javajosh
Whoah. That's some serious balls - def someone Colbert needs to have on the show.
swah
What about writing a small RTOS? How hard do you think that is?
Hytosys
Judging by his arrogance complex, I'm sure he thinks it's pretty tough!
robomartin
Not hard at all if you take the time to study the subject.

If you really have a good grasp of the concepts sometimes the hard part is the drudgery of possibly having to write all the device drivers for the various devices and peripherals that the RTOS has to service.

In the case of the OP, he seems to be talking about rewriting everything from the most basic device drivers on up to bootloaders and even the GUI. That's a ton of work and it requires knowledge across a bunch of areas he is not yet well-versed in.

Also, when it comes to the idea of writing an RTOS, there's a huge difference between an RTOS for, say, a non-mission-critical device of some sort and something that could kill somebody (plane, powerful machinery, etc.). That is hard not because the concepts require superior intellect but rather because you really have to understand the code and potential issues behind how you wrote it very well and test like a maniac.

I have written RTOS's for embedded control of devices that could take someone's arm off in a fraction of a second. Hard? Extremely, when you consider what the stakes are and particularly so if it is your own business, your own money and your own reputation on the line. There's a lot more to programming that bits and bytes.

ricardobeat
Anyone who has ever built a high-speed rail network across the US will only laugh at you. And, I hate to say, it would be justified. There are about five to six hundred books between where you are now and where you'd have to be in order to even start talking about designing an electric railway system for transporting cargo at >200mph. Add to that 50,000 hours of working in engineering projects across the transportation and logistics industries.

In case you're lost: http://news.ycombinator.com/item?id=4815463

You could have just posted the book recommendations with a handwave. Dismissing other people's ideas because "they don't know better" (or you know better) doesn't add anything and is harmful to discussion.

robomartin
And attacking the messenger does not make your argument at all.

There's no comparison whatsoever here. My high-speed cargo rail proposal/idea is actually DESIGNED for criticism out of the self realization that I am no expert in the field. I gathered as much data as I could. Did a bunch of math. Studied some of the issues involved and devoted a non-trivial amount of time to understanding the underlying issues.

Had you engaged me privately you would have also realized that I am very aware of the near-impossibility of the project as I proposed it due to a myriad of issues, not the least of which are political and environmental. Of course there's the simple practical fact that it is probably nearly impossible to trench new territory to build a new railroad system in the US today.

The more important point of raising the issue was to highlight the issue of just how badly ocean-based container shipping methods are polluting our planet and creating a situation that is has escalated into the proverbial elephant in the room.

So, yes, I've done a bit more work than the OP has done in truly understanding --in his case-- what operating systems are about, how to write the, why things are done in certain ways, the history behind some of the approaches, what works, what definitely does not work, and more.

And, yes, I have written several real-time operating systems for embedded systems, some of them mission critical. And, no, in retrospect it would have been a far better idea to license existing technology but as a programmer sometimes you don't have the option to make those decisions if you employer is dead set on a given approach.

No, I have never written a workstation-class OS. I know better than that. Today, it would be lunacy to even suggest it, particularly for a solo programmer, even with a ton of experience.

Anyhow, you succeeded at getting a rise out of me. Congratulations. I hope you are happy. It still doesn't change the fact that attacking the messenger does not invalidate anything I have said or prove whatever your fucking point might be.

Mar 13, 2012 · gghh on Help, Linux ate my RAM
I found a good introduction on unix memory caching in chapter 3 (The Buffer Cache) of 'Design of the UNIX Operating System', http://www.amazon.com/Design-Operating-System-Prentice-Hall-... . At least it was good for me (mathematician by training, programmer by profession)
Aug 04, 2009 · Locke1689 on Roll Your Own UNIX Clone
Good, but a little outdated. For general systems overview, as well as introduction to UNIX architecture, I'd actually recommend my textbook, which is the CMU intro systems book, Computer Systems: A Programmer's Perspective[1]. For OS theory, our OS/advanced OS textbooks are fine, but for actual implementation my coworkers recommended The Design of the UNIX Operating System[2], Linux System Programming[3], and Understanding the Linux Kernel[4].

[1] http://www.amazon.com/Computer-Systems-Programmers-Randal-Br...

[2] http://www.amazon.com/Design-Operating-System-Prentice-Softw...

[3] http://oreilly.com/catalog/9780596009588/

[4] http://oreilly.com/catalog/9780596005658/

michael_dorfman
Outdated? I thought the Linux Kernel was outdated in comparison. The Minix kernel was rewritten from scratch in 2006 for version 3.
Locke1689
OSDI (2nd) was written in 1997, perhaps you were thinking of this? http://www.amazon.com/Modern-Operating-Systems-3rd-GOAL/dp/0...

Minix 3 is a whole different ball game though. It's an interesting kernel with some interesting ideas, but I still recommend Linux as it's both more popular and reflects the majority of UNIX design decisions today. Even the OSs with praise for microkernel design tend to incorporate a number of monolithic features.

EDIT: Sorry! I had no idea there was a Third Edition of OSDI out http://www.pearsonhighered.com/educator/academic/product/0,,.... However, I should mention that I meant those Linux books to be read in conjunction with the kernel code -- the book is still valid today despite the additions to the kernel in the past couple years.

michael_dorfman
The 3rd edition of OSDI is definitely worth reading. The MINIX3 kernel is only 4000 lines of code; very clean, and easy to wrap your head around.
ericmc
I also recommend Computer Systems: A Programmer's Perspective to anyone who wants an intro to systems. One of the best textbooks I've read.
HN Books is an independent project and is not operated by Y Combinator or Amazon.com.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.