HN Books @HNBooksMonth

The best books of Hacker News.

Hacker News Comments on
Building High Integrity Applications with SPARK

John W. McCormick, Peter C. Chapin · 8 HN comments
HN Books has aggregated all Hacker News stories and comments that mention "Building High Integrity Applications with SPARK" by John W. McCormick, Peter C. Chapin.
View on Amazon [↗]
HN Books may receive an affiliate commission when you make purchases on sites after clicking through links on this page.
Amazon Summary
Software is pervasive in our lives. We are accustomed to dealing with the failures of much of that software - restarting an application is a very familiar solution. Such solutions are unacceptable when the software controls our cars, airplanes and medical devices or manages our private information. These applications must run without error. SPARK provides a means, based on mathematical proof, to guarantee that a program has no errors. SPARK is a formally defined programming language and a set of verification tools specifically designed to support the development of software used in high integrity applications. Using SPARK, developers can formally verify properties of their code such as information flow, freedom from runtime errors, functional correctness, security properties and safety properties. Written by two SPARK experts, this is the first introduction to the just-released 2014 version. It will help students and developers alike master the basic concepts for building systems with SPARK.
HN Books Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this book.
Sep 09, 2022 · eggy on Game Jam 2 Results
Hah, Chevy Spark, and it's a gasoline car!

I usually put "SPARK2014 programming" or "SPARK2014 Ada" in my searches. I used to get annoyed by all of the Apache Spark stuff that came up, but that seems to have waned a bit this past year or two.

Vermont Technical College (VTC) used SPARK2014 to program their CubeSat and the book, "Building High Integrity Applications with SPARK"[1] is a great intro to it in a nice context of practical use.

There was also an article on the AdaCore site about someone who rewrote some drone firmware in SPARK2014, which is another nice real-world application. [2]

[1] https://www.amazon.com/Building-High-Integrity-Applications-...

[2] https://blog.adacore.com/how-to-prevent-drone-crashes-using-...

unwind
Gasoline cars have sparkplugs, so it's not really a strange association. At least not in my opinion.

In fact, I guess in gas cars sparks are "better" since they are engineered in, and doing useful work.

eggy
I didn't say it was strange. I found it amusing that this is what came up during the other person's search for SPARK2014, and yes, thinking it sounded like a suitable electric vehicle name. Chevy called their electric offering Volt, and Volts are in the battery in a gas car too, but I think you get the picture.
Nov 17, 2018 · nickpsecurity on SPARK by Example
If you want to learn it, here's the main book people are using:

https://www.amazon.com/Building-High-Integrity-Applications-...

Cleanroom just uses functional decomposition, the simplest of the control flow primitives, and math you learned in middle or high school. Got results on lots of projects.

http://infohost.nmt.edu/~al/cseet-paper.html

https://pdfs.semanticscholar.org/12d5/23e586ffde5cbe020cb3fa...

(Second link has a table and description showing the results they got. Stavely's book distills it into lightweight, less-processy form.)

Design by Contract itself is like assertions on steroids. If your language lacks support, you can build it into your functions at beginning, middle and end. If OOP language, you might use constructors and destructors. If trying to understand it, I have a link that you can give even to project managers.

https://www.win.tue.nl/~wstomv/edu/2ip30/references/design-b...

One benefit of making it a formal spec is you can generate tests directly from your specs. This is an old, old technique being rediscovered recently. Various names include specification-based, model-based, contract-based, and property-based test generation. The last thing you can do is manually or automatically convert your specs into runtime checks for the above and/or fuzz testing. The failure takes you exactly to the property that failed if it was one of them.

Finally, for just the most critical things, you can try formal proof on them. SPARK Ada is a great example used in industry with the book being pretty easy to follow.

https://www.amazon.com/Building-High-Integrity-Applications-...

The nice thing about that toolset, esp the proprietary version, is that you can use automated provers to avoid having to do mathematical proof by hand. If something doesn't pass, you have several options: do some manual work on hints to the automated provers or actual proofs in a proof assistant; monkey around with the code to see if different structure or algorithm gets it through; put in runtime checks for just the properties you couldn't prove. If you do the last one and keep programming while solvers run, then the productivity is similar to Design-by-Contract with the extra benefit some properties might hold in all cases. You might also get performance boost by reducing unnecessary, safety checks via proofs that were successful.

They probably should've started with a Wirth-style language plus Design-by-Contract and static analysis. That would knock out tons of stuff at compile and runtime. If looking for mature language, the best one for predictability is SPARK Ada hands down. It was designed for easy analysis in automated provers. It's automation level is a lot higher than it used to be. Folks wanting provable programs should either be considering using it or prototyping their programs in it with semantics-preserving translation to the blockchain language/VM.

https://en.wikipedia.org/wiki/SPARK_(programming_language)

https://www.amazon.com/Building-High-Integrity-Applications-...

Note: The book is designed for people without formal methods background. I can't say how easy or not it will be since I just got it. :)

Are we there yet? 20 Years of Industrial Theorem Proving with SPARK

https://proteancode.com/keynote.pdf

IRONSIDES DNS illustrates it can handle bigger problems

http://ironsides.martincarlisle.com

Frama-C and SPARK both build on Why3 platform with its intermediate language WhyML. Work built on any of these three can usually be ported to the others. So, I'll drop a recent example in Frama-C that's really math heavy with a style amenable to high automation.

https://hal.archives-ouvertes.fr/hal-01681134/document

“but after studying many other ICO, I am convinced now that this is not just a short-lived fad, but a real new model to fund projects, and especially open-source projects related to blockchains.”

There’s been quite a few of us suggesting people jump into blockchain just to get funding for more important stuff (e.g. language-level tooling or libraries) they’ll produce as part of the coin offering they sell to investors. Plus, with the focus on correctness, it’s easier than ever to get the private sector to invest in high-assurance tooling. Problem is they keep trying to clean-slate everything when there's piles of work to build on with better benefits in the long term. And this work is hard even when doing simple things if you want end result to be widely usable. So, you better have a good justification for trying to redo decades worth of hard work on your own with some VC funding.

Anyone wanting a shortcut should build your language on top of capabilities of the SPARK Ada language [1] with this book [2]. Your smart contracts get all the benefits it currently offers plus whatever extra you build plus whatever they build with the influx of money for Pro edition. Recent projects already add some pointer [3] and floating-point [4] safety to what they already mostly automate with that tooling. This is also a mature tool whose development and commercial use goes back decades [5].

[1] https://en.wikipedia.org/wiki/SPARK_(programming_language)

[2] https://www.amazon.com/Building-High-Integrity-Applications-...

[3] https://arxiv.org/abs/1710.07047

[4] http://lists.forge.open-do.org/pipermail/spark2014-discuss/2...

[5] http://www.spark-2014.org/uploads/itp_2014_r610.pdf

greggirwin
Thanks for the info @nickpsecurity!
[Copying/pasting my reply to author in Disqus since at work. ]

Good write-up on the difficulties. The first I saw was Guttman’s paper that set back the hype quite a bit:

http://www.cypherpunks.to/~peter/04_verif_techniques.pdf

His post and yours both oversell the problems while underselling the successes. It’s best to keep big picture in mind. Yes, it’s extremely hard. Yes, we shouldn’t rely exclusively on it or have too much expectations. On Guttman’s end, the Orange Book of TCSEC (later, EAL7 of Common Criteria) required many activities like I included in my own security framework:

http://pastebin.com/y3PufJ0V

Each was supposed to check the others. Each time, you want experienced people doing it since it’s the people that make these processes work. What’s mostly missing in this post and esp in Guttman’s is all the evidence of defect reduction. Most of the work he cites were tracking problems they found throughout their lifecycle. They all said the formal specifications caught bugs due to ambiguity. Some found formal proofs beneficial with often mentioning “deep” errors they likely wouldn’t find with testing or code review. TLA+ use at Amazon was great example where errors showed up 30+ steps into protocols. Most of old verifications found they caught errors just simplifying their specs for the prover even if they wouldn’t run proofs. And for high-assurance security, the A1-class systems would go through 2-5 years of analysis and pentesting where evalutators said they did way better than systems using mainstream practices. Most of those can’t even tell you how many covert channels they have.

If you’re interested in methods that helped in practice, I dropped a start on a survey paper on Lobste.rs trying to shorten several decades worth of material into an outline post:

https://pastebin.com/uyNfvqcp

Now, all that said, full proof is hard with a lot of problems. You hit nail on the head about the infrastructure problem. I’ve always strongly encouraged people to concentrate on building up one or two projects to have infrastructure for most common ways of solving problems in programming with as much automation as possible. They have at least been building most things on Coq, Isabelle/HOL, HOL4, and ACL2. Having prior work to build on has been helping them. As you’ve seen, though, they’ve not made this a priority: the tooling still is horrible. So, horribly-difficult work with horribly-difficult tooling. That leads to the next proposal I did that might help you or some of your readers who want 80/20 rule of most benefits without most problems:

https://news.ycombinator.com/item?id=15781508

https://lobste.rs/s/n7v658/your_thoughts_on_this_advice_thos...

Here’s some of those methods on top of the TLA+ and Alloy links:

https://www.eiffel.com/values/design-by-contract/introductio...

https://www.win.tue.nl/~wstomv/edu/2ip30/references/design-b...

http://www.anthonyhall.org/c_by_c_secure_system.pdf

https://www.amazon.com/Building-High-Integrity-Applications-...

Here’s my recommendation in general. Learn lightweight methods such as Meyer’s Design-by-Contract for most stuff, TLA+ for concurrency/order, Alloy for structure, SPARK for low-level code, and so on. TLA+ on distributed stuff seems to sell people the most. With a formal method in hand, then apply this methodology: memory-safe language + lightweight method to model-check important stuff + Design-by-Contract + tooling to generate tests from specs/contracts + fuzzer running with runtime checks enabled. This combination will bring the most quality improvement for least effort. In parallel, we have the experts formally verifying key components such as kernels, filesystems, network stacks, type systems, compilers, security protocols, and so on. They do so modelling real-world usage with each one coming with contracts and guidance on exactly how they should be used. The baseline goes up dramatically with specific components getting nearly bullet-proof.

Also, you said you wanted entire system to be reliable. There was one that set the gold standard on that by noting how everything failed with systematic mitigations for it in prevention and/or recovery:

http://www.hpl.hp.com/techreports/tandem/TR-85.7.pdf

https://en.wikipedia.org/wiki/NonStop_(server_computers)

The OpenVMS clustering software alone let their systems run for years at a time. One ran 17 years at a rail yard or something supposedly. The NonStop systems took that to hardware itself with them just running and running and running. Applying a combination of balanced assurance like I describe, NonStop techniques for reliability, and Cambridge’s CHERI for security should make one hell of a system. You can baseline it all with medium-assurance using tools that already exist with formal verification applied to a piece of the design at a time. The work by Rockwell-Collins on their commercially-deployed AAMP7G’s shows that’s already reachable.

http://www.ccs.neu.edu/home/pete/acl206/slides/hardin.pdf

So, there’s you some weekend reading. Merry Christmas! :)

For people wanting to learn, this article trying to use it for audio applications will give you a nice taste of the language:

http://www.electronicdesign.com/embedded-revolution/assessin...

This Barnes book shows how it’s systematically designed for safety at every level:

https://www.adacore.com/books/safe-and-secure-software

Note: The AdaCore website has a section called Gems that gives tips on a lot of useful ways to apply Ada.

Finally, if you do Ada, you get the option of using Design-by-Contract (built-in to 2012) and/or SPARK language. One gives you clear specifications of program behavior that take you right to source of errors when fuzzing or something. The other is a smaller variant of Ada that integrates into automated, theorem provers to try to prove your code free of common errors in all cases versus just ones you think of like with testing. Those errors include things like integer overflow or divide by zero. Here’s some resources on those:

http://www.eiffel.com/developers/design_by_contract_in_detai...

https://en.wikipedia.org/wiki/SPARK_(programming_language)

https://www.amazon.com/Building-High-Integrity-Applications-...

The book and even language was designed for people without a background in formal methods. I’ve gotten positive feedback from a few people on it. Also, I encouraged some people to try SPARK for safer, native methods in languages such as Go. It’s kludgier than things like Rust designed for that in mind but still works.

GPL download for AdaCore GNAT:

https://www.adacore.com/community

mcguire
SPARK might be kludgier than Rust, but the guarantees it can make are stronger. And from what I've seen, it's less kludgy than the dependent type systems I've seen.
nickpsecurity
I meant integrating it into FFI's expecting C code. It's true that it makes stronger guarantees. I have a concept also that's called Brute Force Assurance where one source gets converted into Rust, Frama-C, and SPARK. Idea is static analysis tools for each knock out errors others can't catch. Final result is portable C.
touisteur
I'd also reference Rosetta Code (http://rosettacode.org/wiki/Rosetta_Code) to get a taste of Ada. I use it sometimes when I forget some simple stuff.

For C++ and Java practitioners : https://www.adacore.com/books/ada-for-the-c-or-java-develope... .

There's also http://university.adacore.com .

For Spark2014 you might want to start with AdaCore University also, or if you're in Paris in December there is a public training session : https://www.adacore.com/public-spark-training .

Then you might want to look up an implementation guidance : https://www.adacore.com/books/implementation-guidance-spark

And no I don't work for AdaCore :-D.

For small modules, SPARK with this book from the limited feedback I have:

https://www.amazon.com/Building-High-Integrity-Applications-...

Verification of larger programs is hard for a lot of reasons. Easiest method is to do Design-by-Contract in a memory-safe language. Anything easy to specify in contracts goes there where stuff easier to test for is done that way. If your language doesn't support it, you can do it with asserts, conditionals in functions, or constructors/destructors in OOP.

http://www.eiffel.com/developers/design_by_contract_in_detai...

Then, you hit the app with tests generated from the contracts and fuzz testing that both run while the contract conditions have runtime checks in the code. That means a failure will take you right to the precondition, invariant, or post condition that is broken.

For analyzing distributed stuff, TLA+ is your best bet with an accessible tutorial for a subset available here:

https://learntla.com/introduction/

HN Books is an independent project and is not operated by Y Combinator or Amazon.com.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.