HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
CppCon 2017: Carl Cook “When a Microsecond Is an Eternity: High Performance Trading Systems in C++”

CppCon · Youtube · 22 HN points · 8 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention CppCon's video "CppCon 2017: Carl Cook “When a Microsecond Is an Eternity: High Performance Trading Systems in C++”".
Youtube Summary
http://CppCon.org

Presentation Slides, PDFs, Source Code and other presenter materials are available at: https://github.com/CppCon/CppCon2017

Automated trading involves submitting electronic orders rapidly when opportunities arise. But it’s harder than it seems: either your system is the fastest and you make the trade, or you get nothing.

This is a considerable challenge for any C++ developer - the critical path is only a fraction of the total codebase, it is invoked infrequently and unpredictably, yet must execute quickly and without delay. Unfortunately we can’t rely on the help of compilers, operating systems and standard hardware, as they typically aim for maximum throughput and fairness across all processes.

This talk describes how successful low latency trading systems can be developed in C++, demonstrating common coding techniques used to reduce execution times. While automated trading is used as the motivation for this talk, the topics discussed are equally valid to other domains such as game development and soft real-time processing.

Carl Cook: Optiver, Software Engineer

Carl has a Ph.D. from the University of Canterbury, New Zealand, graduating in 2006. He currently works for Optiver, a global electronic market maker, where he is tasked with adding new trading features into the execution stack while continually reducing latencies. Carl is also an active member of SG14, making sure that requirements from the automated trading industry are represented. He is currently assisting with several proposals, including non-allocating standard functions, fast containers, and CPU affinity/cache control.

Videos Filmed & Edited by Bash Films: http://www.BashFilms.com

*-----*
Register Now For CppCon 2022: https://cppcon.org/registration/
*-----*
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.
I find this stuff fascinating, and this article is way above average for online posts about proprietary/algorithmic/quantitative/low-latency trading (very leaky Venn diagram there). I have a few nitpicks but overall it's informative and it's an interesting format: viewing an industry through the lens of a particular firm, especially one as fascinating as Jane. Anything that develops literacy in modern finance amongst the lay public is a good thing in my book.

If this stuff floats your boat I'd also recommend any of Carl Cook's talks, e.g. https://www.youtube.com/watch?v=NH1Tta7purM. Optiver is AFAIK in a somewhat different business than Jane, but they're also players (or were last I had any inside baseball).

Too many people got their worldview on this industry from "Flash Boys", and I say this as a Lewis fan, is criminally stupid at best and in bad faith at worst (if you want a well-researched, accessible alternative: https://www.amazon.com/Trading-Speed-Light-Algorithms-Transf... is about a zillion times better).

It's a pretty short list of places I'd ever go through some grueling and semi-arbitrary gauntlet to work for, but Jane is on it for sure.

I hope the author(s) do Medallion next.

ivanech
+1 to Trading at the Speed of Light. A great read, particularly for any engineer curious about clinging to the limits of physics. An example: microwave towers are used to beam data from Chicago to New York because it's faster than fiber optic cables. Even crazier: these microwave towers have their repeater hardware at the top of the tower (microwave towers usually have it at the bottom) so that they don't lose time in wires going from the top to the bottom.
gautamdivgi
A 2019 article on the various companies competing for this [0]

0. https://www.bloomberg.com/news/features/2019-03-08/the-gazil...

Bluecobra
Even microwave is slow, they are using shortwave radio for certain signals. Maybe neutrinos are next?

https://sniperinmahwah.wordpress.com/2018/05/07/shortwave-tr...

samatman
I'm not sure what's going on with this comment so let me make three observations:

The speed (latency) of the EM spectrum is the effective celerity of the medium: this doesn't differ in the air between microwave and shortwave in any meaningful way.

The speed (throughput) achievable on a given frequency is limited by the period of that frequence, high wavelengths can modulate more signal. Microwaves are higher frequency than radio by definition, so they have a higher throughput.

Neutrinos don't exceed the speed of light, and make a very bad medium of transmission given the near-complete lack of interaction with baryonic matter.

dejerpha
Neutrinos don’t need to go around the earth, so in theory you have a pi/2 advantage over an EM signal when sending to an antipodal location, for instance. In practice of course, throughput is utterly horrible for the reason you indicate.
Scoundreller
All you need is a couple bits, but yeah, making sure it’s the neutrino your buddy sent and not some other one is where it gets complicated.
simiones
I think neutrino detectors capture at best something like 10 neutrinos per year or so, so the throughput would be severely limited.
samatman
This assumes that neutrinos aren't slowed by a dense medium as light is.

That's a maybe. Still, good point.

SamReidHughes
We already fire neutrino beams through the Earth's crust, and they travel at the speed of light, and the core isn't that much more dense.
AbrahamParangi
Latency would also be degraded by the interaction property. The probability that you detect the first packet of neutrinos is extremely low.
kristjansson
The game is repeaters and path lengths, not the speed of the medium itself
pushrax
Shortwave radio can be transmitted around the curve of Earth by ionospheric reflection and refraction so fewer repeaters are needed. This allows crossing vast oceans where microwave infrastructure might not be possible.

As you say the downside is available bandwidth and throughput.

Scoundreller
That’s why most markets close during night time.
matred
No it isn’t.

Most markets, in terms of their daily volume, are open at night, but very thinly traded until EU hours, but some do see action in Asia hours. It’s just about liquidity.

Maybe you’re thinking of single name equity markets, which are a fraction of daily trading.

vlovich123
I think they’re probably asking about US markets. Afaik those do close at business hours and I’m not sure how after hours trading happens but it might not be available to most people.

I think the one of the main reasons is government concerns about shenanigans happen overnight without oversight / flash crash. That being said I presume all the breakers that would halt trading activity are probably automatic but potentially not all of them (I think the US did things like that during the housing collapse).

samatman
Another good point, and one I thought about before replying, but that doesn't make microwaves slower, it makes them inapplicable.
pushrax
In theory having fewer repeaters improves latency, probably in the range of 100ns per repeater. I don't know how much of a practical effect that has, likely very minimal with modern implementations.

Either way it's more sensible to build high throughput microwave networks given the tiny amount of shortwave bandwidth we have.

ericbarrett
I bet it's way lower than 100ms for this application. Single digits. Maybe less.
pushrax
I wrote 100ns, and meant an analog repeater. Full digital regeneration is probably in the 1-10μs range.
melony
Since this is Hacker News, let's not beat about the bush. Here's a channel that actually go through derivatives pricing without hiding the math:

https://youtube.com/c/QuantPy/videos

benreesman
Thank you kindly for what looks like a great resource!

I've been trying to put myself through YouTube night school on some of this stuff, and MIT OCW has great resources as well at significantly less cost than going to MIT ;)

This is a pretty reasonable jumping off point for their corpus of financial engineering stuff: https://www.youtube.com/watch?v=HdHlfiOAJyE.

I'm fortunate enough to work with a person who actually understands derivatives trades with some sophistication, but that's a happy accident and the more people have access to good online resources the better!

Edit: I forgot to mention this book (https://www.amazon.com/Algorithmic-Trading-DMA-introduction-...) in the spirit of something more technical than the general-audience one I linked above. I have some nitpicks with it as well, but I've gotten value out of it.

thornewolf
Followed quantpy tutorials to implement my own black scholes and heston pricing models last year. Highly recommend.
lordnacho
Whenever the topic comes up, I throw out a reference to Hull's Futures, Options and other derivatives, Wilmott's Quantitative Finance, and possibly also Taleb's Dynamic Hedging.

That's more than enough on the instrument math side, most of what you'll see is pretty mundane stuff, unless you end up on an exotics structuring desk.

I'd also note that JS and other MMs mostly don't do anything requiring you to know the intimate details of these things, a lot of it is understanding how the market works rather than the deep instrument math. That might mean other kinds of math of course.

blitzar
Hull's Futures, Options and other derivatives was on the bookshelf of a friend who worked at JS - it was their bible.

I always throught market microstructure was more important, but they insisted a disciplined application of the maths (as per the bible of Hull) was where the magic really was.

adament
If you come from a pure math theory first background I would advise starting out with Björks “Arbitrage theory in continuous time”, I personally found the lack of rigor and superfluous examples in Hull frustrating and found Björk much more approachable then you can look into Hull for real life practicalities like daycount conventions, etc. If you want to go into complex derivatives pricing I would advise looking at the Andersen and Piterbarg trilogy.
pvitz
I would also suggest to forget about Hull and Wilmott and would suggest to start with the excellent book by Shreve: "Stochastic Calculus for Finance. Volume II: Continuous Time Models".

Then, you can quickly read Bjoerk, work through Brigo/Mercurio (if you like that style) or Andersen/Piterbarg. Alternatively, if you want to fully dive into into the subject after Shreve, Musiela/Rutkowski: "Martingale Methods in Financial Modelling" is wonderful.

dash2
I'd say that Shreve and Musiela lack rigour and are too focused on trivia, name-dropping and anecdotes. Jiao's "Infinite-dimensional methods in amassing vast bags of gold" is unsurpassed. For its coverage of Black-Scholes, I'd also recommend Schulz's "Good Grief, Charlie Brown". Sorry, everyone seemed to be doing this so I felt I should contribute too.
bob29
I find a lot of value in the analysis put forth here

https://www.jwz.org/blog/

gcapell
I honestly can't tell exactly when the thread meandered into satire.
topologie
This. Exactly this.
keithalewis
My writeup: https://keithalewis.github.io/math/um.html for modeling and https://keithalewis.github.io/math/uf.html on now to more accurately reflect the real world. I have taught Derivative Securities at NYU, Columbia, Cornell, and Rutgers over the past 14 years, but my day job is turning math into software that produces numbers people running a business will pay for. The textbooks are missing some important things.
matred
Nicely organized and dense.

Thanks for the material!

JackFr
Michael Lewis is a great writer, but the closer you are to the subject the more his shortcomings are exposed. I felt the same way about The Big Short and to some extent Liar’s Poker. He has an annoying tendency to assume that if he doesn’t understand something, either it’s completely inscrutable to everyone or simply BS.

(And to pile on, The Blind Side was the touching story of how Lewis’s prep school classmate, an Ole Miss booster, gamed the system to provide improper benefits to a high school recruit.)

FabHK
While I agree that Flash Boys was below par, what's wrong with The Big Short? I thought that was well done, accessible, and largely accurate.
mooreds
I liked The Big Short.

On a different but related note, I also really enjoyed the Compleat Ubernerd, written by Tanta, all about mortgage servicing in the mid 2000s: https://www.calculatedriskblog.com/2007/07/compleat-ubernerd...

I'm not sure how it has aged (no Dodd-Frank updates, the author has passed away) but it was glorious in its time.

JackFr
Tanta was the ABSOLUTE BEST writing on the financial crisis as it was happening. You've made it when Federal Reserve Bank of NY cites your blog in a footnote in their research report.

https://www.newyorkfed.org/medialibrary/media/research/staff...

The CR blog was not the same after she passed away.

FabHK
Agreed, Calculated Risk was required reading at the time. So much insight.

(On Tanta's passing: https://www.calculatedriskblog.com/2008/11/sad-news-tanta-pa...)

JackFr
I think it's probably because I was there for it. His construction of the narrative, while better than many (including many straight journalists) ends up sort of falsely casting people into hero/fool/villain roles that make the book work as an entertainment, but don't fully hold up.

It's a decent book, and a decent movie (kudos for one particular scene where I recognized data from the actual LoanPerformance database) I actually prefer the movie Margin Call for more accurately capturing the feel of the crisis from inside a bank.

adamsmith143
But wasn't the point of the Big Short to show the perspective of people "outside" the mainstream who made big bets against the system/banks? Not surprising then that it didn't really show what was happening in the banks themselves.
refulgentis
No, the book is very different from the movie, I'd say it's almost the opposite in that it was mostly narrated from the perspective of the banks.

Another compounding factor is people often assume the message is "banks bad" but it's more "oh this system was so complex that any one individual did not understand the impact of their decision(s), much much more than everyone/anyone was playing super fast and loose from their particular perspective "

gadders
Liar's Poker was autobiographical, though. He should have got that right :-)
murbard2
Having worked as a quant GS on HFT, +1 on Flash Boys being criminally stupid at best and bad faith at worst.
the_watcher
The Diff is easily one of the best value’s I get despite being >$200/year. I’m not sure how I’d rank it relative to Stratechery, I personally enjoy The Diff more but Stratechery is more relevant to my work and is also excellent. Byrne churns out an all-timer like this every other month or so, and his average posts still consistently include the best sentences I read of the day. It’s one of the only newsletters that, if I get behind on reading it, I make sure I catch up on every missed issue.
bmitc
> It's a pretty short list of places I'd ever go through some grueling and semi-arbitrary gauntlet to work for, but Jane is on it for sure.

It certainly seems like an interesting place to work, but I find their hiring process as a bit of a red flag. Places that hire like that confuse me, because it seems it's going to apply a very selective filter to applicants that make it through. And I don't meant selective in the sense of technical ability but more emotional, social, and thinking styles. I get incredibly nervous in technical interviews and with a wide background, I don't always know certain bits of computer science. So, I do terrible in these style of interviews, because they do nothing to expose what I do know or how I really think on projects.

As another point of why I don't think they work, they almost are never two-way. And if they were, it would show the pointlessness of them. If I asked interviewers a bunch of questions about things that I know about, then we'd just be trading blows, which is pointless.

dQw4w9WgXcQ
> I find their hiring process as a bit of a red flag

This is a bit like saying you find the hiring process at the NFL or for SEAL teams a red flag. You either fall through the selection sieve or you don't, that's the whole point. The filter has served its purpose, and they end up with the elite high-performing teams they desire. There's nothing wrong with you or the filter if you don't make it. The filter clearly does work because Jane St prints cash like the Fed and is one of the top MM firms out there.

bmitc
I understand the perspective, although it's different and not really comparable to those other things you mentioned, especially since the NFL and the SEAL teams do not evaluate people solely based upon a few hours of tunnel-visioned interviews. They are primarily evaluated on their performance in real-life scenarios and training across a decent amount of time.

Of course Jane Street is successful! There's no question there. However, I don't think it's much of a question that they probably miss out on some great candidates that don't fit into that type of hiring process. It's possible they view those as acceptable losses.

fossuser
The author is Byrne Hobart: https://twitter.com/ByrneHobart and his substack The Diff is just generally great.
data_maan
I find the article to be a poorly written piece of propaganda for Jane Street as you could change "Jane Street" for "Citadel" for example and entire paragraphs would still be true.

> it's an interesting format: viewing an industry through the lens of a particular firm

What's interesting about that? Almost every other article does that.

paulpauper
I hope the author(s) do Medallion next.

Medallion has probably gotten more scrutiny than any other fund, yet 3 decades later it's still as opaque as ever beyond vague 'statistical methods'. It makes a lot of money no matter what. It's more tight-lipped and exclusive than Jane Street. I don't even think anyone knows even if it's doing market making or not. Or if it's making short-term directional bets. You would think after 30 years stuff would leak and the edge would be gone. Employees are paid enough to not disclose, and likely are divulged only a small part of the overall method/system, so only a handful of employees will know how it works in its entirety. What it's doing has to be on a very large scale and in a big and liquid market to be so consistent and profitable.

bko
I would recommend the book The Man Who Solved the Market about Medallion founder Jim Simons. It doesn't go over the strategy extensively but goes through the history and culture of the firm. From reading it I would attribute their performance to execution. They have an incredible pipeline and hire almost entirely engineers and scientists. They have a rigorous scientific method in finding and executing on signals. And they've resisted taking more money and earning more on the management fee, opting for performance.

Everything about them is boring. They're well paid sure, but they're based in long island and hire mostly grey beards and don't overhire. Compare that to Jane Street hiring interns jumping through silly hoops like betting poker chips on puzzles. It's a bit of a farce.

Theoretically other firms could copy this, but the main goal of a hedge fund manager is keeping AUM. High AUM and poor performance is better than low AUM and strong performance. So its a lot easier to optimize on maximizing AUM and managing your brand. There aren't a lot of mathematicians that start hedge funds so the people starting them already seed the company with the wrong culture to replicate RenTech.

nly
It's probably pretty easy to keep the returns on a pot as small as $10bn sweet if you just reserve all your best alphas for that fund. There are proprietary trading firms trading pots that size for a single shareholder.

What I've been told is that Rentech also effectively use their public funds as a source of revenue to juice development of proprietary platform, so some of it is business cunning rather than a hard technical edge.

They were also got on the quant train very early.

jeffreyrogers
They were successful long before they had those public funds though.
carnitine
To even have those ‘best alphas’ in the first place and then select them in advance for your best fund is the impressive part. The returns are insane even if the fund is capped.
bmitc
RenTech is also a bit different than Jane Street. While smart people no doubt work at Jane Street, some very serious brains have worked at RenTech and contributed to Medallion, people like Elwyn Berlekamp.
benreesman
Oh yeah, RenTech is just fascinating, and the opacity only lends to the mystique around it. People are talking a lot about how hard it is to get a gig at Jane, and AFAIK it's fucking hard, but one of the best mathematicians who was also a super-hacker I've ever met crushed the Jane interview and got bounced out in the RenTech screen.

Of course, the 30%+ annual returns almost every year for 30 years doesn't hurt the mystique either ;)

It's interesting that their other funds are far more mundane in terms of performance and last I heard Medallion can't hold much capital (~10B or so I've heard in whispers), but there is definitely something interesting as hell going on there.

Near as I can tell it's the hardest job to get on Earth. Rumor mill is that they pre-screen candidate based on their citation record in the literature, though that's obviously hearsay and I don't know if it's true.

Inconel
In addition to RenTech, TGS is another intriguing place that mostly flies under the radar and from all rumors seems to have been fantastically successful over 3 decades. It’d be very interesting to hear about other less known firms with stellar, albeit likely smaller in absolute terms, levels of success.
VirusNewbie
TGS is just weird. Friend of mine making very good money at staff level had them reach out to get him to come interview, saying they would at least double his comp.

Another friend at G said the “smartest person in the office was poached by this company TGS, have you heard of them?”

chucksmash
Had "The Man Who Solved The Market: How Jim Simons Launched The Quant Revolution" on my shelf for several years as an out-of-the-blue birthday present but I finally got around to reading it earlier this year and I'd absolutely recommend it.

The emphasis on published work rang a bell, but thumbing through the book I can't find it off hand.

benreesman
I enthusiastically second "The Man Who Solved The Market".
lextuto
I also enjoyed the book and tried to find out more which I wrote about here (in my non-monetized blog): https://sileret.com/blog/renaissance_technologies/
Do you know this for a fact? I've done some work in the industry where I needed to make fast software, but never the like sub-microsecond tick-to-trade type fast, so I really don't know.

There was a great presentation from 2017 about some of Optiver's low latency techniques[1]. I had assumed they released it because the had obviated all of them by switching to FPGAs, but I don't know. Either way, he suggested that if you ever needed to ping main memory for anything, you already lost. So, I wouldn't have thought DDIO plays into their thinking much.

[1] https://www.youtube.com/watch?v=NH1Tta7purM

isogon
The idea is precisely that you want to avoid pinging main memory at all, which is possible (in the happy case) if you do things correctly with DDIO. Not everything is done in hardware where I am. I am wary of saying much because my employer frowns on it, and admittedly I work on the software more than the hardware, but DDIO is certainly important to us.
kolbe
Kernel bypass networking? My guess is the cards use DDIO.
I’ve seen this sentiment before and worked on both a “low latency Java” team and low latency C++ teams.

I have some sympathy for the idea that the JVM is better since it means you won’t spend all your time chasing crash reports. The thing is, like another comment hinted at, is that this issue is generally more reflective of the environment you build in than the technology choice.

Here’s a good talk on the reasons for and limits of using C++ for low latency systems: https://m.youtube.com/watch?v=NH1Tta7purM

If I had to rank in terms of reliability of trading infrastructure I’ve used/worked on,

* Low Latency C / C++ infra at fully automated trading firm. Very much run by the programmers and quants and also fairly small. By far the fastest (near limits of what you could do) and also most reliable.

* Low Latency Java execution infrastructure. Pretty reliable, not that fast, had some issues with GC battling and manual memory management to avoid GC, etc. There was a pretty clear latency floor (still quite low) even when “doing everything right” that serious native infrastructure beat.

* C++ market making infra at a firm run by manual traders. It was by far the slowest and least reliable. Echos the experience of “spent hours debugging weird crashes”. The culture was very “A trader asked for this and needs it done yesterday. Also this refactoring business doesn’t sound like adding new features, drop it”.

What I saw is that if you hire people who have a good idea of what they’re doing and keep a culture of technical excellence, C++ is definitely a better choice if you care about latency. You really have to maintain a culture of high quality code, testing, and in general caring about the technology.

This is only really possible when all of the stakeholders are involved in the technology, or at least understand the benefits. Once you start down the path of “well this feature could be done a day quicker if...” this goes down the drain, and a few years later you find yourself getting run over on latency AND with an impossible to use trading system. It’s really the worst of both worlds.

knuthsat
From my experience, JVM can't do much when there's a deep stack.

I remember just refactoring all of foreach-loops/iterators into for-loops and got insane speedups.

Functions with inefficient iterations would get optimized if called directly but deep in the call stack and nothing happens.

These kinds of things are impossible with C++.

It's possible to write efficient Java but you have to not use some of the language features to do so.

iamcreasy
> I remember just refactoring all of foreach-loops/iterators into for-loops and got insane speedups.

Wow! Can you please provide a simple example when/where it happens?

> It's possible to write efficient Java but you have to not use some of the language features to do so.

Please tell us more.

knuthsat
Let's say you write an Iterable class so that you can use foreach on your data structure. It's quite possible iteration won't be completely inlined and the overhead of calling hasNext & next can become non-significant.

An even worse case that made massive speedups is just inlining everything. I believe modern IDEs can automatically inline all invocations of a function. You'd be surprised how much performance you can get with that.

As for writing efficient Java, it comes down to using primitive and value types.

Although, my experience is mostly with Java 8. I have no idea how streams or other new syntax works. After optimizing a really massive Java code base I've never used it since.

iamcreasy
But you still have to make the next and hasNext calls in a for loop. What guarantees that they would be inline in a for loop?
knuthsat
I inline them manually.
iamcreasy
How much was the performance improvement?
knuthsat
The biggest improvement I measured was about 15x. I'd say you can get 3x most of the time (go down to 33% of previous runtime).
iamcreasy
That is insane speedup! Thanks for sharing. How do I find next and hasNext is the bottleneck? Should I run profiler to see if they are being called a large number of times?
knuthsat
Unfortunately, any profiler I tried does not show these issues. Because the code you are profiling will be optimized completely differently. Although it makes sense to do the optimizations/inlining inside the top-level functions that profiler reports to take a lot of time (even if the iteration happens in deeper levels).

You just have to try it and see. I remember it was quite easy in IntelliJ, I believe there's an option to inline all invocations of a function. I just went crazy with that option but did it systematically to really find where the bottleneck is and make a minimal change.

dralley
Why would refactoring iterators into for-loops yield speedups? Curious.
rramadass
>These kinds of things are impossible with C++

What? Anything you can do in C, you can do in C++ at the same performance level (pay only for what you use); just be aware of what you are doing and using.

arthurcolle
I thought the actual approach taken these days is to disable GC for any JVM based trading systems and use arena based architectures or similar for application where you need super low latency, like a LOB mirror or execution engine, no?
dehrmann
I've heard of people doing that 10-15 years ago. I'm sure it still works, but hopefully the GC situation has also improved.
thu2111
There are open source 'pauseless' (really: ultra low pause) GCs for the JVM now. Azul made good money selling such a thing to the finance industry for a long time.

Typical pauses may be far less than a millisecond in these GCs because they do almost everything in parallel.

simonh
A pause is still a pause though. One system I know of had a solution. Throw in a huge amount of RAM on the server and delay garbage collection until end of day.
overkalix
Why waste money hire best engineers when more ram does trick?
ska
As I understand it (before my time), this was a practical solution found on original symbolics lisp machines; trigger the GC and go get lunch...
2sk21
Interesting but doesn't allocation get slower as the heap increases?
masklinn
Normally yes but that’s mostly a factor of heap fragmentation. It the GC never runs the “old” buffers are always full so you never bother going through them.
maccard
But a pause that can happen when you want it is acceptable. A 3 second stall might be totally acceptable as long as it happens outside of a trade, and you can guarantee that.
simonh
If you can predict 3 seconds in advance if you’re going to need to trade or not, you’d make millions very quickly. In trading terms, that’s basically having a time machine.
maccard
You don't need to predict in advance, you just need the capability in the system. If a system takes X time from the beginning of a trade to being ready to process the next trade, with Y of that time being GC, it doesn't matter how long Y is if you can execute the entire trade before all of the GC happens.
simonh
I used to support a derivatives trading system at a company called Patsystems, so I’ve seen this. Our legacy system could process trade triggers from 4ms to 7ms, but tracing the behaviour of the new system it could do it in 3ms, great. Except every now and then a GC pause would halt order processing for around 50ms. Our customers hit the roof.

Trading systems trigger almost all their orders based on detecting market changes, which you cannot predict. If a symbol hits a price that triggers one of your trades you want that trade in the market ASAP every millisecond matters. Randomly add 50ms on to that and your trading system is out of business. The customers will go elsewhere.

maccard
(Sorry, I missed this over the weekend) - I think the issue in this case is you're selling this as a guaranteed 3ms, going down from 4-7ms. If you have to pause for this 50ms GC once every X trades, then you force the GC to happen _after_ the X-1th trade has completed, but you need to provision an extra 15(?) machines to cover the pause. Otherwise, you're not delivering 3ms trades, you're delivering (3 + GCTime/X) ms. It doesn't matter if GC takes 1ms or 1s, you "simply" need to have the capacity to cover the extra GC time.
opportune
What is an arena based architecture? A quick google didn't show any results. My assumption is that you mean pre-allocating all the memory you want to use at once, but I'm not sure and I'm curious.
dan-robertson
Basically yeah. You allocate out of pools or arenas. In a language like C this usually means using a block of memory for objects that can all be freed together, so instead of calling malloc and free for each little object in your critical path (eg from request to response on some server, or a frame in a videogame), you somehow determine a maximum size, allocate that (and write enough bytes to have the OS actually give you the memory), and then allocations are just bumping a pointer into this block. At the end of the request, free the block (or return it to a pool to be used later).

Sometimes an arena may instead refer to a region of memory for allocations that are all the same size/type. Because everything is the same size, the data structures for tracking allocations may be simpler.

In a language like Java, you basically preallocate an array of objects of the same type with a bunch of fields set to zero, and then have some data structure to give you a fast interface a bit like malloc. Because you don’t do any extra allocation, the GC can be safely turned off.

vgatherps
Yeah, basically everything has to be that way.

At the super fast speeds you start running into things like:

* why won’t the devirtualizer trigger?

* these object headers are sure wasting a lot of cache

* there’s a lot of forced pointer indirection

And you just end up spending vast amounts of time and effort trying to shave off those few microseconds you’re wasting in the JVM. Once you add in all the effort trying to get around the GC, it’s bleak picture for competing with quality C++ without spending far more effort.

chrisseaton
What do you mean by ‘devirtualizer’ and why would you want that triggered? Sounds like something you wouldn’t want to trigger?
votepaunchy
Java will optimistically inline method calls.

https://mrale.ph/blog/2015/01/11/whats-up-with-monomorphism....

https://shipilev.net/blog/2015/black-magic-method-dispatch/

wolf550e
See Aleksey Shipilёv on JVM method dispatch: https://shipilev.net/blog/2015/black-magic-method-dispatch/
chrisseaton
Doesn't mention the term?
vgatherps
The devirtualizer (maybe this is the wrong JVM terminology, it’s basically what clang/GC call it) is part of the optimizer which sees that you have a virtual function call (like most in Java) where there is a unique caller, so you can replace the virtual call with a direct one and possibly inline it.

In the JVM I think this can only be done speculatively (you have to double check the type), but it still matters.

chrisseaton
Ah right yes conflict of terminology.

In the JVM devirtualising means making a virtual object a full object again, so the opposite of what you want to be happening.

I don't think the JVM really names the optimisation you're talking about, but it does do it, through either a global assumption based on the class hierarchy, or an inline cache with a local guard.

masklinn
> In the JVM devirtualising means making a virtual object a full object again, so the opposite of what you want to be happening.

No, it’s exactly what you want: for the jit to emit a static call instead of a virtual call. Devirtualisation also makes inlining possible, by definition you can’t inline a virtual call since you’ve got no idea what you’ll end up calling.

gameswithgo
the jvm names the optimization exactly like he said
thu2111
IIRC it calls it monomorphisation.
masklinn
What do you mean by “virtual object”, an object which has been broken up on the stack instead of a “real” object living as a single entity on the heap?
bluGill
Remember back in cs101 where you had class animal, with subclasses dog and cat. You can call speak() on animal and get either ruff or meow depending on what animal is. Animal is a virtual object. The system needs to do some work to figure out which speak to call every time, the work isn't too bad, but it turns out that cpus basically never have the right thing in the cache, so it ends up being very slow compared to most of what you are doing.

When we say devirtualize we mean that we know animal is always a cat so we don't have to look up which speak to use.

masklinn
Please don't take others for complete morons.

And that makes literally no sense in-context, you're mapping those words directly to the concept of virtual methods but that's the opposite of the way chrisseaton uses them (and they're clearly familiar with the concept of virtual calls and it's not what they're using "virtual" for), hence asking them what they mean specifically by this terminology.

chrisseaton
> What do you mean by “virtual object”, an object which has been broken up on the stack instead of a “real” object living as a single entity on the heap?

Yes... but let's not say 'broken up on the stack' - the object's fields become dataflow edges. The object doesn't exist reifed on the stack - fields may exist on the stack, or in registers, or not at all.

aweinstock
LLVM calls the process of breaking a struct/object into dataflow variables "Scalar Replacement of Aggregates".

https://llvm.org/doxygen/classllvm_1_1SROA.html#details

chrisseaton
Yes that's what the JVM calls it - and then the node the ties together the state is the virtual object.
blackrock
Why not just write it in C? And pre-allocate all the data structures, make them global, put them into a queue, and constantly reuse them. No more need to instantiate objects.
chrisseaton
Why not just write it in assembly while you’re at it?

The answer to both questions is that many people prefer writing in higher-level languages with more safety guarantees.

opportune
They mention not even using most of the Java features (exceptions, built in GC) that make it safe. So I think it's a pretty fair question, since they are essentially using a very stripped down version of the language that removes most of the compelling reasons to use it, and seemingly fighting the language runtime along the way.
chrisseaton
The reason these people still use Java is that everything else can be nice high-level Java code.

So they use regular Java for the build system, deployment, testing, logging, loading configuration, debugging, etc etc. There's a small core written in this strange way... but everything else is easier. And things like your profiler and debugger still work on the core as well.

mcqueenjordan
This is a bit of a false equivalency -- there aren't many perf benefits in rewriting a C program in asm. The cost you pay for that rewrite is also much greater.

Yes, of course $HIGH_LEVEL_LANG is preferable for many many use cases. In this context, we're discussing "high speed trading systems", for which native implementations are going to be the favorite.

chrisseaton
> for which native implementations are going to be the favorite

...but the point of the article is they aren't always the favourite.

acqq
I write C++ for now almost 30 years, and always for the workloads where the cost of allocations was plainly visible, especially in the critical parts of the applications. So I have never stopped writing "non idiomatic" C++, at least from the point of view of typical language lawyers (and C++ attracted them a lot through the years). And I'm surely not the only one: there were different environments where it was recognized, through the times, first, that creating and destroying much objects was very bad, and later, with the growth of the C++ standard library, that even not everything in the C++ standard library should be treated the same, and that are better solutions than what's already available, and that the third party libraries are often bringing even more potential danger.

Depending on the goal, one has to be very careful when choosing what one uses. The good side is, C++ kept its C features: if I'm deciding how I'll do something, I don't have to follow the rules of the "language lawyers." I can do my work producing what is measurably efficient. And compiler can still help me avoiding some types of errors -- others can anyway be discovered only with testing (and additional tools). At the end, knowing good what one wants is the most important aspect of the whole endeavor.

rramadass
Yep. I always say start with the "better C" part of C++ to get stronger type checking and then add in other features only as needed. All abstractions should be with minimal overhead with a strict pay only what you need policy.
SomeoneFromCA
Yep, the same thing I do - just write in C++ as if it was good old C, with very occasional use of templates, containers and exceptions.
acqq
To those who miss the point, nobody is here denying that C++ has something to bring, it's just that what it brings isn't what those who promote some fashion would claim that is to be universally used, and, honestly, there's no actual reason to believe such claims, for they being not more true now than at the times where "making complex OOP hierarchies" was the most popular advice -- I remember these times too. Or the times when managers wanted to believe that everybody will just use Rational Rose to draw nice diagrams and the actual programming won't be needed, at all. Every time has its hypes:

http://www.jot.fm/issues/issue_2003_01/column1/

https://wiki.c2.com/?UmlCaseVultures

One size doesn't fit all. Some solutions to some problems could be and are provably better than those typically promoted or "generally known" at some point of time.

If that all still doesn't mean anything to you, please read carefully and very, very slowly "The Summer of 1960", seen on HN some 9 years ago:

https://news.ycombinator.com/item?id=2856567

Edit: Answering the parallel post writing "When you proclaim to ignore language lawyers, it sounds like you are knowingly breaking the rules of the C++ standard."

No. The language lawyers, in my perception, religiously follow everything that enters the standard and proclaim that all that has to be used, because it's standardized. Including the standard libraries and some specific stuff there that isn't optimal for the problem I'm trying to solve. And especially that whatever is newer and more recently entered the standard is automatically better. It's understandable that they support their own existence by doing all that -- it's about becoming "more important" just by following/promoting some book or some obligatory rituals (it's an easy and time proven strategy through the centuries, and that's why I call it "religiously", of course -- and I am also not surprised that somebody who identifies themselves with being one of the "lawyers" wouldn't like this perspective -- you are free to suggest a better name). But it should also be also very obvious that it's not what's necessarily optimal for me to follow, as soon as I can decide what I'm doing. And yes, it's different in the environment where the "company policy" is sacred. There one has the company "policy lawyers", and typically every attempt of change can die if one isn't one of them.

pjmlp
I guess this is for you, in case you haven't seen it already,

"Orthodox C++"

https://gist.github.com/bkaradzic/2e39896bc7d8c34e042b

acqq
Haven't seen, but thanks. Lived and worked while following some of the ideas mentioned.

E.g. at the whole bottom of the page in some comment is a link to:

"Why should I have written ZeroMQ in C, not C++ (part II)"

https://250bpm.com/blog:8/

where the author writes "Let's compare how a C++ programmer would implement a list of objects..." and then "The real reason why any C++ programmer won't design the list in the C way is that the design breaks the encapsulation principle" etc.

I have indeed more than once used intrusive data structures in non-trivial C++ code, and the result was easy to read, and very efficient. There I really didn't care about some "thou shalt not" "breaking the encapsulation principle" because whoever thinks at that level of "verbot" 100% of times is just wrong.

The "encapsulation principle" is an OK principle for some levels of abstraction, but nobody says that one has to hold to it religiously (exactly my point before). I would of course always make an API which would hide what's behind it. But where I implement something ("the guts" of something), I of course have my freedom to use intrusive elements, if that solves the problem better. I have even created some "extremely intrusive" stuff (with a variable number of intrusive links in the structures). It worked perfectly. Insisting on doing anything as "ritual" all the time is just so, so wrong.

MauranKilom
> I don't have to follow the rules of the "language lawyers."

I'm not exactly sure what specifically you are trying to imply here. When you proclaim to ignore language lawyers, it sounds like you are knowingly breaking the rules of the C++ standard. That takes a lot of faith in compilers doing what you meant to do, despite writing code that is incompatible with the standard those compilers implement...

vgatherps
This is why C++ is used over Java for very fast stuff, or “just” fast stuff if you don’t want to bother fighting the JVM
arthurcolle
Thanks for the insights - I must admit I only briefly ever even used Java so I am not an expert. Do you have any recommendations for documents or a PDF that covers the JVM memory model and scheduler in any depth that makes it appropriate, possibly with applications to hard-realtime systems?

Again, thanks for the response, this was insightful and I'd love to learn more.

Also, an aside - do you have any thoughts on the use of Rust in these same systems? It's a little bit more bleeding edge, but I'm curious to hear an expert's thoughts!

bboygravity
Am I understanding correctly that those firms really really care about latency & have unlimited money & chose C++/JAVA with a standard off-the-shelve compiler + an OS and/or JVM??

If you'd really really care about latency and have the resources, wouldn't it be more effective to just go bare metal? Bare metal as in: no OS (talk to the hardware registers directly from code), possibly a custom compiler, use all the relevant instructions that your CPU offers you. Or at the very least mess with the compiler to optimize it more to the specific make/model of the CPU/hardware that it will run on?

What are the considerations in those companies for or against this?

fma
Well...the article says:

"If you have an unlimited amount of time and resources, the best solution for speed will be coded in FPGA," says Lawrey, referring to the language that directly codes field-programmable gate arrays. "If you still have a lot of time and resources and you want to create multiple exchange adaptors, you'll choose C++. - But if you want to engage with 20+ exchanges, to go to market quickly, and implement continuous performance tuning, you'll choose Java."

layoutIfNeeded
I used to work at a market maker where they used FPGAs to beat HFTs. Of course it was a constant arms race, so I’m not sure what they would be using nowadays...
bboygravity
I would imagine they switched to ASICs which are more optimized than FPGAs (much smaller feature size, less general reconfigurable bloat, higher clocks, etc).
vgatherps
Only a few have, and even then only for a few fixed things. They have extreme versions of the inflexibility problems that fpgas have, and few exchanges have low enough jitter to justify ASICS over fpgas.

It might surprise people that relatively few hfts do latency arb (and the ones that do often have some other structural advantage), and at that the ones are more and more mixing in serious short-term alpha. In that context speed is simply becoming table stakes for playing the game, and not the one true race to win it all.

chrisseaton
I don’t know what you think the practical benefit of running with no OS is?
toolslive

    - raw, unshared access to devices.
    - no more interrupts.
You can get really close to this via virtualization though. Run a guest machine without operating system on the hypervisor. It doesn't even have to be low level code to get most of the benefits (for example MirageOS)
chrisseaton
> raw, unshared access to devices

Get to OS to set up DMA for you.

> no more interrupts

If the OS has no need to interrupt you, it will never interrupt you. The OS doesn't context switch from a usefully running application on its own core unless you ask it to.

Don't need virtualisation.

bboygravity
More processing capacity for your application (no need to share with OS), no more interruptions by the OS that are unpredictable in duration, no more interruptions by the OS that are unpredictable in terms of timing and priority and above all, since this seems to be the main goal of the software we're talking about: lower overall latency. Standard C++/JAVA (as far as I know) talks to hardware through a software layer (the OS's system calls). This takes time. Time you can save by cutting out the middle man.

There are apparently ways to do this with an OS running in parallel as well though when you have multiple cores available apparently (see another comment to this thread).

chrisseaton
You can get all this by pinning to a core and DMA to hardware. Think of the OS like a library - if you don't call it and don't ask it to call you it stays out of the way. There's no need to throw away the OS.
vgatherps
This is exactly what is done. OS’s are really nice to have around and will stay out of the way. In the day and age of fpgas no reason to ruin the software side to maybe kind of sort of get a bit faster in your software?
bluGill
There is only a limit amount of money to make. If your algorithm generates 100k in gross profit a year that doesn't even pay for the programmers to maintain it, unless they can do other things as well. (I'd guess common, there are a lot of algorithms) the sum total doesn't pay. Where it does pay you are better off spending money on fpga, or even asic.
scatters
I don't know about java, but when using c++ you can use the OS to start the program, set up the hardware mappings and then move it to an isolated core where you talk to the hardware directly without ever issuing a system call.

Compilers already offer flags to optimize for specific processor generations (and use the newly available instructions); you aren't going to be able to do better than that with a custom compiler.

bboygravity
Oh, that's cool. However, you'd still be sharing at least 1 core (and all peripheral hardware and memory access in the system) with the OS. A core, other hardware and memory that you could otherwise access directly from your application without having to wait for the OS to finish its business?

I'm not sure whether it's worth it to go through this trouble (writing for bare metal vs just using an off-the-shelve OS) in a real life trading-app scenario. Hence my wonder :)

scatters
Accessing other hardware isn't latency-sensitive, though; that's usually disk and the internal network, for logging, telemetry and control. Typically a trading application will keep a thread running on the shared core and send messages between that and its dedicated core thread using lock-free queues, or simple atomics. As you say, writing for bare metal is a lot of hassle and it's unnecessary for the parts of a trading application that don't need to be super fast; using a commodity OS and isolating resources gives the best of both worlds.
sakex
I've never seen that, how do you do it? Also, how do you deal with VM and scheduling?
ericbarrett
Brief intro, should bootstrap your knowledge: https://codywu2010.wordpress.com/2015/09/27/isolcpus-numactl...
throwaway_dcnt
In addition to the peer comments, using JNI, these facilities can also be availed in Java. Chronicle queue does this.
scatters
Isolating a core: isolcpus and taskset, or possibly cgroups, plus nohz and there's a patch set that offers even stronger isolation guarantees (keeping kernel threads off the core). Hardware mappings are done as a ring buffer in memory mapped to the device; a kernel module helps set up that mapping but once it's in place reads and writes to those virtual addresses go directly (well - via MMU and memory bus) to the hardware. The general term for this is user-space networking (although obviously you can do it with other hardware as well). Some addresses may be mapped to device registers that trigger hardware actions on read/write.

VM - the key is to get everything mapped in at the start and avoid page faults subsequently. madvise can help with this and obviously you need to avoid memory leaks that could result in sbrk - but in any case allocating at all in the trading thread is generally unnecessary (after startup) and frowned on. This does mean that the (lock-free) queue back to the shared core needs to recycle memory and both sides need to service it regularly or you need a strategy to deal with exhaustion. Scheduling isn't an issue - you have a single thread per core and (with isolcpus) the kernel scheduler knows to avoid placing other threads on it.

rramadass
Thank you for that writeup. Can you please list some resources (links/articles/books etc.) for us to read up on the above?
funcDropShadow
The jit compiler of typical jvms automatically uses the instruction sets (with limits afaik) that are available on the hardware that it is running on.
roel_v
Which one of these firms made the most money?
None
None
vgatherps
The third one was definitely the least, it was losing money (and specifically to trader-invisible things that never got worked on).

The whole of the trading strategy (and capital traded, return profiles, etc) between the first two was fairly different, so it’s hard to compare.

I would say though that the first firm was significantly outperforming its peers in a way the second firm did not, although both were very successful.

The technology wasn’t the whole story in either case, but the first firm had significantly more and better opportunities as a result of just having a better stack all around

Ragib_Zaman
It's generally pretty rare for a market maker to lose money over a whole year. Colour me surprised.
hnracer
True but I feel that the market maker label gets applied pretty loosely nowadays, couldve been a firm that mixes in a lot of intraday position taking trades that blew up
lordnacho
There's costs that aren't the PnL of buy-low-sell-high. The coders, the data, the infra. All cost money, a lot of money. Add to that there's a competitive aspect in the models interacting with each other, there's no slam-dunk.

There's at least a couple of MMs struggling these days.

Roritharr
Interestingly I have seen this pattern in all kinds of software businesses, but for most it tended to be less of a core business problem.

I'm still wondering if the businesses would work better if the focus on technical excellency could be instilled and especially which parts of the company would have to suffer for it and what the ultimate outcome would be.

I'm convinced that at least for some companies, it simply can't be the winning strategy.

fizixer
You missed one comparison: C infra (no C++).

(well strictly speaking, additionally you could also have hand-written assembly infra).

humanrebar
You can write C style code in C++ and still get access to features that are still interesting in the problem domain. constexpr functions, typesafe enums, static assertions, and destructors come to mind. No need to even touch a template or inheritance if you don't want to.

The main benefits of C per se are ABI stability and availability of compilers.

fizixer
https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

https://www.youtube.com/watch?v=WDXp11LBX1o

95th
Well there is a middle ground - Rust
aldanor
From what perspective exactly is Rust a middle ground Java and C++?
splix
Maybe dev comfort like with Java, and performance like with C++?
pjmlp
I like Rust, but it definitely isn't dev comfort like Java.
phildawes
Memory safety with no unpredictable gc latency
jandrewrogers
Rust improves on Java in that it doesn't have a GC, so no pauses, and the code gen is similar to C++ in terms of performance.

Rust is worse than C++ in that it struggles to express some idiomatic high-performance code architectures, such as using DMA for I/O, because they violate axioms of Rust's memory safety model, or in other cases because Rust currently lacks the language features. To make it work in Rust requires writing more code and disabling safety features.

Rust fixes many language design and performance issues with Java, while offering some similar types of safety, but (like Java) it is missing some elements of expressiveness of C++ that are important for high-performance architectures.

aldanor
Hm... unsafe {} + use pointers and you're back to c++-style? You've had unsound code in one language, now you have unsound code in other language, how come it's suddenly worse other than having to type unsafe explicitly?

I've done my fair share of C++, including low latency stuff, and in the grand scheme of things I'd say "expressiveness of C++" is completely overshadowed by its complexity and occasional ambiguity, lack of proper type system and proper generics, lack of proper module system, lack of dependency tracking, lack of a unified build systems, etc. I'm not exactly sure what you mean by expressiveness though.

jandrewrogers
The issue, beyond having no visible references at compile-time, is that DMA has no concept of objects and has its own rules for how and when it interacts with regions of memory. It is oblivious to ownership semantics. In things like databases, most of your address space is DMA-able memory and therefore "unsafe". In C++, you can completely hide these mechanics behind constructs that look like unique_ptr/shared_ptr but which are guaranteed to be DMA-safe. In this context, all references are treated as mutable because mutability is not a property of references per se. Conflicts can only be resolved at runtime.

Instead of using an ownership-based memory safety model, which is clearly broken for systems that use DMA, they can use schedule-based memory safety models, which are formally verifiable. These don't rely on immutable references for safety, and also eliminate most need for locking -- many of the concepts originate in research on automated lock-graph conflict resolution in databases. The evolution away from ownership-based to schedule-based models is evident in database kernels over the decades. It was originally motivated by improved concurrency; suitability for later DMA-based designs was a happy accident and C++ allows it to be expressed idiomatically.

As for expressiveness, beyond the ability construct alternative safety models, modern C++ metaprogramming facilities enable not only compile-time optimization but also verification of code correctness that would be impractical in languages like Rust.

vmception
VHDL is faster no?

Isn’t everything you wrote from the wrong decade?

vgatherps
Fpgas have a huge amount of operational and practical difficulties. They’re great for fixed-function stuff like feed parsing or dumb triggers your software can direct in real-time, but absolutely terrible for something where you the ability to be flexible and adaptive.

Pure latency arb isn’t the only hft trade there is anyways.

vmception
VHDL is faster no?
angry_octet
The worst (messiest, laziest and most chaotic) engineer I know works at a brokerage, coding obscenely large and unreliable edifices (and in a number of languages, from C++ to Excel+VB)... He does what he is asked, and doesn't worry about telling them what they need. They never demand code that doesn't crash, they see that as a fact of life, and developers are an overhead. Basically someone else owns the risk, the brokers don't, so why should he?

People who don't like this style (or the code) leave.

The direction for quality has to come from the top and be supported by systematic mechanisms. In fact that would be a question to ask in interviews.

shitgoose
I know exactly the type of guys you are talking about. They write tools for traders, a mix of Excel, VB, .Net, you name it. They are not obsessed with architectural beauty, patterns and frameworks. They just write what users ask them (and they do not tell traders what traders need - the response will be swift and insulting). The traders take these tools and generate profit. And they do not give a shit about test coverage or not sound architecture principles. If software generates profit, then it works.
Karellen
^^This^^ is why you shouldn't call software developers "engineers" (disclaimer, am software developer).

"No-one ever explicitly specified that the bridge shouldn't fall down and kill everyone who was on it at the time", said no engineer, ever.

travisporter
I think software developers have a range in terms of responsibility from plumbers/electricians to civil engineers. We could save the designation “software engineer” for someone who has to go through the requisite ethics and case study courses like all other engineers. Otherwise maybe programmer/developer. (I’m gatekeeping I guess)
RHSeeger
I would think it's less about ethics and more about understanding consequences; being able to reason about what happens based on the choices made while implementing something.
zanellato19
Given the same conditions (i.e. people don't die) the same outcome would present itself. We have a history of bridges falling down, so much so that Wikipedia has a list on that: https://en.wikipedia.org/wiki/List_of_bridge_failures

A particular one that caught my eye: Cimarron River Rail Crossing Dover, Oklahoma Territory United States 18 September 1906 Wooden railroad trestle Washed out under pressure from debris during high water 4-100+ killed Entire span lost; rebuilt Bridge was to be temporary, but replacement was delayed for financial reasons.[8][9][10] Number of deaths is uncertain; estimates range from 4 to over 100.[11]

Emphasis mine. How is that different from software engineering?

cobbzilla
The sarcastic/negative implication of your example is that software engineering today is where civil engineering was 100+ years ago.
nickpeterson
I don’t think that is too far off the mark. Given 8 more decades of progress we might even be able to produce reliable code. Webpages will be 100GB of JavaScript though
m_mueller
I think reliability is, after a certain point, inverse proportional to the amount of code involved. So I'd disagree that we'll have even larger webpages then - the same way that in modern construction, heavy and thick stone & brick pillars have been replaced with comparatively light and strong steel-reinforced concrete. Maybe the equivalent would be just enough strongly typed code running on top of just enough, well tested & secure WASM?
MaxBarraclough
> Given 8 more decades of progress we might even be able to produce reliable code

As the risk of nitpicking, we already can produce reliable code, but as piaste's comment states, it's rare to make the necessary investment.

Avionics software and medical systems software is held to a very high standard, at great expense, but the average website or desktop application is not.

> Webpages will be 100GB of JavaScript though

I agree that the problem of bloat is likely to continue to worsen in the web, but that's just the web. Other sectors, like high-performance game engines, will continue to compete on efficiency.

elros
Given software engineering is less than 200 years old while civil engineering is a few thousand years old, I don't see this as a negative at all.
zanellato19
I don't think that is the case - but that is certainly one interpretation of it. They were still engineers at the time though, and so are we.
nomoreusernames
made me chuckle.
piaste
Yes, but not because software engineers are worse human beings than civil engineers, which is all too often the undertone of these discussions.

Rather, because software engineering hasn't killed enough people to advance; or when it has, it wasn't obviously the culprit.

In which fields of software engineering do we find actual solid procedures and standards? Avionics. Medical hardware. Fields where the link between bug and death is as short as possible, and so the stakeholders have demanded solid engineering.

What are the consequences of GP's acquaintance writing poor code? Some trading firm becomes slightly less efficient at trading. Impossible to evaluate the net human loss from it (if it's a loss at all).

The civil engineering equivalent would be something like façade design or layout. If your building is ugly or confusing, it will annoy or waste the time of the people who live and work in it, but as long as it doesn't fall on their heads nobody is going to withdraw your certification.

LoSboccacc
> software engineering hasn't killed enough people to advance

and where it had, you get pretty serious control and certification process put in place to avoid it (Boeing notwithstanding)

MaxBarraclough
The 737 Max issues were not with the software.
fsflyer
There seems to be plenty of software issues found:

The MCAS software was modified to read from both angle of attack sensors and to be less aggressive in pushing the nose of the plane down. The software that controlled the indicator light that illuminated when the two angle of attack sensors disagreed was also fixed.

While reviewing the software systems, a number of other software issues were found.

The wiring bundle issue was also found during these reviews.

https://www.barrons.com/articles/these-6-issues-are-preventi...

https://simpleflying.com/boeing-737-max-software-update-3/

MaxBarraclough
I phrased that poorly, I should have said The 737 Max issues that caused the crashes were not with the software implementation. They were at the level of requirements and high-level design, and were not specific to the discipline of software engineering. We're not talking about a missing break statement here.

> The MCAS software was modified to read from both angle of attack sensors and to be less aggressive in pushing the nose of the plane down.

That strikes me as a design issue in the domain of aeronautical engineering, rather than in software engineering. Software engineers aren't the ones with the domain expertise to determine the right aggression parameters.

> The software that controlled the indicator light that illuminated when the two angle of attack sensors disagreed was also fixed.

I thought the issue was that the sensor-disagree warning light was not included as standard, it was sold as an optional extra. [0]

> While reviewing the software systems, a number of other software issues were found.

Interesting, I wasn't aware of that. If I understand correctly these issues aren't thought to have a direct bearing on the crashes.

[0] https://www.nytimes.com/2019/05/05/business/boeing-737-max-w...

angry_octet
Exactly. People seem to have skipped the V&V lectures. The code did exactly what it was asked to do. But the system/aero engineers made that decision. Even the lights issue was specified that way in the system diagram.

But the biggest defect was the decision that the pilots didn't have to be told about the risk (to avoid retraining and certification costs). If they had been told the pilots/airlines may well have protested about the lack of redundancy. All those higher ups in Boeing and the FAA should be in prison.

MaxBarraclough
The 737 Max issues were not with the software.
MaxBarraclough
> not because software engineers are worse human beings than civil engineers, which is all too often the undertone of these discussions

I think the undertone is usually that the software developer at fault had no idea what they were doing, and had no place working on critical systems, rather than that they had ill intent.

ku-man
There are definitely several ancient and recent cases of bad design or execution in civil/mechanical/chemical engineering. The difference is that there is a very strict regulatory body that reviews and sanctions such malpractices (not to mention the exams you need to write to become a member of such bodies).

Unfortunately, when it comes to software development we have people who modify wordpress templates and call themselves engineers...

I cringe whenever I read about the new 'engineering disciplines' such as 'react engineer' or 'dev-ops engineer' or 'gis engineer'.

Karellen
The point isn't that bridges never fall down. The point is that engineers who build bridges know that it's their responsibility to do everything in their power to make sure it doesn't, even if no-one else around them seems to care about that, or specifically asked for it. That they should walk away from a project rather than sign off on one where they will not have the resources or support to do a competent job.

That said, all bridges have design specifications and limits. If someone drives a convoy of trucks carrying gold bullion over a bridge and exceeds its design max weight by a factor of 10, and it doesn't hold up, that's not the engineer's fault. If a bridge is designed for a geologically active area and is designed to withstand a quake 10x more powerful than the strongest that's ever been recorded in the area, or is predicted by seismologists, and the area gets hit by a 100x quake, that's not the engineer's fault.

And if a bridge designed to last say, 5 years, is neither re-certified for use, or just closed, when its intended lifespan is up then that's not the engineer's fault either.

But the engineer is responsible for finding out what the likely max load of a bridge will be, or how powerful the strongest quake will be, or how long its intended lifespan is - and for adding in safety margins on top of that. They can't just assume that any new bridge will only be needed for 5 years, just because that no-one told them that it will be needed for longer, and then claim "well, it was only temporary" afterwards.

zanellato19
> The point isn't that bridges never fall down. The point is that engineers who build bridges know that it's their responsibility to do everything in their power to make sure it doesn't, even if no-one else around them seems to care about that, or specifically asked for it. That they should walk away from a project rather than sign off on one where they will not have the resources or support to do a competent job.

The law takes care of that though. The law and inspection. If it was up to the business, any engineers building bridges who would refuse to project it faster because of that would be fired.

bigbubba
A thief thinks every man steals. An apathetic engineer thinks all engineers share his apathy.
levosmetalo
"I need that bridge yesterday. If my army doesn't cross the river today, we are all dead. I don't care if the bridge collapse tomorrow, or if the rain wear it down, as long as I can use it today."

If it's your job to do what is requested from you, then you do it. It's not like having brittle unmaintainable code is morally wrong. It's not engineers responsibility to judge use case of the customer paying for the bridge.

fishermanbill
That is a really weird analogy. Using extreme scenarios to argue your point is never going to put you on a strong footing.

Anyway for most systems deployed the customer usually wants to maintain a business using it. If the system constantly falls over, cant be readily changed etc etc that is going to cost the customers business compared to its competitors.

Add that a lot of developers work for the same company as the customer and that is just as much the developers responsibility.

99.999% of bridges built would be expected to hold up 10 years later and require minimal maintenance. I'd wager all bridges even temporary army ones would be expected to not unforeseeably fail. See the Morandi Bridge tragedy.

lowtech
Enginners where not requested to make the Voyagers last a hundred years, yet they're still up and running. Personally I think the correlation between quality of code and time to delivery are not linear. People can cram more work and quality in the same ammount of time if they want and have the right incentives to do so.
throw93232
Dont argue with that. There is common trend to blame poor managerial decisions on lower level developers. just shows how crooked system is.
Enginerrrd
>It's not engineers responsibility to judge use case of the customer paying for the bridge.

Actually yes, yes it is! I am a civil engineer and there's a standard of ethics and personal responsibility among engineers that is very, very high. When we graduate, most engineers participate in a ring ceremony. They get a funny, angled ring on their right pinky. It was originally made from the metal from a bridge that fell down and killed people. It was meant to be a daily reminder, worn on the drafting hand, that every single drawing you signed carried the weight of public trust of their lives. Even pedestrian bridges get built with vehicle load standards, because you can't just assume that it will only be used by pedestrians.

Civil engineering is a licensure, and no matter what liability structure you use, when you stamp a drawing, you carry PERSONAL liability for anything that might happen if that design fails. Even small violations of ethics or operating outside your scope of knowledge are dealt with very aggressively by the licensing board.

In school, if we misplaced a decimal or got a calculation wrong or failed to adhere to a very specific standard, it was automatic failure because if you do that in real life, people die.

ryandrake
This is a key common thread in all these discussions. At so many development shops, developers have no agency to push back against poor quality. They don't have a license on the line that they can stand behind and firmly say "NO we will not do it that way or I could be personally liable for its failure!" When they point out that they're shipping buggy crap, the response simply is: Ship the crap on time.

It's strange too, you hear in other threads about how high the demand is for software engineers, how high their salaries are, how much negotiating power they have with their companies, but when it comes to the actual product content, suddenly they have no power at all and it's just Yes boss, whatever you say, boss. How is this true?

RestlessMind
> I am a civil engineer and there's a standard of ethics and personal responsibility among engineers that is very, very high.

And that is because no one can just go to a 6-month bootcamp and call themselves a Civil engineer. And even if they did, no one would hire them or the employer would go to jail.

If we want Software Engineering to be as rigorous as other Engineering disciplines, we need to erect similar barriers to become a Software Engineer.

Enginerrrd
Honestly, I think it would be sensible to have a software engineering licensure, though it shouldn't be required for all developers or development. I'd say something like an engineering stamp would be pretty reasonable for anything requiring encryption, financial data or transactions, health data, etc. And that's not to say that developers couldn't work on it, but just that a licensed software engineer should be in responsible charge of their work to ensure that it has been performed according to best practices and standards.
quonn
Is this the same in Europe? I doubt it and at least in Germany an engineer (even a software engineer) is someone who has studied a technical subject at university. It has nothing to do with liability here.
Enginerrrd
The exact protected term varies by country but there is an equivalent license in Europe. The EU has a "Eur Ing." title issued by FEANI which is roughly equivalent. As I understand it, Germany is a bit unique even for the EU. They still have some things which require a certification of sorts, but I don't know exactly how that works.
ampdepolymerase
Found yer old union gatekeeper. Canadian engineer?
ku-man
It's not about gatekeeping, it's about responsibility.

Would be you okay if the acupuncturists called themselves doctors?

Enginerrrd
No, on both accounts, though the ring thing did originate in Canada.
dmz73
It is difficult to compare software development with civil engineering as they operate on different levels. Software development is better compared to building houses. No one expects single story house to support 4 extra levels or for internal walls to be resistant to hammer blows. Imagine what civil engineering would produce if you were building a bridge from a to b and then 3/4 way through you were required to add two extra exits c and d that were miles apart from a and b and another 4 lanes and extra level for train traffic...all in the same time frame and for no extra cost. Civil engineers would not be able to do it and no-one would even expect them to but this is regular occurrence for software developers and most of them actually manage to deliver. Sure there are bugs but bridges need constant maintenance too, some quite substantial and that is expected. For some reason, lots of non-software engineers expect software to be perfect when their own products are not, not even close.
AlexTWithBeard
The state of software engineering these days resemble that of civil engineering of late 19th century: some masterpieces, some failures and lots of experiments.

Do we put the cockpit in front of the boiler or behind?

Do we make a walking steam engine? How about using gears instead of wheels?

Which gauge to pick for the rails?

We'll get there in software engineering at well. It will be very reliable and equally boring.

sorokod
There may be reasons for not calling software developers engineers but this is not one of them. The software "bridge" in this case has the following properties:

1. It can be rebuilt in a matter of hours by a single person.

2. The customers that ordered the bridge are also the people on the bridge and are totally fine with the bridge crashing on them from time to time.

ryandrake
#2 was the most frustrating thing (to me) about writing software, and ultimately what got me to move out of coding over to other areas of the business. This universal acceptance of low quality and crashing. You always have the classic "Schedule, cost, quality, pick 2" tradeoff, and 99 out of 100 places you'll work will throw quality under the bus when push comes to shove.

I remember the exact turning point. We had a super buggy Windows application that had tons of crashes in it. Instead of root causing each crash and fixing them, I was asked to simply write a launcher app that sat in the background, waited for the application to crash, then re-launch it. That was the great solution. And it was totally acceptable to the customer. Arghhhhh! I remember thinking: I didn't spend four years in university to shit this kind of finished product out.

RestlessMind
I guess the key reason why software developers are not engineers is that there is no barrier to entry. If a necessary requirement to build a bridge is to hire a certified Civil Engineer who has undergone the necessary ethics and reliability training, then you have no choice but to pay up for that certification cost. You can't outsource that work to Eastern EU or Asia or hire another one willing to "hack" the bridge to deliver it on your tight timeline.

On the other hand, if someone takes software reliability seriously, either they are going to burn themselves out to meet the crazy deadlines or are going to "not deliver on time", in which case they will be replaced by someone who can "get shit done" cheaper and faster.

fma
If you wanna design a bridge...you need an engineering background. If you wanna design a car...you need an engineering background.

If you want to design the software that designs the bridge or the car...just need a 8 week bootcamp.

krzyk
You shouldn't compare bridge building to developing a trading apps (on causes death, the other merely financial loss)

Compare bridge building to developing software for pace maker.

Now compare number of failures in both cases.

You get what you pay for, if you pay for developer you'll get a developer.

qayxc
> You get what you pay for, if you pay for developer you'll get a developer.

That's way too oversimplified. You can be as good and as thorough as you want, but if the root of the problem is that software is seen as a cost not as an integral part of the solution, you get bad results.

This often starts with unclear/vague requirements that change every other day as new information and understanding is gained.

But it doesn't stop there - unrealistic deadlines, lack of defined processes and quality control as well as disregard (and refusal to budget and schedule) for background tasks (documentation, refactoring, ...) are also contributing factors that can turn even the best and most diligent software developer into a messy code cowboy.

If gaming the system is rewarded more or actually doing a good job is even penalised, why do a good job?

julienfr112
I remember reading about bad software in a radiotherapy machine inducing many deaths. And also about software in Toyota cars failing and also inducing deaths because the code was a huge mess. In the end, there is no such thing as a clear separation between "engineer" and "developers".
ku-man
There is definitely one. Engineers need to write technical and ethical exams, and need to have years of accountable experience in order to become engineers.

To become a software developer there are no official requirements whatsoever. This is why is wrong to call software developers engineers.

Whenever asked I always say I am a software developer.

pps43
You can convert between financial loss and loss of human life using implied cost of averting a fatality [1], which is about $10 million.

[1] https://en.wikipedia.org/wiki/Value_of_life

angry_octet
That is not a bidirectional transform. A financial loss in an investment fund just means that other traders made more profit.
astral303
The Bridge analogy is often applied with flaws. Bridges are not free to build. Software essentially is.

Here is a 1992 truth bomb rewind about it:

https://www.developerdotstar.com/printable/mag/articles/reev...

Basically engineers care about the quality of the code because they know that is the design. As soon as caring about codebase quality leaves the building, so will the quality of the product.

criddell
A civil engineer will put their stamp on the design of a bridge certifying that it will not fail. If it does fail, they will be held responsible and may lose their license to practice.

Software engineers often hide behind a contract saying they aren't responsible for anything.

astral303
Software engineers are held responsible to plenty of things (HIPAA) when there is a will to do so.
angry_octet
He is a qualified elec eng though, he's just found a non-engineering job that pays a lot more.

Software engineering is real and does require you to have a real understanding of risk (including through formal methods, model based design, systematic testing etc) in regard to the consequence. If you work in a more regulated industry (e.g. railways, aviation, automotive) these things are taken more seriously. This is why things like 'partial autonomy' driving needs more regulation -- no one is requiring Tesla to have a Chief Engineer sign off on anything meeting any standard.

RHSeeger
There are _plenty_ of places in engineering where defects / quality / etc are lesser concerns, just like there are places in software development where they're a major concern. There's also a _lot_ more call for quality up high (ie, regulations) for the various engineering disciplines.

I would expect that there are plenty of engineers that look at faults in the systems they're working on and say, "management doesn't care, not my problem".

firethief
If software developers aren't engineers, why are we judging this person's engineering? I think this case is an argument that software developers are engineers, even when their employer fails to see it that way
ku-man
Because this person is not doing engineering, this person is writing code.
gbin
On the opposite end you have programmed obsolescence for example that is definitely an engineering art. It takes talent and effort to make things fail.
2OEH8eoCRo0
Sounds like defense software. Feature A is blatantly obvious but there is no "shall" requirement for it.
MrDresden
Except those who go through university level software engineering or computer science degrees
erikerikson
I would suggest that some in the field are developers while others are engineers. The former write some code that does what they tested it for, the latter designed code to account for every detail. The former thinks it does the job the latter knows. The former rushes through while the latter codes in ways that avoids redundant busy work to allow for the higher up front investment. The former slows over time as they slog through the "ball of mud" they call inevitable while the latter increasingly gains and shares insight on the deeper problems being solved and the opportunities afoot.

TL;DR: I think the extent to which we are engineers is a choice we individually make.

burnthrow
Nice flamebait.
derefr
I think it’s not that software engineers aren’t engineers; but more that they’re most equivalent to combat engineers — engineers operating under time-pressure and shipping-pressure, where the systems they build need to “work” (in the sense of a bridge getting people across a ravine) but may only need to work once; and where it may be perfectly fine/expected/required to need to “baby” the resulting engineered system along (i.e. letting a batch of people over the bridge, then closing it to traffic, examining the piles for faults, and shoring up the ones that are slipping; then reopening the bridge, in a cycle.)

It’s not that this type of engineering is “not engineering”; it’s that it’s engineering where the engineer themselves (and their ability to actively re-stabilize the system) is considered a load-bearing element in the design, such that the system will very likely fall apart once that element is removed.

Combat engineering is still engineering, in the sense that there are still tolerances to be achieved, and it’s still a disaster if the bridge falls over while it’s in active use for the mission it was built for. It’s just not considered a problem if it falls over later, once that mission is accomplished.

dnautics
you mean they're more like train engineers, or maybe to be charitable, like geordi laforge engineers, someone who runs an engine.
First of all, it leaves the team behind if you use features that are too advanced. Makes collaboration hard if you have wildly diverging skills (in both ways).

Also, it did in our case take you away so far from the bare metal, that the code was elegant, but slow. It did not play well with our static allocators and allocated/deallocated way too much, especially temporaries.

You might i.e. check out talks like [0].

[0] https://m.youtube.com/watch?v=NH1Tta7purM

deng
I agree. Another thing to consider is portability. You might have to port your software to obscure platforms with bad C++ compilers where compiler bugs are not uncommon at all, so it's better to stay clear of more advanced features. Also, you might have a legacy system which does not have modern C++ compilers, so you might have to restrict yourself to C++03. Another thing is code size, so templates should be used judiciously.
dmurray
These replies both don't seem like they describe someone with "excellent" C++ skills. Sounds more like someone who knows a lot of advanced features of the language, but doesn't make the correct tradeoffs considering portability and/or performance.

Maybe it could be that he didn't know all the requirements up front and his mediocre colleague accidentally wrote code that better suited the company's unstated requirements (e.g. portability to niche systems). But that's hardly the most likely explanation.

jeffreygoesto
Ah, ok. In my head I do have a distinction between excellent C++ and excellent domain skills. So that person had excellent C++ skills but did not properly choose which C++ subset to use for the domain. From your comment I think you would say "excellent" implies "in all relevant aspects", whereas I just meant the pure programming language. What I tried to write in my first comment is that it is more and more difficult to find both combined in one person.
Basically Java, .NET and C++, with heavy focus on C++.

Being able to write allocation free algorithms, even on GC languages, lock free data structures and good knowledge of all multi-core programming paradigms and distributed computing.

Here are some talks that will give you a small overview into that world,

CppCon 2017: Carl Cook “When a Microsecond Is an Eternity: High Performance Trading Systems in C++”

https://www.youtube.com/watch?v=NH1Tta7purM

Core C++ 2019 :: Nimrod Sapir :: High Frequency Trading and Ultra Low Latency development techniques

https://www.youtube.com/watch?v=_0aU8S-hFQI

Open source Java HFT code from Chronicle Software, https://github.com/OpenHFT

"Writing and Testing High-Frequency Trading Engines", from Cliff Click

https://www.youtube.com/watch?v=iINk7x44MmM

However in these domains every ms counts, even how long cables are, so also expect Verilog, VHDL, and plenty of custom made algorithms running directly on hardware.

"A Low-Latency Library in FPGA Hardware for High-Frequency Trading"

https://www.youtube.com/watch?v=nXFcM1pGOIE

You can get an overview of the typical expectations here, https://www.efinancialcareers.com/

mraza007
Thanks for sharing all this
willcipriano
> allocation free algorithms

Do you have any good sources for information about this concept? I'm having trouble understanding how a algorithm could produce useful work without allocating anything.

scott00
The goal is to prevent garbage collection from happening during fast path execution. So the program doesn't need to be literally allocation free, just free of heap allocations during the steady state of the program. So heap allocations during the initialization phase are okay, as are stack allocations at any time. And the no heap allocations during steady state isn't as bad as it sounds, as using an object pool that has preallocated objects in it is usually a reasonable replacement.
phyalow
I used to write Java code which didn't have any objects, just byte arrays. It would never call GC and ran lightning fast.
non-entity
Is there any particular reason you were using Java then? Seems like you could get the same effect with much more suitable languages.
pgwhalen
The general thinking is that 99% of code at even an HFT doesn’t have to look that funny, so it might make sense to write funny looking java for the 1% that does so you don’t have to fragment your codebase and talent pool.
pjmlp
Tooling, productivity, developer pool.

Usually these are very niche use cases that can be packed into a special purpose library.

Check the openHFT repository from my original post.

aliceryhl
An algorithm might be: Take an array of integers as input and return their sum. Returning or adding integers does not require allocating memory.
Frost1x
I presume you mean allocating new memory. You could sum an array and keep track of state/store cumulative sum in the existing array i.e. in-place algorithmic approaches: https://en.m.wikipedia.org/wiki/In-place_algorithm

You still have to have memory and use it though, somewhere.

aliceryhl
Yes of course. If the algorithm operates on existing allocations, then it doesn't allocate any memory. Similarly the algorithm might operate on a fixed set of integers, which would not require allocations at all, even in Java.
imglorp
Here's a good one by Fowler. They were trying to saturate a data channel with many simultaneous writers and didn't want locking or allocation. They ended up using a static ring buffer.

https://martinfowler.com/articles/lmax.html

xiphias2
You can learn Rust to get a good general understanding of implementing everything without allocations. The knowledge later transfers to other languages (though you'll miss the guarantees that Rust provides).
None
None
dooglius
Startup/initialization performance generally doesn't matter because that can be done outside of trading hours, so allocation is fine there; the important thing is performance after that.
FpUser
Never mind algo. I write decent size firmware every once in a while and the whole thing is allocation free.
mraza007
Where can i find openHFT
pgwhalen
OpenHFT is actually known as "Chronicle" now. Chronicle used to refer to a specific library under the OpenHFT umbrella, but a few years ago it was all rebranded to be Chronicle.

https://chronicle.software/

zarkov99
You use the stack or pre-allocate memory and work with that budget. You can also use custom allocators.
jacoblambda
Not an expert by any means but another way to look at "allocation free" algorithms is by looking at them as "statically allocated".

They don't allocate any new memory but they do perform operations on an input and write it to an output. At least in the embedded world this is how this is often done. Considering the overlap between embedded software and HFT, I'd imagine most of these algorithms are going to be something along the lines of "Take in a massive but fixed size block of read only data and output a fixed size result." You can have allocated memory but it needs to be allocated at startup and be able to be consistently reused without need for reallocation.

If you want super low latency, the last thing you want is a memory allocator blocking in the middle of your highly optimised algorithm. Instead you allocate everything ahead of time and just reuse the memory over and over again.

realtalk_sp
Exactly this. It's annoying but also not as intractable as it sounds. NASA has a similar requirement for mission critical software operating under strict constraints.
Enough theory,

"Scientific Computing: C++ Versus Fortran" (1997)

https://www.drdobbs.com/cpp/scientific-computing-c-versus-fo...

"Micro-Optimisation in C++: HFT and Beyond"

http://research.ma.cx/NDCTechTown_2017_JMMcG_v1_1.pdf

"The Speed Game: Automated Trading Systems in C++"

https://www.youtube.com/watch?v=ulOLGX3HNCI

"When a Microsecond Is an Eternity: High Performance Trading Systems in C++"

https://www.youtube.com/watch?v=NH1Tta7purM

mratsim
It might be Internet and the issue of communicating emotions across but you sound quite taken by this issue.

Anyway, I stand by what I say and I'm backed by my high performance code:

- Writing matrix multiplication that is as fast as Assembly, complete with analysis and control on register allocations, L1 and L2 cache tiling and avoiding TLB cache miss:

- https://github.com/numforge/laser/blob/master/laser/primitiv...

- Code, including caveat about hyperthreading: https://github.com/numforge/laser/blob/master/laser/primitiv...

- The code is all pure Nim and is as fast/faster than OpenBLAS when multithreaded, caveat, the single-threaded kernel are slightly slower but it scales better on multiple cores.

- I've also written my own multithreading runtime. It's scale better and has lower overhead than Intel TBB. There is no constexpr, you need type-erasure to handle everything people can use a multithreading runtime for, same comparison on GEMM: https://github.com/mratsim/weave/tree/v0.4.0/benchmarks/matm...

- More resources on the importance of memory bandwidth: optimization convolutions https://github.com/numforge/laser/wiki/Convolution-optimisat...

- Optimizing matrix multiplication on GPUs: https://github.com/NervanaSystems/maxas/wiki/SGEMM, again it's all about memory and caches optimization

- Let's switch to another domain with critical perf need, cryptography. Even when knowing the bounds of iterating on a bigint at compile-time, compiler are very bad at producing optimized code, see GCC vs Clang https://gcc.godbolt.org/z/2h768y

- And crypto is the one thing where integer templates are very useful since you know the bounds.

- Another domain? VM interpretation. The slowness there is due to function call overhead and/or switch dispatching and not properly using hardware prefetchers. Same thing, C++ constexpr doesn't help it's lower-level, see resources: https://github.com/status-im/nimbus/wiki/Interpreter-optimiz...

Also all the polyhedral research, and deep learning compiler research including the Halide compiler, Taichi, Tiramisu, Legion, DaCE confirm that memory is the big bottleneck.

Now since you want to stop on the theory and you mentioned HPC, pick your algorithm, it could be matrix multiplication, QR decomposition, Cholesky, ... Any fast C++ code (or C, or Fortran or Assembly) that you find will be fast because of careful memory layout and all level of caches, not constexpr.

If you have your own library in one of those domains I would be also very happy to have a look.

As a simple example, let's pick an out-of-place transposition kernel to transpose a matrix. Show me how you use constexpr and template metaprogramming to speed it up. Here is a detailed analysis on the impact of 1D-tiling and 2D tiling: https://github.com/numforge/laser/blob/master/benchmarks/tra..., throughput can be increased 4x with proper usage of memory caches.

pjmlp
Ah now we are into the opinion of experts in the matter don't count, only if I prove it myself?

I guess that is why NVidia has spent 10 years doing hardware design to optimize their cards for C++ execution.

Apparently that was wasted money, they should have kept using C.

mratsim
I mentioned theory and experts, you said enough theory.

I switched to practical applications and walk the talk showing my code, and then you back off and want to go back to opinions.

I see now that you want to back myself with experts since reproducible code and runnable benchmarks is not enough.

Apparently you recognize Nvidia as an expert so let's talk about CuDNN where optimizing convolution is all about memory layout, source: https://github.com/soumith/convnet-benchmarks/issues/93#issu... and it's not about C vs C++ vs PTX.

Or let's hear about what Nvidia says about optimizing GEMM: https://github.com/NVIDIA/cutlass/blob/master/media/docs/eff..., it's all about memory locality and tiling.

Or maybe Stanford, the US government and Nvidia Research are also wrong when pouring significant research in Legion? https://legion.stanford.edu/

> Legion is a data-centric parallel programming system for writing portable high performance programs targeted at distributed heterogeneous architectures. Legion presents abstractions which allow programmers to describe properties of program data (e.g. independence, locality). By making the Legion programming system aware of the structure of program data, it can automate many of the tedious tasks programmers currently face, including correctly extracting task- and data-level parallelism and moving data around complex memory hierarchies. A novel mapping interface provides explicit programmer controlled placement of data in the memory hierarchy and assignment of tasks to processors in a way that is orthogonal to correctness, thereby enabling easy porting and tuning of Legion applications to new architectures.

Are you saying they should have just called it a day once they were done with C++?

Or you can read the DaCE paper on how to beat CuBLAS and CuDNN: https://arxiv.org/pdf/1902.10345.pdf, it's all about data movement. 6.4 Case Study III: Quantum Transport to optimize transistors heat dissipation, Nvidia strided matrix multiplication was improved upon by over 30%, and this part is pure Assembly, the improvement was about better utilizing the hardware caches.

pjmlp
Nah, I was answering the whole "C vs C++" issue.

But then since you saw it was a lousing battle going down that path, you pulled the hardware rabbit trick out of the magician hat.

So we moved from C++ is not faster than C assertion, to memory layouts, hardware design and data representation.

Now you are even asserting that it's not about C vs C++ vs PTX, and going down quantum transport lane?

Yeah, whatever.

mratsim
Obviously you didn't read what I posted. The Quantum Transport is compute-intensive physics problem that has a lot of optimization research going behind it. One of the main bottleneck to solve this problem is Strided Matrix Multiplication.

There is no C vs C++ issue, you keep saying that constexpr and template metaprogramming matter in high performance computing and GPGPU, I have given you links, benchmarks and actual code that showed you that what makes a difference is memory locality.

Ergo, as long as your language is low-level enough to control that locality, be it C, C++, Fortran, Rust, Nim, Zig, ... you can achieve speedups by several order of magnitude and it is absolutely required to get high-performance.

Constexpr and template metaprogramming don't matter in high performance computing, prove me wrong, walk the talk, don't drink the kool-aid.

There are plenty of well studied computation kernels you can use: matrix multiplication, convolution, ray-tracing, recurrent neural network, laplacian, video encoding, Cholesky decomposition, Gaussian filter, Jacobi, Heat, Gauss Seidel, ...

> I can imagine few cases where first-to-right-the-bell performance on a single core determines if you get a specific quote in HFT but that's that.

That is actually the case from what I've heard, A lot of them buy consumer chips then disable all but one core and overclock it to the max.

Here is a guy from optiver talking about their process at CppCon: https://www.youtube.com/watch?v=NH1Tta7purM

softawre
What a great talk, thanks for sharing!
Thanks for the praise, and for the insight per your experience. Sounds like a great gig!

I looked you up, and see that you are likely working exclusively in Haskell, but thought you might derive some value from Carl Cook's presentation on optimizing HFT code at the most recent cpp convention. I recognize the languages are distinctly different, but he provides some interesting theoretical points along the way that might be advantageous to you.

Here's a link: https://www.youtube.com/watch?v=NH1Tta7purM&feature=youtu.be

Also, I can sympathize with your plight to run scalable sims. I've done some work in bioinformatics, and am building a small cluster at home so that I can learn to write code that'll scale to more massively parallel systems that are de rigueur in that domain.

Cheers!

carterschonwald
I’ll have a gander! Thx for the link. I do code in other stuff when it makes sense ;)

Yeah building stuff that works easily in the small and sanely in the large is a fun challenge. Also hard.

Possibly because of your original comment, or perhaps unrelatedly, I’ve been lately saying “finance done right is the lock free wait free scheduling algorithm for moving society’s resources around” —— all the other stuff folks think of as finance is really just icing and fancy wrapping on top of that core truth

Oct 09, 2017 · 2 points, 0 comments · submitted by dpc94
HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.