HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
Inside the AMD Microcode ROM

media.ccc.de · 170 HN points · 2 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention media.ccc.de's video "Inside the AMD Microcode ROM".
Watch on media.ccc.de [↗]
media.ccc.de Summary
Microcode runs in most modern CPUs and translates the outer instruction set (e.g. x86) into a simpler form (usually a RISC architecture)....
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.
Other links and abstract for this talk are available on the CCC media site: https://media.ccc.de/v/35c3-9614-inside_the_amd_microcode_ro...
I don't see a difference between RYF-ness of CPU microcode and SSD firmware. The main practical difference in the context of TFA is that distros typically bundle CPU microcode but not SSD firmware, hence the question is unlikely to come up in that context for SSD firmware, but that doesn't mean that it's not a problem.

Btw, the AMD CPU microcode is fused into the CPU, otherwise it wouldn't boot; there's an additional SRAM area where the microcode update is applied; for details see https://media.ccc.de/v/35c3-9614-inside_the_amd_microcode_ro...

To elaborate on your last point, I'd always use software encryption with SSDs, because with the opaque firmware wear-leveling it's essentially impossible to be sure that anything is actually physically deleted from SSDs, and if it isn't physically deleted it can be read by a custom firmware.

mindslight
I was referring to AMD's graphics cards. They have a binary firmware blob that basically must be loaded, but the loading itself is done by Free software. I don't see how this is any different from a RYF perspective than if AMD had put another flash chip on the BOM for storing it cold. That is, assuming a competent signature scheme for both.

So the card itself doesn't RYF, but it can be used to display the output from a RYF computer. And unfortunately barring a better graphics option based on open firmware, these are the compromises we have to make. RMS recognizes this - I just think the manner in which he framed the compromise is a bit unnuanced and out of date. Rather than finding reasons to ignore least-worst blobs, we should be talking about boundaries between Free/non-free components.

SSDs are an interesting case, because the hard drive interface abstraction is so simple and longstanding, we just kind of assume it's a good boundary. But if you want to pop back into abstract Freedom land, imagine what the market would look like if vendors weren't able to market around decommoditizing software features. For example, if the FTL were done by Free software (perhaps on the main CPU), there would be no worries about certain lines of drives getting corrupted due to power failures!

(And yes, totally agree about FDE. I actually just changed my router back to being a general purpose Linux box, and it felt quite odd installing that with no FDE).

the_why_of_y
I'm sorry, no idea how I managed to jump from AMD GPUs to AMD CPUs :-)
Dec 28, 2018 · 170 points, 36 comments · submitted by DyslexicAtheist
snovv_crash
I wonder whether it would be possible to add aftermarket AVX-512 instruction handling? Not for performance necessarily, but for compatibility.
TazeTSchnitzel
The tricky thing there would be those aren't just new instructions, they extend existing registers and add new ones. Where do you fit the extra bytes?
en4bz
AMD already implements 256bit operations in terms of 2 128bit operations. Seems like going to 4 wouldn't be a stretch, at least for some of the simpler operations. Seems like a subset of operations would be possible.
bayindirh
I think Zen2 implements 256 bit instructions natively [0]. For AVX512, the new instructions [1] rather than the floating point arithmetic will be a problem IMHO. Emulating them with the microcode will be expensive and will provide no performance gains.

[0]: https://en.wikichip.org/wiki/amd/microarchitectures/zen_2

[1]: https://en.wikipedia.org/wiki/AVX-512

TazeTSchnitzel
> AMD already implements 256bit operations in terms of 2 128bit operations.

Current Zen has actual 256-bit registers, it just doesn't have the execution units to process the whole register at once. It's not really the same thing.

ah-
Unlikely, IIRC you only had someting in the order of 32 three-instruction slots of memory.
rzzzt
I guess it would be easier to do that with virtualization. Advertise the capability to the VM, catch illegal instructions, emulate the missing pieces.
ip26
We have robust support for alternate code paths based on CPU flags after many years of this kind of thing. Is that really necessary or useful?

Custom microcode handling is a lot more brittle and chip- specific yet nearly equivalent to overloading in software your call to some avx512 op.

raverbashing
You're right, it isn't needed

The "last time" this was done was in the x87 days, where if the math coprocessor wasn't installed you could trap the corresponding interrupt and handle it to emulate the instructions.

mschuster91
> The "last time" this was done was in the x87 days, where if the math coprocessor wasn't installed you could trap the corresponding interrupt and handle it to emulate the instructions.

Wasn't Hackintosh (or getting new OS X running on too old hardware) also using this technology to "support" CPUs without the newest SSEx instruction sets?

bpye
This is done for MIPS FP emulation too - https://www.linux-mips.org/wiki/Floating_point#The_Linux_ker...
cafxx
I wonder if it would be possible to dump, from microcode, the contents of the microcode ROM. This would neatly sidestep the problems inherent in decoding the ROM contents from pictures of decapped chips.
choonway
Is it possible to hack the microcode so that it can run ARM assembly natively?
bayindirh
It would be hard, because the ISA is tightly bound to the underlying silicon's structure.

Some of the commands cannot be translated to the silicon effectively or not at all.

e.g.: MIPS have 64 x 64bit registers. You can use any of them as a source or a destination, however x86 always designates EAX as the ALU accumulator. This has some profound effects on silicon design.

gpderetta
> x86 always designates EAX as the ALU accumulat

Actually no. After decoding there is nothing special in the aex register.

AMD at some point was going to release K10 which was basically Zen but with an ARM decoder. It got cancelled when Zen proved viable and AMD decided it was better to compete with Intel than all the ARM vendors.

floatboth
https://softiron.com/development-tools/overdrive-1000/

They actually "released" (to one manufacturer it seems) the Opteron A1100. With stock Cortex-A57 cores, not "Zen with an ARM decoder".

bayindirh
> Actually no. After decoding there is nothing special in the aex register.

The microcode, or specifically the modern x86 processors, are using register renaming to move things around, but the actual ASM commands imply that the results should end in EAX register. You cannot arbitrarily do a MUL and get the result from EBX for example [3]. i.e. x86 assembly dictates where the results should end in.

AMD played with two ideas: A pure ARM core, and a hybrid x86 core with ARM co-processor. The ARM core missed the performance targets [0], and they also abandoned the ARM accelerated x86 core [1], but I don't know why.

They never intended to go full TransMeta and transcode the x86 ASM into something proprietary or ARM.

Bonus: It seems they are still muling the idea of X86/ARM hybrid [2].

[0]: https://www.theregister.co.uk/2018/11/27/amazon_aws_graviton...

[1]: https://www.extremetech.com/computing/205078-amds-project-sk...

[2]: https://www.reddit.com/r/AMD_Stock/comments/8x4sba/the_retur...

[3]: https://c9x.me/x86/html/file_module_x86_id_210.html

pkaye
The instruction decoder that breaks up the variable length instruction set into micro-ops is likely hard coded.
monocasa
Nope. The chip is very much designed around x86 decoding, even before you get to the ucode ROM/RAM. Additionally, you only have a handful of patch RAM locations.
shmerl
Why is AMD microcode not open source to begin with?
jshap70
because there's a lot of proprietary stuff in microcode that's used for accelerations. gfx drivers too. it's the reason the closed amd drivers are so much faster than the open mesa ones.
monocasa
Mesa almost always uses proprietary firmware. The fail0verflow guys did some work last year to at least document it for the PS4's GPU to patch a bug. But the upstream Radeon Mesa guys are really hesitant to upstream it to avoid pissing off AMD. https://github.com/fail0verflow/radeon-tools/tree/master/f32

Of course that's all sorta orthogonal because that's all not really microcode or firmware in the classic definition, but just "code for an embedded processor I don't want to document."

shmerl
> it's the reason the closed amd drivers are so much faster than the open mesa ones.

On the contrary, Mesa is faster than their blob. AMD themselves are working on replacing blob with Mesa in the long term.

Firmware doesn't offer any acceleration advantages, it's used for different purposes.

jshap70
yeah... I don't know what numbers you're looking at but that's not true in the general case. and this isn't firmware, it's microcode. firmware is already on the chip. microcode is used so the os can take advantage of chip specific features, like security patches or even acceleration.
shmerl
They clearly said in the presentation, that microcode is a form of CPU firmware.
atq2119
Do you have actual benchmarks which show the closed source OpenGL driver significantly faster than the open source one? In Phoronix benchmarks I've seen, the open source driver beats the closed source one by a large margin.
jshap70
https://www.phoronix.com/scan.php?page=news_item&px=AMDGPU-P...
dralley
A lot has changed in the last two years. Nowadays you have an occasional game that is faster on the blob driver, but most are faster under Mesa, often significantly so.
shmerl
That's years ago and is outdated. Today Mesa beats the blob point blank, thanks to AMD themselves working on optimizing radeonsi.
None
None
anonymouzz
It's firmware. Very little firmware is. In information theoretic sense it's much more surprising if some firmware is open source.
shmerl
I'm asking why. Is there some reason for them not to open it? AMD are quite positive about opening up other things, like GPU drivers for example. So why not firmware as well?

In the GPU case I know the reason - it's the DRM garbage (HDCP and Co.). Support for DRM dictates for them to keep it closed. But even there, they could provide alternative firmware without DRM, and make it open. But for CPU, there is no real reason it seems.

slededit
GPU vendors refused to open source their drivers and firmware long before HDCP was a thing.
shmerl
Things have changed for drivers. Not for firmware though, and DRM it the reason.
atq2119
According to a question at the end, this is about very old CPUs, K8/K10, because the newer ones authenticate microcode updates with public key cryptography which hasn't been broken. Still pretty amazing stuff.
loeg
Yeah, the description says "up to 2013." I think that's likely a bit more recent than K10 but I don't know.
TazeTSchnitzel
That's just the tail end of K10 production (2012 according to Wikipedia). Its successor, Bulldozer, came out in 2011, but a new architecture being out doesn't mean its predecessor immediately stops production.
HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.