Hacker News Comments on
Inside the AMD Microcode ROM
media.ccc.de
·
170
HN points
·
2
HN comments
- This course is unranked · view top recommended courses
Hacker News Stories and Comments
All the comments and stories posted to Hacker News that reference this video.Other links and abstract for this talk are available on the CCC media site: https://media.ccc.de/v/35c3-9614-inside_the_amd_microcode_ro...
I don't see a difference between RYF-ness of CPU microcode and SSD firmware. The main practical difference in the context of TFA is that distros typically bundle CPU microcode but not SSD firmware, hence the question is unlikely to come up in that context for SSD firmware, but that doesn't mean that it's not a problem.Btw, the AMD CPU microcode is fused into the CPU, otherwise it wouldn't boot; there's an additional SRAM area where the microcode update is applied; for details see https://media.ccc.de/v/35c3-9614-inside_the_amd_microcode_ro...
To elaborate on your last point, I'd always use software encryption with SSDs, because with the opaque firmware wear-leveling it's essentially impossible to be sure that anything is actually physically deleted from SSDs, and if it isn't physically deleted it can be read by a custom firmware.
⬐ mindslightI was referring to AMD's graphics cards. They have a binary firmware blob that basically must be loaded, but the loading itself is done by Free software. I don't see how this is any different from a RYF perspective than if AMD had put another flash chip on the BOM for storing it cold. That is, assuming a competent signature scheme for both.So the card itself doesn't RYF, but it can be used to display the output from a RYF computer. And unfortunately barring a better graphics option based on open firmware, these are the compromises we have to make. RMS recognizes this - I just think the manner in which he framed the compromise is a bit unnuanced and out of date. Rather than finding reasons to ignore least-worst blobs, we should be talking about boundaries between Free/non-free components.
SSDs are an interesting case, because the hard drive interface abstraction is so simple and longstanding, we just kind of assume it's a good boundary. But if you want to pop back into abstract Freedom land, imagine what the market would look like if vendors weren't able to market around decommoditizing software features. For example, if the FTL were done by Free software (perhaps on the main CPU), there would be no worries about certain lines of drives getting corrupted due to power failures!
(And yes, totally agree about FDE. I actually just changed my router back to being a general purpose Linux box, and it felt quite odd installing that with no FDE).
⬐ the_why_of_yI'm sorry, no idea how I managed to jump from AMD GPUs to AMD CPUs :-)
⬐ snovv_crashI wonder whether it would be possible to add aftermarket AVX-512 instruction handling? Not for performance necessarily, but for compatibility.⬐ TazeTSchnitzel⬐ cafxxThe tricky thing there would be those aren't just new instructions, they extend existing registers and add new ones. Where do you fit the extra bytes?⬐ en4bz⬐ ah-AMD already implements 256bit operations in terms of 2 128bit operations. Seems like going to 4 wouldn't be a stretch, at least for some of the simpler operations. Seems like a subset of operations would be possible.⬐ bayindirhI think Zen2 implements 256 bit instructions natively [0]. For AVX512, the new instructions [1] rather than the floating point arithmetic will be a problem IMHO. Emulating them with the microcode will be expensive and will provide no performance gains.[0]: https://en.wikichip.org/wiki/amd/microarchitectures/zen_2
⬐ TazeTSchnitzel> AMD already implements 256bit operations in terms of 2 128bit operations.Current Zen has actual 256-bit registers, it just doesn't have the execution units to process the whole register at once. It's not really the same thing.
Unlikely, IIRC you only had someting in the order of 32 three-instruction slots of memory.⬐ rzzztI guess it would be easier to do that with virtualization. Advertise the capability to the VM, catch illegal instructions, emulate the missing pieces.⬐ ip26We have robust support for alternate code paths based on CPU flags after many years of this kind of thing. Is that really necessary or useful?Custom microcode handling is a lot more brittle and chip- specific yet nearly equivalent to overloading in software your call to some avx512 op.
⬐ raverbashingYou're right, it isn't neededThe "last time" this was done was in the x87 days, where if the math coprocessor wasn't installed you could trap the corresponding interrupt and handle it to emulate the instructions.
⬐ mschuster91> The "last time" this was done was in the x87 days, where if the math coprocessor wasn't installed you could trap the corresponding interrupt and handle it to emulate the instructions.Wasn't Hackintosh (or getting new OS X running on too old hardware) also using this technology to "support" CPUs without the newest SSEx instruction sets?
⬐ bpyeThis is done for MIPS FP emulation too - https://www.linux-mips.org/wiki/Floating_point#The_Linux_ker...I wonder if it would be possible to dump, from microcode, the contents of the microcode ROM. This would neatly sidestep the problems inherent in decoding the ROM contents from pictures of decapped chips.⬐ choonwayIs it possible to hack the microcode so that it can run ARM assembly natively?⬐ bayindirh⬐ shmerlIt would be hard, because the ISA is tightly bound to the underlying silicon's structure.Some of the commands cannot be translated to the silicon effectively or not at all.
e.g.: MIPS have 64 x 64bit registers. You can use any of them as a source or a destination, however x86 always designates EAX as the ALU accumulator. This has some profound effects on silicon design.
⬐ gpderetta⬐ pkaye> x86 always designates EAX as the ALU accumulatActually no. After decoding there is nothing special in the aex register.
AMD at some point was going to release K10 which was basically Zen but with an ARM decoder. It got cancelled when Zen proved viable and AMD decided it was better to compete with Intel than all the ARM vendors.
⬐ floatbothhttps://softiron.com/development-tools/overdrive-1000/They actually "released" (to one manufacturer it seems) the Opteron A1100. With stock Cortex-A57 cores, not "Zen with an ARM decoder".
⬐ bayindirh> Actually no. After decoding there is nothing special in the aex register.The microcode, or specifically the modern x86 processors, are using register renaming to move things around, but the actual ASM commands imply that the results should end in EAX register. You cannot arbitrarily do a MUL and get the result from EBX for example [3]. i.e. x86 assembly dictates where the results should end in.
AMD played with two ideas: A pure ARM core, and a hybrid x86 core with ARM co-processor. The ARM core missed the performance targets [0], and they also abandoned the ARM accelerated x86 core [1], but I don't know why.
They never intended to go full TransMeta and transcode the x86 ASM into something proprietary or ARM.
Bonus: It seems they are still muling the idea of X86/ARM hybrid [2].
[0]: https://www.theregister.co.uk/2018/11/27/amazon_aws_graviton...
[1]: https://www.extremetech.com/computing/205078-amds-project-sk...
[2]: https://www.reddit.com/r/AMD_Stock/comments/8x4sba/the_retur...
The instruction decoder that breaks up the variable length instruction set into micro-ops is likely hard coded.⬐ monocasaNope. The chip is very much designed around x86 decoding, even before you get to the ucode ROM/RAM. Additionally, you only have a handful of patch RAM locations.Why is AMD microcode not open source to begin with?⬐ jshap70⬐ atq2119because there's a lot of proprietary stuff in microcode that's used for accelerations. gfx drivers too. it's the reason the closed amd drivers are so much faster than the open mesa ones.⬐ monocasa⬐ anonymouzzMesa almost always uses proprietary firmware. The fail0verflow guys did some work last year to at least document it for the PS4's GPU to patch a bug. But the upstream Radeon Mesa guys are really hesitant to upstream it to avoid pissing off AMD. https://github.com/fail0verflow/radeon-tools/tree/master/f32Of course that's all sorta orthogonal because that's all not really microcode or firmware in the classic definition, but just "code for an embedded processor I don't want to document."
⬐ shmerl> it's the reason the closed amd drivers are so much faster than the open mesa ones.On the contrary, Mesa is faster than their blob. AMD themselves are working on replacing blob with Mesa in the long term.
Firmware doesn't offer any acceleration advantages, it's used for different purposes.
⬐ jshap70yeah... I don't know what numbers you're looking at but that's not true in the general case. and this isn't firmware, it's microcode. firmware is already on the chip. microcode is used so the os can take advantage of chip specific features, like security patches or even acceleration.⬐ shmerl⬐ NoneThey clearly said in the presentation, that microcode is a form of CPU firmware.⬐ atq2119Do you have actual benchmarks which show the closed source OpenGL driver significantly faster than the open source one? In Phoronix benchmarks I've seen, the open source driver beats the closed source one by a large margin.⬐ jshap70https://www.phoronix.com/scan.php?page=news_item&px=AMDGPU-P...⬐ dralleyA lot has changed in the last two years. Nowadays you have an occasional game that is faster on the blob driver, but most are faster under Mesa, often significantly so.⬐ shmerlThat's years ago and is outdated. Today Mesa beats the blob point blank, thanks to AMD themselves working on optimizing radeonsi.NoneIt's firmware. Very little firmware is. In information theoretic sense it's much more surprising if some firmware is open source.⬐ shmerlI'm asking why. Is there some reason for them not to open it? AMD are quite positive about opening up other things, like GPU drivers for example. So why not firmware as well?In the GPU case I know the reason - it's the DRM garbage (HDCP and Co.). Support for DRM dictates for them to keep it closed. But even there, they could provide alternative firmware without DRM, and make it open. But for CPU, there is no real reason it seems.
⬐ slededitGPU vendors refused to open source their drivers and firmware long before HDCP was a thing.⬐ shmerlThings have changed for drivers. Not for firmware though, and DRM it the reason.According to a question at the end, this is about very old CPUs, K8/K10, because the newer ones authenticate microcode updates with public key cryptography which hasn't been broken. Still pretty amazing stuff.⬐ loegYeah, the description says "up to 2013." I think that's likely a bit more recent than K10 but I don't know.⬐ TazeTSchnitzelThat's just the tail end of K10 production (2012 according to Wikipedia). Its successor, Bulldozer, came out in 2011, but a new architecture being out doesn't mean its predecessor immediately stops production.