Hacker News Comments on
Breaking the x86 Instruction Set
Black Hat
·
Youtube
·
190
HN points
·
23
HN comments
- This course is unranked · view top recommended courses
Hacker News Stories and Comments
All the comments and stories posted to Hacker News that reference this video.What are the implications of such hardware on user security?Threat persistence, i.e. persistent rootkits, persistent remote monitoring if one has privs to tickle the CPU instructions that give access. It is just a matter of time before there are public tools for script kiddies to manage the OS within the CPU. Here [1] is an example of someone slowly making progress decoding all the undocumented instructions and here is a talk with a brief overview of the tools. [2] This is not specific to Intel's ME but it is the way people will eventually tame/exploit that beast in my opinion. There are more recent talks that get deeper into security rings in the CPU. This video [3] more related to your question but not specific to ME however can be used to access ME and much more. If this appears to difficult and time consuming just know that some folks out there have the documentation for the undocumented instructions.
You may find some tools on github and other public repos for disabling ME. Use with care. Test on an identical model of system that is at the same firmware/BIOS and same model of CPU as those tools can brick your CPU. As @rocket_surgeron stated, one can buy CPU's with ME disabled but that will not disable the "God Mode" referenced in [3].
[1] - https://github.com/xoreaxeaxeax
⬐ intelVISAGot hired by Intel to shut up not long after that vid, Sandsifter hasn't been updated by Domas since 2017 iirc.⬐ LinuxBenderThat suggests to me that he was turning over rocks that were not expected to be turned over. I leave it to the open source community to fork his code and continue the work. If nothing else it shows me that there are a myriad of undocumented tools in the CPU that could be used by anyone that has the patience to enumerate them.
There are even more instructions: see this lecture on finding undocumented instructions https://www.youtube.com/watch?v=KrksBdWcZgQ
⬐ ubermanHighly recommend watching this as it was a great presentation. Thank you for posting this
And secret microcode / hidden instructions in every major x86 CPU, presumably for the NSA.
⬐ runnerupIf the downvotes are due to poor quality discussion, that's fair enough. If they're due to skepticism, here's his talk from the next year:[0]: https://www.youtube.com/watch?v=jmTwlEh8L7g "DEF CON 26 - Christopher Domas - GOD MODE UNLOCKED Hardware Backdoors in redacted x86"
⬐ icedchaiI'm not skeptical they exist, I'm skeptical that they're "for the NSA."
I watched a super interesting Black Hat video on youtube that talked about discovering secret instructions on CPUs by iterating over each bit of opcodes until you get an illegal instruction, and thereby discovering if an opcode is valid or not.He set up a room full of PCs running his code and had hardware to auto-reset them when they crashed.
I would be really interested to see how often malicious software utilized undocumented opcodes that disassemblers incorrectly interpret and thus lead a security researcher down a rabbit hole while the actual opcode does something different. Like the66e9xxxxxxxx and 66e8xxxxxxxx opcodes in x86_64 [1]
If my understanding is correct, Stuxnet incorporated bytecode for the PLCs in S7comm, a protocol that was not open at the time. Though this is different then including undocumented opcodes for the system being targeted directly.
How about undocumented instructions? Sandsifter[1] is an interesting project and the video from BlackHat[2] is a good watch. There's also a previous discussion of it on HN[3].[1] https://github.com/Battelle/sandsifter
⬐ da_big_gheyyes and sansdifter are continue finding more, recent undocumented microcode modify instruction are find with it.⬐ sandinmyjointsVery cool project! Thanks for the pointer.⬐ stevemk14ebrAlso see https://blog.can.ac/2021/03/22/speculating-x86-64-isa-with-o...⬐ gbrown_Ah cool, seems I missed that when it appeared on HN a few weeks back. Thanks for the pointer.
There was an interesting black hat talk about automatically finding undocumented instructions. [0]Interesting comment on twitter to the instruction of the original post[1]
xoreaxeaxeax's videos about how to systematically parse the asm space of x86 really hit home to me how bad the encoding is.
"I thought you were saying that the use of microcode would complicate verification of the hardware."I do think that microcode opens up the possibility of subtle injections to follow code/hardware paths you might not easily predict.
A really good talk on reverse engineering an ISA implementation: https://youtu.be/KrksBdWcZgQ
⬐ MaxBarracloughAgreed, this is an issue with complexity more generally, but microcode seems like a particularly important case.As far as I can tell from the software world though, it's pretty rare for anyone to try this kind of thing. When their software is Free and Open Source, some companies put telemetry in there (e.g. Visual Studio Code), but it's very rare for there to be anything this malicious hidden away in publicly viewable code.
The Linux world has collectively agreed to trust SELinux, for instance, despite that it originates from the NSA.
This was also linked elsewhere but a security researcher was able to identify a significant number of undocumented instructions: https://www.youtube.com/watch?v=KrksBdWcZgQ
Related talk about this from Blackhat a few years ago:
⬐ anon73044Unfortunately Chris works for Intel now so I don't think he'll be giving any more of these talks in the future. (At least until his NDA expires)⬐ kchoudhuAll good things get eaten by the majors eventually, it would seem.
Great project and write-up. I'm reminded of a couple other projects.MC Hammer project for LLVM tests round-trip properties of the ARM assembler. http://llvm.org/devmtg/2012-04-12/Slides/Richard_Barton.pdf
Sandsifter x86 fuzzer. https://github.com/xoreaxeaxeax/sandsifter https://youtu.be/KrksBdWcZgQ
⬐ woodruffw(Author here.)Yep! Both sandsifter and MC Hammer provided inspiration for mishegos.
Watch this Blackhat 2017 talk and maybe it'll change your mind: https://www.youtube.com/watch?v=KrksBdWcZgQ
⬐ lawnchair_larryVery familiar with it, nothing indicating a backdoor there.
There was a really good Blackhat talk a year or two ago in a sort of similar vein: they developed an approach to iterate and find undocumented opcodes.The software they developed is called Sandsifter.
⬐ xelxebarOh yeah. That's Christopher Domas, the same guy that built a functioning compiler that only emits MOV instructions [0] [1].His other talks are quite interesting, as well!
[0]:https://github.com/xoreaxeaxeax/movfuscator [1]:https://www.youtube.com/watch?v=2VF_wPkiBJY
Breaking the x86 Instruction Set by Christopher Domas. https://youtu.be/KrksBdWcZgQI found it easy to follow and pretty entertaining.
Reveal talk at Blackhat showing this off: https://www.youtube.com/watch?v=KrksBdWcZgQ
⬐ artificialFascinating! Plus a very easy to follow presentation. Thanks for the link.
The talk is worth its weight in gold https://www.youtube.com/watch?v=KrksBdWcZgQ
Related talk (relevant part starts at 38:43) https://www.youtube.com/watch?v=KrksBdWcZgQQuote from the talk: I don't want to make it sound like the sky is falling. This was found on one very esoteric processor that is not used in widespread production. I think it's mostly interesting from an academic perspective that we have a tool that is able to find these kinds of things now.
I do also wonder if some speculative prediction / branching stuff can be controlled through undocumented CPU instructions: https://www.youtube.com/watch?v=KrksBdWcZgQ
my favorite were:- How the reputation economy is creating data-driven conformity https://media.ccc.de/v/34c3-8797-social_cooling_-_big_data_s...
- DEF CON 25 (2017) - Weaponizing Machine Learning https://www.youtube.com/watch?v=wbRx18VZlYA&t=2121s
- BlackHat 2017: Breaking the x86 Instruction Set https://www.youtube.com/watch?v=KrksBdWcZgQ
One of my favourite talks I watched was "Low Cost Non-Invasive Biomedical Imaging - An Open Electrical Impedance Tomography Project"https://media.ccc.de/v/34c3-8948-low_cost_non-invasive_biome...
As it presented an interesting technique I'd never heard of before, along with an implementation.
Also I thought the 'Breaking the x86 Instruction Set' talk was extremely clever - https://www.youtube.com/watch?v=KrksBdWcZgQ
Data Breach TransparencySocial Security Number- At a minimum, their should be a penalty that grows from the time the breach was learned to when they disclose it publicly. - There should also be penalities for not being transparent about what exact data was leaked for what users.
User Data Rights- SSN is similar to a password- you want to keep it hidden, and if it leaked, you should change it. However, we can't change it. Perhaps it should be considered more as a password?
Adoption from USA Nutrition Label- People should know what personal data companies have on them. A good example of this is Equifax storing peoples home addresses- this could be disclosed. On the other hand, a it is probably fine to exclude other types of data, such as an advertiser storing your zip code- people probably don't care as much. - Should people have a right to have certain kinds of data (e.g. SSN) removed from websites?
Technology Improvement- Is it a good idea to mandate companies disclose the security they use? For example, at one time reddit had their passwords stored as plaintext and they got hacked. Disclosing basic security hygiene (e.g. password storage) somewhere standardized in the website would make it much less outrageous.
(edit trying to figure this formating out)- Certain technologies enable hackers more than others. SQL seems to enable a lot of hacking. Should we discourage it? - Get rid of Intel ME technologies - https://schd.ws/hosted_files/osseu17/84/Replace%20UEFI%20with%20Linux.pdf - Get rid of Intel hidden instructions - https://www.youtube.com/watch?v=KrksBdWcZgQ - Get rid of Simon and Speck - https://www.reuters.com/article/us-cyber-standards-insight/distrustful-u-s-allies-force-spy-agency-to-back-down-in-encryption-fight-idUSKCN1BW0GV - What is "best for National Security" is actually worst for our own. It feels like people don't have a democratic say in the right balance either.
⬐ thethirdone> SSN is similar to a password- you want to keep it hidden, and if it leaked, you should change it. However, we can't change it. Perhaps it should be considered more as a password?I assume you meant
> SSN is similar to a password- you want to keep it hidden, and if it leaked, you should change it. However, we can't change it. Perhaps it should be considered more as a username?
⬐ zilitorActually, yes. On second, thought that is a much better idea.⬐ zilitorYeah that is good point. Either way we need a universal password.
> I trust its disassembler (especially for mainstream languages) more than almost any other disassemblerMight I suggest Christopher Domas' Black Hat talk "Breaking the x86 ISA", along the way of which he demonstrates the limitations of all disassemblers out there, including IDA's :)
Talk: https://youtu.be/KrksBdWcZgQ
Slides: https://www.blackhat.com/docs/us-17/thursday/us-17-Domas-Bre...
Paper: https://www.blackhat.com/docs/us-17/thursday/us-17-Domas-Bre...
⬐ Lramseyer"If your processor has an errata in it and you update the documentation to allow that errata, is it still an errata? I think it is, but apparently it's allowed in the newest version of the AMD manuals" [37:48]I don't understand why more vendors don't do this (if anyone wants to comment to this, I would be interested to get another opinion.) While my experience is admittedly limited to obscure chips that require NDAs for access to the specs, I was always a little annoyed that almost every time there was not even a reference to the errata documentation that the vendor provided when a new version of the spec would come out.
Now in AMD's case, I would argue that they should be more clear that it's an errata, and mention that the updated spec differed from previous versions (which it alluded to by saying "This behavior is model-dependent".) Ultimately, the spec is THE document on how a user should expect the chip to behave. So sue me if I am blurring the lines between an errata and a mistake in the spec, but I just want my documentation to tell me what the chip does without having to refer to a dozen other secondary documents dang it!
⬐ userbinator⬐ jacksonRFor that specific case (pagefault vs undefined instruction) I can see why the behaviour difference from the spec, since it's highly dependent on how the processor decodes each instruction; others have noticed similar things:https://www.symantec.com/connect/blogs/x86-fetch-decode-anom...
Ah, this is the talk behind the sandsifter tool that was making its round a few weeks back. Nice to get the deeper picture.github: https://github.com/xoreaxeaxeax/sandsifter white paper: https://github.com/xoreaxeaxeax/sandsifter/blob/master/refer...
⬐ j_s⬐ xk98qBSandsifter: find undocumented instructions and bugs on x86 CPU | https://news.ycombinator.com/item?id=14872418 (91 comments, July 2017)heh, by the same guy who made that compiler that translates C into only mov instructions: https://github.com/xoreaxeaxeax/movfuscator⬐ im3w1lObvious question: Are there programs with these instructions in the wild?⬐ inetknght⬐ dcompI would love to know the answer to that. Something tells me that we need to fix disassemblers before we can answer it though.⬐ k__Do they have any benefit over the valid instructions?⬐ tyingq⬐ AlyssaRowanHas happened in the past...Yes.This kind of technique, and the exploitation of minor CPU errata, can be used to help differentiate processor models and steppings.
That in turn allows a currently widespread DRM system to download personalised portions of object code that rely on properties specific to the licensed hardware in order to execute properly, in an attempt to counter debugging, emulation and transfer - continuing a tradition practised in copy protection techniques since at least the 6502, maybe even earlier.
⬐ ameliusIs there a term for that, similar to "security by obscurity"?⬐ Open-Sourceryhow does "Identification by exploitation" sound.Can't wait to see what 2017's f00f bug equivalent is after its released. Maybe I should just run his tunneling programs and not wait for the disclosure.⬐ wyldfire⬐ m00dyThe f00f bug is one that was discovered when cmpxchg didn't do what it was supposed to.If this just searches the space looking for packets that shouldn't decode but end up getting executed, then it's unlikely to be anywhere as interesting as f00f.
In all likelihood we have already seen 2017's big silicon bug and it was AMD's Ryzen 7 1800X issue.
⬐ twiddlydee⬐ anfractuosityHe says at the end he found a new f00f bug thoughHe did seem to indicate that was on an esoteric processor though so maybe it's not on an Intel/AMD chip.I don't know much about CPU internals, but would it actually be possible to 'patch' that through updated microcode?
It's such a clever program, will be intrigued to see what else it can find!
⬐ NoneNoneTo summarize; The guy built a random cpu instruction generator for x86. An instruction can be at most 15bytes long. So, the solution space is quite huge. He cut the solution space to 100k by generating them with DFS style fashion and validating them through cpu exceptions and flags. In the end, there's kind of map reduce style distiller to analyse hidden and valid instructions.Nice job though
⬐ smegelWhat's the bet the really secret instructions are hidden behind special conditional decode logic? I.e. the cpu wont even ask for the next byte if some register value is not set, possibly a secret register that first needs to be set via some other hidden instruction. Make that a sequence of 3 hidden instructions combined with arbitrary register and immediate values, and you won't get close to identifying them in a billion years.I mean if you worked for Intel and your manager said "make me a really secret instruction" would your best response be "lets just not document it and hopes noone notices"?
What I would give to read the full microcode of the latest Intel processor. I am guessing it is stored in a vault with the real nuclear codes, Alien cadavers and the Holy Grail.
⬐ micheljones⬐ userbinatorThere is so many ways to implement backdoors in CPUs, even completely analog ones:https://hackaday.com/2017/04/25/an-analog-charge-pump-fabric...
⬐ mickronomeSuch decoding logic might actually be detectable by something like differential power analysis, thought it could be excessively difficult if someone really wanted it hidden.I suspect that really keeping it out of view would also cost both silicon and propagation delays in what would probably be some of the most critical paths, but then I'm not a vlsi engineer, or whatever the correct title would be :)
⬐ smegel⬐ greenpenguinWhen you say might...are there any case studies showing how the internals of a CPU can be exposed using this technique?I'm not really qualified to answer this, but I suspect the instruction decoder(s?) would be decoupled from register state as much as possible (unless x86 is even weirder than I thought).Given this, I suspect wiring in a path all the way from the relevant versions of the relevant registers might be quite expensive. Plus part of the decode logic now needs to block on a register value - so a timing based attack might find these.
More qualified comments welcome...
⬐ dfox⬐ vardumpi386 instruction decoding at least partially depend on what descriptor is loaded into (shadow) CS. For example the effects of 0x66 prefix are reversed between 16b and 32b code.You don't really need even that.You just need a set of magic register values, like how CPUID [0] instruction already works.
⬐ jevinskieYup, this was done in this years USENIX with AMD microcode. See the exploits that check for magic register values at [0] and the paper at [1].[0]: https://github.com/RUB-SysSec/Microcode/tree/master/updates
[1]: http://syssec.rub.de/research/publications/microcode-reversi...
In case anyone is wondering what the "hidden instructions" he found are, many of them are documented elsewhere:IMHO the 1-byte opcode map has basically been completely explored and documented, perhaps with the exception of some of the x87 stuff. It's the 2-byte (0F xx) ones where things start to get really interesting.0f0d/0-7 were all prefetch instructions, but probably behave like NOPs if not supported 0f18/0-7 are HINT_NOPs 0f{1a-1f} are also HINT_NOPs 0fae is a bunch of assorted instructions (FXSAVE, FXRSTOR, LDMXCSR, etc.) dbe0 is FNENI dbe1 is FNDISI df{c0-c7} are x87 ops f1 is ICEBP c0,c1,d0,d1,d2,d3 groups have a few aliases (SAL/SHL) f6/1 and f/1 are aliases of f6/0 and f7/0 0f0f are 3DNow instructions and it wouldn't surprise me if there were many aliases there 0fa7 was briefly used for the IBTS instruction on the very earliest 386s and then CMPXCHG for the very earliest 486s (http://datasheets.chipdb.org/Intel/x86/486/Intel486.htm) perhaps VIA continued to use it for a CMPXCHG alias
⬐ twiddlydeeI think this is a bit of an oversimplification. I’m seeing some of these appear on sandpile.org, it does look like a lot of them are vestigial/legacy things. But, I think the implied meaning of “documented” for the presentation is “documented by the manufacturer”. Just because some of these have been reverse engineered by others doesn’t really make them “documented”, at least, not in the spirit of what the project is trying to find. He also points out that some of these (0f0d, 0f18, etc) were added to the documentation in the last year - the concern is that they were hidden for years and years before that. 0f0f and 0fa7 look like the most interesting to me - 0f0f is probably 3Dnow, but I can’t find any information on the gaps in the 3Dnow set; he says in the presentation 0fa7 is the via padlock instructions, I can’t find any references on the gaps in that range either.⬐ userbinatorhttp://linux.via.com.tw/support/beginDownload.action?eleid=1...Interesting. 0fa7xx is indeed where the VIA Padlock instructions live, but the last byte is probably being partially decoded.
⬐ jaybosamiyaSlides: https://github.com/xoreaxeaxeax/sandsifter/blob/master/refer...Related GitHub project (sandsifter): https://github.com/xoreaxeaxeax/sandsifter
Whitepaper: https://github.com/xoreaxeaxeax/sandsifter/blob/master/refer...