HN Theater @HNTheaterMonth

The best talks and videos of Hacker News.

Hacker News Comments on
Breaking the x86 Instruction Set

Black Hat · Youtube · 190 HN points · 23 HN comments
HN Theater has aggregated all Hacker News stories and comments that mention Black Hat's video "Breaking the x86 Instruction Set".
Youtube Summary
A processor is not a trusted black box for running code; on the contrary, modern x86 chips are packed full of secret instructions and hardware bugs. In this talk, we'll demonstrate how page fault analysis and some creative processor fuzzing can be used to exhaustively search the x86 instruction set and uncover the secrets buried in your chipset.

Full Abstract & Presentation Materials:
https://www.blackhat.com/us-17/briefings.html#breaking-the-x86-instruction-set
HN Theater Rankings

Hacker News Stories and Comments

All the comments and stories posted to Hacker News that reference this video.
What are the implications of such hardware on user security?

Threat persistence, i.e. persistent rootkits, persistent remote monitoring if one has privs to tickle the CPU instructions that give access. It is just a matter of time before there are public tools for script kiddies to manage the OS within the CPU. Here [1] is an example of someone slowly making progress decoding all the undocumented instructions and here is a talk with a brief overview of the tools. [2] This is not specific to Intel's ME but it is the way people will eventually tame/exploit that beast in my opinion. There are more recent talks that get deeper into security rings in the CPU. This video [3] more related to your question but not specific to ME however can be used to access ME and much more. If this appears to difficult and time consuming just know that some folks out there have the documentation for the undocumented instructions.

You may find some tools on github and other public repos for disabling ME. Use with care. Test on an identical model of system that is at the same firmware/BIOS and same model of CPU as those tools can brick your CPU. As @rocket_surgeron stated, one can buy CPU's with ME disabled but that will not disable the "God Mode" referenced in [3].

[1] - https://github.com/xoreaxeaxeax

[2] - https://www.youtube.com/watch?v=KrksBdWcZgQ

[3] - https://www.youtube.com/watch?v=jmTwlEh8L7g

intelVISA
Got hired by Intel to shut up not long after that vid, Sandsifter hasn't been updated by Domas since 2017 iirc.
LinuxBender
That suggests to me that he was turning over rocks that were not expected to be turned over. I leave it to the open source community to fork his code and continue the work. If nothing else it shows me that there are a myriad of undocumented tools in the CPU that could be used by anyone that has the patience to enumerate them.
There are even more instructions: see this lecture on finding undocumented instructions https://www.youtube.com/watch?v=KrksBdWcZgQ
uberman
Highly recommend watching this as it was a great presentation. Thank you for posting this
May 16, 2022 · 2 points, 0 comments · submitted by d1stc
And secret microcode / hidden instructions in every major x86 CPU, presumably for the NSA.

[0]: https://www.youtube.com/watch?v=KrksBdWcZgQ

runnerup
If the downvotes are due to poor quality discussion, that's fair enough. If they're due to skepticism, here's his talk from the next year:

[0]: https://www.youtube.com/watch?v=jmTwlEh8L7g "DEF CON 26 - Christopher Domas - GOD MODE UNLOCKED Hardware Backdoors in redacted x86"

icedchai
I'm not skeptical they exist, I'm skeptical that they're "for the NSA."
I watched a super interesting Black Hat video on youtube that talked about discovering secret instructions on CPUs by iterating over each bit of opcodes until you get an illegal instruction, and thereby discovering if an opcode is valid or not.

He set up a room full of PCs running his code and had hardware to auto-reset them when they crashed.

https://youtu.be/KrksBdWcZgQ

I would be really interested to see how often malicious software utilized undocumented opcodes that disassemblers incorrectly interpret and thus lead a security researcher down a rabbit hole while the actual opcode does something different. Like the

66e9xxxxxxxx and 66e8xxxxxxxx opcodes in x86_64 [1]

If my understanding is correct, Stuxnet incorporated bytecode for the PLCs in S7comm, a protocol that was not open at the time. Though this is different then including undocumented opcodes for the system being targeted directly.

[1] https://youtu.be/KrksBdWcZgQ?t=1767

How about undocumented instructions? Sandsifter[1] is an interesting project and the video from BlackHat[2] is a good watch. There's also a previous discussion of it on HN[3].

[1] https://github.com/Battelle/sandsifter

[2] https://www.youtube.com/watch?v=KrksBdWcZgQ

[3] https://news.ycombinator.com/item?id=18179212

da_big_ghey
yes and sansdifter are continue finding more, recent undocumented microcode modify instruction are find with it.
sandinmyjoints
Very cool project! Thanks for the pointer.
stevemk14ebr
Also see https://blog.can.ac/2021/03/22/speculating-x86-64-isa-with-o...
gbrown_
Ah cool, seems I missed that when it appeared on HN a few weeks back. Thanks for the pointer.
There was an interesting black hat talk about automatically finding undocumented instructions. [0]

Interesting comment on twitter to the instruction of the original post[1]

[0] https://www.youtube.com/watch?v=KrksBdWcZgQ&t=3

[1] https://twitter.com/eigma/status/1373155650432290819

xoreaxeaxeax's videos about how to systematically parse the asm space of x86 really hit home to me how bad the encoding is.

https://www.youtube.com/watch?v=KrksBdWcZgQ

"I thought you were saying that the use of microcode would complicate verification of the hardware."

I do think that microcode opens up the possibility of subtle injections to follow code/hardware paths you might not easily predict.

A really good talk on reverse engineering an ISA implementation: https://youtu.be/KrksBdWcZgQ

MaxBarraclough
Agreed, this is an issue with complexity more generally, but microcode seems like a particularly important case.

As far as I can tell from the software world though, it's pretty rare for anyone to try this kind of thing. When their software is Free and Open Source, some companies put telemetry in there (e.g. Visual Studio Code), but it's very rare for there to be anything this malicious hidden away in publicly viewable code.

The Linux world has collectively agreed to trust SELinux, for instance, despite that it originates from the NSA.

This was also linked elsewhere but a security researcher was able to identify a significant number of undocumented instructions: https://www.youtube.com/watch?v=KrksBdWcZgQ
Related talk about this from Blackhat a few years ago:

https://www.youtube.com/watch?v=KrksBdWcZgQ

anon73044
Unfortunately Chris works for Intel now so I don't think he'll be giving any more of these talks in the future. (At least until his NDA expires)
kchoudhu
All good things get eaten by the majors eventually, it would seem.
Great project and write-up. I'm reminded of a couple other projects.

MC Hammer project for LLVM tests round-trip properties of the ARM assembler. http://llvm.org/devmtg/2012-04-12/Slides/Richard_Barton.pdf

Sandsifter x86 fuzzer. https://github.com/xoreaxeaxeax/sandsifter https://youtu.be/KrksBdWcZgQ

woodruffw
(Author here.)

Yep! Both sandsifter and MC Hammer provided inspiration for mishegos.

Watch this Blackhat 2017 talk and maybe it'll change your mind: https://www.youtube.com/watch?v=KrksBdWcZgQ
lawnchair_larry
Very familiar with it, nothing indicating a backdoor there.
There was a really good Blackhat talk a year or two ago in a sort of similar vein: they developed an approach to iterate and find undocumented opcodes.

https://youtu.be/KrksBdWcZgQ

The software they developed is called Sandsifter.

xelxebar
Oh yeah. That's Christopher Domas, the same guy that built a functioning compiler that only emits MOV instructions [0] [1].

His other talks are quite interesting, as well!

[0]:https://github.com/xoreaxeaxeax/movfuscator [1]:https://www.youtube.com/watch?v=2VF_wPkiBJY

Breaking the x86 Instruction Set by Christopher Domas. https://youtu.be/KrksBdWcZgQ

I found it easy to follow and pretty entertaining.

Reveal talk at Blackhat showing this off: https://www.youtube.com/watch?v=KrksBdWcZgQ
artificial
Fascinating! Plus a very easy to follow presentation. Thanks for the link.
Related talk (relevant part starts at 38:43) https://www.youtube.com/watch?v=KrksBdWcZgQ

Quote from the talk: I don't want to make it sound like the sky is falling. This was found on one very esoteric processor that is not used in widespread production. I think it's mostly interesting from an academic perspective that we have a tool that is able to find these kinds of things now.

I do also wonder if some speculative prediction / branching stuff can be controlled through undocumented CPU instructions: https://www.youtube.com/watch?v=KrksBdWcZgQ
my favorite were:

- How the reputation economy is creating data-driven conformity https://media.ccc.de/v/34c3-8797-social_cooling_-_big_data_s...

- DEF CON 25 (2017) - Weaponizing Machine Learning https://www.youtube.com/watch?v=wbRx18VZlYA&t=2121s

- BlackHat 2017: Breaking the x86 Instruction Set https://www.youtube.com/watch?v=KrksBdWcZgQ

One of my favourite talks I watched was "Low Cost Non-Invasive Biomedical Imaging - An Open Electrical Impedance Tomography Project"

https://media.ccc.de/v/34c3-8948-low_cost_non-invasive_biome...

As it presented an interesting technique I'd never heard of before, along with an implementation.

Also I thought the 'Breaking the x86 Instruction Set' talk was extremely clever - https://www.youtube.com/watch?v=KrksBdWcZgQ

Data Breach Transparency

  - At a minimum, their should be a penalty that grows from the time the breach was learned to when they disclose it publicly.
  - There should also be penalities for not being transparent about what exact data was leaked for what users.
Social Security Number

  - SSN is similar to a password- you want to keep it hidden, and if it leaked, you should change it. However, we can't change it. Perhaps it should be considered more as a password?
User Data Rights

  - People should know what personal data companies have on them. A good example of this is Equifax storing peoples home addresses- this could be disclosed. On the other hand, a it is probably fine to exclude other types of data, such as an advertiser storing your zip code- people probably don't care as much.
  - Should people have a right to have certain kinds of data (e.g. SSN) removed from websites?
Adoption from USA Nutrition Label

  - Is it a good idea to mandate companies disclose the security they use? For example, at one time reddit had their passwords stored as plaintext and they got hacked. Disclosing basic security hygiene (e.g. password storage) somewhere standardized in the website would make it much less outrageous.
Technology Improvement

  - Certain technologies enable hackers more than others. SQL seems to enable a lot of hacking. Should we discourage it?
  - Get rid of Intel ME technologies     - https://schd.ws/hosted_files/osseu17/84/Replace%20UEFI%20with%20Linux.pdf
  - Get rid of Intel hidden instructions - https://www.youtube.com/watch?v=KrksBdWcZgQ
  - Get rid of Simon and Speck           - https://www.reuters.com/article/us-cyber-standards-insight/distrustful-u-s-allies-force-spy-agency-to-back-down-in-encryption-fight-idUSKCN1BW0GV
  - What is "best for National Security" is actually worst for our own. It feels like people don't have a democratic say in the right balance either.
(edit trying to figure this formating out)
thethirdone
> SSN is similar to a password- you want to keep it hidden, and if it leaked, you should change it. However, we can't change it. Perhaps it should be considered more as a password?

I assume you meant

> SSN is similar to a password- you want to keep it hidden, and if it leaked, you should change it. However, we can't change it. Perhaps it should be considered more as a username?

zilitor
Actually, yes. On second, thought that is a much better idea.
zilitor
Yeah that is good point. Either way we need a universal password.
Sep 15, 2017 · AceJohnny2 on IDA: What's new in 7.00
> I trust its disassembler (especially for mainstream languages) more than almost any other disassembler

Might I suggest Christopher Domas' Black Hat talk "Breaking the x86 ISA", along the way of which he demonstrates the limitations of all disassemblers out there, including IDA's :)

Talk: https://youtu.be/KrksBdWcZgQ

Slides: https://www.blackhat.com/docs/us-17/thursday/us-17-Domas-Bre...

Paper: https://www.blackhat.com/docs/us-17/thursday/us-17-Domas-Bre...

Sep 10, 2017 · 1 points, 0 comments · submitted by sillysaurus3
Sep 09, 2017 · 174 points, 28 comments · submitted by old-gregg
Lramseyer
"If your processor has an errata in it and you update the documentation to allow that errata, is it still an errata? I think it is, but apparently it's allowed in the newest version of the AMD manuals" [37:48]

I don't understand why more vendors don't do this (if anyone wants to comment to this, I would be interested to get another opinion.) While my experience is admittedly limited to obscure chips that require NDAs for access to the specs, I was always a little annoyed that almost every time there was not even a reference to the errata documentation that the vendor provided when a new version of the spec would come out.

Now in AMD's case, I would argue that they should be more clear that it's an errata, and mention that the updated spec differed from previous versions (which it alluded to by saying "This behavior is model-dependent".) Ultimately, the spec is THE document on how a user should expect the chip to behave. So sue me if I am blurring the lines between an errata and a mistake in the spec, but I just want my documentation to tell me what the chip does without having to refer to a dozen other secondary documents dang it!

userbinator
For that specific case (pagefault vs undefined instruction) I can see why the behaviour difference from the spec, since it's highly dependent on how the processor decodes each instruction; others have noticed similar things:

https://www.symantec.com/connect/blogs/x86-fetch-decode-anom...

jacksonR
Ah, this is the talk behind the sandsifter tool that was making its round a few weeks back. Nice to get the deeper picture.

github: https://github.com/xoreaxeaxeax/sandsifter white paper: https://github.com/xoreaxeaxeax/sandsifter/blob/master/refer...

j_s
Sandsifter: find undocumented instructions and bugs on x86 CPU | https://news.ycombinator.com/item?id=14872418 (91 comments, July 2017)
xk98qB
heh, by the same guy who made that compiler that translates C into only mov instructions: https://github.com/xoreaxeaxeax/movfuscator
im3w1l
Obvious question: Are there programs with these instructions in the wild?
inetknght
I would love to know the answer to that. Something tells me that we need to fix disassemblers before we can answer it though.
k__
Do they have any benefit over the valid instructions?
tyingq
Has happened in the past...

http://www.rcollins.org/secrets/opcodes/SALC.html

AlyssaRowan
Yes.

This kind of technique, and the exploitation of minor CPU errata, can be used to help differentiate processor models and steppings.

That in turn allows a currently widespread DRM system to download personalised portions of object code that rely on properties specific to the licensed hardware in order to execute properly, in an attempt to counter debugging, emulation and transfer - continuing a tradition practised in copy protection techniques since at least the 6502, maybe even earlier.

amelius
Is there a term for that, similar to "security by obscurity"?
Open-Sourcery
how does "Identification by exploitation" sound.
dcomp
Can't wait to see what 2017's f00f bug equivalent is after its released. Maybe I should just run his tunneling programs and not wait for the disclosure.
wyldfire
The f00f bug is one that was discovered when cmpxchg didn't do what it was supposed to.

If this just searches the space looking for packets that shouldn't decode but end up getting executed, then it's unlikely to be anywhere as interesting as f00f.

In all likelihood we have already seen 2017's big silicon bug and it was AMD's Ryzen 7 1800X issue.

twiddlydee
He says at the end he found a new f00f bug though
anfractuosity
He did seem to indicate that was on an esoteric processor though so maybe it's not on an Intel/AMD chip.

I don't know much about CPU internals, but would it actually be possible to 'patch' that through updated microcode?

It's such a clever program, will be intrigued to see what else it can find!

None
None
m00dy
To summarize; The guy built a random cpu instruction generator for x86. An instruction can be at most 15bytes long. So, the solution space is quite huge. He cut the solution space to 100k by generating them with DFS style fashion and validating them through cpu exceptions and flags. In the end, there's kind of map reduce style distiller to analyse hidden and valid instructions.

Nice job though

smegel
What's the bet the really secret instructions are hidden behind special conditional decode logic? I.e. the cpu wont even ask for the next byte if some register value is not set, possibly a secret register that first needs to be set via some other hidden instruction. Make that a sequence of 3 hidden instructions combined with arbitrary register and immediate values, and you won't get close to identifying them in a billion years.

I mean if you worked for Intel and your manager said "make me a really secret instruction" would your best response be "lets just not document it and hopes noone notices"?

What I would give to read the full microcode of the latest Intel processor. I am guessing it is stored in a vault with the real nuclear codes, Alien cadavers and the Holy Grail.

micheljones
There is so many ways to implement backdoors in CPUs, even completely analog ones:

https://hackaday.com/2017/04/25/an-analog-charge-pump-fabric...

mickronome
Such decoding logic might actually be detectable by something like differential power analysis, thought it could be excessively difficult if someone really wanted it hidden.

I suspect that really keeping it out of view would also cost both silicon and propagation delays in what would probably be some of the most critical paths, but then I'm not a vlsi engineer, or whatever the correct title would be :)

smegel
When you say might...are there any case studies showing how the internals of a CPU can be exposed using this technique?
greenpenguin
I'm not really qualified to answer this, but I suspect the instruction decoder(s?) would be decoupled from register state as much as possible (unless x86 is even weirder than I thought).

Given this, I suspect wiring in a path all the way from the relevant versions of the relevant registers might be quite expensive. Plus part of the decode logic now needs to block on a register value - so a timing based attack might find these.

More qualified comments welcome...

dfox
i386 instruction decoding at least partially depend on what descriptor is loaded into (shadow) CS. For example the effects of 0x66 prefix are reversed between 16b and 32b code.
vardump
You don't really need even that.

You just need a set of magic register values, like how CPUID [0] instruction already works.

[0]: http://www.sandpile.org/x86/cpuid.htm

jevinskie
Yup, this was done in this years USENIX with AMD microcode. See the exploits that check for magic register values at [0] and the paper at [1].

[0]: https://github.com/RUB-SysSec/Microcode/tree/master/updates

[1]: http://syssec.rub.de/research/publications/microcode-reversi...

userbinator
In case anyone is wondering what the "hidden instructions" he found are, many of them are documented elsewhere:

    0f0d/0-7 were all prefetch instructions, but probably behave like NOPs if not supported
    0f18/0-7 are HINT_NOPs
    0f{1a-1f} are also HINT_NOPs
    0fae is a bunch of assorted instructions (FXSAVE, FXRSTOR, LDMXCSR, etc.)
    dbe0 is FNENI
    dbe1 is FNDISI
    df{c0-c7} are x87 ops
    f1 is ICEBP
    c0,c1,d0,d1,d2,d3 groups have a few aliases (SAL/SHL)
    f6/1 and f/1 are aliases of f6/0 and f7/0
    0f0f are 3DNow instructions and it wouldn't surprise me if there were many aliases there
    0fa7 was briefly used for the IBTS instruction on the very earliest 386s and then CMPXCHG for the very earliest 486s
        (http://datasheets.chipdb.org/Intel/x86/486/Intel486.htm)
        perhaps VIA continued to use it for a CMPXCHG alias
IMHO the 1-byte opcode map has basically been completely explored and documented, perhaps with the exception of some of the x87 stuff. It's the 2-byte (0F xx) ones where things start to get really interesting.
twiddlydee
I think this is a bit of an oversimplification. I’m seeing some of these appear on sandpile.org, it does look like a lot of them are vestigial/legacy things. But, I think the implied meaning of “documented” for the presentation is “documented by the manufacturer”. Just because some of these have been reverse engineered by others doesn’t really make them “documented”, at least, not in the spirit of what the project is trying to find. He also points out that some of these (0f0d, 0f18, etc) were added to the documentation in the last year - the concern is that they were hidden for years and years before that. 0f0f and 0fa7 look like the most interesting to me - 0f0f is probably 3Dnow, but I can’t find any information on the gaps in the 3Dnow set; he says in the presentation 0fa7 is the via padlock instructions, I can’t find any references on the gaps in that range either.
userbinator
http://linux.via.com.tw/support/beginDownload.action?eleid=1...

Interesting. 0fa7xx is indeed where the VIA Padlock instructions live, but the last byte is probably being partially decoded.

Sep 06, 2017 · 3 points, 0 comments · submitted by bga
Sep 04, 2017 · 3 points, 0 comments · submitted by adamnemecek
Sep 03, 2017 · 4 points, 0 comments · submitted by gbrown_
HN Theater is an independent project and is not operated by Y Combinator or any of the video hosting platforms linked to on this site.
~ yaj@
;laksdfhjdhksalkfj more things
yahnd.com ~ Privacy Policy ~
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.