Thursday, January 4, 2018

Is PowerPC susceptible to Spectre? Yep.

UPDATE: Yes, TenFourFox will implement relevant Spectre-hardening features being deployed to Firefox, and the changes to performance.now will be part of FPR5 final. We also don't support SharedArrayBuffer anyway and right now are not likely to implement it any time soon.

UPDATE the 2nd: This post is getting a bit of attention and was really only intended as a quick skim, so if you're curious whether all PowerPC chips are vulnerable in the same fashion and why, read on for a deeper dive.

If you've been under a rock the last couple days, then you should read about Meltdown and Spectre (especially if you are using an Intel CPU).

Meltdown is specific to x86 processors made by Intel; it does not appear to affect AMD. But virtually every CPU going back decades that has a feature called speculative execution is vulnerable to a variety of the Spectre attack. In short, for those processors that execute "future" code downstream in anticipation of what the results of certain branching operations will be, Spectre exploits the timing differences that occur when certain kinds of speculatively executed code changes what's in the processor cache. The attacker may not be able to read the memory directly, but (s)he can find out if it's in the cache by looking at those differences (in broad strokes, stuff in the cache is accessed more quickly), and/or exploit those timing changes as a way of signaling the attacking software with the actual data itself. Although only certain kinds of code can be vulnerable to this technique, an attacker could trick the processor into mistakenly speculatively executing code it wouldn't ordinarily run. These side effects are intrinsic to the processor's internal implementation of this feature, though it is made easier if you have the source code of the victim process, which is increasingly common.

Power ISA is fundamentally vulnerable going back even to the days of the original PowerPC 601, as is virtually all current architectures, and there are no simple fixes. So what's the practical impact to Power Macs? Well, not much. As far as directly executing an attacking application, there are a billion more effective ways to write a Trojan horse than this, and they would have to be PowerPC-specific (possibly even CPU-family specific due to microarchitectural changes) to be functional. It's certainly possible to devise JavaScript that could attack the cache in a similar fashion, especially since TenFourFox implements a PowerPC JIT, but such an attack would -- surprise! -- almost certainly have to be PowerPC-specific too, and the TenFourFox JIT doesn't easily give up the instruction sequences necessary. Either way, even if the attacker knew exactly the memory they wanted to read and went to its address immediately, the attack would be rather slow on a Power Mac and you'd definitely notice the CPU usage whether it succeeded or not.

There are ways to stop speculative execution using certain instructions the processor must serialize, but this can seriously harm performance: speculative execution, after all, is a way to keep the processor busy with (hopefully) useful work while it waits for previous instructions to complete. On PowerPC, cache manipulation instructions, some kinds of special-purpose register accesses and even instructions like b . (branch to the next instruction, essentially a no-op) can halt speculative execution with a sometimes notable time penalty. I think there may be some ways we can harden the TenFourFox JIT with these instructions used selectively to reduce their overhead, though as I say, I don't find such attacks very practical on our geriatric machines in general.

Anyway, you can sleep well, because everybody's all in the same boat. Perhaps it's time to dust off those old strict CPUs. The world needs a port of Classilla to the Commodore 64. :)

7 comments:

  1. Early (well, up to early 80s) microprocessors were so nice for that reason. Always below 16mhz, 2cycles/instruction at best but the code would execute the way it appears, not dispatch to 12 different processing units(each which has N execution stages) after being rescheduled internally. What I deplore most is the fact microcomputers became progressively money-making tools for corporations. They'll never come back to being as friendly as they were 30 years ago :( Just felt like expressing myself here.

    Cameron, would additional TenFourFox builders (for things like testing changesets) be of any help?

    ReplyDelete
    Replies
    1. Wanted to add the cache-related vulnerability made me think about the nice eieio instruction that forced in-order uncached memory operations. Sad it is not part of the POWER ISA (PowerPC only)

      Delete
    2. See more on that in the next blog post. :)

      If you'd like to be a test builder, be my guest. At minimum it keeps the builds reproducible.

      Delete
    3. I was asking because I already managed to get a build running. Macports demands a bit more patience on tiger than on leopard, but it ended up working.

      I had to compile a source in libyuv manually with -O2 instead of -O3 to prevent an ICE with gcc-4.8.5 (haven't looked thoroughly for a 4.8.2 portfile since this worked out). Something else I observed is the libgcc/c++ from your TFF binary packages perform faster than my system's(i.e. in browser benchmarks). That's mainly what differs.

      Posting this comment with the resulting build. Can this count as the first Mozilla app I manage to build? What I remember from trying to build Fx from source is a lot scarier. Haha.

      Delete
    4. Hey, congrats! I build Fx on my i7 MBA and, yeah, it feels like a bigger deal. I don't think mach has really improved the build system much.

      Interesting about the ICE with 4.8.5. What file went bad? Was it one of the unified source files?

      I don't remember doing anything special with the distribution libgcc/c++; they were basically copied and relinked directly out of /opt.

      Delete
    5. I did not keep the output but I remember /media/source/libyuv/row_common.cc made some references that triggered a list of something like a dozen undefined asm symbols (LC01 and so on) in config/rs6000/rs6000.c . I had 4.9.4 installed beforehand and decided to test out what would be the outcome since libyuv isn't much far ahead of the start of the build. 4.9.4 had the same exact bug so I reverted to 4.8.5 and the file compiled as soon as I replaced O3 wih O2.

      And I'm confused about the libraries; They also aren't the same size. Maybe it's a question of package versions.

      The build itself took around 5 hours to complete on the dual 1500mhz MDD. Is -pipe useable on with your G5 that has maxed ram?

      Delete
  2. This comment has been removed by the author.

    ReplyDelete

Due to an increased frequency of spam, comments are now subject to moderation.