The SGX instructions are essentially key management for full memory encryption. Does he really think memory encryption could be implemented in microcode?
That's an obvious flaw of his proposal.
I'm really surprised no paper reviewer caught that. What happened to peer review?
Yes? Why not? This argument rather relies on knowing what the (trade secret) capabilities of the microcode are. I don't see why this warrants a smear on peer review.
If you read the paper carefully, there are many other arguments one can make against it. However, it is an interesting paper and its message is worth considering.
Peer-review is not a perfect process. I think the research community is better off with publishing such a paper.
- Tooling is terrible and non-free
- Power consumption
- Tiny die area FPGA not that useful, huge die FPGA is wasting die you could use for cache
- Cost of the nonintegrated solution is huge e.g. http://www.nallatech.com/store/fpga-accelerated-computing/pc...
It is really hard to get a young programmer excited about programming FPGAs when the tools are archaic, buggy, proprietary, and non-portable. The higher-level interfaces (C abstractions etc.) are also lacking.
There's also a disease of people trying to write "C to HDL" converters without thinking of the very different paradigms involved there. While it can be made to work it will always be extremely inefficient.
It would be the equivalent of a newbie programmer looking at GCC and saying: "This program that makes other programs looks so archaic, it runs on an 70's computer terminal."
oh right, and the insistence on using feature wizards, which somehow work fine for the on example case but fail in mysterious ways or aren't capable of expressing what you actually want to do
its really a long way away from "gcc hello.c ; ./a.out"
then multiply that with the tool licensing and various feature sets, the IP licensing...support for different parts, entirely different tooling for different vendors.
part of it as you say is that reality is a lot more inherently complicated than software...but the tooling really is a mess. time and cost are a huge barrier to entry.
You can design and simulate without your FPGA, most of the tools allow for simulation without calling a GUI. It will be not different than calling GCC. Worst case scenario, you can always use this open source Verilog simulator, it's good enough if you are only designing for FPGAs:
Again, you don't need any FPGA to do the design and you shouldn't even use it, it's a waste of time synthesizing the design every time you make a change.
Place and Route comes after the design gets to a complete state and you should only be able to do it once.
If you are doing something different from this flow, you are most likely doing it wrong.
It's good for things that are compute heavy, but not for anything else.
I'd also wager that asics might make more sense than fpgas. Which seems to be the path we are going down
X <= X + 1;
Y <= X;
Once you start wrapping your head around these concepts reasoning isn't so hard (and having been a CMPE major once I had the basis for this academically, but it was far in the past so had been forgotten). Certainly not harder than trying to understand distributed systems and coordination between them, in fact it's a lot of the same problems just applied at a different scale.
- most programmers have a poor grasp on what the real advantages and disadvantages of FPGAs are, instead treating them as magic sauce
- you need to go and benchmark: https://electronics.stackexchange.com/questions/140618/can-f... also https://electronics.stackexchange.com/questions/132034/how-c...
Commercially they're mostly used for low-latency predictable-throughput signal processing.
In the embedded world they're available and not even super new, the Zynq from Xilinx has been out for a few years. It combines an FPGA fabric with hard ARM CPU core(s).
Intel (through the acquisition of Altera) also has such products, like Stratix which also combines FPGA and ARM. I seem to remember seing some x86+FPGA for embedded applications from Intel, but didn't find it with a quick search.
I feel you might be understating this! FPGA's and ASIC's have in fact been in use for decades now, they are very much ingrained into many high end products. Any digital oscilloscope has an FPGA, and has done for a very long time, for example. These are often either used in combination with a CPU for the UI part. The CPU might either be an IC on the PCB or what is quite popular is having the CPU be a synthesised one programmed into the FPGA.
ARM, for example, have a whole architecture specifically designed for optimal FPGA synthesis.
There are a number of relatively cheap SBC's out there with an FPGA too. http://linuxgizmos.com/tiny-sbc-runs-linux-on-xilinx-zynq-ar...
I suspect these have gone out to "select customers".
I always thought this too, but turn around time (time to market) in hardware isn't fast enough, and as the article argues, Intel has a vested interest in adding instructions. Most large companies are motivated to add complexity and keep the users spending their time on the companies projects. It doesn't pay to make simpler products that might become commodities.
Examples that I don't think would move from FPGA to CPU:
- Specialized algorithims that don't have enough users.
- Changing algorithims: encryption, HTTP2 parser, data structure/index scanning
- Tied to specific software: a game that's popular, but not enough to justify CPU instructions
- Portable FPGA: imagine an M.2 or USB style connection fast enough to run an add-on FPGA. It could be the start of making a reconfigurable computer; just add the chips for your software.
You aren't left with a niche of people who are making decisions about what CPU to buy based on what FPGAs they have, as a group, with sufficiently similar enough specs that they look like a single market to Intel or a chip maker. (That is, someone who just wants "a couple of instructions" and someone who wants "an FPGA for parsing HTTP2 requests" are dissimilar enough not to be the same thing.)
In this case, being able to name a multiplicity of very edge use cases isn't an advantage; it is another reason why it doesn't happen. You need one big use that enough people agree on that it's economical to centralize it into the CPU, not a ton of scattered ones with radically different implications in terms of die size, how the data flows in and out, what percentage of users are using it, etc. In fact, in your use cases, if we take them seriously and put them in the system in the place to maximize performance, because otherwise why bother, the FPGAs you're calling for don't even live in all the same places.
And when you get that one use case, you get specialized hardware, like GPUs.
(Incidentally, GPUs are themselves another reason. Many of the things you might previously have cited as reasons to have FPGAs twenty years ago are now efficiently implemented in GPUs. No, not by any means all such problems, but enough that it carves another massive hunk of the use-case-space out of the FPGA-on-the-CPU argument.)
Another way of looking at it is that the fundamental problem that FPGAs have is that it is physically impossible for them to beat dedicated hardware in the same configuration that you'd load into an FPGA, so once you have that popular use case, it gets put into normal silicon instead and outperforms the FPGAs.
The ARM+FPGA chips work well too, the IO blocks are decoupled enough from the CPU that you can add your own offload engines.
Unless you are using word-thinking and think that "general purpose computing" is some sort of promise that the maximally-general-purpose hardware is guaranteed to be available, in accordance with some particular ideosyncratic definition of "maximally general purpose hardware"? General purpose computing refers more to your ability to run any program you choose rather than particular hardware capabilities. It stands in opposition to platfroms where the programs are locked down, like gaming consoles.
That being said, as many others have pointed out in this thread, there are still tool-chain and conceptual hurdles to doing this effectively. If/when FPGA acceleration becomes more common-place, I think you'll see tighter integration of the FPGA with the server hardware platforms, perhaps to the point of it being on-package, if not on-die. The demand of the marketplace will dictate the pace and level of integration. That is of course also if the major CPU makers are in-tune and responsive to these market demands.
I wouldn't be shocked if Amazon, Google, etc... were designing their own server CPU chipsets from scratch, to include some features they can't get from Intel/AMD. But at least with the Intel acquisitions of Nervana and Altera, I'd say Intel seems to be aware of evolving computing needs.
But the major data center operators have been building their own solutions further up the "hardware stack" for some time. Amazon, Facebook and Google have bypassed the major equipment makers (Dell, HP, Cisco, EMC, etc...) and gone ahead with their own server and switch hardware designs to suite their specific datacenter needs.
1 - https://aws.amazon.com/ec2/instance-types/f1/
2 - https://www.recode.net/2016/8/9/12413600/intel-buys-nervana-...
3 - https://newsroom.intel.com/news-releases/intel-completes-acq...
4 - https://www.wired.com/2016/03/google-facebook-designing-open...
MIPS did something like that. Each MIPS variant required different compiler flags to get good performance. Software distribution was a huge pain, with multiple versions. MIPS peaked in the 1990s.