Dave Mugridge also wrote some articles about the ARM1, focusing more on the ALU and registers: https://daveshacks.blogspot.com/2015/12/inside-alu-of-armv1-...
For http://visual6502.org/sim/varm/armgl.html, would be much nicer that dragging would pan rather than "3D rotate" the view. The panning with wasd is too slow and not compatible with some keyboard layouts.
And of course zooming around mouse cursor rather than around center of screen would also help to zoom towards the part you want.
The 3D rotation is gimmicky but not actually useful to see the gates, and the current UI just doesn't let me zoom to gates I want without spending too much effort fighting the slow panning and the zooming target.
>"One very nice thing about the 32-bit instruction set is its pervasive conditional execution, which helps one avoid branching over code. For example, this sequence of instructions resets the register r0 to 0 if its value is equal to or less than zero, or forces its value to 1 if its value is greater than zero:
CMP r0, #0 ; if (r0 <= 0) MOVLE r0, #0 ; r0 = 0; MOVGT r0, #1 ; else r0 = 1
Without the conditional moves (MOVLE and MOVGT) after the compare (CMP), you'd have to branch after the compare, which is wasteful."
How are those those two conditional moves after the CMP operation more efficient than branching? Aren't they kind of branches themselves? What would the alternative "branching" sequence look like then?
The big deal is the conditional branch (the bgt). If the processor gets it wrong it's a pipeline flush. And best case you still have extra instructions for the branches. The conditional mov example is a fixed cost of a single "wasted" cycle, which matches the best case of the branching example (branch correctly predicted to mov r0,#1 and fall through). The worst case for the branching version is probably somewhere ~15 cycles depending on the uArch, but is still 1 cycle for the conditional move.
cmp r0,#0 bgt .1f mov r0,#0 b .2f 1: mov r0,#1 2:
All of that being said, the branching version tends to be nicer for OoO cores since there aren't data dependencies on the flag registers any more, hence why you see RISC ISAs designed for OoO cores removing conditional execution for most instructions (AArch64 and RISC-V standout here).
So for a simple if/else, it was usually both less code and faster to use a straight line of conditional instructions. In more complicated cases, if the programmer was feeling clever, it was possible to update the status flags to get three-way (or more!) conditionals in straight-line branchless code. Fun!
The idea here is that a branch results in a pipeline flush which takes a couple of cycles to refill.
In practice, most CPUs have very good branch predictors these days and conditional moves aren’t all that useful anymore.
That’s probably the reason why they don’t exist for later ISAs such as RISC-V.
>"The ARM2 had pretty much the same instruction set as the ARM1, although featured new multiplication and (later) atomic swap instructions."
Does this mean that the ARM1 didn't support any atomic operations or were they using something else besides "compare and swap"?