Here is the summary.
It runs from 3.75 MHz to 4 MHz (depending on date of manufacture and model G/G+/GX), which is internally (probably... we're not sure yet. Could be clock stretching.) subdivided into quarter cycles. Since that wouldn't be weird enough, as long as the LCD is enabled (which is pretty much always) about 10% of CPU time is always spent refreshing the display. So practically speaking, we have 3.5 million CPU cycles ("ticks") available each second. Might sound much, but it really (really) isn't.
Memory runs at 2 MHz. It's a "DX2".
What does it have?
Four 64bit, semi-general purpose registers: A,B,C and D.
Five 64bit, "scratch" registers: R0, R1, R2, R3, R4 and R5.
Two 20bit address registers: D0 and D1.
One 16bit status register: ST.
One 4bit pointer register: P.
One 20bit, 8-level hardware stack: "RSTK".
One 20bit program counter: PC.
One carry flag: "CRY" or "C" (not to be confused with the C register).
Four hardware status flags: MP, SR, SB, XM.
Its address space is 512 KB, not 1 MB, because the data bus is 4 bits wide, not 8.
What can you do with those?
With A, B, C and D, you can...
... clear it: r=0,
negate it (2's complement): r=-r,
invert it (logical NOT): r=-r-1,
increment or decrement it: r=r+1, r=r-1,
add or subtract a constant from 0 to 15: r=r+CON, r=r-CON,
shift it one bit to the left or right: r=r+r, rSRB,
shift it one nibble (4bits) to the left or right: rSL, rSR,
shift it one nibble to the left or right *circularly*: rSLC, rSRC,
test it: ?r=0, ?r#0,
(for r= A,B and D) add, subtract, OR or AND with C: r=r+C, r=r-C, r=C-r, r=r!C, r=r&C,
(for r= A,B and D) exchange it with C: rCEX,
(for r= A, B and D)compare it with C: ?r<C, ?r<=C, ?r=C, ?r>=C, ?r>C, ?r#C.
You can not do anything between B and D.
So, A can interact with B and C.
B can interact with A and C.
C can interact with A, B and D.
D can interact only with C.
With A and C you can also...
access the address registers: D0=A, AD0EX and similarly for D1 and/or C,
read from and write to memory: A=DAT0, C=DAT0, A=DAT1, C=DAT1, DAT0=A, DAT0=C, ...
access R0...R4: A=R0, R1=A, AR0EX, AR3EX, CR2EX, etc
test their first 16 bits: ?ABIT=0 13
set/clear their first 16 bits: ABIT=0 12, CBIT=1 10
With C you can also...
access the stack: C=RSTK, RSTK=C,
access the program counter: C=PC, PC=C, PC=(C), CPCEX
access the pointer register: C=P, P=C, CPEX, C=C+P+1 (lolwat)
One note: for most of the commands, you can (or must) specify a "field" operand. For example, "A=0 B" clears only the first byte, leaving the other 14 bytes unaffected. This extends to arithmetic, logic and some other operations. There are the B, X, XS, M, S, A, W and P fields.
You can do conditional ( GOC, GONC ) and unconditional jumps ( GOTO, GOLONG). You can also call subroutines ( GOSUB ) and RTN from them
There are some various other bits and pieces here and there, but these should keep your noggin dizzy for a while.
Next time, we'll attempt some graphics and observe just how miserably slow is this CPU.