Tengen Lockout Chip Dump Files ------------------------------ 11/29/06 K.Horton ---- lock1.bin - dump of the lockout sequence when TEST (pin 4) is pulled high. lock2.bin - dump of the sequence when TEST is low until 48h clocks later. I have constructed and run a special test device that dumps data from the Tengen lockout chip using the built in test modes. Here's a pinout of the Tengen chip: .--V--. Dout -|1 16|- VCC Din -| |- T7 Mode -| |- T6 Test -| |- T5 T0 -| |- T4 CLK -| |- T3 RST -| |- T2 GND -|8 9|- T1 `-----' I have arbitrarally designated the 8 test output pins T0 through T7. Here's how the mode/test pins work: To put the chip into test mode, pull TEST high. This will take the 8 Tx pins out of tristate mode, and they will start to reflect the internal state of the chip! If this pin is pulled low, these outputs are disabled, and are open circuit. You must pull TEST high AFTER the master lockout chip (i.e. the one in the NES) finished resetting the chip, however. This took exactly 72 clocks from when I pulled "user reset" low on the master lockout chip. "user reset" is the reset button input on the master. The "mode" pin then selects 1 of 2 sets of data that come out the 8 test pins. There's one more quirk with the test pin- if it's high when the lockout chips start communicating, it will go into a secondary test mode and execute code at C0-FFh. The lock1.bin file is what happens if test is held high. lock2.bin is what happens when test is held low until 72 cycles after reset is deasserted on the master. As for the format of the file, it is set up like so: byte: 0 1 2 7 0 7 0 7 0 --------- --------- --------- LLLL QQQQ HHHH BBBB CCCC DQxR L: lower 4 address bits H: upper 4 address bits Q: ALU output B: 2 RAM control pins, 2 clock phases C: 4 bit counter. incremented every clock cycle D: pin 2 of the Tengen chip- Din to chip Q: pin 1 of the Tengen chip- Dout from chip R: pin 7 of the Tengen chip- reset After every 3 byte record, the chips were clocked once. -------- tengenq.bin bit order: The ROM is set up as an array of 12 bit words. The ROM is physically designed as 12 separate 256 bit blocks. Here's a pictorial of the entire ROM array. The "xxx" on the inside edges are the decode logic, and it terminates in a single bit line. There are physically 8 columns in each block, and 32 rows. The 32 rows terminate in the single bit line. The columns are shared on each set of 6 blocks. NOTE: The ROM array is mirrored about the Y axis. NOTE: numbers are the bit # of that block. /decode\ /decode\ +--------+ +--------+ | |x x| | | 11 |x-bit bit-x| 5 | | |x x| | +--------+ +--------+ | |x x| | | 10 |x-bit bit-x| 4 | | |x x| | +--------+ +--------+ | |x x| | | 9 |x-bit bit-x| 3 | | |x x| | +--------+ +--------+ | |x x| | | 8 |x-bit bit-x| 2 | | |x x| | +--------+ +--------+ | |x x| | | 7 |x-bit bit-x| 1 | | |x x| | +--------+ +--------+ | |x x| | | 6 |x-bit bit-x| 0 | | |x x| | +--------+ +--------+ Each of the 12 blocks are identical, except half being a mirror image. The decoding is thus the same in the Y axis, but mirrored on the X axis like so: (Block closeup, left hand block. Right hand blocks have columns swapped.) from column decoder +----------------------------+ | FF 3F 1F |x | FE 3E 1E |x | FD 3D 1D |xx <- binary muxer tree | . . . |xx | . . . |xxx | . * * * . . |xxxx- bit line | . . . |xxx | . . . |xx | E2 22 2 |xx | E1 21 1 |x | E0 20 0 |x +----------------------------+ Clock generator timing signals: The clock generator is composed of a walking ring counter, 4 decoding gates, and finally 4 latches. The latches latch on the FALLING edge of the clock- their clock inputs are swapped with respect to the walking ring counter's clocks! This effectively stretches things out a bit. I have indicated this below by showing the rising and falling edges. The walking ring counter, decoder, and latch outputs are below, along with misc. signals of interest. (ctr) (decoder) (latch) (CLK #) (rd) (rd)(wr) 42 40 64 65 61 41 62 66 63 43 0 1 2 3 93 14 14 141 141 92 93 ----------------------------------------------------------------------- r 0 0 0 1 1 1 1 1 1 0 1 0 0 0 1 1 1 0 0 d 1 f 0 0 0 1 1 1 0 1 1 1 0 1 0 0 1 1 1 0 0 d 1 r 1 0 1 0 1 1 0 1 1 1 0 1 0 0 0 1 1 1 1 1 0 f 1 0 1 0 1 1 1 0 1 1 0 0 1 0 0 0 1 1 d 1 0 r 1 1 1 1 0 1 1 0 1 1 0 0 1 0 0 0 1 1 d 1 0 f 1 1 1 1 0 1 1 1 0 1 0 0 0 1 0 1 1 1 1 1 0 r 0 1 1 1 1 0 1 1 0 1 0 0 0 1 0 1 1 1 1 1 0 f 0 1 1 1 1 0 1 1 1 0 1 0 0 0 1 1 1 0 0 d 1 D: delayed signal, approx 1/4th clock I suspect. stretched out with the delay, the read signal looks like this: 144 is the RAM_RD line, 148 is the A4 latch update line. (idle) (read) CLK2 144 144 148 r 0 0 0 0 r 0 0 0 0 f 0 0 0 0 f 0 0 0 0 r 0 1 1 0 r 0 1 1 0 f 1 1 1 0 f 1 1 0 0 r 1 1 0 0 r 1 1 0 0 f 0 1 0 1 f 0 1 1 0 r 0 1 1 0 r 0 1 1 0 f 0 0 0 0 f 0 0 0 0 Update order: this is the order in which latches are updated, which can show pipeline/multiple operation order. CLK# 0 Program counter is updated 0.5 RAM written to 1 Dout pin, RAM A4 holding register (both from carry flag), adder latch 1.5 RAM data latch updated 2 RAM address latch, accumulator updated 3 ROM data latched, RAM data updated, A4 holding transferred to A4 latch --- --- --------------------- --- --- CPU behaviour ------------- Now that the schematic for the CPU has been traced out, I have started to decode the opcodes that the chip can execute. So far, here is what I know. These write to the accumulator: 0000 1xxx xxxx 0010 0xxx xxxx 0010 1xxx xxxx 0100 1xxx xxxx 0110 0xxx xxxx 0110 1xxx xxxx 1000 1xxx xxxx 1010 0xxx xxxx 1010 1xxx xxxx Select operand A: xxxx xx0x xxxx select ROM xxxx xx1x xxxx select RAM select operand B: xxxx x0xx xxxx select RAM pointer xxxx x1xx xxxx select accumulator Zero operand A: 01xx xxxx xxxx normal operation (do not zero anything) 00xx xxxx xxxx zero ROM data input (but not RAM data) 10xx xxxx xxxx zero ROM data input (but not RAM data) 11xx xxxx xxxx zero ROM data input (but not RAM data, never used directly since this is JMP) Zero operand B: xx0x xxxx xxxx zero RAM address latch (but not the accumulator) xx1x xxxx xxxx normal operation (do not zero anything) Select carry source: xx00 0xxx xxxx Carry source is RAM/ROM D0, selected by D5 (ALU oper. A input bit 0, basically) accept/deny carry: 0xxx xxx1 xxxx carry is accepted into the carry flag force RAM address: 10xx xxxx x1xx force RAM address 0 select ALU carry input: 10xx xxx1 xxxx Din pin 00xx xxxx xx00 zero 00xx xxxx xx01 carry flag 00xx xxxx xx10 one 00xx xxxx xx11 inverted carry flag carry chain zapping: 00xx xxxx x1xx zap carry chains on ALU- turning adder into XNOR i.e. result = operand A + operand B turns into: result = operand A XNOR operand B. No carry inputs are accepted, though the output carry from D3 could be fed into the carry flag if enabled via the other bits. adder latch control: 00xx xxxx xxxx update adder latch 01xx xxxx xxxx update adder latch 10xx xxxx xxxx update adder latch 11xx xxxx xxxx do not update adder latch (for jumps) xxxx 00xx xxxx write ALU result to RAM address latch (unless a JMP opcode is executing) 10xx xxxx xx1x latch carry flag to Dout pin 10xx xxxx xxx1 latch carry flag to RAM A4 when address latch is updated 10xx xxxx 1xxx update RAM data latches into ALU from RAM array 00xx xxxx 1xxx same as above RAM writes: 0001 0xxx xxxx 0010 0xxx xxxx 0011 0xxx xxxx 0101 0xxx xxxx 0110 0xxx xxxx 0111 0xxx xxxx 1001 0xxx xxxx 1010 0xxx xxxx 1011 0xxx xxxx RAM reads: xxx1 1xxx xxxx perform RAM read Fully decoded opcodes: ---------------------- 110x AAAA AAAA : JMP A Unconditional jump 111x AAAA AAAA : JMP NC,A Conditional jump- jump if carry is clear