Bringing Up a Reverse Engineered CPU

I have previously written about reverse engineering a CPU, and how we extract a Verilog netlist from a layout. Getting the netlist is only the half-way point. Now it must be verified.

The netlist of the CPU might have 50 to 100 connections, and since we extracted if from a physical layout, we don’t really know which signal is which. Loading the netlist into a Verilog simulator will generate errors on all of the inputs, since they will not be connected to anything. Most of the inputs can be identified by tracing the signals back to IO pads. If a line goes to the Reset pin, then it is probably reset. And we can identify the clock the same way. The databus is important to identify, though it can be difficult to determine which line is D0, and which line is D7. Other inputs can be tied off to a valid logic level, and identified later on.

An initial milestone is to get the circuit to compile correctly in the simulator. The simulator performs a check on the netlist, and if we can get to 0 nanoseconds, that means that all of the inputs have been identified, and there are no syntax errors in the netlist.

Now it is time to do something. Reset the part, start the clock, and see what happens. Initially, we put a NOP instruction on the databus, using a weak strength. This fixes any bus directionality problems.

The biggest problem we face at this stage is unknowns propagating throughout the simulation. Old designs frequently did not reset everyone. A divide-by-two in the clock circuit was a don’t care for which state it powered up in. In real life, it might not matter, but it certainly does matter in a logic simulation. So, we frequently have to add a global reset to all latches and flops that either don’t have a reset, or whose reset is not controlled by the reset pin.

Once the POR resets are dealt with, the next step is to look at those unknowns that are generated in the simulation. These can either be a result of a modeling issue that creates conflicts, or they can be a result of setup and hold violations. Once identified, both are easy to fix.

The next step is a quick look at all of the opcodes. Do they run? Do they appear to work? This check reveals gross errors that need to be fixed before the next level of checks.

This is also the time to modify the test bench so that the CPU is executing code out of a Verilog memory, with read and write control. We also have to set up scripts so that we can write an assembly language program, assemble it, and then translate the HEX file into a Verilog memory. This makes it very easy to generate stimulus for the CPU.

Now we go into a detailed check. At this level, we have a basic routine that initializes the processor, runs an opcode, and dumps the processor status so that we can see the results of the opcode, including any flags that might be set or cleared. We also want the opcode to have both set and cleared any flags, and to have taken and skipped any conditional branches. We make these checks self-checking, which reduces the amount of engineering time required to analyze the results. We also extract a test vector from these simulations, and use it to test the original part. This guarantees that we have created an exact copy.

Later, I will discuss design verification through the use of FPGA based emulators.