| Benchmark | C | ASM | Assembler Output | Description | |
|---|---|---|---|---|---|
| 1A | c | asm | html | mem | Copy a 16-word block of memory a total of 16 times. |
| 1B | c | asm | html | mem | |
| 2 | c | asm | html | mem | Calculate the 16th Fibonacci number recursively. |
Benchmarks 1A and 1B both perform the same function. We chose to implement it first with our mtm (memory-to-memory copy) operation, and second (1B) with lw and sw. This will allow us to compare them and speculate as to how effective mtm is. Since the RTL for mem required no changes to the datapath and only minimal changes to the control, it is unlikely that removing it would change our maximum clock speed. Thus, it is fair to directly compare the running times of Benchmarks 1A and 1B.
We calculated the following values for each benchmark:
| Benchmark | LOC | I | CYC | CPI | RT |
|---|---|---|---|---|---|
| 1A | 17 | 1607 | 6408 | 3.986 | 0.650360 ms |
| 1B | 18 | 1863 | 7432 | 3.989 | 0.754288 ms |
| 2 | 36 | 67801 | 243080 | 3.585 | 24.671 ms |
These calculations are based on a clock speed of 9.853 MHz, the maximum clock speed reported by the Xilinx Foundation Series software.
The calculations were performed in Excel 2000
The mtm instruction saved 1024 cycles (0.104 ms) when Benchmark 1, as compared to using lw and sw.