Benchmarks

Benchmark C ASM Assembler Output Description
1A c asm html mem Copy a 16-word block of memory a total of 16 times.
1B c asm html mem
2 c asm html mem Calculate the 16th Fibonacci number recursively.

Benchmarks 1A and 1B both perform the same function. We chose to implement it first with our mtm (memory-to-memory copy) operation, and second (1B) with lw and sw. This will allow us to compare them and speculate as to how effective mtm is. Since the RTL for mem required no changes to the datapath and only minimal changes to the control, it is unlikely that removing it would change our maximum clock speed. Thus, it is fair to directly compare the running times of Benchmarks 1A and 1B.

 

Performance Data

We calculated the following values for each benchmark:

Benchmark LOC I CYC CPI RT
1A 17 1607 6408 3.986 0.650360 ms
1B 18 1863 7432 3.989 0.754288 ms
2 36 67801 243080 3.585 24.671 ms

These calculations are based on a clock speed of 9.853 MHz, the maximum clock speed reported by the Xilinx Foundation Series software.
The calculations were performed in Excel 2000

 

The mtm instruction saved 1024 cycles (0.104 ms) when Benchmark 1, as compared to using lw and sw.