Date of Award:


Document Type:


Degree Name:

Master of Science (MS)


Electrical and Computer Engineering


Aravind Dasu


Fast Block Matching (FBM) algorithms for video compression are well suited for acceleration using parallel data-path architectures on Field Programmable Gate Arrays (FPGAs). However, designing an efficient on-chip memory subsystem to provide the required throughput to this parallel data-path architecture is a complex problem. This thesis presents a memory architecture template that can be parameterized for a given FBM algorithm, number of parallel Processing Elements (PEs), and block size. The template can be parameterized with well known exploration techniques to design efficient on-chip memory subsystems. The memory subsystems are derived for two existing FBM algorithms and are implemented on a Xilinx Virtex 4 family of FPGAs. Results show that the derived memory subsystem in the best case supports up to 27 more parallel PEs than the three existing subsystems and processes integer pixels in a 1080p video sequence up to a rate of 73 frames per second. The speculative execution of an FBM algorithm for the same number of PEs increases the number of frames processed per second by 49%.