Abstract (EN):
Instructions for concurrent processing of smaller data units than whole CPU words are useful in areas like multimedia processing and cryptography. Since the processors used in FPGA-based embedded systems lack support for such applications, this paper proposes mapping sequences of subword
operations to a set of hardware components and generating
the corresponding FPGA partial configurations at run-time. The
technique is aimed at adaptive embedded systems that employ
run-time reconfiguration to achieve high flexibility and performance. New partial configurations for circuits implementing sets
of subword operations are created by merging together the
relocated partial configurations of the hardware components
(from a predefined library), and the configurations of the switch
matrices used for the connections between the components. The paper presents and discusses results obtained for a 300MHz PowerPC CPU in a Virtex-II Pro platform FPGA. For the set of benchmarks analyzed, the complete configuration creation process takes between 5 s and 60 s. The run-time generated hardware versions achieved speed-ups between 17 and 73 over the software versions.
Language:
English
Type (Professor's evaluation):
Scientific
Contact:
jcf@fe.up.pt
Notes:
Extended version of this paper published in a journal (Microprocessors and Microsystems)