Você está em: Start > Publications > View > On the Performance Effect of Loop Trace Window Size on Scheduling for Configurable Coarse Grain Loop Accelerators

Map of Premises

Publication

Publication Search

Publications

On the Performance Effect of Loop Trace Window Size on Scheduling for Configurable Coarse Grain Loop Accelerators

Title

On the Performance Effect of Loop Trace Window Size on Scheduling for Configurable Coarse Grain Loop AcceleratorsExport publication in the APA format Export publication in the EXCEL format Export publication in the RIS format

Type

Article in International Conference Proceedings Book

Date

2021

Title

On the Performance Effect of Loop Trace Window Size on Scheduling for Configurable Coarse Grain Loop Accelerators

Type

Article in International Conference Proceedings Book

Year

2021

Authors

Santos, T

(Author)

Other

The person does not belong to the institution. The person does not belong to the institution. The person does not belong to the institution. Without AUTHENTICUS Without ORCID

Nuno Paulino

(Author)

FEUP

View Personal Page You do not have permissions to view the institutional email. Search for Participant Publications View Authenticus page View ORCID page

João Bispo

(Author)

FEUP

View Personal Page Send message Search for Participant Publications View Authenticus page View ORCID page

João M. P. Cardoso

(Author)

FEUP

View Personal Page Send message Search for Participant Publications View Authenticus page View ORCID page

João Canas Ferreira

(Author)

FEUP

View Personal Page Send message Search for Participant Publications View Authenticus page View ORCID page

Conference proceedings International

Title: International Conference on Field-Programmable Technology, (IC)FPT 2021, Auckland, New Zealand, December 6-10, 2021 Search for Conference Proceedings Publications

Pages: 65-68

20th International Conference on Field-Programmable Technology, ICFPT 2021

6 December 2021 through 10 December 2021

Indexing

ISI Web of Knowledge - 0 Citations

Scopus - 0 Citations

Other information

Authenticus ID: P-00V-V19

DOI: 10.1109/icfpt52863.2021.9609868

Abstract (EN): By using Dynamic Binary Translation, instruction traces from pre-compiled applications can be offloaded, at runtime, to FPGA-based accelerators, such as Coarse-Grained Loop Accelerators, in a transparent way. However, scheduling onto coarse-grain accelerators is challenging, with two of current known issues being the density of computations that can be mapped, and the effects of memory accesses on performance. Using an in-house framework for analysis of instruction traces, we explore the effect of different window sizes when applying list scheduling, to map the window operations to a coarse-grain loop accelerator model that has been previously experimentally validated. For all window sizes, we vary the number of ALUs and memory ports available in the model, and comment how these parameters affect the resulting latency. For a set of benchmarks taken from the PolyBench suite, compiled for the 32-bit MicroBlaze softcore, we have achieved an average iteration speedup of 5.10x for a basic block repeated 5 times and scheduled with 8 ALUs and memory ports, and an average speedup of 5.46x when not considering resource constraints. We also identify which benchmarks contribute to the difference between these two speedups, and breakdown their limiting factors. Finally, we reflect on the impact memory dependencies have on scheduling.

Language: English

Type (Professor's evaluation): Scientific

No. of pages: 4

Documents

We could not find any documents associated to the publication.

Recommend this page Top

Copyright 1996-2025 © Faculdade de Direito da Universidade do Porto I Terms and Conditions I Acessibility I Index A-Z
Page created on: 2025-08-26 at 21:32:12 | Privacy Policy | Personal Data Protection Policy | Whistleblowing