ajuda

Você está em: Início > Publicações > Visualização > On the Performance Effect of Loop Trace Window Size on Scheduling for Configurable Coarse Grain Loop Accelerators

Mapa das Instalações

Publicação

Pesquisa de Publicações

On the Performance Effect of Loop Trace Window Size on Scheduling for Configurable Coarse Grain Loop Accelerators

Título

On the Performance Effect of Loop Trace Window Size on Scheduling for Configurable Coarse Grain Loop AcceleratorsExportar publicação no formato APA Exportar publicação no formato EXCEL Exportar publicação no formato RIS

Tipo

Artigo em Livro de Atas de Conferência Internacional

Data

2021

Título

On the Performance Effect of Loop Trace Window Size on Scheduling for Configurable Coarse Grain Loop Accelerators

Tipo

Artigo em Livro de Atas de Conferência Internacional

Ano

2021

Autores

Santos, T

(Autor)

Outra

A pessoa não pertence à instituição. A pessoa não pertence à instituição. A pessoa não pertence à instituição. Sem AUTHENTICUS Sem ORCID

Nuno Paulino

(Autor)

FEUP

Ver página pessoal Sem permissões para visualizar e-mail institucional Pesquisar Publicações do Participante Ver página do Authenticus Ver página ORCID

João Bispo

(Autor)

FEUP

Ver página pessoal Enviar mensagem Pesquisar Publicações do Participante Ver página do Authenticus Ver página ORCID

João M. P. Cardoso

(Autor)

FEUP

Ver página pessoal Enviar mensagem Pesquisar Publicações do Participante Ver página do Authenticus Ver página ORCID

João Canas Ferreira

(Autor)

FEUP

Ver página pessoal Enviar mensagem Pesquisar Publicações do Participante Ver página do Authenticus Ver página ORCID

Ata de Conferência Internacional

Título: International Conference on Field-Programmable Technology, (IC)FPT 2021, Auckland, New Zealand, December 6-10, 2021 Pesquisar Publicações da Ata de Conferência

Páginas: 65-68

20th International Conference on Field-Programmable Technology, ICFPT 2021

6 December 2021 through 10 December 2021

Indexação

ISI Web of Knowledge - 0 Citações

Scopus - 0 Citações

Outras Informações

ID Authenticus: P-00V-V19

DOI: 10.1109/icfpt52863.2021.9609868

Abstract (EN): By using Dynamic Binary Translation, instruction traces from pre-compiled applications can be offloaded, at runtime, to FPGA-based accelerators, such as Coarse-Grained Loop Accelerators, in a transparent way. However, scheduling onto coarse-grain accelerators is challenging, with two of current known issues being the density of computations that can be mapped, and the effects of memory accesses on performance. Using an in-house framework for analysis of instruction traces, we explore the effect of different window sizes when applying list scheduling, to map the window operations to a coarse-grain loop accelerator model that has been previously experimentally validated. For all window sizes, we vary the number of ALUs and memory ports available in the model, and comment how these parameters affect the resulting latency. For a set of benchmarks taken from the PolyBench suite, compiled for the 32-bit MicroBlaze softcore, we have achieved an average iteration speedup of 5.10x for a basic block repeated 5 times and scheduled with 8 ALUs and memory ports, and an average speedup of 5.46x when not considering resource constraints. We also identify which benchmarks contribute to the difference between these two speedups, and breakdown their limiting factors. Finally, we reflect on the impact memory dependencies have on scheduling.

Idioma: Inglês

Tipo (Avaliação Docente): Científica

Nº de páginas: 4

Documentos

Não foi encontrado nenhum documento associado à publicação.

Recomendar Página Voltar ao Topo

Copyright 1996-2025 © Faculdade de Medicina Dentária da Universidade do Porto I Termos e Condições I Acessibilidade I Índice A-Z
Página gerada em: 2025-07-31 às 00:06:32 | Política de Privacidade | Política de Proteção de Dados Pessoais | Denúncias | Livro Amarelo Eletrónico