Effective data access patterns on massively parallel processors (Contributo in volume (capitolo o saggio))

Type
Label
  • Effective data access patterns on massively parallel processors (Contributo in volume (capitolo o saggio)) (literal)
Anno
  • 2014-01-01T00:00:00+01:00 (literal)
Alternative label
  • Capannini G., Baraglia R., Silvestri F., Nardini F. M. (2014)
    Effective data access patterns on massively parallel processors
    John Wiley & Sons Inc., Hoboken (Stati Uniti d'America) in High-Performance Computing on Complex Environments, 2014
    (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#autori
  • Capannini G., Baraglia R., Silvestri F., Nardini F. M. (literal)
Pagina inizio
  • 115 (literal)
Pagina fine
  • 134 (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#titoloVolume
  • High-Performance Computing on Complex Environments (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#pagineTotali
  • 11 (literal)
Note
  • PuMa (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#affiliazioni
  • D&IT Chalmers, Göteborg, Sweden; CNR-ISTI, Pisa, Italy; CNR-ISTI, Pisa, Italy; CNR-ISTI, Pisa, Italy (literal)
Titolo
  • Effective data access patterns on massively parallel processors (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#isbn
  • 978-1-118-71205-4 (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#curatoriVolume
  • Emmanuel Jeannot, Julius Zilinskas (literal)
Abstract
  • The new generation of microprocessors incorporates a huge number of cores on the same chip. This trades single-core performance off for the total amount of work done across multiple threads of execution. Graphics Processing Units (GPUs) are an example of this kind of architectures. The first generation of GPUs has been designed to support a fixed set of rendering functions. Nowa- days, GPUs are becoming easier to program. Therefore, they can be used for applications that have been traditionally handled by CPUs. The reasons of using General Purpose GPU (GPGPUs) in high-performance computations are: raw computing power, good performance per watt, and low costs. How- ever, some important issues limit a wide exploitation of GPGPUs. The main one concerns the heterogeneous and distributed nature of the memory hierar- chy. As a consequence, the speed-up of some applications depends on being able to efficiently access the data so that all cores are able to work at the same time. This chapter discusses the characteristics and the issues of the memory systems of this kind of architectures. We analyze these architectures from a theoretical point by using K-model, a model for capturing their performance constraints. K -model is used to estimate the complexity of a given algorithm defined on this model. This chapter describes how K-model can also be used to design efficient data access patterns for implementing efficient GPU algorithms. To this extent, we use K -model to derive an efficient realization of two popular algorithms, i.e., prefix sum and sorting. By means of reproducible experiments, we validate theoretical results showing that the optimization of an algorithm based on K-model corresponds to an actual optimization in practice. (literal)
Editore
Prodotto di
Autore CNR
Insieme di parole chiave

Incoming links:


Prodotto
Autore CNR di
Editore di
Insieme di parole chiave di
data.CNR.it