A Multi-GPU Implementation of a D2Q37 Lattice Boltzmann Code (Contributo in atti di convegno)

Type
Label
  • A Multi-GPU Implementation of a D2Q37 Lattice Boltzmann Code (Contributo in atti di convegno) (literal)
Anno
  • 2012-01-01T00:00:00+01:00 (literal)
Alternative label
  • Biferale, Luca and Mantovani, Filippo and Pivanti, Marcello and Pozzati, Fabio and Sbragaglia, Mauro and Scagliarini, Andrea and Schifano, Sebastiano Fabio and Toschi, Federico and Tripiccione, Raffaele (2012)
    A Multi-GPU Implementation of a D2Q37 Lattice Boltzmann Code
    in 9th International Conference on Parallel Processing and Applied Mathematics (PPAM), Torun, POLAND, SEP 11-14, 2011, Torun, POLAND, SEP 11-14, 2011
    (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#autori
  • Biferale, Luca and Mantovani, Filippo and Pivanti, Marcello and Pozzati, Fabio and Sbragaglia, Mauro and Scagliarini, Andrea and Schifano, Sebastiano Fabio and Toschi, Federico and Tripiccione, Raffaele (literal)
Pagina inizio
  • 640 (literal)
Pagina fine
  • 650 (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#altreInformazioni
  • 9th International Conference on Parallel Processing and Applied Mathematics (PPAM), Torun, POLAND, SEP 11-14, 2011 (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#titoloVolume
  • PARALLEL PROCESSING AND APPLIED MATHEMATICS, PT I (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#numeroVolume
  • 7203 (literal)
Note
  • ISI Web of Science (WOS) (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#affiliazioni
  • Biferale, L (Reprint Author), Univ Roma Tor Vergata, Rome, Italy. Biferale, Luca; Sbragaglia, Mauro, Univ Roma Tor Vergata, Rome, Italy. (literal)
Titolo
  • A Multi-GPU Implementation of a D2Q37 Lattice Boltzmann Code (literal)
Http://www.cnr.it/ontology/cnr/pubblicazioni.owl#isbn
  • 978-3-642-31463-6 (literal)
Abstract
  • We describe a parallel implementation of a compressible Lattice Boltzmann code on a multi-GPU cluster based on Nvidia Fermi processors. We analyze how to optimize the algorithm for GP-GPU architectures, describe the implementation choices that we have adopted and compare our performance results with an implementation optimized for latest generation multi-core CPUs. Our program runs at approximate to 30% of the double-precision peak performance of one GPU and shows almost linear scaling when run on the multi-GPU cluster. (literal)
Prodotto di
Autore CNR
Insieme di parole chiave

Incoming links:


Prodotto
Autore CNR di
Insieme di parole chiave di
data.CNR.it