Mixed-precision Fused Multiply and Add - ENS de Lyon - École normale supérieure de Lyon Accéder directement au contenu
Communication Dans Un Congrès Année : 2012

Mixed-precision Fused Multiply and Add

Résumé

The standard floating-point fused multiply and add (FMA) computes R=AB+C with a single rounding. This article investigates a variant of this operator where the addend C and the result R are of a larger format, for instance binary64 (double precision), while the multiplier inputs A and B are of a smaller format, for instance binary32 (single precision). With minor modifications, this operator is also able to perform the standard FMA in the smaller format, and the standard addition in the larger format. For sum-of-product applications, the proposed mixed-precision FMA provides the accumulation accuracy of the larger format, at a cost that is close to that of a classical FMA in the smaller format. Besides, it is fully compatible with existing arithmetic and language standards. The architectural cost of this operator is analysed in detail. An implementation of a mixed binary32/binary64 operator fully supporting subnormal numbers, binary64 addition and binary32 FMA is demonstrated and evaluated: its area overhead is one third over the classical binary32 FMA. Similarly, in high-end processors, a mixed binary64/binary128 FMA could provide an adequate solution to the binary128 requirements of very large scale computing applications.
Fichier principal
Vignette du fichier
mpfma.pdf (127.87 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

ensl-00642157 , version 1 (17-11-2011)

Identifiants

  • HAL Id : ensl-00642157 , version 1

Citer

Nicolas Brunie, Florent de Dinechin, Benoît de Dinechin. Mixed-precision Fused Multiply and Add. 45th Asilomar Conference on Signals, Systems & Computers, Nov 2011, United States. pp.165-169. ⟨ensl-00642157⟩
293 Consultations
2921 Téléchargements

Partager

Gmail Facebook X LinkedIn More