W. S. Briggs, T. B. Brightman, and D. W. Matula, Method and apparatus for performing the square root function using a rectangular aspect ratio multiplier, p.566, 1992.

W. S. Briggs and D. W. Matula, A 17x69-bit multiply and add unit with redundant binary feedback and single cycle latency, 11th Symposium on Computer Arithmetic, pp.163-170, 1993.

M. Cornea, J. Harrison, and P. T. Tang, Scientific Computing on Itanium R -based Systems, 2002.

F. De-dinechin, M. Joldes, and B. Pasca, Automatic generation of polynomial-based hardware architectures for function evaluation, ASAP 2010, 21st IEEE International Conference on Application-specific Systems, Architectures and Processors, 2010.
DOI : 10.1109/ASAP.2010.5540952

URL : https://hal.archives-ouvertes.fr/ensl-00470506

F. De-dinechin, C. Klein, and B. Pasca, Generating high-performance custom floating-point pipelines, 2009 International Conference on Field Programmable Logic and Applications, pp.59-64, 2009.
DOI : 10.1109/FPL.2009.5272553

URL : https://hal.archives-ouvertes.fr/ensl-00379154

F. De-dinechin and B. Pasca, Large multipliers with fewer DSP blocks, 2009 International Conference on Field Programmable Logic and Applications, pp.250-255, 2009.
DOI : 10.1109/FPL.2009.5272296

J. Detrey and F. De-dinechin, A Tool for Unbiased Comparison between Logarithmic and Floating-point Arithmetic, The Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology, vol.23, issue.1, pp.161-175, 2007.
DOI : 10.1007/s11265-007-0048-7

URL : https://hal.archives-ouvertes.fr/ensl-00542212

J. Detrey, F. De-dinechin, and X. Pujol, Return of the hardware floatingpoint elementary function, 18th Symposium on Computer Arithmetic, pp.161-168, 2007.
URL : https://hal.archives-ouvertes.fr/ensl-00117386

M. D. Ercegovac and T. Lang, Digital Arithmetic, 2003.
URL : https://hal.archives-ouvertes.fr/ensl-00542215

C. Jeannerod, H. Knochel, C. Monat, and G. Revy, Faster floatingpoint square root for integer processors, IEEE Symposium on Industrial Embedded Systems (SIES'07), 2007.
DOI : 10.1109/sies.2007.4297353

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.106.6002

T. Lang and P. Montuschi, Very high radix square root with prescaling and rounding and a combined division/square root unit, IEEE Transactions on Computers, vol.48, issue.8, pp.827-841, 1999.
DOI : 10.1109/12.795124

M. Langhammer, Foundation for FPGA acceleration, Fourth Annual Reconfigurable Systems Summer Institute, 2008.

B. Lee and N. Burgess, Parameterisable floating-point operators on FP- GAs, 36th Asilomar Conference on Signals, Systems, and Computers, pp.1064-1068, 2002.
DOI : 10.1109/acssc.2002.1196947

D. Lee, A. Gaffar, O. Mencer, and W. Luk, Optimizing Hardware Function Evaluation, IEEE Transactions on Computers, vol.54, issue.12, pp.1520-1531, 2005.
DOI : 10.1109/TC.2005.201

Y. Li and W. Chu, Implementation of single precision floating point square root on FPGAs, FPGAs for Custom Computing Machines, pp.56-65, 1997.

P. Markstein, IA-64 and Elementary Functions: Speed and Precision. Hewlett-Packard Professional Books, 2000.

J. Muller, Elementary Functions, Algorithms and Implementation. Birkhäuser, 2006.
URL : https://hal.archives-ouvertes.fr/ensl-00000008

J. A. Pineiro and J. D. Bruguera, High-speed double-precision computation of reciprocal, division, square root, and inverse square root, IEEE Transactions on Computers, vol.51, issue.12, pp.1377-1388, 2002.
DOI : 10.1109/TC.2002.1146704

D. M. Russinoff, A mechanically checked proof of correctness of the AMD K5 floating point square root microcode. Formal Methods in System Design, pp.75-125, 1999.

X. Wang, S. Braganza, and M. Leeser, Advanced Components in the Variable Precision Floating-Point Library, 2006 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, pp.249-258, 2006.
DOI : 10.1109/FCCM.2006.21