Index: gcc/gmp/doc/gmp.texi |
diff --git a/gcc/gmp/doc/gmp.texi b/gcc/gmp/doc/gmp.texi |
deleted file mode 100644 |
index af21941ad8131a239b9946ab8ad8f7c4ca57df94..0000000000000000000000000000000000000000 |
--- a/gcc/gmp/doc/gmp.texi |
+++ /dev/null |
@@ -1,10450 +0,0 @@ |
-\input texinfo @c -*-texinfo-*- |
-@c %**start of header |
-@setfilename gmp.info |
-@documentencoding ISO-8859-1 |
-@include version.texi |
-@settitle GNU MP @value{VERSION} |
-@synindex tp fn |
-@iftex |
-@afourpaper |
-@end iftex |
-@comment %**end of header |
- |
-@copying |
-This manual describes how to install and use the GNU multiple precision |
-arithmetic library, version @value{VERSION}. |
- |
-Copyright 1991, 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, |
-2003, 2004, 2005, 2006, 2007, 2008, 2009 Free Software Foundation, Inc. |
- |
-Permission is granted to copy, distribute and/or modify this document under |
-the terms of the GNU Free Documentation License, Version 1.2 or any later |
-version published by the Free Software Foundation; with no Invariant Sections, |
-with the Front-Cover Texts being ``A GNU Manual'', and with the Back-Cover |
-Texts being ``You have freedom to copy and modify this GNU Manual, like GNU |
-software''. A copy of the license is included in |
-@ref{GNU Free Documentation License}. |
-@end copying |
-@c Note the @ref above must be on one line, a line break in an @ref within |
-@c @copying will bomb in recent texinfo.tex (eg. 2004-04-07.08 which comes |
-@c with texinfo 4.7), with messages about missing @endcsname. |
- |
- |
-@c Texinfo version 4.2 or up will be needed to process this file. |
-@c |
-@c The version number and edition number are taken from version.texi provided |
-@c by automake (note that it's regenerated only if you configure with |
-@c --enable-maintainer-mode). |
-@c |
-@c Notes discussing the present version number of GMP in relation to previous |
-@c ones (for instance in the "Compatibility" section) must be updated at |
-@c manually though. |
-@c |
-@c @cindex entries have been made for function categories and programming |
-@c topics. The "mpn" section is not included in this, because a beginner |
-@c looking for "GCD" or something is only going to be confused by pointers to |
-@c low level routines. |
-@c |
-@c @cindex entries are present for processors and systems when there's |
-@c particular notes concerning them, but not just for everything GMP |
-@c supports. |
-@c |
-@c Index entries for files use @code rather than @file, @samp or @option, |
-@c since the latter come out with quotes in TeX, which are nice in the text |
-@c but don't look so good in index columns. |
-@c |
-@c Tex: |
-@c |
-@c A suitable texinfo.tex is supplied, a newer one should work equally well. |
-@c |
-@c HTML: |
-@c |
-@c Nothing special is done for links to external manuals, they just come out |
-@c in the usual makeinfo style, eg. "../libc/Locales.html". If you have |
-@c local copies of such manuals then this is a good thing, if not then you |
-@c may want to search-and-replace to some online source. |
-@c |
- |
-@dircategory GNU libraries |
-@direntry |
-* gmp: (gmp). GNU Multiple Precision Arithmetic Library. |
-@end direntry |
- |
-@c html <meta name="description" content="..."> |
-@documentdescription |
-How to install and use the GNU multiple precision arithmetic library, version @value{VERSION}. |
-@end documentdescription |
- |
-@c smallbook |
-@finalout |
-@setchapternewpage on |
- |
-@ifnottex |
-@node Top, Copying, (dir), (dir) |
-@top GNU MP |
-@end ifnottex |
- |
-@iftex |
-@titlepage |
-@title GNU MP |
-@subtitle The GNU Multiple Precision Arithmetic Library |
-@subtitle Edition @value{EDITION} |
-@subtitle @value{UPDATED} |
- |
-@author by The GMP developers |
-@c @email{tege@@gmplib.org} |
- |
-@c Include the Distribution inside the titlepage so |
-@c that headings are turned off. |
- |
-@tex |
-\global\parindent=0pt |
-\global\parskip=8pt |
-\global\baselineskip=13pt |
-@end tex |
- |
-@page |
-@vskip 0pt plus 1filll |
-@end iftex |
- |
-@insertcopying |
-@ifnottex |
-@sp 1 |
-@end ifnottex |
- |
-@iftex |
-@end titlepage |
-@headings double |
-@end iftex |
- |
-@c Don't bother with contents for html, the menus seem adequate. |
-@ifnothtml |
-@contents |
-@end ifnothtml |
- |
-@menu |
-* Copying:: GMP Copying Conditions (LGPL). |
-* Introduction to GMP:: Brief introduction to GNU MP. |
-* Installing GMP:: How to configure and compile the GMP library. |
-* GMP Basics:: What every GMP user should know. |
-* Reporting Bugs:: How to usefully report bugs. |
-* Integer Functions:: Functions for arithmetic on signed integers. |
-* Rational Number Functions:: Functions for arithmetic on rational numbers. |
-* Floating-point Functions:: Functions for arithmetic on floats. |
-* Low-level Functions:: Fast functions for natural numbers. |
-* Random Number Functions:: Functions for generating random numbers. |
-* Formatted Output:: @code{printf} style output. |
-* Formatted Input:: @code{scanf} style input. |
-* C++ Class Interface:: Class wrappers around GMP types. |
-* BSD Compatible Functions:: All functions found in BSD MP. |
-* Custom Allocation:: How to customize the internal allocation. |
-* Language Bindings:: Using GMP from other languages. |
-* Algorithms:: What happens behind the scenes. |
-* Internals:: How values are represented behind the scenes. |
- |
-* Contributors:: Who brings you this library? |
-* References:: Some useful papers and books to read. |
-* GNU Free Documentation License:: |
-* Concept Index:: |
-* Function Index:: |
-@end menu |
- |
- |
-@c @m{T,N} is $T$ in tex or @math{N} otherwise. This is an easy way to give |
-@c different forms for math in tex and info. Commas in N or T don't work, |
-@c but @C{} can be used instead. \, works in info but not in tex. |
-@iftex |
-@macro m {T,N} |
-@tex$\T\$@end tex |
-@end macro |
-@end iftex |
-@ifnottex |
-@macro m {T,N} |
-@math{\N\} |
-@end macro |
-@end ifnottex |
- |
-@macro C {} |
-, |
-@end macro |
- |
-@c @ms{V,N} is $V_N$ in tex or just vn otherwise. This suits simple |
-@c subscripts like @ms{x,0}. |
-@iftex |
-@macro ms {V,N} |
-@tex$\V\_{\N\}$@end tex |
-@end macro |
-@end iftex |
-@ifnottex |
-@macro ms {V,N} |
-\V\\N\ |
-@end macro |
-@end ifnottex |
- |
-@c @nicode{S} is plain S in info, or @code{S} elsewhere. This can be used |
-@c when the quotes that @code{} gives in info aren't wanted, but the |
-@c fontification in tex or html is wanted. Doesn't work as @nicode{'\\0'} |
-@c though (gives two backslashes in tex). |
-@ifinfo |
-@macro nicode {S} |
-\S\ |
-@end macro |
-@end ifinfo |
-@ifnotinfo |
-@macro nicode {S} |
-@code{\S\} |
-@end macro |
-@end ifnotinfo |
- |
-@c @nisamp{S} is plain S in info, or @samp{S} elsewhere. This can be used |
-@c when the quotes that @samp{} gives in info aren't wanted, but the |
-@c fontification in tex or html is wanted. |
-@ifinfo |
-@macro nisamp {S} |
-\S\ |
-@end macro |
-@end ifinfo |
-@ifnotinfo |
-@macro nisamp {S} |
-@samp{\S\} |
-@end macro |
-@end ifnotinfo |
- |
-@c Usage: @GMPtimes{} |
-@c Give either \times or the word "times". |
-@tex |
-\gdef\GMPtimes{\times} |
-@end tex |
-@ifnottex |
-@macro GMPtimes |
-times |
-@end macro |
-@end ifnottex |
- |
-@c Usage: @GMPmultiply{} |
-@c Give * in info, or nothing in tex. |
-@tex |
-\gdef\GMPmultiply{} |
-@end tex |
-@ifnottex |
-@macro GMPmultiply |
-* |
-@end macro |
-@end ifnottex |
- |
-@c Usage: @GMPabs{x} |
-@c Give either |x| in tex, or abs(x) in info or html. |
-@tex |
-\gdef\GMPabs#1{|#1|} |
-@end tex |
-@ifnottex |
-@macro GMPabs {X} |
-@abs{}(\X\) |
-@end macro |
-@end ifnottex |
- |
-@c Usage: @GMPfloor{x} |
-@c Give either \lfloor x\rfloor in tex, or floor(x) in info or html. |
-@tex |
-\gdef\GMPfloor#1{\lfloor #1\rfloor} |
-@end tex |
-@ifnottex |
-@macro GMPfloor {X} |
-floor(\X\) |
-@end macro |
-@end ifnottex |
- |
-@c Usage: @GMPceil{x} |
-@c Give either \lceil x\rceil in tex, or ceil(x) in info or html. |
-@tex |
-\gdef\GMPceil#1{\lceil #1 \rceil} |
-@end tex |
-@ifnottex |
-@macro GMPceil {X} |
-ceil(\X\) |
-@end macro |
-@end ifnottex |
- |
-@c Math operators already available in tex, made available in info too. |
-@c For example @bmod{} can be used in both tex and info. |
-@ifnottex |
-@macro bmod |
-mod |
-@end macro |
-@macro gcd |
-gcd |
-@end macro |
-@macro ge |
->= |
-@end macro |
-@macro le |
-<= |
-@end macro |
-@macro log |
-log |
-@end macro |
-@macro min |
-min |
-@end macro |
-@macro leftarrow |
-<- |
-@end macro |
-@macro rightarrow |
--> |
-@end macro |
-@end ifnottex |
- |
-@c New math operators. |
-@c @abs{} can be used in both tex and info, or just \abs in tex. |
-@tex |
-\gdef\abs{\mathop{\rm abs}} |
-@end tex |
-@ifnottex |
-@macro abs |
-abs |
-@end macro |
-@end ifnottex |
- |
-@c @cross{} is a \times symbol in tex, or an "x" in info. In tex it works |
-@c inside or outside $ $. |
-@tex |
-\gdef\cross{\ifmmode\times\else$\times$\fi} |
-@end tex |
-@ifnottex |
-@macro cross |
-x |
-@end macro |
-@end ifnottex |
- |
-@c @times{} made available as a "*" in info and html (already works in tex). |
-@ifnottex |
-@macro times |
-* |
-@end macro |
-@end ifnottex |
- |
-@c Usage: @W{text} |
-@c Like @w{} but working in math mode too. |
-@tex |
-\gdef\W#1{\ifmmode{#1}\else\w{#1}\fi} |
-@end tex |
-@ifnottex |
-@macro W {S} |
-@w{\S\} |
-@end macro |
-@end ifnottex |
- |
-@c Usage: \GMPdisplay{text} |
-@c Put the given text in an @display style indent, but without turning off |
-@c paragraph reflow etc. |
-@tex |
-\gdef\GMPdisplay#1{% |
-\noindent |
-\advance\leftskip by \lispnarrowing |
-#1\par} |
-@end tex |
- |
-@c Usage: \GMPhat |
-@c A new \hat that will work in math mode, unlike the texinfo redefined |
-@c version. |
-@tex |
-\gdef\GMPhat{\mathaccent"705E} |
-@end tex |
- |
-@c Usage: \GMPraise{text} |
-@c For use in a $ $ math expression as an alternative to "^". This is good |
-@c for @code{} in an exponent, since there seems to be no superscript font |
-@c for that. |
-@tex |
-\gdef\GMPraise#1{\mskip0.5\thinmuskip\hbox{\raise0.8ex\hbox{#1}}} |
-@end tex |
- |
-@c Usage: @texlinebreak{} |
-@c A line break as per @*, but only in tex. |
-@iftex |
-@macro texlinebreak |
-@* |
-@end macro |
-@end iftex |
-@ifnottex |
-@macro texlinebreak |
-@end macro |
-@end ifnottex |
- |
-@c Usage: @maybepagebreak |
-@c Allow tex to insert a page break, if it feels the urge. |
-@c Normally blocks of @deftypefun/funx are kept together, which can lead to |
-@c some poor page break positioning if it's a big block, like the sets of |
-@c division functions etc. |
-@tex |
-\gdef\maybepagebreak{\penalty0} |
-@end tex |
-@ifnottex |
-@macro maybepagebreak |
-@end macro |
-@end ifnottex |
- |
-@c Usage: @GMPreftop{info,title} |
-@c Usage: @GMPpxreftop{info,title} |
-@c |
-@c Like @ref{} and @pxref{}, but designed for a reference to the top of a |
-@c document, not a particular section. The TeX output for plain @ref insists |
-@c on printing a particular section, GMPreftop gives just the title. |
-@c |
-@c The texinfo manual recommends putting a likely section name in references |
-@c like this, eg. "Introduction", but it seems better to just give the title. |
-@c |
-@iftex |
-@macro GMPreftop{info,title} |
-@i{\title\} |
-@end macro |
-@macro GMPpxreftop{info,title} |
-see @i{\title\} |
-@end macro |
-@end iftex |
-@c |
-@ifnottex |
-@macro GMPreftop{info,title} |
-@ref{Top,\title\,\title\,\info\,\title\} |
-@end macro |
-@macro GMPpxreftop{info,title} |
-@pxref{Top,\title\,\title\,\info\,\title\} |
-@end macro |
-@end ifnottex |
- |
- |
-@node Copying, Introduction to GMP, Top, Top |
-@comment node-name, next, previous, up |
-@unnumbered GNU MP Copying Conditions |
-@cindex Copying conditions |
-@cindex Conditions for copying GNU MP |
-@cindex License conditions |
- |
-This library is @dfn{free}; this means that everyone is free to use it and |
-free to redistribute it on a free basis. The library is not in the public |
-domain; it is copyrighted and there are restrictions on its distribution, but |
-these restrictions are designed to permit everything that a good cooperating |
-citizen would want to do. What is not allowed is to try to prevent others |
-from further sharing any version of this library that they might get from |
-you.@refill |
- |
-Specifically, we want to make sure that you have the right to give away copies |
-of the library, that you receive source code or else can get it if you want |
-it, that you can change this library or use pieces of it in new free programs, |
-and that you know you can do these things.@refill |
- |
-To make sure that everyone has such rights, we have to forbid you to deprive |
-anyone else of these rights. For example, if you distribute copies of the GNU |
-MP library, you must give the recipients all the rights that you have. You |
-must make sure that they, too, receive or can get the source code. And you |
-must tell them their rights.@refill |
- |
-Also, for our own protection, we must make certain that everyone finds out |
-that there is no warranty for the GNU MP library. If it is modified by |
-someone else and passed on, we want their recipients to know that what they |
-have is not what we distributed, so that any problems introduced by others |
-will not reflect on our reputation.@refill |
- |
-The precise conditions of the license for the GNU MP library are found in the |
-Lesser General Public License version 3 that accompanies the source code, |
-see @file{COPYING.LIB}. Certain demonstration programs are provided under the |
-terms of the plain General Public License version 3, see @file{COPYING}. |
- |
- |
-@node Introduction to GMP, Installing GMP, Copying, Top |
-@comment node-name, next, previous, up |
-@chapter Introduction to GNU MP |
-@cindex Introduction |
- |
-GNU MP is a portable library written in C for arbitrary precision arithmetic |
-on integers, rational numbers, and floating-point numbers. It aims to provide |
-the fastest possible arithmetic for all applications that need higher |
-precision than is directly supported by the basic C types. |
- |
-Many applications use just a few hundred bits of precision; but some |
-applications may need thousands or even millions of bits. GMP is designed to |
-give good performance for both, by choosing algorithms based on the sizes of |
-the operands, and by carefully keeping the overhead at a minimum. |
- |
-The speed of GMP is achieved by using fullwords as the basic arithmetic type, |
-by using sophisticated algorithms, by including carefully optimized assembly |
-code for the most common inner loops for many different CPUs, and by a general |
-emphasis on speed (as opposed to simplicity or elegance). |
- |
-There is assembly code for these CPUs: |
-@cindex CPU types |
-ARM, |
-DEC Alpha 21064, 21164, and 21264, |
-AMD 29000, |
-AMD K6, K6-2, Athlon, and Athlon64, |
-Hitachi SuperH and SH-2, |
-HPPA 1.0, 1.1 and 2.0, |
-Intel Pentium, Pentium Pro/II/III, Pentium 4, generic x86, |
-Intel IA-64, i960, |
-Motorola MC68000, MC68020, MC88100, and MC88110, |
-Motorola/IBM PowerPC 32 and 64, |
-National NS32000, |
-IBM POWER, |
-MIPS R3000, R4000, |
-SPARCv7, SuperSPARC, generic SPARCv8, UltraSPARC, |
-DEC VAX, |
-and |
-Zilog Z8000. |
-Some optimizations also for |
-Cray vector systems, |
-Clipper, |
-IBM ROMP (RT), |
-and |
-Pyramid AP/XP. |
- |
-@cindex Home page |
-@cindex Web page |
-@noindent |
-For up-to-date information on GMP, please see the GMP web pages at |
- |
-@display |
-@uref{http://gmplib.org/} |
-@end display |
- |
-@cindex Latest version of GMP |
-@cindex Anonymous FTP of latest version |
-@cindex FTP of latest version |
-@noindent |
-The latest version of the library is available at |
- |
-@display |
-@uref{ftp://ftp.gnu.org/gnu/gmp/} |
-@end display |
- |
-Many sites around the world mirror @samp{ftp.gnu.org}, please use a mirror |
-near you, see @uref{http://www.gnu.org/order/ftp.html} for a full list. |
- |
-@cindex Mailing lists |
-There are three public mailing lists of interest. One for release |
-announcements, one for general questions and discussions about usage of the GMP |
-library and one for bug reports. For more information, see |
- |
-@display |
-@uref{http://gmplib.org/mailman/listinfo/}. |
-@end display |
- |
-The proper place for bug reports is @email{gmp-bugs@@gmplib.org}. See |
-@ref{Reporting Bugs} for information about reporting bugs. |
- |
-@sp 1 |
-@section How to use this Manual |
-@cindex About this manual |
- |
-Everyone should read @ref{GMP Basics}. If you need to install the library |
-yourself, then read @ref{Installing GMP}. If you have a system with multiple |
-ABIs, then read @ref{ABI and ISA}, for the compiler options that must be used |
-on applications. |
- |
-The rest of the manual can be used for later reference, although it is |
-probably a good idea to glance through it. |
- |
- |
-@node Installing GMP, GMP Basics, Introduction to GMP, Top |
-@comment node-name, next, previous, up |
-@chapter Installing GMP |
-@cindex Installing GMP |
-@cindex Configuring GMP |
-@cindex Building GMP |
- |
-GMP has an autoconf/automake/libtool based configuration system. On a |
-Unix-like system a basic build can be done with |
- |
-@example |
-./configure |
-make |
-@end example |
- |
-@noindent |
-Some self-tests can be run with |
- |
-@example |
-make check |
-@end example |
- |
-@noindent |
-And you can install (under @file{/usr/local} by default) with |
- |
-@example |
-make install |
-@end example |
- |
-If you experience problems, please report them to @email{gmp-bugs@@gmplib.org}. |
-See @ref{Reporting Bugs}, for information on what to include in useful bug |
-reports. |
- |
-@menu |
-* Build Options:: |
-* ABI and ISA:: |
-* Notes for Package Builds:: |
-* Notes for Particular Systems:: |
-* Known Build Problems:: |
-* Performance optimization:: |
-@end menu |
- |
- |
-@node Build Options, ABI and ISA, Installing GMP, Installing GMP |
-@section Build Options |
-@cindex Build options |
- |
-All the usual autoconf configure options are available, run @samp{./configure |
---help} for a summary. The file @file{INSTALL.autoconf} has some generic |
-installation information too. |
- |
-@table @asis |
-@item Tools |
-@cindex Non-Unix systems |
-@samp{configure} requires various Unix-like tools. See @ref{Notes for |
-Particular Systems}, for some options on non-Unix systems. |
- |
-It might be possible to build without the help of @samp{configure}, certainly |
-all the code is there, but unfortunately you'll be on your own. |
- |
-@item Build Directory |
-@cindex Build directory |
-To compile in a separate build directory, @command{cd} to that directory, and |
-prefix the configure command with the path to the GMP source directory. For |
-example |
- |
-@example |
-cd /my/build/dir |
-/my/sources/gmp-@value{VERSION}/configure |
-@end example |
- |
-Not all @samp{make} programs have the necessary features (@code{VPATH}) to |
-support this. In particular, SunOS and Slowaris @command{make} have bugs that |
-make them unable to build in a separate directory. Use GNU @command{make} |
-instead. |
- |
-@item @option{--prefix} and @option{--exec-prefix} |
-@cindex Prefix |
-@cindex Exec prefix |
-@cindex Install prefix |
-@cindex @code{--prefix} |
-@cindex @code{--exec-prefix} |
-The @option{--prefix} option can be used in the normal way to direct GMP to |
-install under a particular tree. The default is @samp{/usr/local}. |
- |
-@option{--exec-prefix} can be used to direct architecture-dependent files like |
-@file{libgmp.a} to a different location. This can be used to share |
-architecture-independent parts like the documentation, but separate the |
-dependent parts. Note however that @file{gmp.h} and @file{mp.h} are |
-architecture-dependent since they encode certain aspects of @file{libgmp}, so |
-it will be necessary to ensure both @file{$prefix/include} and |
-@file{$exec_prefix/include} are available to the compiler. |
- |
-@item @option{--disable-shared}, @option{--disable-static} |
-@cindex @code{--disable-shared} |
-@cindex @code{--disable-static} |
-By default both shared and static libraries are built (where possible), but |
-one or other can be disabled. Shared libraries result in smaller executables |
-and permit code sharing between separate running processes, but on some CPUs |
-are slightly slower, having a small cost on each function call. |
- |
-@item Native Compilation, @option{--build=CPU-VENDOR-OS} |
-@cindex Native compilation |
-@cindex Build system |
-@cindex @code{--build} |
-For normal native compilation, the system can be specified with |
-@samp{--build}. By default @samp{./configure} uses the output from running |
-@samp{./config.guess}. On some systems @samp{./config.guess} can determine |
-the exact CPU type, on others it will be necessary to give it explicitly. For |
-example, |
- |
-@example |
-./configure --build=ultrasparc-sun-solaris2.7 |
-@end example |
- |
-In all cases the @samp{OS} part is important, since it controls how libtool |
-generates shared libraries. Running @samp{./config.guess} is the simplest way |
-to see what it should be, if you don't know already. |
- |
-@item Cross Compilation, @option{--host=CPU-VENDOR-OS} |
-@cindex Cross compiling |
-@cindex Host system |
-@cindex @code{--host} |
-When cross-compiling, the system used for compiling is given by @samp{--build} |
-and the system where the library will run is given by @samp{--host}. For |
-example when using a FreeBSD Athlon system to build GNU/Linux m68k binaries, |
- |
-@example |
-./configure --build=athlon-pc-freebsd3.5 --host=m68k-mac-linux-gnu |
-@end example |
- |
-Compiler tools are sought first with the host system type as a prefix. For |
-example @command{m68k-mac-linux-gnu-ranlib} is tried, then plain |
-@command{ranlib}. This makes it possible for a set of cross-compiling tools |
-to co-exist with native tools. The prefix is the argument to @samp{--host}, |
-and this can be an alias, such as @samp{m68k-linux}. But note that tools |
-don't have to be setup this way, it's enough to just have a @env{PATH} with a |
-suitable cross-compiling @command{cc} etc. |
- |
-Compiling for a different CPU in the same family as the build system is a form |
-of cross-compilation, though very possibly this would merely be special |
-options on a native compiler. In any case @samp{./configure} avoids depending |
-on being able to run code on the build system, which is important when |
-creating binaries for a newer CPU since they very possibly won't run on the |
-build system. |
- |
-In all cases the compiler must be able to produce an executable (of whatever |
-format) from a standard C @code{main}. Although only object files will go to |
-make up @file{libgmp}, @samp{./configure} uses linking tests for various |
-purposes, such as determining what functions are available on the host system. |
- |
-Currently a warning is given unless an explicit @samp{--build} is used when |
-cross-compiling, because it may not be possible to correctly guess the build |
-system type if the @env{PATH} has only a cross-compiling @command{cc}. |
- |
-Note that the @samp{--target} option is not appropriate for GMP@. It's for use |
-when building compiler tools, with @samp{--host} being where they will run, |
-and @samp{--target} what they'll produce code for. Ordinary programs or |
-libraries like GMP are only interested in the @samp{--host} part, being where |
-they'll run. (Some past versions of GMP used @samp{--target} incorrectly.) |
- |
-@item CPU types |
-@cindex CPU types |
-In general, if you want a library that runs as fast as possible, you should |
-configure GMP for the exact CPU type your system uses. However, this may mean |
-the binaries won't run on older members of the family, and might run slower on |
-other members, older or newer. The best idea is always to build GMP for the |
-exact machine type you intend to run it on. |
- |
-The following CPUs have specific support. See @file{configure.in} for details |
-of what code and compiler options they select. |
- |
-@itemize @bullet |
- |
-@c Keep this formatting, it's easy to read and it can be grepped to |
-@c automatically test that CPUs listed get through ./config.sub |
- |
-@item |
-Alpha: |
-@nisamp{alpha}, |
-@nisamp{alphaev5}, |
-@nisamp{alphaev56}, |
-@nisamp{alphapca56}, |
-@nisamp{alphapca57}, |
-@nisamp{alphaev6}, |
-@nisamp{alphaev67}, |
-@nisamp{alphaev68} |
-@nisamp{alphaev7} |
- |
-@item |
-Cray: |
-@nisamp{c90}, |
-@nisamp{j90}, |
-@nisamp{t90}, |
-@nisamp{sv1} |
- |
-@item |
-HPPA: |
-@nisamp{hppa1.0}, |
-@nisamp{hppa1.1}, |
-@nisamp{hppa2.0}, |
-@nisamp{hppa2.0n}, |
-@nisamp{hppa2.0w}, |
-@nisamp{hppa64} |
- |
-@item |
-IA-64: |
-@nisamp{ia64}, |
-@nisamp{itanium}, |
-@nisamp{itanium2} |
- |
-@item |
-MIPS: |
-@nisamp{mips}, |
-@nisamp{mips3}, |
-@nisamp{mips64} |
- |
-@item |
-Motorola: |
-@nisamp{m68k}, |
-@nisamp{m68000}, |
-@nisamp{m68010}, |
-@nisamp{m68020}, |
-@nisamp{m68030}, |
-@nisamp{m68040}, |
-@nisamp{m68060}, |
-@nisamp{m68302}, |
-@nisamp{m68360}, |
-@nisamp{m88k}, |
-@nisamp{m88110} |
- |
-@item |
-POWER: |
-@nisamp{power}, |
-@nisamp{power1}, |
-@nisamp{power2}, |
-@nisamp{power2sc} |
- |
-@item |
-PowerPC: |
-@nisamp{powerpc}, |
-@nisamp{powerpc64}, |
-@nisamp{powerpc401}, |
-@nisamp{powerpc403}, |
-@nisamp{powerpc405}, |
-@nisamp{powerpc505}, |
-@nisamp{powerpc601}, |
-@nisamp{powerpc602}, |
-@nisamp{powerpc603}, |
-@nisamp{powerpc603e}, |
-@nisamp{powerpc604}, |
-@nisamp{powerpc604e}, |
-@nisamp{powerpc620}, |
-@nisamp{powerpc630}, |
-@nisamp{powerpc740}, |
-@nisamp{powerpc7400}, |
-@nisamp{powerpc7450}, |
-@nisamp{powerpc750}, |
-@nisamp{powerpc801}, |
-@nisamp{powerpc821}, |
-@nisamp{powerpc823}, |
-@nisamp{powerpc860}, |
-@nisamp{powerpc970} |
- |
-@item |
-SPARC: |
-@nisamp{sparc}, |
-@nisamp{sparcv8}, |
-@nisamp{microsparc}, |
-@nisamp{supersparc}, |
-@nisamp{sparcv9}, |
-@nisamp{ultrasparc}, |
-@nisamp{ultrasparc2}, |
-@nisamp{ultrasparc2i}, |
-@nisamp{ultrasparc3}, |
-@nisamp{sparc64} |
- |
-@item |
-x86 family: |
-@nisamp{i386}, |
-@nisamp{i486}, |
-@nisamp{i586}, |
-@nisamp{pentium}, |
-@nisamp{pentiummmx}, |
-@nisamp{pentiumpro}, |
-@nisamp{pentium2}, |
-@nisamp{pentium3}, |
-@nisamp{pentium4}, |
-@nisamp{k6}, |
-@nisamp{k62}, |
-@nisamp{k63}, |
-@nisamp{athlon}, |
-@nisamp{amd64}, |
-@nisamp{viac3}, |
-@nisamp{viac32} |
- |
-@item |
-Other: |
-@nisamp{a29k}, |
-@nisamp{arm}, |
-@nisamp{clipper}, |
-@nisamp{i960}, |
-@nisamp{ns32k}, |
-@nisamp{pyramid}, |
-@nisamp{sh}, |
-@nisamp{sh2}, |
-@nisamp{vax}, |
-@nisamp{z8k} |
-@end itemize |
- |
-CPUs not listed will use generic C code. |
- |
-@item Generic C Build |
-@cindex Generic C |
-If some of the assembly code causes problems, or if otherwise desired, the |
-generic C code can be selected with CPU @samp{none}. For example, |
- |
-@example |
-./configure --host=none-unknown-freebsd3.5 |
-@end example |
- |
-Note that this will run quite slowly, but it should be portable and should at |
-least make it possible to get something running if all else fails. |
- |
-@item Fat binary, @option{--enable-fat} |
-@cindex Fat binary |
-@cindex @option{--enable-fat} |
-Using @option{--enable-fat} selects a ``fat binary'' build on x86, where |
-optimized low level subroutines are chosen at runtime according to the CPU |
-detected. This means more code, but gives good performance on all x86 chips. |
-(This option might become available for more architectures in the future.) |
- |
-@item @option{ABI} |
-@cindex ABI |
-On some systems GMP supports multiple ABIs (application binary interfaces), |
-meaning data type sizes and calling conventions. By default GMP chooses the |
-best ABI available, but a particular ABI can be selected. For example |
- |
-@example |
-./configure --host=mips64-sgi-irix6 ABI=n32 |
-@end example |
- |
-See @ref{ABI and ISA}, for the available choices on relevant CPUs, and what |
-applications need to do. |
- |
-@item @option{CC}, @option{CFLAGS} |
-@cindex C compiler |
-@cindex @code{CC} |
-@cindex @code{CFLAGS} |
-By default the C compiler used is chosen from among some likely candidates, |
-with @command{gcc} normally preferred if it's present. The usual |
-@samp{CC=whatever} can be passed to @samp{./configure} to choose something |
-different. |
- |
-For various systems, default compiler flags are set based on the CPU and |
-compiler. The usual @samp{CFLAGS="-whatever"} can be passed to |
-@samp{./configure} to use something different or to set good flags for systems |
-GMP doesn't otherwise know. |
- |
-The @samp{CC} and @samp{CFLAGS} used are printed during @samp{./configure}, |
-and can be found in each generated @file{Makefile}. This is the easiest way |
-to check the defaults when considering changing or adding something. |
- |
-Note that when @samp{CC} and @samp{CFLAGS} are specified on a system |
-supporting multiple ABIs it's important to give an explicit |
-@samp{ABI=whatever}, since GMP can't determine the ABI just from the flags and |
-won't be able to select the correct assembly code. |
- |
-If just @samp{CC} is selected then normal default @samp{CFLAGS} for that |
-compiler will be used (if GMP recognises it). For example @samp{CC=gcc} can |
-be used to force the use of GCC, with default flags (and default ABI). |
- |
-@item @option{CPPFLAGS} |
-@cindex @code{CPPFLAGS} |
-Any flags like @samp{-D} defines or @samp{-I} includes required by the |
-preprocessor should be set in @samp{CPPFLAGS} rather than @samp{CFLAGS}. |
-Compiling is done with both @samp{CPPFLAGS} and @samp{CFLAGS}, but |
-preprocessing uses just @samp{CPPFLAGS}. This distinction is because most |
-preprocessors won't accept all the flags the compiler does. Preprocessing is |
-done separately in some configure tests, and in the @samp{ansi2knr} support |
-for K&R compilers. |
- |
-@item @option{CC_FOR_BUILD} |
-@cindex @code{CC_FOR_BUILD} |
-Some build-time programs are compiled and run to generate host-specific data |
-tables. @samp{CC_FOR_BUILD} is the compiler used for this. It doesn't need |
-to be in any particular ABI or mode, it merely needs to generate executables |
-that can run. The default is to try the selected @samp{CC} and some likely |
-candidates such as @samp{cc} and @samp{gcc}, looking for something that works. |
- |
-No flags are used with @samp{CC_FOR_BUILD} because a simple invocation like |
-@samp{cc foo.c} should be enough. If some particular options are required |
-they can be included as for instance @samp{CC_FOR_BUILD="cc -whatever"}. |
- |
-@item C++ Support, @option{--enable-cxx} |
-@cindex C++ support |
-@cindex @code{--enable-cxx} |
-C++ support in GMP can be enabled with @samp{--enable-cxx}, in which case a |
-C++ compiler will be required. As a convenience @samp{--enable-cxx=detect} |
-can be used to enable C++ support only if a compiler can be found. The C++ |
-support consists of a library @file{libgmpxx.la} and header file |
-@file{gmpxx.h} (@pxref{Headers and Libraries}). |
- |
-A separate @file{libgmpxx.la} has been adopted rather than having C++ objects |
-within @file{libgmp.la} in order to ensure dynamic linked C programs aren't |
-bloated by a dependency on the C++ standard library, and to avoid any chance |
-that the C++ compiler could be required when linking plain C programs. |
- |
-@file{libgmpxx.la} will use certain internals from @file{libgmp.la} and can |
-only be expected to work with @file{libgmp.la} from the same GMP version. |
-Future changes to the relevant internals will be accompanied by renaming, so a |
-mismatch will cause unresolved symbols rather than perhaps mysterious |
-misbehaviour. |
- |
-In general @file{libgmpxx.la} will be usable only with the C++ compiler that |
-built it, since name mangling and runtime support are usually incompatible |
-between different compilers. |
- |
-@item @option{CXX}, @option{CXXFLAGS} |
-@cindex C++ compiler |
-@cindex @code{CXX} |
-@cindex @code{CXXFLAGS} |
-When C++ support is enabled, the C++ compiler and its flags can be set with |
-variables @samp{CXX} and @samp{CXXFLAGS} in the usual way. The default for |
-@samp{CXX} is the first compiler that works from a list of likely candidates, |
-with @command{g++} normally preferred when available. The default for |
-@samp{CXXFLAGS} is to try @samp{CFLAGS}, @samp{CFLAGS} without @samp{-g}, then |
-for @command{g++} either @samp{-g -O2} or @samp{-O2}, or for other compilers |
-@samp{-g} or nothing. Trying @samp{CFLAGS} this way is convenient when using |
-@samp{gcc} and @samp{g++} together, since the flags for @samp{gcc} will |
-usually suit @samp{g++}. |
- |
-It's important that the C and C++ compilers match, meaning their startup and |
-runtime support routines are compatible and that they generate code in the |
-same ABI (if there's a choice of ABIs on the system). @samp{./configure} |
-isn't currently able to check these things very well itself, so for that |
-reason @samp{--disable-cxx} is the default, to avoid a build failure due to a |
-compiler mismatch. Perhaps this will change in the future. |
- |
-Incidentally, it's normally not good enough to set @samp{CXX} to the same as |
-@samp{CC}. Although @command{gcc} for instance recognises @file{foo.cc} as |
-C++ code, only @command{g++} will invoke the linker the right way when |
-building an executable or shared library from C++ object files. |
- |
-@item Temporary Memory, @option{--enable-alloca=<choice>} |
-@cindex Temporary memory |
-@cindex Stack overflow |
-@cindex @code{alloca} |
-@cindex @code{--enable-alloca} |
-GMP allocates temporary workspace using one of the following three methods, |
-which can be selected with for instance |
-@samp{--enable-alloca=malloc-reentrant}. |
- |
-@itemize @bullet |
-@item |
-@samp{alloca} - C library or compiler builtin. |
-@item |
-@samp{malloc-reentrant} - the heap, in a re-entrant fashion. |
-@item |
-@samp{malloc-notreentrant} - the heap, with global variables. |
-@end itemize |
- |
-For convenience, the following choices are also available. |
-@samp{--disable-alloca} is the same as @samp{no}. |
- |
-@itemize @bullet |
-@item |
-@samp{yes} - a synonym for @samp{alloca}. |
-@item |
-@samp{no} - a synonym for @samp{malloc-reentrant}. |
-@item |
-@samp{reentrant} - @code{alloca} if available, otherwise |
-@samp{malloc-reentrant}. This is the default. |
-@item |
-@samp{notreentrant} - @code{alloca} if available, otherwise |
-@samp{malloc-notreentrant}. |
-@end itemize |
- |
-@code{alloca} is reentrant and fast, and is recommended. It actually allocates |
-just small blocks on the stack; larger ones use malloc-reentrant. |
- |
-@samp{malloc-reentrant} is, as the name suggests, reentrant and thread safe, |
-but @samp{malloc-notreentrant} is faster and should be used if reentrancy is |
-not required. |
- |
-The two malloc methods in fact use the memory allocation functions selected by |
-@code{mp_set_memory_functions}, these being @code{malloc} and friends by |
-default. @xref{Custom Allocation}. |
- |
-An additional choice @samp{--enable-alloca=debug} is available, to help when |
-debugging memory related problems (@pxref{Debugging}). |
- |
-@item FFT Multiplication, @option{--disable-fft} |
-@cindex FFT multiplication |
-@cindex @code{--disable-fft} |
-By default multiplications are done using Karatsuba, 3-way Toom, and |
-Fermat FFT@. The FFT is only used on large to very large operands and can be |
-disabled to save code size if desired. |
- |
-@item Berkeley MP, @option{--enable-mpbsd} |
-@cindex Berkeley MP compatible functions |
-@cindex BSD MP compatible functions |
-@cindex @code{--enable-mpbsd} |
-The Berkeley MP compatibility library (@file{libmp}) and header file |
-(@file{mp.h}) are built and installed only if @option{--enable-mpbsd} is used. |
-@xref{BSD Compatible Functions}. |
- |
-@item Assertion Checking, @option{--enable-assert} |
-@cindex Assertion checking |
-@cindex @code{--enable-assert} |
-This option enables some consistency checking within the library. This can be |
-of use while debugging, @pxref{Debugging}. |
- |
-@item Execution Profiling, @option{--enable-profiling=prof/gprof/instrument} |
-@cindex Execution profiling |
-@cindex @code{--enable-profiling} |
-Enable profiling support, in one of various styles, @pxref{Profiling}. |
- |
-@item @option{MPN_PATH} |
-@cindex @code{MPN_PATH} |
-Various assembly versions of each mpn subroutines are provided. For a given |
-CPU, a search is made though a path to choose a version of each. For example |
-@samp{sparcv8} has |
- |
-@example |
-MPN_PATH="sparc32/v8 sparc32 generic" |
-@end example |
- |
-which means look first for v8 code, then plain sparc32 (which is v7), and |
-finally fall back on generic C@. Knowledgeable users with special requirements |
-can specify a different path. Normally this is completely unnecessary. |
- |
-@item Documentation |
-@cindex Documentation formats |
-@cindex Texinfo |
-The source for the document you're now reading is @file{doc/gmp.texi}, in |
-Texinfo format, see @GMPreftop{texinfo, Texinfo}. |
- |
-@cindex Postscript |
-@cindex DVI |
-@cindex PDF |
-Info format @samp{doc/gmp.info} is included in the distribution. The usual |
-automake targets are available to make PostScript, DVI, PDF and HTML (these |
-will require various @TeX{} and Texinfo tools). |
- |
-@cindex DocBook |
-@cindex XML |
-DocBook and XML can be generated by the Texinfo @command{makeinfo} program |
-too, see @ref{makeinfo options,, Options for @command{makeinfo}, texinfo, |
-Texinfo}. |
- |
-Some supplementary notes can also be found in the @file{doc} subdirectory. |
- |
-@end table |
- |
- |
-@need 2000 |
-@node ABI and ISA, Notes for Package Builds, Build Options, Installing GMP |
-@section ABI and ISA |
-@cindex ABI |
-@cindex Application Binary Interface |
-@cindex ISA |
-@cindex Instruction Set Architecture |
- |
-ABI (Application Binary Interface) refers to the calling conventions between |
-functions, meaning what registers are used and what sizes the various C data |
-types are. ISA (Instruction Set Architecture) refers to the instructions and |
-registers a CPU has available. |
- |
-Some 64-bit ISA CPUs have both a 64-bit ABI and a 32-bit ABI defined, the |
-latter for compatibility with older CPUs in the family. GMP supports some |
-CPUs like this in both ABIs. In fact within GMP @samp{ABI} means a |
-combination of chip ABI, plus how GMP chooses to use it. For example in some |
-32-bit ABIs, GMP may support a limb as either a 32-bit @code{long} or a 64-bit |
-@code{long long}. |
- |
-By default GMP chooses the best ABI available for a given system, and this |
-generally gives significantly greater speed. But an ABI can be chosen |
-explicitly to make GMP compatible with other libraries, or particular |
-application requirements. For example, |
- |
-@example |
-./configure ABI=32 |
-@end example |
- |
-In all cases it's vital that all object code used in a given program is |
-compiled for the same ABI. |
- |
-Usually a limb is implemented as a @code{long}. When a @code{long long} limb |
-is used this is encoded in the generated @file{gmp.h}. This is convenient for |
-applications, but it does mean that @file{gmp.h} will vary, and can't be just |
-copied around. @file{gmp.h} remains compiler independent though, since all |
-compilers for a particular ABI will be expected to use the same limb type. |
- |
-Currently no attempt is made to follow whatever conventions a system has for |
-installing library or header files built for a particular ABI@. This will |
-probably only matter when installing multiple builds of GMP, and it might be |
-as simple as configuring with a special @samp{libdir}, or it might require |
-more than that. Note that builds for different ABIs need to done separately, |
-with a fresh @command{./configure} and @command{make} each. |
- |
-@sp 1 |
-@table @asis |
-@need 1000 |
-@item AMD64 (@samp{x86_64}) |
-@cindex AMD64 |
-On AMD64 systems supporting both 32-bit and 64-bit modes for applications, the |
-following ABI choices are available. |
- |
-@table @asis |
-@item @samp{ABI=64} |
-The 64-bit ABI uses 64-bit limbs and pointers and makes full use of the chip |
-architecture. This is the default. Applications will usually not need |
-special compiler flags, but for reference the option is |
- |
-@example |
-gcc -m64 |
-@end example |
- |
-@item @samp{ABI=32} |
-The 32-bit ABI is the usual i386 conventions. This will be slower, and is not |
-recommended except for inter-operating with other code not yet 64-bit capable. |
-Applications must be compiled with |
- |
-@example |
-gcc -m32 |
-@end example |
- |
-(In GCC 2.95 and earlier there's no @samp{-m32} option, it's the only mode.) |
-@end table |
- |
-@sp 1 |
-@need 1000 |
-@item HPPA 2.0 (@samp{hppa2.0*}, @samp{hppa64}) |
-@cindex HPPA |
-@cindex HP-UX |
-@table @asis |
-@item @samp{ABI=2.0w} |
-The 2.0w ABI uses 64-bit limbs and pointers and is available on HP-UX 11 or |
-up. Applications must be compiled with |
- |
-@example |
-gcc [built for 2.0w] |
-cc +DD64 |
-@end example |
- |
-@item @samp{ABI=2.0n} |
-The 2.0n ABI means the 32-bit HPPA 1.0 ABI and all its normal calling |
-conventions, but with 64-bit instructions permitted within functions. GMP |
-uses a 64-bit @code{long long} for a limb. This ABI is available on hppa64 |
-GNU/Linux and on HP-UX 10 or higher. Applications must be compiled with |
- |
-@example |
-gcc [built for 2.0n] |
-cc +DA2.0 +e |
-@end example |
- |
-Note that current versions of GCC (eg.@: 3.2) don't generate 64-bit |
-instructions for @code{long long} operations and so may be slower than for |
-2.0w. (The GMP assembly code is the same though.) |
- |
-@item @samp{ABI=1.0} |
-HPPA 2.0 CPUs can run all HPPA 1.0 and 1.1 code in the 32-bit HPPA 1.0 ABI@. |
-No special compiler options are needed for applications. |
-@end table |
- |
-All three ABIs are available for CPU types @samp{hppa2.0w}, @samp{hppa2.0} and |
-@samp{hppa64}, but for CPU type @samp{hppa2.0n} only 2.0n or 1.0 are |
-considered. |
- |
-Note that GCC on HP-UX has no options to choose between 2.0n and 2.0w modes, |
-unlike HP @command{cc}. Instead it must be built for one or the other ABI@. |
-GMP will detect how it was built, and skip to the corresponding @samp{ABI}. |
- |
-@sp 1 |
-@need 1500 |
-@item IA-64 under HP-UX (@samp{ia64*-*-hpux*}, @samp{itanium*-*-hpux*}) |
-@cindex IA-64 |
-@cindex HP-UX |
-HP-UX supports two ABIs for IA-64. GMP performance is the same in both. |
- |
-@table @asis |
-@item @samp{ABI=32} |
-In the 32-bit ABI, pointers, @code{int}s and @code{long}s are 32 bits and GMP |
-uses a 64 bit @code{long long} for a limb. Applications can be compiled |
-without any special flags since this ABI is the default in both HP C and GCC, |
-but for reference the flags are |
- |
-@example |
-gcc -milp32 |
-cc +DD32 |
-@end example |
- |
-@item @samp{ABI=64} |
-In the 64-bit ABI, @code{long}s and pointers are 64 bits and GMP uses a |
-@code{long} for a limb. Applications must be compiled with |
- |
-@example |
-gcc -mlp64 |
-cc +DD64 |
-@end example |
-@end table |
- |
-On other IA-64 systems, GNU/Linux for instance, @samp{ABI=64} is the only |
-choice. |
- |
-@sp 1 |
-@need 1000 |
-@item MIPS under IRIX 6 (@samp{mips*-*-irix[6789]}) |
-@cindex MIPS |
-@cindex IRIX |
-IRIX 6 always has a 64-bit MIPS 3 or better CPU, and supports ABIs o32, n32, |
-and 64. n32 or 64 are recommended, and GMP performance will be the same in |
-each. The default is n32. |
- |
-@table @asis |
-@item @samp{ABI=o32} |
-The o32 ABI is 32-bit pointers and integers, and no 64-bit operations. GMP |
-will be slower than in n32 or 64, this option only exists to support old |
-compilers, eg.@: GCC 2.7.2. Applications can be compiled with no special |
-flags on an old compiler, or on a newer compiler with |
- |
-@example |
-gcc -mabi=32 |
-cc -32 |
-@end example |
- |
-@item @samp{ABI=n32} |
-The n32 ABI is 32-bit pointers and integers, but with a 64-bit limb using a |
-@code{long long}. Applications must be compiled with |
- |
-@example |
-gcc -mabi=n32 |
-cc -n32 |
-@end example |
- |
-@item @samp{ABI=64} |
-The 64-bit ABI is 64-bit pointers and integers. Applications must be compiled |
-with |
- |
-@example |
-gcc -mabi=64 |
-cc -64 |
-@end example |
-@end table |
- |
-Note that MIPS GNU/Linux, as of kernel version 2.2, doesn't have the necessary |
-support for n32 or 64 and so only gets a 32-bit limb and the MIPS 2 code. |
- |
-@sp 1 |
-@need 1000 |
-@item PowerPC 64 (@samp{powerpc64}, @samp{powerpc620}, @samp{powerpc630}, @samp{powerpc970}, @samp{power4}, @samp{power5}) |
-@cindex PowerPC |
-@table @asis |
-@item @samp{ABI=aix64} |
-@cindex AIX |
-The AIX 64 ABI uses 64-bit limbs and pointers and is the default on PowerPC 64 |
-@samp{*-*-aix*} systems. Applications must be compiled with |
- |
-@example |
-gcc -maix64 |
-xlc -q64 |
-@end example |
- |
-@item @samp{ABI=mode64} |
-The @samp{mode64} ABI uses 64-bit limbs and pointers, and is the default on |
-64-bit GNU/Linux, BSD, and Mac OS X/Darwin systems. Applications must be |
-compiled with |
- |
-@example |
-gcc -m64 |
-@end example |
- |
-@item @samp{ABI=mode32} |
-@cindex AIX |
-The @samp{mode32} ABI uses a 64-bit @code{long long} limb but with the chip |
-still in 32-bit mode and using 32-bit calling conventions. This is the default |
-on for systems where the true 64-bit ABIs are unavailable. No special compiler |
-options are needed for applications. |
- |
-@item @samp{ABI=32} |
-This is the basic 32-bit PowerPC ABI, with a 32-bit limb. No special compiler |
-options are needed for applications. |
-@end table |
- |
-GMP speed is greatest in @samp{aix64} and @samp{mode32}. In @samp{ABI=32} |
-only the 32-bit ISA is used and this doesn't make full use of a 64-bit chip. |
-On a suitable system we could perhaps use more of the ISA, but there are no |
-plans to do so. |
- |
-@sp 1 |
-@need 1000 |
-@item Sparc V9 (@samp{sparc64}, @samp{sparcv9}, @samp{ultrasparc*}) |
-@cindex Sparc V9 |
-@cindex Solaris |
-@cindex Sun |
-@table @asis |
-@item @samp{ABI=64} |
-The 64-bit V9 ABI is available on the various BSD sparc64 ports, recent |
-versions of Sparc64 GNU/Linux, and Solaris 2.7 and up (when the kernel is in |
-64-bit mode). GCC 3.2 or higher, or Sun @command{cc} is required. On |
-GNU/Linux, depending on the default @command{gcc} mode, applications must be |
-compiled with |
- |
-@example |
-gcc -m64 |
-@end example |
- |
-On Solaris applications must be compiled with |
- |
-@example |
-gcc -m64 -mptr64 -Wa,-xarch=v9 -mcpu=v9 |
-cc -xarch=v9 |
-@end example |
- |
-On the BSD sparc64 systems no special options are required, since 64-bits is |
-the only ABI available. |
- |
-@item @samp{ABI=32} |
-For the basic 32-bit ABI, GMP still uses as much of the V9 ISA as it can. In |
-the Sun documentation this combination is known as ``v8plus''. On GNU/Linux, |
-depending on the default @command{gcc} mode, applications may need to be |
-compiled with |
- |
-@example |
-gcc -m32 |
-@end example |
- |
-On Solaris, no special compiler options are required for applications, though |
-using something like the following is recommended. (@command{gcc} 2.8 and |
-earlier only support @samp{-mv8} though.) |
- |
-@example |
-gcc -mv8plus |
-cc -xarch=v8plus |
-@end example |
-@end table |
- |
-GMP speed is greatest in @samp{ABI=64}, so it's the default where available. |
-The speed is partly because there are extra registers available and partly |
-because 64-bits is considered the more important case and has therefore had |
-better code written for it. |
- |
-Don't be confused by the names of the @samp{-m} and @samp{-x} compiler |
-options, they're called @samp{arch} but effectively control both ABI and ISA@. |
- |
-On Solaris 2.6 and earlier, only @samp{ABI=32} is available since the kernel |
-doesn't save all registers. |
- |
-On Solaris 2.7 with the kernel in 32-bit mode, a normal native build will |
-reject @samp{ABI=64} because the resulting executables won't run. |
-@samp{ABI=64} can still be built if desired by making it look like a |
-cross-compile, for example |
- |
-@example |
-./configure --build=none --host=sparcv9-sun-solaris2.7 ABI=64 |
-@end example |
-@end table |
- |
- |
-@need 2000 |
-@node Notes for Package Builds, Notes for Particular Systems, ABI and ISA, Installing GMP |
-@section Notes for Package Builds |
-@cindex Build notes for binary packaging |
-@cindex Packaged builds |
- |
-GMP should present no great difficulties for packaging in a binary |
-distribution. |
- |
-@cindex Libtool versioning |
-@cindex Shared library versioning |
-Libtool is used to build the library and @samp{-version-info} is set |
-appropriately, having started from @samp{3:0:0} in GMP 3.0 (@pxref{Versioning, |
-Library interface versions, Library interface versions, libtool, GNU |
-Libtool}). |
- |
-The GMP 4 series will be upwardly binary compatible in each release and will |
-be upwardly binary compatible with all of the GMP 3 series. Additional |
-function interfaces may be added in each release, so on systems where libtool |
-versioning is not fully checked by the loader an auxiliary mechanism may be |
-needed to express that a dynamic linked application depends on a new enough |
-GMP. |
- |
-An auxiliary mechanism may also be needed to express that @file{libgmpxx.la} |
-(from @option{--enable-cxx}, @pxref{Build Options}) requires @file{libgmp.la} |
-from the same GMP version, since this is not done by the libtool versioning, |
-nor otherwise. A mismatch will result in unresolved symbols from the linker, |
-or perhaps the loader. |
- |
-When building a package for a CPU family, care should be taken to use |
-@samp{--host} (or @samp{--build}) to choose the least common denominator among |
-the CPUs which might use the package. For example this might mean plain |
-@samp{sparc} (meaning V7) for SPARCs. |
- |
-For x86s, @option{--enable-fat} sets things up for a fat binary build, making a |
-runtime selection of optimized low level routines. This is a good choice for |
-packaging to run on a range of x86 chips. |
- |
-Users who care about speed will want GMP built for their exact CPU type, to |
-make best use of the available optimizations. Providing a way to suitably |
-rebuild a package may be useful. This could be as simple as making it |
-possible for a user to omit @samp{--build} (and @samp{--host}) so |
-@samp{./config.guess} will detect the CPU@. But a way to manually specify a |
-@samp{--build} will be wanted for systems where @samp{./config.guess} is |
-inexact. |
- |
-On systems with multiple ABIs, a packaged build will need to decide which |
-among the choices is to be provided, see @ref{ABI and ISA}. A given run of |
-@samp{./configure} etc will only build one ABI@. If a second ABI is also |
-required then a second run of @samp{./configure} etc must be made, starting |
-from a clean directory tree (@samp{make distclean}). |
- |
-As noted under ``ABI and ISA'', currently no attempt is made to follow system |
-conventions for install locations that vary with ABI, such as |
-@file{/usr/lib/sparcv9} for @samp{ABI=64} as opposed to @file{/usr/lib} for |
-@samp{ABI=32}. A package build can override @samp{libdir} and other standard |
-variables as necessary. |
- |
-Note that @file{gmp.h} is a generated file, and will be architecture and ABI |
-dependent. When attempting to install two ABIs simultaneously it will be |
-important that an application compile gets the correct @file{gmp.h} for its |
-desired ABI@. If compiler include paths don't vary with ABI options then it |
-might be necessary to create a @file{/usr/include/gmp.h} which tests |
-preprocessor symbols and chooses the correct actual @file{gmp.h}. |
- |
- |
-@need 2000 |
-@node Notes for Particular Systems, Known Build Problems, Notes for Package Builds, Installing GMP |
-@section Notes for Particular Systems |
-@cindex Build notes for particular systems |
-@cindex Particular systems |
-@cindex Systems |
-@table @asis |
- |
-@c This section is more or less meant for notes about performance or about |
-@c build problems that have been worked around but might leave a user |
-@c scratching their head. Fun with different ABIs on a system belongs in the |
-@c above section. |
- |
-@item AIX 3 and 4 |
-@cindex AIX |
-On systems @samp{*-*-aix[34]*} shared libraries are disabled by default, since |
-some versions of the native @command{ar} fail on the convenience libraries |
-used. A shared build can be attempted with |
- |
-@example |
-./configure --enable-shared --disable-static |
-@end example |
- |
-Note that the @samp{--disable-static} is necessary because in a shared build |
-libtool makes @file{libgmp.a} a symlink to @file{libgmp.so}, apparently for |
-the benefit of old versions of @command{ld} which only recognise @file{.a}, |
-but unfortunately this is done even if a fully functional @command{ld} is |
-available. |
- |
-@item ARM |
-@cindex ARM |
-On systems @samp{arm*-*-*}, versions of GCC up to and including 2.95.3 have a |
-bug in unsigned division, giving wrong results for some operands. GMP |
-@samp{./configure} will demand GCC 2.95.4 or later. |
- |
-@item Compaq C++ |
-@cindex Compaq C++ |
-Compaq C++ on OSF 5.1 has two flavours of @code{iostream}, a standard one and |
-an old pre-standard one (see @samp{man iostream_intro}). GMP can only use the |
-standard one, which unfortunately is not the default but must be selected by |
-defining @code{__USE_STD_IOSTREAM}. Configure with for instance |
- |
-@example |
-./configure --enable-cxx CPPFLAGS=-D__USE_STD_IOSTREAM |
-@end example |
- |
-@item Floating Point Mode |
-@cindex Floating point mode |
-@cindex Hardware floating point mode |
-@cindex Precision of hardware floating point |
-@cindex x87 |
-On some systems, the hardware floating point has a control mode which can set |
-all operations to be done in a particular precision, for instance single, |
-double or extended on x86 systems (x87 floating point). The GMP functions |
-involving a @code{double} cannot be expected to operate to their full |
-precision when the hardware is in single precision mode. Of course this |
-affects all code, including application code, not just GMP. |
- |
-@item MacOS 9 |
-@cindex MacOS 9 |
-The @file{macos} directory contains an unsupported port to MacOS 9 on Power |
-Macintosh, see @file{macos/README}. Note that MacOS X ``Darwin'' should use |
-the normal Unix-style @samp{./configure}. |
- |
-@item MS-DOS and MS Windows |
-@cindex MS-DOS |
-@cindex MS Windows |
-@cindex Windows |
-@cindex Cygwin |
-@cindex DJGPP |
-@cindex MINGW |
-On an MS-DOS system DJGPP can be used to build GMP, and on an MS Windows |
-system Cygwin, DJGPP and MINGW can be used. All three are excellent ports of |
-GCC and the various GNU tools. |
- |
-@display |
-@uref{http://www.cygwin.com/} |
-@uref{http://www.delorie.com/djgpp/} |
-@uref{http://www.mingw.org/} |
-@end display |
- |
-@cindex Interix |
-@cindex Services for Unix |
-Microsoft also publishes an Interix ``Services for Unix'' which can be used to |
-build GMP on Windows (with a normal @samp{./configure}), but it's not free |
-software. |
- |
-@item MS Windows DLLs |
-@cindex DLLs |
-@cindex MS Windows |
-@cindex Windows |
-On systems @samp{*-*-cygwin*}, @samp{*-*-mingw*} and @samp{*-*-pw32*} by |
-default GMP builds only a static library, but a DLL can be built instead using |
- |
-@example |
-./configure --disable-static --enable-shared |
-@end example |
- |
-Static and DLL libraries can't both be built, since certain export directives |
-in @file{gmp.h} must be different. |
- |
-A MINGW DLL build of GMP can be used with Microsoft C@. Libtool doesn't |
-install a @file{.lib} format import library, but it can be created with MS |
-@command{lib} as follows, and copied to the install directory. Similarly for |
-@file{libmp} and @file{libgmpxx}. |
- |
-@example |
-cd .libs |
-lib /def:libgmp-3.dll.def /out:libgmp-3.lib |
-@end example |
- |
-MINGW uses the C runtime library @samp{msvcrt.dll} for I/O, so applications |
-wanting to use the GMP I/O routines must be compiled with @samp{cl /MD} to do |
-the same. If one of the other C runtime library choices provided by MS C is |
-desired then the suggestion is to use the GMP string functions and confine I/O |
-to the application. |
- |
-@item Motorola 68k CPU Types |
-@cindex 68000 |
-@samp{m68k} is taken to mean 68000. @samp{m68020} or higher will give a |
-performance boost on applicable CPUs. @samp{m68360} can be used for CPU32 |
-series chips. @samp{m68302} can be used for ``Dragonball'' series chips, |
-though this is merely a synonym for @samp{m68000}. |
- |
-@item OpenBSD 2.6 |
-@cindex OpenBSD |
-@command{m4} in this release of OpenBSD has a bug in @code{eval} that makes it |
-unsuitable for @file{.asm} file processing. @samp{./configure} will detect |
-the problem and either abort or choose another m4 in the @env{PATH}. The bug |
-is fixed in OpenBSD 2.7, so either upgrade or use GNU m4. |
- |
-@item Power CPU Types |
-@cindex Power/PowerPC |
-In GMP, CPU types @samp{power*} and @samp{powerpc*} will each use instructions |
-not available on the other, so it's important to choose the right one for the |
-CPU that will be used. Currently GMP has no assembly code support for using |
-just the common instruction subset. To get executables that run on both, the |
-current suggestion is to use the generic C code (CPU @samp{none}), possibly |
-with appropriate compiler options (like @samp{-mcpu=common} for |
-@command{gcc}). CPU @samp{rs6000} (which is not a CPU but a family of |
-workstations) is accepted by @file{config.sub}, but is currently equivalent to |
-@samp{none}. |
- |
-@item Sparc CPU Types |
-@cindex Sparc |
-@samp{sparcv8} or @samp{supersparc} on relevant systems will give a |
-significant performance increase over the V7 code selected by plain |
-@samp{sparc}. |
- |
-@item Sparc App Regs |
-@cindex Sparc |
-The GMP assembly code for both 32-bit and 64-bit Sparc clobbers the |
-``application registers'' @code{g2}, @code{g3} and @code{g4}, the same way |
-that the GCC default @samp{-mapp-regs} does (@pxref{SPARC Options,, SPARC |
-Options, gcc, Using the GNU Compiler Collection (GCC)}). |
- |
-This makes that code unsuitable for use with the special V9 |
-@samp{-mcmodel=embmedany} (which uses @code{g4} as a data segment pointer), |
-and for applications wanting to use those registers for special purposes. In |
-these cases the only suggestion currently is to build GMP with CPU @samp{none} |
-to avoid the assembly code. |
- |
-@item SunOS 4 |
-@cindex SunOS |
-@command{/usr/bin/m4} lacks various features needed to process @file{.asm} |
-files, and instead @samp{./configure} will automatically use |
-@command{/usr/5bin/m4}, which we believe is always available (if not then use |
-GNU m4). |
- |
-@item x86 CPU Types |
-@cindex x86 |
-@cindex 80x86 |
-@cindex i386 |
-@samp{i586}, @samp{pentium} or @samp{pentiummmx} code is good for its intended |
-P5 Pentium chips, but quite slow when run on Intel P6 class chips (PPro, P-II, |
-P-III)@. @samp{i386} is a better choice when making binaries that must run on |
-both. |
- |
-@item x86 MMX and SSE2 Code |
-@cindex MMX |
-@cindex SSE2 |
-If the CPU selected has MMX code but the assembler doesn't support it, a |
-warning is given and non-MMX code is used instead. This will be an inferior |
-build, since the MMX code that's present is there because it's faster than the |
-corresponding plain integer code. The same applies to SSE2. |
- |
-Old versions of @samp{gas} don't support MMX instructions, in particular |
-version 1.92.3 that comes with FreeBSD 2.2.8 or the more recent OpenBSD 3.1 |
-doesn't. |
- |
-Solaris 2.6 and 2.7 @command{as} generate incorrect object code for register |
-to register @code{movq} instructions, and so can't be used for MMX code. |
-Install a recent @command{gas} if MMX code is wanted on these systems. |
-@end table |
- |
- |
-@need 2000 |
-@node Known Build Problems, Performance optimization, Notes for Particular Systems, Installing GMP |
-@section Known Build Problems |
-@cindex Build problems known |
- |
-@c This section is more or less meant for known build problems that are not |
-@c otherwise worked around and require some sort of manual intervention. |
- |
-You might find more up-to-date information at @uref{http://gmplib.org/}. |
- |
-@table @asis |
-@item Compiler link options |
-The version of libtool currently in use rather aggressively strips compiler |
-options when linking a shared library. This will hopefully be relaxed in the |
-future, but for now if this is a problem the suggestion is to create a little |
-script to hide them, and for instance configure with |
- |
-@example |
-./configure CC=gcc-with-my-options |
-@end example |
- |
-@item DJGPP (@samp{*-*-msdosdjgpp*}) |
-@cindex DJGPP |
-The DJGPP port of @command{bash} 2.03 is unable to run the @samp{configure} |
-script, it exits silently, having died writing a preamble to |
-@file{config.log}. Use @command{bash} 2.04 or higher. |
- |
-@samp{make all} was found to run out of memory during the final |
-@file{libgmp.la} link on one system tested, despite having 64Mb available. |
-Running @samp{make libgmp.la} directly helped, perhaps recursing into the |
-various subdirectories uses up memory. |
- |
-@item GNU binutils @command{strip} prior to 2.12 |
-@cindex Stripped libraries |
-@cindex Binutils @command{strip} |
-@cindex GNU @command{strip} |
-@command{strip} from GNU binutils 2.11 and earlier should not be used on the |
-static libraries @file{libgmp.a} and @file{libmp.a} since it will discard all |
-but the last of multiple archive members with the same name, like the three |
-versions of @file{init.o} in @file{libgmp.a}. Binutils 2.12 or higher can be |
-used successfully. |
- |
-The shared libraries @file{libgmp.so} and @file{libmp.so} are not affected by |
-this and any version of @command{strip} can be used on them. |
- |
-@item @command{make} syntax error |
-@cindex SCO |
-@cindex IRIX |
-On certain versions of SCO OpenServer 5 and IRIX 6.5 the native @command{make} |
-is unable to handle the long dependencies list for @file{libgmp.la}. The |
-symptom is a ``syntax error'' on the following line of the top-level |
-@file{Makefile}. |
- |
-@example |
-libgmp.la: $(libgmp_la_OBJECTS) $(libgmp_la_DEPENDENCIES) |
-@end example |
- |
-Either use GNU Make, or as a workaround remove |
-@code{$(libgmp_la_DEPENDENCIES)} from that line (which will make the initial |
-build work, but if any recompiling is done @file{libgmp.la} might not be |
-rebuilt). |
- |
-@item MacOS X (@samp{*-*-darwin*}) |
-@cindex MacOS X |
-@cindex Darwin |
-Libtool currently only knows how to create shared libraries on MacOS X using |
-the native @command{cc} (which is a modified GCC), not a plain GCC@. A |
-static-only build should work though (@samp{--disable-shared}). |
- |
-@item NeXT prior to 3.3 |
-@cindex NeXT |
-The system compiler on old versions of NeXT was a massacred and old GCC, even |
-if it called itself @file{cc}. This compiler cannot be used to build GMP, you |
-need to get a real GCC, and install that. (NeXT may have fixed this in |
-release 3.3 of their system.) |
- |
-@item POWER and PowerPC |
-@cindex Power/PowerPC |
-Bugs in GCC 2.7.2 (and 2.6.3) mean it can't be used to compile GMP on POWER or |
-PowerPC@. If you want to use GCC for these machines, get GCC 2.7.2.1 (or |
-later). |
- |
-@item Sequent Symmetry |
-@cindex Sequent Symmetry |
-Use the GNU assembler instead of the system assembler, since the latter has |
-serious bugs. |
- |
-@item Solaris 2.6 |
-@cindex Solaris |
-The system @command{sed} prints an error ``Output line too long'' when libtool |
-builds @file{libgmp.la}. This doesn't seem to cause any obvious ill effects, |
-but GNU @command{sed} is recommended, to avoid any doubt. |
- |
-@item Sparc Solaris 2.7 with gcc 2.95.2 in @samp{ABI=32} |
-@cindex Solaris |
-A shared library build of GMP seems to fail in this combination, it builds but |
-then fails the tests, apparently due to some incorrect data relocations within |
-@code{gmp_randinit_lc_2exp_size}. The exact cause is unknown, |
-@samp{--disable-shared} is recommended. |
-@end table |
- |
- |
-@need 2000 |
-@node Performance optimization, , Known Build Problems, Installing GMP |
-@section Performance optimization |
-@cindex Optimizing performance |
- |
-@c At some point, this should perhaps move to a separate chapter on optimizing |
-@c performance. |
- |
-For optimal performance, build GMP for the exact CPU type of the target |
-computer, see @ref{Build Options}. |
- |
-Unlike what is the case for most other programs, the compiler typically |
-doesn't matter much, since GMP uses assembly language for the most critical |
-operation. |
- |
-In particular for long-running GMP applications, and applications demanding |
-extremely large numbers, building and running the @code{tuneup} program in the |
-@file{tune} subdirectory, can be important. For example, |
- |
-@example |
-cd tune |
-make tuneup |
-./tuneup |
-@end example |
- |
-will generate better contents for the @file{gmp-mparam.h} parameter file. |
- |
-To use the results, put the output in the file file indicated in the |
-@samp{Parameters for ...} header. Then recompile from scratch. |
- |
-The @code{tuneup} program takes one useful parameter, @samp{-f NNN}, which |
-instructs the program how long to check FFT multiply parameters. If you're |
-going to use GMP for extremely large numbers, you may want to run @code{tuneup} |
-with a large NNN value. |
- |
- |
-@node GMP Basics, Reporting Bugs, Installing GMP, Top |
-@comment node-name, next, previous, up |
-@chapter GMP Basics |
-@cindex Basics |
- |
-@strong{Using functions, macros, data types, etc.@: not documented in this |
-manual is strongly discouraged. If you do so your application is guaranteed |
-to be incompatible with future versions of GMP.} |
- |
-@menu |
-* Headers and Libraries:: |
-* Nomenclature and Types:: |
-* Function Classes:: |
-* Variable Conventions:: |
-* Parameter Conventions:: |
-* Memory Management:: |
-* Reentrancy:: |
-* Useful Macros and Constants:: |
-* Compatibility with older versions:: |
-* Demonstration Programs:: |
-* Efficiency:: |
-* Debugging:: |
-* Profiling:: |
-* Autoconf:: |
-* Emacs:: |
-@end menu |
- |
-@node Headers and Libraries, Nomenclature and Types, GMP Basics, GMP Basics |
-@section Headers and Libraries |
-@cindex Headers |
- |
-@cindex @file{gmp.h} |
-@cindex Include files |
-@cindex @code{#include} |
-All declarations needed to use GMP are collected in the include file |
-@file{gmp.h}. It is designed to work with both C and C++ compilers. |
- |
-@example |
-#include <gmp.h> |
-@end example |
- |
-@cindex @code{stdio.h} |
-Note however that prototypes for GMP functions with @code{FILE *} parameters |
-are only provided if @code{<stdio.h>} is included too. |
- |
-@example |
-#include <stdio.h> |
-#include <gmp.h> |
-@end example |
- |
-@cindex @code{stdarg.h} |
-Likewise @code{<stdarg.h>} (or @code{<varargs.h>}) is required for prototypes |
-with @code{va_list} parameters, such as @code{gmp_vprintf}. And |
-@code{<obstack.h>} for prototypes with @code{struct obstack} parameters, such |
-as @code{gmp_obstack_printf}, when available. |
- |
-@cindex Libraries |
-@cindex Linking |
-@cindex @code{libgmp} |
-All programs using GMP must link against the @file{libgmp} library. On a |
-typical Unix-like system this can be done with @samp{-lgmp}, for example |
- |
-@example |
-gcc myprogram.c -lgmp |
-@end example |
- |
-@cindex @code{libgmpxx} |
-GMP C++ functions are in a separate @file{libgmpxx} library. This is built |
-and installed if C++ support has been enabled (@pxref{Build Options}). For |
-example, |
- |
-@example |
-g++ mycxxprog.cc -lgmpxx -lgmp |
-@end example |
- |
-@cindex Libtool |
-GMP is built using Libtool and an application can use that to link if desired, |
-@GMPpxreftop{libtool, GNU Libtool}. |
- |
-If GMP has been installed to a non-standard location then it may be necessary |
-to use @samp{-I} and @samp{-L} compiler options to point to the right |
-directories, and some sort of run-time path for a shared library. |
- |
- |
-@node Nomenclature and Types, Function Classes, Headers and Libraries, GMP Basics |
-@section Nomenclature and Types |
-@cindex Nomenclature |
-@cindex Types |
- |
-@cindex Integer |
-@tindex @code{mpz_t} |
-In this manual, @dfn{integer} usually means a multiple precision integer, as |
-defined by the GMP library. The C data type for such integers is @code{mpz_t}. |
-Here are some examples of how to declare such integers: |
- |
-@example |
-mpz_t sum; |
- |
-struct foo @{ mpz_t x, y; @}; |
- |
-mpz_t vec[20]; |
-@end example |
- |
-@cindex Rational number |
-@tindex @code{mpq_t} |
-@dfn{Rational number} means a multiple precision fraction. The C data type |
-for these fractions is @code{mpq_t}. For example: |
- |
-@example |
-mpq_t quotient; |
-@end example |
- |
-@cindex Floating-point number |
-@tindex @code{mpf_t} |
-@dfn{Floating point number} or @dfn{Float} for short, is an arbitrary precision |
-mantissa with a limited precision exponent. The C data type for such objects |
-is @code{mpf_t}. For example: |
- |
-@example |
-mpf_t fp; |
-@end example |
- |
-@tindex @code{mp_exp_t} |
-The floating point functions accept and return exponents in the C type |
-@code{mp_exp_t}. Currently this is usually a @code{long}, but on some systems |
-it's an @code{int} for efficiency. |
- |
-@cindex Limb |
-@tindex @code{mp_limb_t} |
-A @dfn{limb} means the part of a multi-precision number that fits in a single |
-machine word. (We chose this word because a limb of the human body is |
-analogous to a digit, only larger, and containing several digits.) Normally a |
-limb is 32 or 64 bits. The C data type for a limb is @code{mp_limb_t}. |
- |
-@tindex @code{mp_size_t} |
-Counts of limbs are represented in the C type @code{mp_size_t}. Currently |
-this is normally a @code{long}, but on some systems it's an @code{int} for |
-efficiency. |
- |
-@cindex Random state |
-@tindex @code{gmp_randstate_t} |
-@dfn{Random state} means an algorithm selection and current state data. The C |
-data type for such objects is @code{gmp_randstate_t}. For example: |
- |
-@example |
-gmp_randstate_t rstate; |
-@end example |
- |
-Also, in general @code{unsigned long} is used for bit counts and ranges, and |
-@code{size_t} is used for byte or character counts. |
- |
- |
-@node Function Classes, Variable Conventions, Nomenclature and Types, GMP Basics |
-@section Function Classes |
-@cindex Function classes |
- |
-There are six classes of functions in the GMP library: |
- |
-@enumerate |
-@item |
-Functions for signed integer arithmetic, with names beginning with |
-@code{mpz_}. The associated type is @code{mpz_t}. There are about 150 |
-functions in this class. (@pxref{Integer Functions}) |
- |
-@item |
-Functions for rational number arithmetic, with names beginning with |
-@code{mpq_}. The associated type is @code{mpq_t}. There are about 40 |
-functions in this class, but the integer functions can be used for arithmetic |
-on the numerator and denominator separately. (@pxref{Rational Number |
-Functions}) |
- |
-@item |
-Functions for floating-point arithmetic, with names beginning with |
-@code{mpf_}. The associated type is @code{mpf_t}. There are about 60 |
-functions is this class. (@pxref{Floating-point Functions}) |
- |
-@item |
-Functions compatible with Berkeley MP, such as @code{itom}, @code{madd}, and |
-@code{mult}. The associated type is @code{MINT}. (@pxref{BSD Compatible |
-Functions}) |
- |
-@item |
-Fast low-level functions that operate on natural numbers. These are used by |
-the functions in the preceding groups, and you can also call them directly |
-from very time-critical user programs. These functions' names begin with |
-@code{mpn_}. The associated type is array of @code{mp_limb_t}. There are |
-about 30 (hard-to-use) functions in this class. (@pxref{Low-level Functions}) |
- |
-@item |
-Miscellaneous functions. Functions for setting up custom allocation and |
-functions for generating random numbers. (@pxref{Custom Allocation}, and |
-@pxref{Random Number Functions}) |
-@end enumerate |
- |
- |
-@node Variable Conventions, Parameter Conventions, Function Classes, GMP Basics |
-@section Variable Conventions |
-@cindex Variable conventions |
-@cindex Conventions for variables |
- |
-GMP functions generally have output arguments before input arguments. This |
-notation is by analogy with the assignment operator. The BSD MP compatibility |
-functions are exceptions, having the output arguments last. |
- |
-GMP lets you use the same variable for both input and output in one call. For |
-example, the main function for integer multiplication, @code{mpz_mul}, can be |
-used to square @code{x} and put the result back in @code{x} with |
- |
-@example |
-mpz_mul (x, x, x); |
-@end example |
- |
-Before you can assign to a GMP variable, you need to initialize it by calling |
-one of the special initialization functions. When you're done with a |
-variable, you need to clear it out, using one of the functions for that |
-purpose. Which function to use depends on the type of variable. See the |
-chapters on integer functions, rational number functions, and floating-point |
-functions for details. |
- |
-A variable should only be initialized once, or at least cleared between each |
-initialization. After a variable has been initialized, it may be assigned to |
-any number of times. |
- |
-For efficiency reasons, avoid excessive initializing and clearing. In |
-general, initialize near the start of a function and clear near the end. For |
-example, |
- |
-@example |
-void |
-foo (void) |
-@{ |
- mpz_t n; |
- int i; |
- mpz_init (n); |
- for (i = 1; i < 100; i++) |
- @{ |
- mpz_mul (n, @dots{}); |
- mpz_fdiv_q (n, @dots{}); |
- @dots{} |
- @} |
- mpz_clear (n); |
-@} |
-@end example |
- |
- |
-@node Parameter Conventions, Memory Management, Variable Conventions, GMP Basics |
-@section Parameter Conventions |
-@cindex Parameter conventions |
-@cindex Conventions for parameters |
- |
-When a GMP variable is used as a function parameter, it's effectively a |
-call-by-reference, meaning if the function stores a value there it will change |
-the original in the caller. Parameters which are input-only can be designated |
-@code{const} to provoke a compiler error or warning on attempting to modify |
-them. |
- |
-When a function is going to return a GMP result, it should designate a |
-parameter that it sets, like the library functions do. More than one value |
-can be returned by having more than one output parameter, again like the |
-library functions. A @code{return} of an @code{mpz_t} etc doesn't return the |
-object, only a pointer, and this is almost certainly not what's wanted. |
- |
-Here's an example accepting an @code{mpz_t} parameter, doing a calculation, |
-and storing the result to the indicated parameter. |
- |
-@example |
-void |
-foo (mpz_t result, const mpz_t param, unsigned long n) |
-@{ |
- unsigned long i; |
- mpz_mul_ui (result, param, n); |
- for (i = 1; i < n; i++) |
- mpz_add_ui (result, result, i*7); |
-@} |
- |
-int |
-main (void) |
-@{ |
- mpz_t r, n; |
- mpz_init (r); |
- mpz_init_set_str (n, "123456", 0); |
- foo (r, n, 20L); |
- gmp_printf ("%Zd\n", r); |
- return 0; |
-@} |
-@end example |
- |
-@code{foo} works even if the mainline passes the same variable for |
-@code{param} and @code{result}, just like the library functions. But |
-sometimes it's tricky to make that work, and an application might not want to |
-bother supporting that sort of thing. |
- |
-For interest, the GMP types @code{mpz_t} etc are implemented as one-element |
-arrays of certain structures. This is why declaring a variable creates an |
-object with the fields GMP needs, but then using it as a parameter passes a |
-pointer to the object. Note that the actual fields in each @code{mpz_t} etc |
-are for internal use only and should not be accessed directly by code that |
-expects to be compatible with future GMP releases. |
- |
- |
-@need 1000 |
-@node Memory Management, Reentrancy, Parameter Conventions, GMP Basics |
-@section Memory Management |
-@cindex Memory management |
- |
-The GMP types like @code{mpz_t} are small, containing only a couple of sizes, |
-and pointers to allocated data. Once a variable is initialized, GMP takes |
-care of all space allocation. Additional space is allocated whenever a |
-variable doesn't have enough. |
- |
-@code{mpz_t} and @code{mpq_t} variables never reduce their allocated space. |
-Normally this is the best policy, since it avoids frequent reallocation. |
-Applications that need to return memory to the heap at some particular point |
-can use @code{mpz_realloc2}, or clear variables no longer needed. |
- |
-@code{mpf_t} variables, in the current implementation, use a fixed amount of |
-space, determined by the chosen precision and allocated at initialization, so |
-their size doesn't change. |
- |
-All memory is allocated using @code{malloc} and friends by default, but this |
-can be changed, see @ref{Custom Allocation}. Temporary memory on the stack is |
-also used (via @code{alloca}), but this can be changed at build-time if |
-desired, see @ref{Build Options}. |
- |
- |
-@node Reentrancy, Useful Macros and Constants, Memory Management, GMP Basics |
-@section Reentrancy |
-@cindex Reentrancy |
-@cindex Thread safety |
-@cindex Multi-threading |
- |
-@noindent |
-GMP is reentrant and thread-safe, with some exceptions: |
- |
-@itemize @bullet |
-@item |
-If configured with @option{--enable-alloca=malloc-notreentrant} (or with |
-@option{--enable-alloca=notreentrant} when @code{alloca} is not available), |
-then naturally GMP is not reentrant. |
- |
-@item |
-@code{mpf_set_default_prec} and @code{mpf_init} use a global variable for the |
-selected precision. @code{mpf_init2} can be used instead, and in the C++ |
-interface an explicit precision to the @code{mpf_class} constructor. |
- |
-@item |
-@code{mpz_random} and the other old random number functions use a global |
-random state and are hence not reentrant. The newer random number functions |
-that accept a @code{gmp_randstate_t} parameter can be used instead. |
- |
-@item |
-@code{gmp_randinit} (obsolete) returns an error indication through a global |
-variable, which is not thread safe. Applications are advised to use |
-@code{gmp_randinit_default} or @code{gmp_randinit_lc_2exp} instead. |
- |
-@item |
-@code{mp_set_memory_functions} uses global variables to store the selected |
-memory allocation functions. |
- |
-@item |
-If the memory allocation functions set by a call to |
-@code{mp_set_memory_functions} (or @code{malloc} and friends by default) are |
-not reentrant, then GMP will not be reentrant either. |
- |
-@item |
-If the standard I/O functions such as @code{fwrite} are not reentrant then the |
-GMP I/O functions using them will not be reentrant either. |
- |
-@item |
-It's safe for two threads to read from the same GMP variable simultaneously, |
-but it's not safe for one to read while the another might be writing, nor for |
-two threads to write simultaneously. It's not safe for two threads to |
-generate a random number from the same @code{gmp_randstate_t} simultaneously, |
-since this involves an update of that variable. |
-@end itemize |
- |
- |
-@need 2000 |
-@node Useful Macros and Constants, Compatibility with older versions, Reentrancy, GMP Basics |
-@section Useful Macros and Constants |
-@cindex Useful macros and constants |
-@cindex Constants |
- |
-@deftypevr {Global Constant} {const int} mp_bits_per_limb |
-@findex mp_bits_per_limb |
-@cindex Bits per limb |
-@cindex Limb size |
-The number of bits per limb. |
-@end deftypevr |
- |
-@defmac __GNU_MP_VERSION |
-@defmacx __GNU_MP_VERSION_MINOR |
-@defmacx __GNU_MP_VERSION_PATCHLEVEL |
-@cindex Version number |
-@cindex GMP version number |
-The major and minor GMP version, and patch level, respectively, as integers. |
-For GMP i.j, these numbers will be i, j, and 0, respectively. |
-For GMP i.j.k, these numbers will be i, j, and k, respectively. |
-@end defmac |
- |
-@deftypevr {Global Constant} {const char * const} gmp_version |
-@findex gmp_version |
-The GMP version number, as a null-terminated string, in the form ``i.j.k''. |
-This release is @nicode{"@value{VERSION}"}. Note that the format ``i.j'' was |
-used when k was zero was used before version 4.3.0. |
-@end deftypevr |
- |
- |
-@node Compatibility with older versions, Demonstration Programs, Useful Macros and Constants, GMP Basics |
-@section Compatibility with older versions |
-@cindex Compatibility with older versions |
-@cindex Past GMP versions |
-@cindex Upward compatibility |
- |
-This version of GMP is upwardly binary compatible with all 4.x and 3.x |
-versions, and upwardly compatible at the source level with all 2.x versions, |
-with the following exceptions. |
- |
-@itemize @bullet |
-@item |
-@code{mpn_gcd} had its source arguments swapped as of GMP 3.0, for consistency |
-with other @code{mpn} functions. |
- |
-@item |
-@code{mpf_get_prec} counted precision slightly differently in GMP 3.0 and |
-3.0.1, but in 3.1 reverted to the 2.x style. |
-@end itemize |
- |
-There are a number of compatibility issues between GMP 1 and GMP 2 that of |
-course also apply when porting applications from GMP 1 to GMP 4. Please |
-see the GMP 2 manual for details. |
- |
-The Berkeley MP compatibility library (@pxref{BSD Compatible Functions}) is |
-source and binary compatible with the standard @file{libmp}. |
- |
-@c @enumerate |
-@c @item Integer division functions round the result differently. The obsolete |
-@c functions (@code{mpz_div}, @code{mpz_divmod}, @code{mpz_mdiv}, |
-@c @code{mpz_mdivmod}, etc) now all use floor rounding (i.e., they round the |
-@c quotient towards |
-@c @ifinfo |
-@c @minus{}infinity). |
-@c @end ifinfo |
-@c @iftex |
-@c @tex |
-@c $-\infty$). |
-@c @end tex |
-@c @end iftex |
-@c There are a lot of functions for integer division, giving the user better |
-@c control over the rounding. |
- |
-@c @item The function @code{mpz_mod} now compute the true @strong{mod} function. |
- |
-@c @item The functions @code{mpz_powm} and @code{mpz_powm_ui} now use |
-@c @strong{mod} for reduction. |
- |
-@c @item The assignment functions for rational numbers do no longer canonicalize |
-@c their results. In the case a non-canonical result could arise from an |
-@c assignment, the user need to insert an explicit call to |
-@c @code{mpq_canonicalize}. This change was made for efficiency. |
- |
-@c @item Output generated by @code{mpz_out_raw} in this release cannot be read |
-@c by @code{mpz_inp_raw} in previous releases. This change was made for making |
-@c the file format truly portable between machines with different word sizes. |
- |
-@c @item Several @code{mpn} functions have changed. But they were intentionally |
-@c undocumented in previous releases. |
- |
-@c @item The functions @code{mpz_cmp_ui}, @code{mpz_cmp_si}, and @code{mpq_cmp_ui} |
-@c are now implemented as macros, and thereby sometimes evaluate their |
-@c arguments multiple times. |
- |
-@c @item The functions @code{mpz_pow_ui} and @code{mpz_ui_pow_ui} now yield 1 |
-@c for 0^0. (In version 1, they yielded 0.) |
- |
-@c In version 1 of the library, @code{mpq_set_den} handled negative |
-@c denominators by copying the sign to the numerator. That is no longer done. |
- |
-@c Pure assignment functions do not canonicalize the assigned variable. It is |
-@c the responsibility of the user to canonicalize the assigned variable before |
-@c any arithmetic operations are performed on that variable. |
-@c Note that this is an incompatible change from version 1 of the library. |
- |
-@c @end enumerate |
- |
- |
-@need 1000 |
-@node Demonstration Programs, Efficiency, Compatibility with older versions, GMP Basics |
-@section Demonstration programs |
-@cindex Demonstration programs |
-@cindex Example programs |
-@cindex Sample programs |
-The @file{demos} subdirectory has some sample programs using GMP@. These |
-aren't built or installed, but there's a @file{Makefile} with rules for them. |
-For instance, |
- |
-@example |
-make pexpr |
-./pexpr 68^975+10 |
-@end example |
- |
-@noindent |
-The following programs are provided |
- |
-@itemize @bullet |
-@item |
-@cindex Expression parsing demo |
-@cindex Parsing expressions demo |
-@samp{pexpr} is an expression evaluator, the program used on the GMP web page. |
-@item |
-@cindex Expression parsing demo |
-@cindex Parsing expressions demo |
-The @samp{calc} subdirectory has a similar but simpler evaluator using |
-@command{lex} and @command{yacc}. |
-@item |
-@cindex Expression parsing demo |
-@cindex Parsing expressions demo |
-The @samp{expr} subdirectory is yet another expression evaluator, a library |
-designed for ease of use within a C program. See @file{demos/expr/README} for |
-more information. |
-@item |
-@cindex Factorization demo |
-@samp{factorize} is a Pollard-Rho factorization program. |
-@item |
-@samp{isprime} is a command-line interface to the @code{mpz_probab_prime_p} |
-function. |
-@item |
-@samp{primes} counts or lists primes in an interval, using a sieve. |
-@item |
-@samp{qcn} is an example use of @code{mpz_kronecker_ui} to estimate quadratic |
-class numbers. |
-@item |
-@cindex @code{perl} |
-@cindex GMP Perl module |
-@cindex Perl module |
-The @samp{perl} subdirectory is a comprehensive perl interface to GMP@. See |
-@file{demos/perl/INSTALL} for more information. Documentation is in POD |
-format in @file{demos/perl/GMP.pm}. |
-@end itemize |
- |
-As an aside, consideration has been given at various times to some sort of |
-expression evaluation within the main GMP library. Going beyond something |
-minimal quickly leads to matters like user-defined functions, looping, fixnums |
-for control variables, etc, which are considered outside the scope of GMP |
-(much closer to language interpreters or compilers, @xref{Language Bindings}.) |
-Something simple for program input convenience may yet be a possibility, a |
-combination of the @file{expr} demo and the @file{pexpr} tree back-end |
-perhaps. But for now the above evaluators are offered as illustrations. |
- |
- |
-@need 1000 |
-@node Efficiency, Debugging, Demonstration Programs, GMP Basics |
-@section Efficiency |
-@cindex Efficiency |
- |
-@table @asis |
-@item Small Operands |
-@cindex Small operands |
-On small operands, the time for function call overheads and memory allocation |
-can be significant in comparison to actual calculation. This is unavoidable |
-in a general purpose variable precision library, although GMP attempts to be |
-as efficient as it can on both large and small operands. |
- |
-@item Static Linking |
-@cindex Static linking |
-On some CPUs, in particular the x86s, the static @file{libgmp.a} should be |
-used for maximum speed, since the PIC code in the shared @file{libgmp.so} will |
-have a small overhead on each function call and global data address. For many |
-programs this will be insignificant, but for long calculations there's a gain |
-to be had. |
- |
-@item Initializing and Clearing |
-@cindex Initializing and clearing |
-Avoid excessive initializing and clearing of variables, since this can be |
-quite time consuming, especially in comparison to otherwise fast operations |
-like addition. |
- |
-A language interpreter might want to keep a free list or stack of |
-initialized variables ready for use. It should be possible to integrate |
-something like that with a garbage collector too. |
- |
-@item Reallocations |
-@cindex Reallocations |
-An @code{mpz_t} or @code{mpq_t} variable used to hold successively increasing |
-values will have its memory repeatedly @code{realloc}ed, which could be quite |
-slow or could fragment memory, depending on the C library. If an application |
-can estimate the final size then @code{mpz_init2} or @code{mpz_realloc2} can |
-be called to allocate the necessary space from the beginning |
-(@pxref{Initializing Integers}). |
- |
-It doesn't matter if a size set with @code{mpz_init2} or @code{mpz_realloc2} |
-is too small, since all functions will do a further reallocation if necessary. |
-Badly overestimating memory required will waste space though. |
- |
-@item @code{2exp} Functions |
-@cindex @code{2exp} functions |
-It's up to an application to call functions like @code{mpz_mul_2exp} when |
-appropriate. General purpose functions like @code{mpz_mul} make no attempt to |
-identify powers of two or other special forms, because such inputs will |
-usually be very rare and testing every time would be wasteful. |
- |
-@item @code{ui} and @code{si} Functions |
-@cindex @code{ui} and @code{si} functions |
-The @code{ui} functions and the small number of @code{si} functions exist for |
-convenience and should be used where applicable. But if for example an |
-@code{mpz_t} contains a value that fits in an @code{unsigned long} there's no |
-need extract it and call a @code{ui} function, just use the regular @code{mpz} |
-function. |
- |
-@item In-Place Operations |
-@cindex In-place operations |
-@code{mpz_abs}, @code{mpq_abs}, @code{mpf_abs}, @code{mpz_neg}, @code{mpq_neg} |
-and @code{mpf_neg} are fast when used for in-place operations like |
-@code{mpz_abs(x,x)}, since in the current implementation only a single field |
-of @code{x} needs changing. On suitable compilers (GCC for instance) this is |
-inlined too. |
- |
-@code{mpz_add_ui}, @code{mpz_sub_ui}, @code{mpf_add_ui} and @code{mpf_sub_ui} |
-benefit from an in-place operation like @code{mpz_add_ui(x,x,y)}, since |
-usually only one or two limbs of @code{x} will need to be changed. The same |
-applies to the full precision @code{mpz_add} etc if @code{y} is small. If |
-@code{y} is big then cache locality may be helped, but that's all. |
- |
-@code{mpz_mul} is currently the opposite, a separate destination is slightly |
-better. A call like @code{mpz_mul(x,x,y)} will, unless @code{y} is only one |
-limb, make a temporary copy of @code{x} before forming the result. Normally |
-that copying will only be a tiny fraction of the time for the multiply, so |
-this is not a particularly important consideration. |
- |
-@code{mpz_set}, @code{mpq_set}, @code{mpq_set_num}, @code{mpf_set}, etc, make |
-no attempt to recognise a copy of something to itself, so a call like |
-@code{mpz_set(x,x)} will be wasteful. Naturally that would never be written |
-deliberately, but if it might arise from two pointers to the same object then |
-a test to avoid it might be desirable. |
- |
-@example |
-if (x != y) |
- mpz_set (x, y); |
-@end example |
- |
-Note that it's never worth introducing extra @code{mpz_set} calls just to get |
-in-place operations. If a result should go to a particular variable then just |
-direct it there and let GMP take care of data movement. |
- |
-@item Divisibility Testing (Small Integers) |
-@cindex Divisibility testing |
-@code{mpz_divisible_ui_p} and @code{mpz_congruent_ui_p} are the best functions |
-for testing whether an @code{mpz_t} is divisible by an individual small |
-integer. They use an algorithm which is faster than @code{mpz_tdiv_ui}, but |
-which gives no useful information about the actual remainder, only whether |
-it's zero (or a particular value). |
- |
-However when testing divisibility by several small integers, it's best to take |
-a remainder modulo their product, to save multi-precision operations. For |
-instance to test whether a number is divisible by any of 23, 29 or 31 take a |
-remainder modulo @math{23@times{}29@times{}31 = 20677} and then test that. |
- |
-The division functions like @code{mpz_tdiv_q_ui} which give a quotient as well |
-as a remainder are generally a little slower than the remainder-only functions |
-like @code{mpz_tdiv_ui}. If the quotient is only rarely wanted then it's |
-probably best to just take a remainder and then go back and calculate the |
-quotient if and when it's wanted (@code{mpz_divexact_ui} can be used if the |
-remainder is zero). |
- |
-@item Rational Arithmetic |
-@cindex Rational arithmetic |
-The @code{mpq} functions operate on @code{mpq_t} values with no common factors |
-in the numerator and denominator. Common factors are checked-for and cast out |
-as necessary. In general, cancelling factors every time is the best approach |
-since it minimizes the sizes for subsequent operations. |
- |
-However, applications that know something about the factorization of the |
-values they're working with might be able to avoid some of the GCDs used for |
-canonicalization, or swap them for divisions. For example when multiplying by |
-a prime it's enough to check for factors of it in the denominator instead of |
-doing a full GCD@. Or when forming a big product it might be known that very |
-little cancellation will be possible, and so canonicalization can be left to |
-the end. |
- |
-The @code{mpq_numref} and @code{mpq_denref} macros give access to the |
-numerator and denominator to do things outside the scope of the supplied |
-@code{mpq} functions. @xref{Applying Integer Functions}. |
- |
-The canonical form for rationals allows mixed-type @code{mpq_t} and integer |
-additions or subtractions to be done directly with multiples of the |
-denominator. This will be somewhat faster than @code{mpq_add}. For example, |
- |
-@example |
-/* mpq increment */ |
-mpz_add (mpq_numref(q), mpq_numref(q), mpq_denref(q)); |
- |
-/* mpq += unsigned long */ |
-mpz_addmul_ui (mpq_numref(q), mpq_denref(q), 123UL); |
- |
-/* mpq -= mpz */ |
-mpz_submul (mpq_numref(q), mpq_denref(q), z); |
-@end example |
- |
-@item Number Sequences |
-@cindex Number sequences |
-Functions like @code{mpz_fac_ui}, @code{mpz_fib_ui} and @code{mpz_bin_uiui} |
-are designed for calculating isolated values. If a range of values is wanted |
-it's probably best to call to get a starting point and iterate from there. |
- |
-@item Text Input/Output |
-@cindex Text input/output |
-Hexadecimal or octal are suggested for input or output in text form. |
-Power-of-2 bases like these can be converted much more efficiently than other |
-bases, like decimal. For big numbers there's usually nothing of particular |
-interest to be seen in the digits, so the base doesn't matter much. |
- |
-Maybe we can hope octal will one day become the normal base for everyday use, |
-as proposed by King Charles XII of Sweden and later reformers. |
-@c Reference: Knuth volume 2 section 4.1, page 184 of second edition. :-) |
-@end table |
- |
- |
-@node Debugging, Profiling, Efficiency, GMP Basics |
-@section Debugging |
-@cindex Debugging |
- |
-@table @asis |
-@item Stack Overflow |
-@cindex Stack overflow |
-@cindex Segmentation violation |
-@cindex Bus error |
-Depending on the system, a segmentation violation or bus error might be the |
-only indication of stack overflow. See @samp{--enable-alloca} choices in |
-@ref{Build Options}, for how to address this. |
- |
-In new enough versions of GCC, @samp{-fstack-check} may be able to ensure an |
-overflow is recognised by the system before too much damage is done, or |
-@samp{-fstack-limit-symbol} or @samp{-fstack-limit-register} may be able to |
-add checking if the system itself doesn't do any (@pxref{Code Gen Options,, |
-Options for Code Generation, gcc, Using the GNU Compiler Collection (GCC)}). |
-These options must be added to the @samp{CFLAGS} used in the GMP build |
-(@pxref{Build Options}), adding them just to an application will have no |
-effect. Note also they're a slowdown, adding overhead to each function call |
-and each stack allocation. |
- |
-@item Heap Problems |
-@cindex Heap problems |
-@cindex Malloc problems |
-The most likely cause of application problems with GMP is heap corruption. |
-Failing to @code{init} GMP variables will have unpredictable effects, and |
-corruption arising elsewhere in a program may well affect GMP@. Initializing |
-GMP variables more than once or failing to clear them will cause memory leaks. |
- |
-@cindex Malloc debugger |
-In all such cases a @code{malloc} debugger is recommended. On a GNU or BSD |
-system the standard C library @code{malloc} has some diagnostic facilities, |
-see @ref{Allocation Debugging,, Allocation Debugging, libc, The GNU C Library |
-Reference Manual}, or @samp{man 3 malloc}. Other possibilities, in no |
-particular order, include |
- |
-@display |
-@uref{http://www.inf.ethz.ch/personal/biere/projects/ccmalloc/} |
-@uref{http://dmalloc.com/} |
-@uref{http://www.perens.com/FreeSoftware/} @ (electric fence) |
-@uref{http://packages.debian.org/stable/devel/fda} |
-@uref{http://www.gnupdate.org/components/leakbug/} |
-@uref{http://people.redhat.com/~otaylor/memprof/} |
-@uref{http://www.cbmamiga.demon.co.uk/mpatrol/} |
-@end display |
- |
-The GMP default allocation routines in @file{memory.c} also have a simple |
-sentinel scheme which can be enabled with @code{#define DEBUG} in that file. |
-This is mainly designed for detecting buffer overruns during GMP development, |
-but might find other uses. |
- |
-@item Stack Backtraces |
-@cindex Stack backtrace |
-On some systems the compiler options GMP uses by default can interfere with |
-debugging. In particular on x86 and 68k systems @samp{-fomit-frame-pointer} |
-is used and this generally inhibits stack backtracing. Recompiling without |
-such options may help while debugging, though the usual caveats about it |
-potentially moving a memory problem or hiding a compiler bug will apply. |
- |
-@item GDB, the GNU Debugger |
-@cindex GDB |
-@cindex GNU Debugger |
-A sample @file{.gdbinit} is included in the distribution, showing how to call |
-some undocumented dump functions to print GMP variables from within GDB@. Note |
-that these functions shouldn't be used in final application code since they're |
-undocumented and may be subject to incompatible changes in future versions of |
-GMP. |
- |
-@item Source File Paths |
-GMP has multiple source files with the same name, in different directories. |
-For example @file{mpz}, @file{mpq} and @file{mpf} each have an |
-@file{init.c}. If the debugger can't already determine the right one it may |
-help to build with absolute paths on each C file. One way to do that is to |
-use a separate object directory with an absolute path to the source directory. |
- |
-@example |
-cd /my/build/dir |
-/my/source/dir/gmp-@value{VERSION}/configure |
-@end example |
- |
-This works via @code{VPATH}, and might require GNU @command{make}. |
-Alternately it might be possible to change the @code{.c.lo} rules |
-appropriately. |
- |
-@item Assertion Checking |
-@cindex Assertion checking |
-The build option @option{--enable-assert} is available to add some consistency |
-checks to the library (see @ref{Build Options}). These are likely to be of |
-limited value to most applications. Assertion failures are just as likely to |
-indicate memory corruption as a library or compiler bug. |
- |
-Applications using the low-level @code{mpn} functions, however, will benefit |
-from @option{--enable-assert} since it adds checks on the parameters of most |
-such functions, many of which have subtle restrictions on their usage. Note |
-however that only the generic C code has checks, not the assembly code, so |
-CPU @samp{none} should be used for maximum checking. |
- |
-@item Temporary Memory Checking |
-The build option @option{--enable-alloca=debug} arranges that each block of |
-temporary memory in GMP is allocated with a separate call to @code{malloc} (or |
-the allocation function set with @code{mp_set_memory_functions}). |
- |
-This can help a malloc debugger detect accesses outside the intended bounds, |
-or detect memory not released. In a normal build, on the other hand, |
-temporary memory is allocated in blocks which GMP divides up for its own use, |
-or may be allocated with a compiler builtin @code{alloca} which will go |
-nowhere near any malloc debugger hooks. |
- |
-@item Maximum Debuggability |
-To summarize the above, a GMP build for maximum debuggability would be |
- |
-@example |
-./configure --disable-shared --enable-assert \ |
- --enable-alloca=debug --host=none CFLAGS=-g |
-@end example |
- |
-For C++, add @samp{--enable-cxx CXXFLAGS=-g}. |
- |
-@item Checker |
-@cindex Checker |
-@cindex GCC Checker |
-The GCC checker (@uref{http://savannah.nongnu.org/projects/checker/}) can be |
-used with GMP@. It contains a stub library which means GMP applications |
-compiled with checker can use a normal GMP build. |
- |
-A build of GMP with checking within GMP itself can be made. This will run |
-very very slowly. On GNU/Linux for example, |
- |
-@cindex @command{checkergcc} |
-@example |
-./configure --host=none-pc-linux-gnu CC=checkergcc |
-@end example |
- |
-@samp{--host=none} must be used, since the GMP assembly code doesn't support |
-the checking scheme. The GMP C++ features cannot be used, since current |
-versions of checker (0.9.9.1) don't yet support the standard C++ library. |
- |
-@item Valgrind |
-@cindex Valgrind |
-The valgrind program (@uref{http://valgrind.org/}) is a memory |
-checker for x86s. It translates and emulates machine instructions to do |
-strong checks for uninitialized data (at the level of individual bits), memory |
-accesses through bad pointers, and memory leaks. |
- |
-Recent versions of Valgrind are getting support for MMX and SSE/SSE2 |
-instructions, for past versions GMP will need to be configured not to use |
-those, ie.@: for an x86 without them (for instance plain @samp{i486}). |
- |
-@item Other Problems |
-Any suspected bug in GMP itself should be isolated to make sure it's not an |
-application problem, see @ref{Reporting Bugs}. |
-@end table |
- |
- |
-@node Profiling, Autoconf, Debugging, GMP Basics |
-@section Profiling |
-@cindex Profiling |
-@cindex Execution profiling |
-@cindex @code{--enable-profiling} |
- |
-Running a program under a profiler is a good way to find where it's spending |
-most time and where improvements can be best sought. The profiling choices |
-for a GMP build are as follows. |
- |
-@table @asis |
-@item @samp{--disable-profiling} |
-The default is to add nothing special for profiling. |
- |
-It should be possible to just compile the mainline of a program with @code{-p} |
-and use @command{prof} to get a profile consisting of timer-based sampling of |
-the program counter. Most of the GMP assembly code has the necessary symbol |
-information. |
- |
-This approach has the advantage of minimizing interference with normal program |
-operation, but on most systems the resolution of the sampling is quite low (10 |
-milliseconds for instance), requiring long runs to get accurate information. |
- |
-@item @samp{--enable-profiling=prof} |
-@cindex @code{prof} |
-Build with support for the system @command{prof}, which means @samp{-p} added |
-to the @samp{CFLAGS}. |
- |
-This provides call counting in addition to program counter sampling, which |
-allows the most frequently called routines to be identified, and an average |
-time spent in each routine to be determined. |
- |
-The x86 assembly code has support for this option, but on other processors |
-the assembly routines will be as if compiled without @samp{-p} and therefore |
-won't appear in the call counts. |
- |
-On some systems, such as GNU/Linux, @samp{-p} in fact means @samp{-pg} and in |
-this case @samp{--enable-profiling=gprof} described below should be used |
-instead. |
- |
-@item @samp{--enable-profiling=gprof} |
-@cindex @code{gprof} |
-Build with support for @command{gprof}, which means @samp{-pg} added to the |
-@samp{CFLAGS}. |
- |
-This provides call graph construction in addition to call counting and program |
-counter sampling, which makes it possible to count calls coming from different |
-locations. For example the number of calls to @code{mpn_mul} from |
-@code{mpz_mul} versus the number from @code{mpf_mul}. The program counter |
-sampling is still flat though, so only a total time in @code{mpn_mul} would be |
-accumulated, not a separate amount for each call site. |
- |
-The x86 assembly code has support for this option, but on other processors |
-the assembly routines will be as if compiled without @samp{-pg} and therefore |
-not be included in the call counts. |
- |
-On x86 and m68k systems @samp{-pg} and @samp{-fomit-frame-pointer} are |
-incompatible, so the latter is omitted from the default flags in that case, |
-which might result in poorer code generation. |
- |
-Incidentally, it should be possible to use the @command{gprof} program with a |
-plain @samp{--enable-profiling=prof} build. But in that case only the |
-@samp{gprof -p} flat profile and call counts can be expected to be valid, not |
-the @samp{gprof -q} call graph. |
- |
-@item @samp{--enable-profiling=instrument} |
-@cindex @code{-finstrument-functions} |
-@cindex @code{instrument-functions} |
-Build with the GCC option @samp{-finstrument-functions} added to the |
-@samp{CFLAGS} (@pxref{Code Gen Options,, Options for Code Generation, gcc, |
-Using the GNU Compiler Collection (GCC)}). |
- |
-This inserts special instrumenting calls at the start and end of each |
-function, allowing exact timing and full call graph construction. |
- |
-This instrumenting is not normally a standard system feature and will require |
-support from an external library, such as |
- |
-@cindex FunctionCheck |
-@cindex fnccheck |
-@display |
-@uref{http://sourceforge.net/projects/fnccheck/} |
-@end display |
- |
-This should be included in @samp{LIBS} during the GMP configure so that test |
-programs will link. For example, |
- |
-@example |
-./configure --enable-profiling=instrument LIBS=-lfc |
-@end example |
- |
-On a GNU system the C library provides dummy instrumenting functions, so |
-programs compiled with this option will link. In this case it's only |
-necessary to ensure the correct library is added when linking an application. |
- |
-The x86 assembly code supports this option, but on other processors the |
-assembly routines will be as if compiled without |
-@samp{-finstrument-functions} meaning time spent in them will effectively be |
-attributed to their caller. |
-@end table |
- |
- |
-@node Autoconf, Emacs, Profiling, GMP Basics |
-@section Autoconf |
-@cindex Autoconf |
- |
-Autoconf based applications can easily check whether GMP is installed. The |
-only thing to be noted is that GMP library symbols from version 3 onwards have |
-prefixes like @code{__gmpz}. The following therefore would be a simple test, |
- |
-@cindex @code{AC_CHECK_LIB} |
-@example |
-AC_CHECK_LIB(gmp, __gmpz_init) |
-@end example |
- |
-This just uses the default @code{AC_CHECK_LIB} actions for found or not found, |
-but an application that must have GMP would want to generate an error if not |
-found. For example, |
- |
-@example |
-AC_CHECK_LIB(gmp, __gmpz_init, , |
- [AC_MSG_ERROR([GNU MP not found, see http://gmplib.org/])]) |
-@end example |
- |
-If functions added in some particular version of GMP are required, then one of |
-those can be used when checking. For example @code{mpz_mul_si} was added in |
-GMP 3.1, |
- |
-@example |
-AC_CHECK_LIB(gmp, __gmpz_mul_si, , |
- [AC_MSG_ERROR( |
- [GNU MP not found, or not 3.1 or up, see http://gmplib.org/])]) |
-@end example |
- |
-An alternative would be to test the version number in @file{gmp.h} using say |
-@code{AC_EGREP_CPP}. That would make it possible to test the exact version, |
-if some particular sub-minor release is known to be necessary. |
- |
-In general it's recommended that applications should simply demand a new |
-enough GMP rather than trying to provide supplements for features not |
-available in past versions. |
- |
-Occasionally an application will need or want to know the size of a type at |
-configuration or preprocessing time, not just with @code{sizeof} in the code. |
-This can be done in the normal way with @code{mp_limb_t} etc, but GMP 4.0 or |
-up is best for this, since prior versions needed certain @samp{-D} defines on |
-systems using a @code{long long} limb. The following would suit Autoconf 2.50 |
-or up, |
- |
-@example |
-AC_CHECK_SIZEOF(mp_limb_t, , [#include <gmp.h>]) |
-@end example |
- |
- |
-@node Emacs, , Autoconf, GMP Basics |
-@section Emacs |
-@cindex Emacs |
-@cindex @code{info-lookup-symbol} |
- |
-@key{C-h C-i} (@code{info-lookup-symbol}) is a good way to find documentation |
-on C functions while editing (@pxref{Info Lookup, , Info Documentation Lookup, |
-emacs, The Emacs Editor}). |
- |
-The GMP manual can be included in such lookups by putting the following in |
-your @file{.emacs}, |
- |
-@c This isn't pretty, but there doesn't seem to be a better way (in emacs |
-@c 21.2 at least). info-lookup->mode-value could be used for the "assoc"s, |
-@c but that function isn't documented, whereas info-lookup-alist is. |
-@c |
-@example |
-(eval-after-load "info-look" |
- '(let ((mode-value (assoc 'c-mode (assoc 'symbol info-lookup-alist)))) |
- (setcar (nthcdr 3 mode-value) |
- (cons '("(gmp)Function Index" nil "^ -.* " "\\>") |
- (nth 3 mode-value))))) |
-@end example |
- |
- |
-@node Reporting Bugs, Integer Functions, GMP Basics, Top |
-@comment node-name, next, previous, up |
-@chapter Reporting Bugs |
-@cindex Reporting bugs |
-@cindex Bug reporting |
- |
-If you think you have found a bug in the GMP library, please investigate it |
-and report it. We have made this library available to you, and it is not too |
-much to ask you to report the bugs you find. |
- |
-Before you report a bug, check it's not already addressed in @ref{Known Build |
-Problems}, or perhaps @ref{Notes for Particular Systems}. You may also want |
-to check @uref{http://gmplib.org/} for patches for this release. |
- |
-Please include the following in any report, |
- |
-@itemize @bullet |
-@item |
-The GMP version number, and if pre-packaged or patched then say so. |
- |
-@item |
-A test program that makes it possible for us to reproduce the bug. Include |
-instructions on how to run the program. |
- |
-@item |
-A description of what is wrong. If the results are incorrect, in what way. |
-If you get a crash, say so. |
- |
-@item |
-If you get a crash, include a stack backtrace from the debugger if it's |
-informative (@samp{where} in @command{gdb}, or @samp{$C} in @command{adb}). |
- |
-@item |
-Please do not send core dumps, executables or @command{strace}s. |
- |
-@item |
-The configuration options you used when building GMP, if any. |
- |
-@item |
-The name of the compiler and its version. For @command{gcc}, get the version |
-with @samp{gcc -v}, otherwise perhaps @samp{what `which cc`}, or similar. |
- |
-@item |
-The output from running @samp{uname -a}. |
- |
-@item |
-The output from running @samp{./config.guess}, and from running |
-@samp{./configfsf.guess} (might be the same). |
- |
-@item |
-If the bug is related to @samp{configure}, then the compressed contents of |
-@file{config.log}. |
- |
-@item |
-If the bug is related to an @file{asm} file not assembling, then the contents |
-of @file{config.m4} and the offending line or lines from the temporary |
-@file{mpn/tmp-<file>.s}. |
-@end itemize |
- |
-Please make an effort to produce a self-contained report, with something |
-definite that can be tested or debugged. Vague queries or piecemeal messages |
-are difficult to act on and don't help the development effort. |
- |
-It is not uncommon that an observed problem is actually due to a bug in the |
-compiler; the GMP code tends to explore interesting corners in compilers. |
- |
-If your bug report is good, we will do our best to help you get a corrected |
-version of the library; if the bug report is poor, we won't do anything about |
-it (except maybe ask you to send a better report). |
- |
-Send your report to: @email{gmp-bugs@@gmplib.org}. |
- |
-If you think something in this manual is unclear, or downright incorrect, or if |
-the language needs to be improved, please send a note to the same address. |
- |
- |
-@node Integer Functions, Rational Number Functions, Reporting Bugs, Top |
-@comment node-name, next, previous, up |
-@chapter Integer Functions |
-@cindex Integer functions |
- |
-This chapter describes the GMP functions for performing integer arithmetic. |
-These functions start with the prefix @code{mpz_}. |
- |
-GMP integers are stored in objects of type @code{mpz_t}. |
- |
-@menu |
-* Initializing Integers:: |
-* Assigning Integers:: |
-* Simultaneous Integer Init & Assign:: |
-* Converting Integers:: |
-* Integer Arithmetic:: |
-* Integer Division:: |
-* Integer Exponentiation:: |
-* Integer Roots:: |
-* Number Theoretic Functions:: |
-* Integer Comparisons:: |
-* Integer Logic and Bit Fiddling:: |
-* I/O of Integers:: |
-* Integer Random Numbers:: |
-* Integer Import and Export:: |
-* Miscellaneous Integer Functions:: |
-* Integer Special Functions:: |
-@end menu |
- |
-@node Initializing Integers, Assigning Integers, Integer Functions, Integer Functions |
-@comment node-name, next, previous, up |
-@section Initialization Functions |
-@cindex Integer initialization functions |
-@cindex Initialization functions |
- |
-The functions for integer arithmetic assume that all integer objects are |
-initialized. You do that by calling the function @code{mpz_init}. For |
-example, |
- |
-@example |
-@{ |
- mpz_t integ; |
- mpz_init (integ); |
- @dots{} |
- mpz_add (integ, @dots{}); |
- @dots{} |
- mpz_sub (integ, @dots{}); |
- |
- /* Unless the program is about to exit, do ... */ |
- mpz_clear (integ); |
-@} |
-@end example |
- |
-As you can see, you can store new values any number of times, once an |
-object is initialized. |
- |
-@deftypefun void mpz_init (mpz_t @var{integer}) |
-Initialize @var{integer}, and set its value to 0. |
-@end deftypefun |
- |
-@deftypefun void mpz_init2 (mpz_t @var{integer}, unsigned long @var{n}) |
-Initialize @var{integer}, with space for @var{n} bits, and set its value to 0. |
- |
-@var{n} is only the initial space, @var{integer} will grow automatically in |
-the normal way, if necessary, for subsequent values stored. @code{mpz_init2} |
-makes it possible to avoid such reallocations if a maximum size is known in |
-advance. |
-@end deftypefun |
- |
-@deftypefun void mpz_clear (mpz_t @var{integer}) |
-Free the space occupied by @var{integer}. Call this function for all |
-@code{mpz_t} variables when you are done with them. |
-@end deftypefun |
- |
-@deftypefun void mpz_realloc2 (mpz_t @var{integer}, unsigned long @var{n}) |
-Change the space allocated for @var{integer} to @var{n} bits. The value in |
-@var{integer} is preserved if it fits, or is set to 0 if not. |
- |
-This function can be used to increase the space for a variable in order to |
-avoid repeated automatic reallocations, or to decrease it to give memory back |
-to the heap. |
-@end deftypefun |
- |
- |
-@node Assigning Integers, Simultaneous Integer Init & Assign, Initializing Integers, Integer Functions |
-@comment node-name, next, previous, up |
-@section Assignment Functions |
-@cindex Integer assignment functions |
-@cindex Assignment functions |
- |
-These functions assign new values to already initialized integers |
-(@pxref{Initializing Integers}). |
- |
-@deftypefun void mpz_set (mpz_t @var{rop}, mpz_t @var{op}) |
-@deftypefunx void mpz_set_ui (mpz_t @var{rop}, unsigned long int @var{op}) |
-@deftypefunx void mpz_set_si (mpz_t @var{rop}, signed long int @var{op}) |
-@deftypefunx void mpz_set_d (mpz_t @var{rop}, double @var{op}) |
-@deftypefunx void mpz_set_q (mpz_t @var{rop}, mpq_t @var{op}) |
-@deftypefunx void mpz_set_f (mpz_t @var{rop}, mpf_t @var{op}) |
-Set the value of @var{rop} from @var{op}. |
- |
-@code{mpz_set_d}, @code{mpz_set_q} and @code{mpz_set_f} truncate @var{op} to |
-make it an integer. |
-@end deftypefun |
- |
-@deftypefun int mpz_set_str (mpz_t @var{rop}, char *@var{str}, int @var{base}) |
-Set the value of @var{rop} from @var{str}, a null-terminated C string in base |
-@var{base}. White space is allowed in the string, and is simply ignored. |
- |
-The @var{base} may vary from 2 to 62, or if @var{base} is 0, then the leading |
-characters are used: @code{0x} and @code{0X} for hexadecimal, @code{0b} and |
-@code{0B} for binary, @code{0} for octal, or decimal otherwise. |
- |
-For bases up to 36, case is ignored; upper-case and lower-case letters have |
-the same value. For bases 37 to 62, upper-case letter represent the usual |
-10..35 while lower-case letter represent 36..61. |
- |
-This function returns 0 if the entire string is a valid number in base |
-@var{base}. Otherwise it returns @minus{}1. |
-@c |
-@c It turns out that it is not entirely true that this function ignores |
-@c white-space. It does ignore it between digits, but not after a minus sign |
-@c or within or after ``0x''. Some thought was given to disallowing all |
-@c whitespace, but that would be an incompatible change, whitespace has been |
-@c documented as ignored ever since GMP 1. |
-@c |
-@end deftypefun |
- |
-@deftypefun void mpz_swap (mpz_t @var{rop1}, mpz_t @var{rop2}) |
-Swap the values @var{rop1} and @var{rop2} efficiently. |
-@end deftypefun |
- |
- |
-@node Simultaneous Integer Init & Assign, Converting Integers, Assigning Integers, Integer Functions |
-@comment node-name, next, previous, up |
-@section Combined Initialization and Assignment Functions |
-@cindex Integer assignment functions |
-@cindex Assignment functions |
-@cindex Integer initialization functions |
-@cindex Initialization functions |
- |
-For convenience, GMP provides a parallel series of initialize-and-set functions |
-which initialize the output and then store the value there. These functions' |
-names have the form @code{mpz_init_set@dots{}} |
- |
-Here is an example of using one: |
- |
-@example |
-@{ |
- mpz_t pie; |
- mpz_init_set_str (pie, "3141592653589793238462643383279502884", 10); |
- @dots{} |
- mpz_sub (pie, @dots{}); |
- @dots{} |
- mpz_clear (pie); |
-@} |
-@end example |
- |
-@noindent |
-Once the integer has been initialized by any of the @code{mpz_init_set@dots{}} |
-functions, it can be used as the source or destination operand for the ordinary |
-integer functions. Don't use an initialize-and-set function on a variable |
-already initialized! |
- |
-@deftypefun void mpz_init_set (mpz_t @var{rop}, mpz_t @var{op}) |
-@deftypefunx void mpz_init_set_ui (mpz_t @var{rop}, unsigned long int @var{op}) |
-@deftypefunx void mpz_init_set_si (mpz_t @var{rop}, signed long int @var{op}) |
-@deftypefunx void mpz_init_set_d (mpz_t @var{rop}, double @var{op}) |
-Initialize @var{rop} with limb space and set the initial numeric value from |
-@var{op}. |
-@end deftypefun |
- |
-@deftypefun int mpz_init_set_str (mpz_t @var{rop}, char *@var{str}, int @var{base}) |
-Initialize @var{rop} and set its value like @code{mpz_set_str} (see its |
-documentation above for details). |
- |
-If the string is a correct base @var{base} number, the function returns 0; |
-if an error occurs it returns @minus{}1. @var{rop} is initialized even if |
-an error occurs. (I.e., you have to call @code{mpz_clear} for it.) |
-@end deftypefun |
- |
- |
-@node Converting Integers, Integer Arithmetic, Simultaneous Integer Init & Assign, Integer Functions |
-@comment node-name, next, previous, up |
-@section Conversion Functions |
-@cindex Integer conversion functions |
-@cindex Conversion functions |
- |
-This section describes functions for converting GMP integers to standard C |
-types. Functions for converting @emph{to} GMP integers are described in |
-@ref{Assigning Integers} and @ref{I/O of Integers}. |
- |
-@deftypefun {unsigned long int} mpz_get_ui (mpz_t @var{op}) |
-Return the value of @var{op} as an @code{unsigned long}. |
- |
-If @var{op} is too big to fit an @code{unsigned long} then just the least |
-significant bits that do fit are returned. The sign of @var{op} is ignored, |
-only the absolute value is used. |
-@end deftypefun |
- |
-@deftypefun {signed long int} mpz_get_si (mpz_t @var{op}) |
-If @var{op} fits into a @code{signed long int} return the value of @var{op}. |
-Otherwise return the least significant part of @var{op}, with the same sign |
-as @var{op}. |
- |
-If @var{op} is too big to fit in a @code{signed long int}, the returned |
-result is probably not very useful. To find out if the value will fit, use |
-the function @code{mpz_fits_slong_p}. |
-@end deftypefun |
- |
-@deftypefun double mpz_get_d (mpz_t @var{op}) |
-Convert @var{op} to a @code{double}, truncating if necessary (ie.@: rounding |
-towards zero). |
- |
-If the exponent from the conversion is too big, the result is system |
-dependent. An infinity is returned where available. A hardware overflow trap |
-may or may not occur. |
-@end deftypefun |
- |
-@deftypefun double mpz_get_d_2exp (signed long int *@var{exp}, mpz_t @var{op}) |
-Convert @var{op} to a @code{double}, truncating if necessary (ie.@: rounding |
-towards zero), and returning the exponent separately. |
- |
-The return value is in the range @math{0.5@le{}@GMPabs{@var{d}}<1} and the |
-exponent is stored to @code{*@var{exp}}. @m{@var{d} * 2^{exp}, @var{d} * |
-2^@var{exp}} is the (truncated) @var{op} value. If @var{op} is zero, the |
-return is @math{0.0} and 0 is stored to @code{*@var{exp}}. |
- |
-@cindex @code{frexp} |
-This is similar to the standard C @code{frexp} function (@pxref{Normalization |
-Functions,,, libc, The GNU C Library Reference Manual}). |
-@end deftypefun |
- |
-@deftypefun {char *} mpz_get_str (char *@var{str}, int @var{base}, mpz_t @var{op}) |
-Convert @var{op} to a string of digits in base @var{base}. The base argument |
-may vary from 2 to 62 or from @minus{}2 to @minus{}36. |
- |
-For @var{base} in the range 2..36, digits and lower-case letters are used; for |
-@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62, |
-digits, upper-case letters, and lower-case letters (in that significance order) |
-are used. |
- |
-If @var{str} is @code{NULL}, the result string is allocated using the current |
-allocation function (@pxref{Custom Allocation}). The block will be |
-@code{strlen(str)+1} bytes, that being exactly enough for the string and |
-null-terminator. |
- |
-If @var{str} is not @code{NULL}, it should point to a block of storage large |
-enough for the result, that being @code{mpz_sizeinbase (@var{op}, @var{base}) |
-+ 2}. The two extra bytes are for a possible minus sign, and the |
-null-terminator. |
- |
-A pointer to the result string is returned, being either the allocated block, |
-or the given @var{str}. |
-@end deftypefun |
- |
- |
-@need 2000 |
-@node Integer Arithmetic, Integer Division, Converting Integers, Integer Functions |
-@comment node-name, next, previous, up |
-@section Arithmetic Functions |
-@cindex Integer arithmetic functions |
-@cindex Arithmetic functions |
- |
-@deftypefun void mpz_add (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2}) |
-@deftypefunx void mpz_add_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2}) |
-Set @var{rop} to @math{@var{op1} + @var{op2}}. |
-@end deftypefun |
- |
-@deftypefun void mpz_sub (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2}) |
-@deftypefunx void mpz_sub_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2}) |
-@deftypefunx void mpz_ui_sub (mpz_t @var{rop}, unsigned long int @var{op1}, mpz_t @var{op2}) |
-Set @var{rop} to @var{op1} @minus{} @var{op2}. |
-@end deftypefun |
- |
-@deftypefun void mpz_mul (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2}) |
-@deftypefunx void mpz_mul_si (mpz_t @var{rop}, mpz_t @var{op1}, long int @var{op2}) |
-@deftypefunx void mpz_mul_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2}) |
-Set @var{rop} to @math{@var{op1} @GMPtimes{} @var{op2}}. |
-@end deftypefun |
- |
-@deftypefun void mpz_addmul (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2}) |
-@deftypefunx void mpz_addmul_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2}) |
-Set @var{rop} to @math{@var{rop} + @var{op1} @GMPtimes{} @var{op2}}. |
-@end deftypefun |
- |
-@deftypefun void mpz_submul (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2}) |
-@deftypefunx void mpz_submul_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2}) |
-Set @var{rop} to @math{@var{rop} - @var{op1} @GMPtimes{} @var{op2}}. |
-@end deftypefun |
- |
-@deftypefun void mpz_mul_2exp (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2}) |
-@cindex Bit shift left |
-Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to |
-@var{op2}}. This operation can also be defined as a left shift by @var{op2} |
-bits. |
-@end deftypefun |
- |
-@deftypefun void mpz_neg (mpz_t @var{rop}, mpz_t @var{op}) |
-Set @var{rop} to @minus{}@var{op}. |
-@end deftypefun |
- |
-@deftypefun void mpz_abs (mpz_t @var{rop}, mpz_t @var{op}) |
-Set @var{rop} to the absolute value of @var{op}. |
-@end deftypefun |
- |
- |
-@need 2000 |
-@node Integer Division, Integer Exponentiation, Integer Arithmetic, Integer Functions |
-@section Division Functions |
-@cindex Integer division functions |
-@cindex Division functions |
- |
-Division is undefined if the divisor is zero. Passing a zero divisor to the |
-division or modulo functions (including the modular powering functions |
-@code{mpz_powm} and @code{mpz_powm_ui}), will cause an intentional division by |
-zero. This lets a program handle arithmetic exceptions in these functions the |
-same way as for normal C @code{int} arithmetic. |
- |
-@c Separate deftypefun groups for cdiv, fdiv and tdiv produce a blank line |
-@c between each, and seem to let tex do a better job of page breaks than an |
-@c @sp 1 in the middle of one big set. |
- |
-@deftypefun void mpz_cdiv_q (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d}) |
-@deftypefunx void mpz_cdiv_r (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d}) |
-@deftypefunx void mpz_cdiv_qr (mpz_t @var{q}, mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d}) |
-@maybepagebreak |
-@deftypefunx {unsigned long int} mpz_cdiv_q_ui (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{d}}) |
-@deftypefunx {unsigned long int} mpz_cdiv_r_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}}) |
-@deftypefunx {unsigned long int} mpz_cdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{mpz_t @var{n}}, @w{unsigned long int @var{d}}) |
-@deftypefunx {unsigned long int} mpz_cdiv_ui (mpz_t @var{n}, @w{unsigned long int @var{d}}) |
-@maybepagebreak |
-@deftypefunx void mpz_cdiv_q_2exp (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{b}}) |
-@deftypefunx void mpz_cdiv_r_2exp (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{b}}) |
-@end deftypefun |
- |
-@deftypefun void mpz_fdiv_q (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d}) |
-@deftypefunx void mpz_fdiv_r (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d}) |
-@deftypefunx void mpz_fdiv_qr (mpz_t @var{q}, mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d}) |
-@maybepagebreak |
-@deftypefunx {unsigned long int} mpz_fdiv_q_ui (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{d}}) |
-@deftypefunx {unsigned long int} mpz_fdiv_r_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}}) |
-@deftypefunx {unsigned long int} mpz_fdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{mpz_t @var{n}}, @w{unsigned long int @var{d}}) |
-@deftypefunx {unsigned long int} mpz_fdiv_ui (mpz_t @var{n}, @w{unsigned long int @var{d}}) |
-@maybepagebreak |
-@deftypefunx void mpz_fdiv_q_2exp (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{b}}) |
-@deftypefunx void mpz_fdiv_r_2exp (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{b}}) |
-@end deftypefun |
- |
-@deftypefun void mpz_tdiv_q (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d}) |
-@deftypefunx void mpz_tdiv_r (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d}) |
-@deftypefunx void mpz_tdiv_qr (mpz_t @var{q}, mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d}) |
-@maybepagebreak |
-@deftypefunx {unsigned long int} mpz_tdiv_q_ui (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{d}}) |
-@deftypefunx {unsigned long int} mpz_tdiv_r_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}}) |
-@deftypefunx {unsigned long int} mpz_tdiv_qr_ui (mpz_t @var{q}, mpz_t @var{r}, @w{mpz_t @var{n}}, @w{unsigned long int @var{d}}) |
-@deftypefunx {unsigned long int} mpz_tdiv_ui (mpz_t @var{n}, @w{unsigned long int @var{d}}) |
-@maybepagebreak |
-@deftypefunx void mpz_tdiv_q_2exp (mpz_t @var{q}, mpz_t @var{n}, @w{unsigned long int @var{b}}) |
-@deftypefunx void mpz_tdiv_r_2exp (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{b}}) |
-@cindex Bit shift right |
- |
-@sp 1 |
-Divide @var{n} by @var{d}, forming a quotient @var{q} and/or remainder |
-@var{r}. For the @code{2exp} functions, @m{@var{d}=2^b, @var{d}=2^@var{b}}. |
-The rounding is in three styles, each suiting different applications. |
- |
-@itemize @bullet |
-@item |
-@code{cdiv} rounds @var{q} up towards @m{+\infty, +infinity}, and @var{r} will |
-have the opposite sign to @var{d}. The @code{c} stands for ``ceil''. |
- |
-@item |
-@code{fdiv} rounds @var{q} down towards @m{-\infty, @minus{}infinity}, and |
-@var{r} will have the same sign as @var{d}. The @code{f} stands for |
-``floor''. |
- |
-@item |
-@code{tdiv} rounds @var{q} towards zero, and @var{r} will have the same sign |
-as @var{n}. The @code{t} stands for ``truncate''. |
-@end itemize |
- |
-In all cases @var{q} and @var{r} will satisfy |
-@m{@var{n}=@var{q}@var{d}+@var{r}, @var{n}=@var{q}*@var{d}+@var{r}}, and |
-@var{r} will satisfy @math{0@le{}@GMPabs{@var{r}}<@GMPabs{@var{d}}}. |
- |
-The @code{q} functions calculate only the quotient, the @code{r} functions |
-only the remainder, and the @code{qr} functions calculate both. Note that for |
-@code{qr} the same variable cannot be passed for both @var{q} and @var{r}, or |
-results will be unpredictable. |
- |
-For the @code{ui} variants the return value is the remainder, and in fact |
-returning the remainder is all the @code{div_ui} functions do. For |
-@code{tdiv} and @code{cdiv} the remainder can be negative, so for those the |
-return value is the absolute value of the remainder. |
- |
-For the @code{2exp} variants the divisor is @m{2^b,2^@var{b}}. These |
-functions are implemented as right shifts and bit masks, but of course they |
-round the same as the other functions. |
- |
-For positive @var{n} both @code{mpz_fdiv_q_2exp} and @code{mpz_tdiv_q_2exp} |
-are simple bitwise right shifts. For negative @var{n}, @code{mpz_fdiv_q_2exp} |
-is effectively an arithmetic right shift treating @var{n} as twos complement |
-the same as the bitwise logical functions do, whereas @code{mpz_tdiv_q_2exp} |
-effectively treats @var{n} as sign and magnitude. |
-@end deftypefun |
- |
-@deftypefun void mpz_mod (mpz_t @var{r}, mpz_t @var{n}, mpz_t @var{d}) |
-@deftypefunx {unsigned long int} mpz_mod_ui (mpz_t @var{r}, mpz_t @var{n}, @w{unsigned long int @var{d}}) |
-Set @var{r} to @var{n} @code{mod} @var{d}. The sign of the divisor is |
-ignored; the result is always non-negative. |
- |
-@code{mpz_mod_ui} is identical to @code{mpz_fdiv_r_ui} above, returning the |
-remainder as well as setting @var{r}. See @code{mpz_fdiv_ui} above if only |
-the return value is wanted. |
-@end deftypefun |
- |
-@deftypefun void mpz_divexact (mpz_t @var{q}, mpz_t @var{n}, mpz_t @var{d}) |
-@deftypefunx void mpz_divexact_ui (mpz_t @var{q}, mpz_t @var{n}, unsigned long @var{d}) |
-@cindex Exact division functions |
-Set @var{q} to @var{n}/@var{d}. These functions produce correct results only |
-when it is known in advance that @var{d} divides @var{n}. |
- |
-These routines are much faster than the other division functions, and are the |
-best choice when exact division is known to occur, for example reducing a |
-rational to lowest terms. |
-@end deftypefun |
- |
-@deftypefun int mpz_divisible_p (mpz_t @var{n}, mpz_t @var{d}) |
-@deftypefunx int mpz_divisible_ui_p (mpz_t @var{n}, unsigned long int @var{d}) |
-@deftypefunx int mpz_divisible_2exp_p (mpz_t @var{n}, unsigned long int @var{b}) |
-@cindex Divisibility functions |
-Return non-zero if @var{n} is exactly divisible by @var{d}, or in the case of |
-@code{mpz_divisible_2exp_p} by @m{2^b,2^@var{b}}. |
- |
-@var{n} is divisible by @var{d} if there exists an integer @var{q} satisfying |
-@math{@var{n} = @var{q}@GMPmultiply{}@var{d}}. Unlike the other division |
-functions, @math{@var{d}=0} is accepted and following the rule it can be seen |
-that only 0 is considered divisible by 0. |
-@end deftypefun |
- |
-@deftypefun int mpz_congruent_p (mpz_t @var{n}, mpz_t @var{c}, mpz_t @var{d}) |
-@deftypefunx int mpz_congruent_ui_p (mpz_t @var{n}, unsigned long int @var{c}, unsigned long int @var{d}) |
-@deftypefunx int mpz_congruent_2exp_p (mpz_t @var{n}, mpz_t @var{c}, unsigned long int @var{b}) |
-@cindex Divisibility functions |
-@cindex Congruence functions |
-Return non-zero if @var{n} is congruent to @var{c} modulo @var{d}, or in the |
-case of @code{mpz_congruent_2exp_p} modulo @m{2^b,2^@var{b}}. |
- |
-@var{n} is congruent to @var{c} mod @var{d} if there exists an integer @var{q} |
-satisfying @math{@var{n} = @var{c} + @var{q}@GMPmultiply{}@var{d}}. Unlike |
-the other division functions, @math{@var{d}=0} is accepted and following the |
-rule it can be seen that @var{n} and @var{c} are considered congruent mod 0 |
-only when exactly equal. |
-@end deftypefun |
- |
- |
-@need 2000 |
-@node Integer Exponentiation, Integer Roots, Integer Division, Integer Functions |
-@section Exponentiation Functions |
-@cindex Integer exponentiation functions |
-@cindex Exponentiation functions |
-@cindex Powering functions |
- |
-@deftypefun void mpz_powm (mpz_t @var{rop}, mpz_t @var{base}, mpz_t @var{exp}, mpz_t @var{mod}) |
-@deftypefunx void mpz_powm_ui (mpz_t @var{rop}, mpz_t @var{base}, unsigned long int @var{exp}, mpz_t @var{mod}) |
-Set @var{rop} to @m{base^{exp} \bmod mod, (@var{base} raised to @var{exp}) |
-modulo @var{mod}}. |
- |
-Negative @var{exp} is supported if an inverse @math{@var{base}^@W{-1} @bmod |
-@var{mod}} exists (see @code{mpz_invert} in @ref{Number Theoretic Functions}). |
-If an inverse doesn't exist then a divide by zero is raised. |
-@end deftypefun |
- |
-@deftypefun void mpz_pow_ui (mpz_t @var{rop}, mpz_t @var{base}, unsigned long int @var{exp}) |
-@deftypefunx void mpz_ui_pow_ui (mpz_t @var{rop}, unsigned long int @var{base}, unsigned long int @var{exp}) |
-Set @var{rop} to @m{base^{exp}, @var{base} raised to @var{exp}}. The case |
-@math{0^0} yields 1. |
-@end deftypefun |
- |
- |
-@need 2000 |
-@node Integer Roots, Number Theoretic Functions, Integer Exponentiation, Integer Functions |
-@section Root Extraction Functions |
-@cindex Integer root functions |
-@cindex Root extraction functions |
- |
-@deftypefun int mpz_root (mpz_t @var{rop}, mpz_t @var{op}, unsigned long int @var{n}) |
-Set @var{rop} to @m{\lfloor\root n \of {op}\rfloor@C{},} the truncated integer |
-part of the @var{n}th root of @var{op}. Return non-zero if the computation |
-was exact, i.e., if @var{op} is @var{rop} to the @var{n}th power. |
-@end deftypefun |
- |
-@deftypefun void mpz_rootrem (mpz_t @var{root}, mpz_t @var{rem}, mpz_t @var{u}, unsigned long int @var{n}) |
-Set @var{root} to @m{\lfloor\root n \of {u}\rfloor@C{},} the truncated |
-integer part of the @var{n}th root of @var{u}. Set @var{rem} to the |
-remainder, @m{(@var{u} - @var{root}^n), |
-@var{u}@minus{}@var{root}**@var{n}}. |
-@end deftypefun |
- |
-@deftypefun void mpz_sqrt (mpz_t @var{rop}, mpz_t @var{op}) |
-Set @var{rop} to @m{\lfloor\sqrt{@var{op}}\rfloor@C{},} the truncated |
-integer part of the square root of @var{op}. |
-@end deftypefun |
- |
-@deftypefun void mpz_sqrtrem (mpz_t @var{rop1}, mpz_t @var{rop2}, mpz_t @var{op}) |
-Set @var{rop1} to @m{\lfloor\sqrt{@var{op}}\rfloor, the truncated integer part |
-of the square root of @var{op}}, like @code{mpz_sqrt}. Set @var{rop2} to the |
-remainder @m{(@var{op} - @var{rop1}^2), |
-@var{op}@minus{}@var{rop1}*@var{rop1}}, which will be zero if @var{op} is a |
-perfect square. |
- |
-If @var{rop1} and @var{rop2} are the same variable, the results are |
-undefined. |
-@end deftypefun |
- |
-@deftypefun int mpz_perfect_power_p (mpz_t @var{op}) |
-@cindex Perfect power functions |
-@cindex Root testing functions |
-Return non-zero if @var{op} is a perfect power, i.e., if there exist integers |
-@m{a,@var{a}} and @m{b,@var{b}}, with @m{b>1, @var{b}>1}, such that |
-@m{@var{op}=a^b, @var{op} equals @var{a} raised to the power @var{b}}. |
- |
-Under this definition both 0 and 1 are considered to be perfect powers. |
-Negative values of @var{op} are accepted, but of course can only be odd |
-perfect powers. |
-@end deftypefun |
- |
-@deftypefun int mpz_perfect_square_p (mpz_t @var{op}) |
-@cindex Perfect square functions |
-@cindex Root testing functions |
-Return non-zero if @var{op} is a perfect square, i.e., if the square root of |
-@var{op} is an integer. Under this definition both 0 and 1 are considered to |
-be perfect squares. |
-@end deftypefun |
- |
- |
-@need 2000 |
-@node Number Theoretic Functions, Integer Comparisons, Integer Roots, Integer Functions |
-@section Number Theoretic Functions |
-@cindex Number theoretic functions |
- |
-@deftypefun int mpz_probab_prime_p (mpz_t @var{n}, int @var{reps}) |
-@cindex Prime testing functions |
-@cindex Probable prime testing functions |
-Determine whether @var{n} is prime. Return 2 if @var{n} is definitely prime, |
-return 1 if @var{n} is probably prime (without being certain), or return 0 if |
-@var{n} is definitely composite. |
- |
-This function does some trial divisions, then some Miller-Rabin probabilistic |
-primality tests. @var{reps} controls how many such tests are done, 5 to 10 is |
-a reasonable number, more will reduce the chances of a composite being |
-returned as ``probably prime''. |
- |
-Miller-Rabin and similar tests can be more properly called compositeness |
-tests. Numbers which fail are known to be composite but those which pass |
-might be prime or might be composite. Only a few composites pass, hence those |
-which pass are considered probably prime. |
-@end deftypefun |
- |
-@deftypefun void mpz_nextprime (mpz_t @var{rop}, mpz_t @var{op}) |
-@cindex Next prime function |
-Set @var{rop} to the next prime greater than @var{op}. |
- |
-This function uses a probabilistic algorithm to identify primes. For |
-practical purposes it's adequate, the chance of a composite passing will be |
-extremely small. |
-@end deftypefun |
- |
-@c mpz_prime_p not implemented as of gmp 3.0. |
- |
-@c @deftypefun int mpz_prime_p (mpz_t @var{n}) |
-@c Return non-zero if @var{n} is prime and zero if @var{n} is a non-prime. |
-@c This function is far slower than @code{mpz_probab_prime_p}, but then it |
-@c never returns non-zero for composite numbers. |
- |
-@c (For practical purposes, using @code{mpz_probab_prime_p} is adequate. |
-@c The likelihood of a programming error or hardware malfunction is orders |
-@c of magnitudes greater than the likelihood for a composite to pass as a |
-@c prime, if the @var{reps} argument is in the suggested range.) |
-@c @end deftypefun |
- |
-@deftypefun void mpz_gcd (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2}) |
-@cindex Greatest common divisor functions |
-@cindex GCD functions |
-Set @var{rop} to the greatest common divisor of @var{op1} and @var{op2}. |
-The result is always positive even if one or both input operands |
-are negative. |
-@end deftypefun |
- |
-@deftypefun {unsigned long int} mpz_gcd_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long int @var{op2}) |
-Compute the greatest common divisor of @var{op1} and @var{op2}. If |
-@var{rop} is not @code{NULL}, store the result there. |
- |
-If the result is small enough to fit in an @code{unsigned long int}, it is |
-returned. If the result does not fit, 0 is returned, and the result is equal |
-to the argument @var{op1}. Note that the result will always fit if @var{op2} |
-is non-zero. |
-@end deftypefun |
- |
-@deftypefun void mpz_gcdext (mpz_t @var{g}, mpz_t @var{s}, mpz_t @var{t}, mpz_t @var{a}, mpz_t @var{b}) |
-@cindex Extended GCD |
-@cindex GCD extended |
-Set @var{g} to the greatest common divisor of @var{a} and @var{b}, and in |
-addition set @var{s} and @var{t} to coefficients satisfying |
-@math{@var{a}@GMPmultiply{}@var{s} + @var{b}@GMPmultiply{}@var{t} = @var{g}}. |
-The value in @var{g} is always positive, even if one or both of @var{a} and |
-@var{b} are negative. The values in @var{s} and @var{t} are chosen such that |
-@math{@GMPabs{@var{s}} @le{} @GMPabs{@var{b}}} and @math{@GMPabs{@var{t}} |
-@le{} @GMPabs{@var{a}}}. |
- |
-If @var{t} is @code{NULL} then that value is not computed. |
-@end deftypefun |
- |
-@deftypefun void mpz_lcm (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2}) |
-@deftypefunx void mpz_lcm_ui (mpz_t @var{rop}, mpz_t @var{op1}, unsigned long @var{op2}) |
-@cindex Least common multiple functions |
-@cindex LCM functions |
-Set @var{rop} to the least common multiple of @var{op1} and @var{op2}. |
-@var{rop} is always positive, irrespective of the signs of @var{op1} and |
-@var{op2}. @var{rop} will be zero if either @var{op1} or @var{op2} is zero. |
-@end deftypefun |
- |
-@deftypefun int mpz_invert (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2}) |
-@cindex Modular inverse functions |
-@cindex Inverse modulo functions |
-Compute the inverse of @var{op1} modulo @var{op2} and put the result in |
-@var{rop}. If the inverse exists, the return value is non-zero and @var{rop} |
-will satisfy @math{0 @le{} @var{rop} < @var{op2}}. If an inverse doesn't exist |
-the return value is zero and @var{rop} is undefined. |
-@end deftypefun |
- |
-@deftypefun int mpz_jacobi (mpz_t @var{a}, mpz_t @var{b}) |
-@cindex Jacobi symbol functions |
-Calculate the Jacobi symbol @m{\left(a \over b\right), |
-(@var{a}/@var{b})}. This is defined only for @var{b} odd. |
-@end deftypefun |
- |
-@deftypefun int mpz_legendre (mpz_t @var{a}, mpz_t @var{p}) |
-@cindex Legendre symbol functions |
-Calculate the Legendre symbol @m{\left(a \over p\right), |
-(@var{a}/@var{p})}. This is defined only for @var{p} an odd positive |
-prime, and for such @var{p} it's identical to the Jacobi symbol. |
-@end deftypefun |
- |
-@deftypefun int mpz_kronecker (mpz_t @var{a}, mpz_t @var{b}) |
-@deftypefunx int mpz_kronecker_si (mpz_t @var{a}, long @var{b}) |
-@deftypefunx int mpz_kronecker_ui (mpz_t @var{a}, unsigned long @var{b}) |
-@deftypefunx int mpz_si_kronecker (long @var{a}, mpz_t @var{b}) |
-@deftypefunx int mpz_ui_kronecker (unsigned long @var{a}, mpz_t @var{b}) |
-@cindex Kronecker symbol functions |
-Calculate the Jacobi symbol @m{\left(a \over b\right), |
-(@var{a}/@var{b})} with the Kronecker extension @m{\left(a \over |
-2\right) = \left(2 \over a\right), (a/2)=(2/a)} when @math{a} odd, or |
-@m{\left(a \over 2\right) = 0, (a/2)=0} when @math{a} even. |
- |
-When @var{b} is odd the Jacobi symbol and Kronecker symbol are |
-identical, so @code{mpz_kronecker_ui} etc can be used for mixed |
-precision Jacobi symbols too. |
- |
-For more information see Henri Cohen section 1.4.2 (@pxref{References}), |
-or any number theory textbook. See also the example program |
-@file{demos/qcn.c} which uses @code{mpz_kronecker_ui}. |
-@end deftypefun |
- |
-@deftypefun {unsigned long int} mpz_remove (mpz_t @var{rop}, mpz_t @var{op}, mpz_t @var{f}) |
-@cindex Remove factor functions |
-@cindex Factor removal functions |
-Remove all occurrences of the factor @var{f} from @var{op} and store the |
-result in @var{rop}. The return value is how many such occurrences were |
-removed. |
-@end deftypefun |
- |
-@deftypefun void mpz_fac_ui (mpz_t @var{rop}, unsigned long int @var{op}) |
-@cindex Factorial functions |
-Set @var{rop} to @var{op}!, the factorial of @var{op}. |
-@end deftypefun |
- |
-@deftypefun void mpz_bin_ui (mpz_t @var{rop}, mpz_t @var{n}, unsigned long int @var{k}) |
-@deftypefunx void mpz_bin_uiui (mpz_t @var{rop}, unsigned long int @var{n}, @w{unsigned long int @var{k}}) |
-@cindex Binomial coefficient functions |
-Compute the binomial coefficient @m{\left({n}\atop{k}\right), @var{n} over |
-@var{k}} and store the result in @var{rop}. Negative values of @var{n} are |
-supported by @code{mpz_bin_ui}, using the identity |
-@m{\left({-n}\atop{k}\right) = (-1)^k \left({n+k-1}\atop{k}\right), |
-bin(-n@C{}k) = (-1)^k * bin(n+k-1@C{}k)}, see Knuth volume 1 section 1.2.6 |
-part G. |
-@end deftypefun |
- |
-@deftypefun void mpz_fib_ui (mpz_t @var{fn}, unsigned long int @var{n}) |
-@deftypefunx void mpz_fib2_ui (mpz_t @var{fn}, mpz_t @var{fnsub1}, unsigned long int @var{n}) |
-@cindex Fibonacci sequence functions |
-@code{mpz_fib_ui} sets @var{fn} to to @m{F_n,F[n]}, the @var{n}'th Fibonacci |
-number. @code{mpz_fib2_ui} sets @var{fn} to @m{F_n,F[n]}, and @var{fnsub1} to |
-@m{F_{n-1},F[n-1]}. |
- |
-These functions are designed for calculating isolated Fibonacci numbers. When |
-a sequence of values is wanted it's best to start with @code{mpz_fib2_ui} and |
-iterate the defining @m{F_{n+1} = F_n + F_{n-1}, F[n+1]=F[n]+F[n-1]} or |
-similar. |
-@end deftypefun |
- |
-@deftypefun void mpz_lucnum_ui (mpz_t @var{ln}, unsigned long int @var{n}) |
-@deftypefunx void mpz_lucnum2_ui (mpz_t @var{ln}, mpz_t @var{lnsub1}, unsigned long int @var{n}) |
-@cindex Lucas number functions |
-@code{mpz_lucnum_ui} sets @var{ln} to to @m{L_n,L[n]}, the @var{n}'th Lucas |
-number. @code{mpz_lucnum2_ui} sets @var{ln} to @m{L_n,L[n]}, and @var{lnsub1} |
-to @m{L_{n-1},L[n-1]}. |
- |
-These functions are designed for calculating isolated Lucas numbers. When a |
-sequence of values is wanted it's best to start with @code{mpz_lucnum2_ui} and |
-iterate the defining @m{L_{n+1} = L_n + L_{n-1}, L[n+1]=L[n]+L[n-1]} or |
-similar. |
- |
-The Fibonacci numbers and Lucas numbers are related sequences, so it's never |
-necessary to call both @code{mpz_fib2_ui} and @code{mpz_lucnum2_ui}. The |
-formulas for going from Fibonacci to Lucas can be found in @ref{Lucas Numbers |
-Algorithm}, the reverse is straightforward too. |
-@end deftypefun |
- |
- |
-@node Integer Comparisons, Integer Logic and Bit Fiddling, Number Theoretic Functions, Integer Functions |
-@comment node-name, next, previous, up |
-@section Comparison Functions |
-@cindex Integer comparison functions |
-@cindex Comparison functions |
- |
-@deftypefn Function int mpz_cmp (mpz_t @var{op1}, mpz_t @var{op2}) |
-@deftypefnx Function int mpz_cmp_d (mpz_t @var{op1}, double @var{op2}) |
-@deftypefnx Macro int mpz_cmp_si (mpz_t @var{op1}, signed long int @var{op2}) |
-@deftypefnx Macro int mpz_cmp_ui (mpz_t @var{op1}, unsigned long int @var{op2}) |
-Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} > |
-@var{op2}}, zero if @math{@var{op1} = @var{op2}}, or a negative value if |
-@math{@var{op1} < @var{op2}}. |
- |
-@code{mpz_cmp_ui} and @code{mpz_cmp_si} are macros and will evaluate their |
-arguments more than once. @code{mpz_cmp_d} can be called with an infinity, |
-but results are undefined for a NaN. |
-@end deftypefn |
- |
-@deftypefn Function int mpz_cmpabs (mpz_t @var{op1}, mpz_t @var{op2}) |
-@deftypefnx Function int mpz_cmpabs_d (mpz_t @var{op1}, double @var{op2}) |
-@deftypefnx Function int mpz_cmpabs_ui (mpz_t @var{op1}, unsigned long int @var{op2}) |
-Compare the absolute values of @var{op1} and @var{op2}. Return a positive |
-value if @math{@GMPabs{@var{op1}} > @GMPabs{@var{op2}}}, zero if |
-@math{@GMPabs{@var{op1}} = @GMPabs{@var{op2}}}, or a negative value if |
-@math{@GMPabs{@var{op1}} < @GMPabs{@var{op2}}}. |
- |
-@code{mpz_cmpabs_d} can be called with an infinity, but results are undefined |
-for a NaN. |
-@end deftypefn |
- |
-@deftypefn Macro int mpz_sgn (mpz_t @var{op}) |
-@cindex Sign tests |
-@cindex Integer sign tests |
-Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and |
-@math{-1} if @math{@var{op} < 0}. |
- |
-This function is actually implemented as a macro. It evaluates its argument |
-multiple times. |
-@end deftypefn |
- |
- |
-@node Integer Logic and Bit Fiddling, I/O of Integers, Integer Comparisons, Integer Functions |
-@comment node-name, next, previous, up |
-@section Logical and Bit Manipulation Functions |
-@cindex Logical functions |
-@cindex Bit manipulation functions |
-@cindex Integer logical functions |
-@cindex Integer bit manipulation functions |
- |
-These functions behave as if twos complement arithmetic were used (although |
-sign-magnitude is the actual implementation). The least significant bit is |
-number 0. |
- |
-@deftypefun void mpz_and (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2}) |
-Set @var{rop} to @var{op1} bitwise-and @var{op2}. |
-@end deftypefun |
- |
-@deftypefun void mpz_ior (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2}) |
-Set @var{rop} to @var{op1} bitwise inclusive-or @var{op2}. |
-@end deftypefun |
- |
-@deftypefun void mpz_xor (mpz_t @var{rop}, mpz_t @var{op1}, mpz_t @var{op2}) |
-Set @var{rop} to @var{op1} bitwise exclusive-or @var{op2}. |
-@end deftypefun |
- |
-@deftypefun void mpz_com (mpz_t @var{rop}, mpz_t @var{op}) |
-Set @var{rop} to the one's complement of @var{op}. |
-@end deftypefun |
- |
-@deftypefun {unsigned long int} mpz_popcount (mpz_t @var{op}) |
-If @math{@var{op}@ge{}0}, return the population count of @var{op}, which is |
-the number of 1 bits in the binary representation. If @math{@var{op}<0}, the |
-number of 1s is infinite, and the return value is @var{ULONG_MAX}, the largest |
-possible @code{unsigned long}. |
-@end deftypefun |
- |
-@deftypefun {unsigned long int} mpz_hamdist (mpz_t @var{op1}, mpz_t @var{op2}) |
-If @var{op1} and @var{op2} are both @math{@ge{}0} or both @math{<0}, return |
-the hamming distance between the two operands, which is the number of bit |
-positions where @var{op1} and @var{op2} have different bit values. If one |
-operand is @math{@ge{}0} and the other @math{<0} then the number of bits |
-different is infinite, and the return value is @var{ULONG_MAX}, the largest |
-possible @code{unsigned long}. |
-@end deftypefun |
- |
-@deftypefun {unsigned long int} mpz_scan0 (mpz_t @var{op}, unsigned long int @var{starting_bit}) |
-@deftypefunx {unsigned long int} mpz_scan1 (mpz_t @var{op}, unsigned long int @var{starting_bit}) |
-@cindex Bit scanning functions |
-@cindex Scan bit functions |
-Scan @var{op}, starting from bit @var{starting_bit}, towards more significant |
-bits, until the first 0 or 1 bit (respectively) is found. Return the index of |
-the found bit. |
- |
-If the bit at @var{starting_bit} is already what's sought, then |
-@var{starting_bit} is returned. |
- |
-If there's no bit found, then @var{ULONG_MAX} is returned. This will happen |
-in @code{mpz_scan0} past the end of a negative number, or @code{mpz_scan1} |
-past the end of a nonnegative number. |
-@end deftypefun |
- |
-@deftypefun void mpz_setbit (mpz_t @var{rop}, unsigned long int @var{bit_index}) |
-Set bit @var{bit_index} in @var{rop}. |
-@end deftypefun |
- |
-@deftypefun void mpz_clrbit (mpz_t @var{rop}, unsigned long int @var{bit_index}) |
-Clear bit @var{bit_index} in @var{rop}. |
-@end deftypefun |
- |
-@deftypefun void mpz_combit (mpz_t @var{rop}, unsigned long int @var{bit_index}) |
-Complement bit @var{bit_index} in @var{rop}. |
-@end deftypefun |
- |
-@deftypefun int mpz_tstbit (mpz_t @var{op}, unsigned long int @var{bit_index}) |
-Test bit @var{bit_index} in @var{op} and return 0 or 1 accordingly. |
-@end deftypefun |
- |
-@node I/O of Integers, Integer Random Numbers, Integer Logic and Bit Fiddling, Integer Functions |
-@comment node-name, next, previous, up |
-@section Input and Output Functions |
-@cindex Integer input and output functions |
-@cindex Input functions |
-@cindex Output functions |
-@cindex I/O functions |
- |
-Functions that perform input from a stdio stream, and functions that output to |
-a stdio stream. Passing a @code{NULL} pointer for a @var{stream} argument to any of |
-these functions will make them read from @code{stdin} and write to |
-@code{stdout}, respectively. |
- |
-When using any of these functions, it is a good idea to include @file{stdio.h} |
-before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes |
-for these functions. |
- |
-@deftypefun size_t mpz_out_str (FILE *@var{stream}, int @var{base}, mpz_t @var{op}) |
-Output @var{op} on stdio stream @var{stream}, as a string of digits in base |
-@var{base}. The base argument may vary from 2 to 62 or from @minus{}2 to |
-@minus{}36. |
- |
-For @var{base} in the range 2..36, digits and lower-case letters are used; for |
-@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62, |
-digits, upper-case letters, and lower-case letters (in that significance order) |
-are used. |
- |
-Return the number of bytes written, or if an error occurred, return 0. |
-@end deftypefun |
- |
-@deftypefun size_t mpz_inp_str (mpz_t @var{rop}, FILE *@var{stream}, int @var{base}) |
-Input a possibly white-space preceded string in base @var{base} from stdio |
-stream @var{stream}, and put the read integer in @var{rop}. |
- |
-The @var{base} may vary from 2 to 62, or if @var{base} is 0, then the leading |
-characters are used: @code{0x} and @code{0X} for hexadecimal, @code{0b} and |
-@code{0B} for binary, @code{0} for octal, or decimal otherwise. |
- |
-For bases up to 36, case is ignored; upper-case and lower-case letters have |
-the same value. For bases 37 to 62, upper-case letter represent the usual |
-10..35 while lower-case letter represent 36..61. |
- |
-Return the number of bytes read, or if an error occurred, return 0. |
-@end deftypefun |
- |
-@deftypefun size_t mpz_out_raw (FILE *@var{stream}, mpz_t @var{op}) |
-Output @var{op} on stdio stream @var{stream}, in raw binary format. The |
-integer is written in a portable format, with 4 bytes of size information, and |
-that many bytes of limbs. Both the size and the limbs are written in |
-decreasing significance order (i.e., in big-endian). |
- |
-The output can be read with @code{mpz_inp_raw}. |
- |
-Return the number of bytes written, or if an error occurred, return 0. |
- |
-The output of this can not be read by @code{mpz_inp_raw} from GMP 1, because |
-of changes necessary for compatibility between 32-bit and 64-bit machines. |
-@end deftypefun |
- |
-@deftypefun size_t mpz_inp_raw (mpz_t @var{rop}, FILE *@var{stream}) |
-Input from stdio stream @var{stream} in the format written by |
-@code{mpz_out_raw}, and put the result in @var{rop}. Return the number of |
-bytes read, or if an error occurred, return 0. |
- |
-This routine can read the output from @code{mpz_out_raw} also from GMP 1, in |
-spite of changes necessary for compatibility between 32-bit and 64-bit |
-machines. |
-@end deftypefun |
- |
- |
-@need 2000 |
-@node Integer Random Numbers, Integer Import and Export, I/O of Integers, Integer Functions |
-@comment node-name, next, previous, up |
-@section Random Number Functions |
-@cindex Integer random number functions |
-@cindex Random number functions |
- |
-The random number functions of GMP come in two groups; older function |
-that rely on a global state, and newer functions that accept a state |
-parameter that is read and modified. Please see the @ref{Random Number |
-Functions} for more information on how to use and not to use random |
-number functions. |
- |
-@deftypefun void mpz_urandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, unsigned long int @var{n}) |
-Generate a uniformly distributed random integer in the range 0 to @m{2^n-1, |
-2^@var{n}@minus{}1}, inclusive. |
- |
-The variable @var{state} must be initialized by calling one of the |
-@code{gmp_randinit} functions (@ref{Random State Initialization}) before |
-invoking this function. |
-@end deftypefun |
- |
-@deftypefun void mpz_urandomm (mpz_t @var{rop}, gmp_randstate_t @var{state}, mpz_t @var{n}) |
-Generate a uniform random integer in the range 0 to @math{@var{n}-1}, |
-inclusive. |
- |
-The variable @var{state} must be initialized by calling one of the |
-@code{gmp_randinit} functions (@ref{Random State Initialization}) |
-before invoking this function. |
-@end deftypefun |
- |
-@deftypefun void mpz_rrandomb (mpz_t @var{rop}, gmp_randstate_t @var{state}, unsigned long int @var{n}) |
-Generate a random integer with long strings of zeros and ones in the |
-binary representation. Useful for testing functions and algorithms, |
-since this kind of random numbers have proven to be more likely to |
-trigger corner-case bugs. The random number will be in the range |
-0 to @m{2^n-1, 2^@var{n}@minus{}1}, inclusive. |
- |
-The variable @var{state} must be initialized by calling one of the |
-@code{gmp_randinit} functions (@ref{Random State Initialization}) |
-before invoking this function. |
-@end deftypefun |
- |
-@deftypefun void mpz_random (mpz_t @var{rop}, mp_size_t @var{max_size}) |
-Generate a random integer of at most @var{max_size} limbs. The generated |
-random number doesn't satisfy any particular requirements of randomness. |
-Negative random numbers are generated when @var{max_size} is negative. |
- |
-This function is obsolete. Use @code{mpz_urandomb} or |
-@code{mpz_urandomm} instead. |
-@end deftypefun |
- |
-@deftypefun void mpz_random2 (mpz_t @var{rop}, mp_size_t @var{max_size}) |
-Generate a random integer of at most @var{max_size} limbs, with long strings |
-of zeros and ones in the binary representation. Useful for testing functions |
-and algorithms, since this kind of random numbers have proven to be more |
-likely to trigger corner-case bugs. Negative random numbers are generated |
-when @var{max_size} is negative. |
- |
-This function is obsolete. Use @code{mpz_rrandomb} instead. |
-@end deftypefun |
- |
- |
-@node Integer Import and Export, Miscellaneous Integer Functions, Integer Random Numbers, Integer Functions |
-@section Integer Import and Export |
- |
-@code{mpz_t} variables can be converted to and from arbitrary words of binary |
-data with the following functions. |
- |
-@deftypefun void mpz_import (mpz_t @var{rop}, size_t @var{count}, int @var{order}, size_t @var{size}, int @var{endian}, size_t @var{nails}, const void *@var{op}) |
-@cindex Integer import |
-@cindex Import |
-Set @var{rop} from an array of word data at @var{op}. |
- |
-The parameters specify the format of the data. @var{count} many words are |
-read, each @var{size} bytes. @var{order} can be 1 for most significant word |
-first or -1 for least significant first. Within each word @var{endian} can be |
-1 for most significant byte first, -1 for least significant first, or 0 for |
-the native endianness of the host CPU@. The most significant @var{nails} bits |
-of each word are skipped, this can be 0 to use the full words. |
- |
-There is no sign taken from the data, @var{rop} will simply be a positive |
-integer. An application can handle any sign itself, and apply it for instance |
-with @code{mpz_neg}. |
- |
-There are no data alignment restrictions on @var{op}, any address is allowed. |
- |
-Here's an example converting an array of @code{unsigned long} data, most |
-significant element first, and host byte order within each value. |
- |
-@example |
-unsigned long a[20]; |
-mpz_t z; |
-mpz_import (z, 20, 1, sizeof(a[0]), 0, 0, a); |
-@end example |
- |
-This example assumes the full @code{sizeof} bytes are used for data in the |
-given type, which is usually true, and certainly true for @code{unsigned long} |
-everywhere we know of. However on Cray vector systems it may be noted that |
-@code{short} and @code{int} are always stored in 8 bytes (and with |
-@code{sizeof} indicating that) but use only 32 or 46 bits. The @var{nails} |
-feature can account for this, by passing for instance |
-@code{8*sizeof(int)-INT_BIT}. |
-@end deftypefun |
- |
-@deftypefun {void *} mpz_export (void *@var{rop}, size_t *@var{countp}, int @var{order}, size_t @var{size}, int @var{endian}, size_t @var{nails}, mpz_t @var{op}) |
-@cindex Integer export |
-@cindex Export |
-Fill @var{rop} with word data from @var{op}. |
- |
-The parameters specify the format of the data produced. Each word will be |
-@var{size} bytes and @var{order} can be 1 for most significant word first or |
--1 for least significant first. Within each word @var{endian} can be 1 for |
-most significant byte first, -1 for least significant first, or 0 for the |
-native endianness of the host CPU@. The most significant @var{nails} bits of |
-each word are unused and set to zero, this can be 0 to produce full words. |
- |
-The number of words produced is written to @code{*@var{countp}}, or |
-@var{countp} can be @code{NULL} to discard the count. @var{rop} must have |
-enough space for the data, or if @var{rop} is @code{NULL} then a result array |
-of the necessary size is allocated using the current GMP allocation function |
-(@pxref{Custom Allocation}). In either case the return value is the |
-destination used, either @var{rop} or the allocated block. |
- |
-If @var{op} is non-zero then the most significant word produced will be |
-non-zero. If @var{op} is zero then the count returned will be zero and |
-nothing written to @var{rop}. If @var{rop} is @code{NULL} in this case, no |
-block is allocated, just @code{NULL} is returned. |
- |
-The sign of @var{op} is ignored, just the absolute value is exported. An |
-application can use @code{mpz_sgn} to get the sign and handle it as desired. |
-(@pxref{Integer Comparisons}) |
- |
-There are no data alignment restrictions on @var{rop}, any address is allowed. |
- |
-When an application is allocating space itself the required size can be |
-determined with a calculation like the following. Since @code{mpz_sizeinbase} |
-always returns at least 1, @code{count} here will be at least one, which |
-avoids any portability problems with @code{malloc(0)}, though if @code{z} is |
-zero no space at all is actually needed (or written). |
- |
-@example |
-numb = 8*size - nail; |
-count = (mpz_sizeinbase (z, 2) + numb-1) / numb; |
-p = malloc (count * size); |
-@end example |
-@end deftypefun |
- |
- |
-@need 2000 |
-@node Miscellaneous Integer Functions, Integer Special Functions, Integer Import and Export, Integer Functions |
-@comment node-name, next, previous, up |
-@section Miscellaneous Functions |
-@cindex Miscellaneous integer functions |
-@cindex Integer miscellaneous functions |
- |
-@deftypefun int mpz_fits_ulong_p (mpz_t @var{op}) |
-@deftypefunx int mpz_fits_slong_p (mpz_t @var{op}) |
-@deftypefunx int mpz_fits_uint_p (mpz_t @var{op}) |
-@deftypefunx int mpz_fits_sint_p (mpz_t @var{op}) |
-@deftypefunx int mpz_fits_ushort_p (mpz_t @var{op}) |
-@deftypefunx int mpz_fits_sshort_p (mpz_t @var{op}) |
-Return non-zero iff the value of @var{op} fits in an @code{unsigned long int}, |
-@code{signed long int}, @code{unsigned int}, @code{signed int}, @code{unsigned |
-short int}, or @code{signed short int}, respectively. Otherwise, return zero. |
-@end deftypefun |
- |
-@deftypefn Macro int mpz_odd_p (mpz_t @var{op}) |
-@deftypefnx Macro int mpz_even_p (mpz_t @var{op}) |
-Determine whether @var{op} is odd or even, respectively. Return non-zero if |
-yes, zero if no. These macros evaluate their argument more than once. |
-@end deftypefn |
- |
-@deftypefun size_t mpz_sizeinbase (mpz_t @var{op}, int @var{base}) |
-@cindex Size in digits |
-@cindex Digits in an integer |
-Return the size of @var{op} measured in number of digits in the given |
-@var{base}. @var{base} can vary from 2 to 62. The sign of @var{op} is |
-ignored, just the absolute value is used. The result will be either exact or |
-1 too big. If @var{base} is a power of 2, the result is always exact. If |
-@var{op} is zero the return value is always 1. |
- |
-This function can be used to determine the space required when converting |
-@var{op} to a string. The right amount of allocation is normally two more |
-than the value returned by @code{mpz_sizeinbase}, one extra for a minus sign |
-and one for the null-terminator. |
- |
-@cindex Most significant bit |
-It will be noted that @code{mpz_sizeinbase(@var{op},2)} can be used to locate |
-the most significant 1 bit in @var{op}, counting from 1. (Unlike the bitwise |
-functions which start from 0, @xref{Integer Logic and Bit Fiddling,, Logical |
-and Bit Manipulation Functions}.) |
-@end deftypefun |
- |
- |
-@node Integer Special Functions, , Miscellaneous Integer Functions, Integer Functions |
-@section Special Functions |
-@cindex Special integer functions |
-@cindex Integer special functions |
- |
-The functions in this section are for various special purposes. Most |
-applications will not need them. |
- |
-@deftypefun void mpz_array_init (mpz_t @var{integer_array}, mp_size_t @var{array_size}, @w{mp_size_t @var{fixed_num_bits}}) |
-This is a special type of initialization. @strong{Fixed} space of |
-@var{fixed_num_bits} is allocated to each of the @var{array_size} integers in |
-@var{integer_array}. There is no way to free the storage allocated by this |
-function. Don't call @code{mpz_clear}! |
- |
-The @var{integer_array} parameter is the first @code{mpz_t} in the array. For |
-example, |
- |
-@example |
-mpz_t arr[20000]; |
-mpz_array_init (arr[0], 20000, 512); |
-@end example |
- |
-@c In case anyone's wondering, yes this parameter style is a bit anomalous, |
-@c it'd probably be nicer if it was "arr" instead of "arr[0]". Obviously the |
-@c two differ only in the declaration, not the pointer value, but changing is |
-@c not possible since it'd provoke warnings or errors in existing sources. |
- |
-This function is only intended for programs that create a large number |
-of integers and need to reduce memory usage by avoiding the overheads of |
-allocating and reallocating lots of small blocks. In normal programs this |
-function is not recommended. |
- |
-The space allocated to each integer by this function will not be automatically |
-increased, unlike the normal @code{mpz_init}, so an application must ensure it |
-is sufficient for any value stored. The following space requirements apply to |
-various routines, |
- |
-@itemize @bullet |
-@item |
-@code{mpz_abs}, @code{mpz_neg}, @code{mpz_set}, @code{mpz_set_si} and |
-@code{mpz_set_ui} need room for the value they store. |
- |
-@item |
-@code{mpz_add}, @code{mpz_add_ui}, @code{mpz_sub} and @code{mpz_sub_ui} need |
-room for the larger of the two operands, plus an extra |
-@code{mp_bits_per_limb}. |
- |
-@item |
-@code{mpz_mul}, @code{mpz_mul_ui} and @code{mpz_mul_ui} need room for the sum |
-of the number of bits in their operands, but each rounded up to a multiple of |
-@code{mp_bits_per_limb}. |
- |
-@item |
-@code{mpz_swap} can be used between two array variables, but not between an |
-array and a normal variable. |
-@end itemize |
- |
-For other functions, or if in doubt, the suggestion is to calculate in a |
-regular @code{mpz_init} variable and copy the result to an array variable with |
-@code{mpz_set}. |
-@end deftypefun |
- |
-@deftypefun {void *} _mpz_realloc (mpz_t @var{integer}, mp_size_t @var{new_alloc}) |
-Change the space for @var{integer} to @var{new_alloc} limbs. The value in |
-@var{integer} is preserved if it fits, or is set to 0 if not. The return |
-value is not useful to applications and should be ignored. |
- |
-@code{mpz_realloc2} is the preferred way to accomplish allocation changes like |
-this. @code{mpz_realloc2} and @code{_mpz_realloc} are the same except that |
-@code{_mpz_realloc} takes its size in limbs. |
-@end deftypefun |
- |
-@deftypefun mp_limb_t mpz_getlimbn (mpz_t @var{op}, mp_size_t @var{n}) |
-Return limb number @var{n} from @var{op}. The sign of @var{op} is ignored, |
-just the absolute value is used. The least significant limb is number 0. |
- |
-@code{mpz_size} can be used to find how many limbs make up @var{op}. |
-@code{mpz_getlimbn} returns zero if @var{n} is outside the range 0 to |
-@code{mpz_size(@var{op})-1}. |
-@end deftypefun |
- |
-@deftypefun size_t mpz_size (mpz_t @var{op}) |
-Return the size of @var{op} measured in number of limbs. If @var{op} is zero, |
-the returned value will be zero. |
-@c (@xref{Nomenclature}, for an explanation of the concept @dfn{limb}.) |
-@end deftypefun |
- |
- |
- |
-@node Rational Number Functions, Floating-point Functions, Integer Functions, Top |
-@comment node-name, next, previous, up |
-@chapter Rational Number Functions |
-@cindex Rational number functions |
- |
-This chapter describes the GMP functions for performing arithmetic on rational |
-numbers. These functions start with the prefix @code{mpq_}. |
- |
-Rational numbers are stored in objects of type @code{mpq_t}. |
- |
-All rational arithmetic functions assume operands have a canonical form, and |
-canonicalize their result. The canonical from means that the denominator and |
-the numerator have no common factors, and that the denominator is positive. |
-Zero has the unique representation 0/1. |
- |
-Pure assignment functions do not canonicalize the assigned variable. It is |
-the responsibility of the user to canonicalize the assigned variable before |
-any arithmetic operations are performed on that variable. |
- |
-@deftypefun void mpq_canonicalize (mpq_t @var{op}) |
-Remove any factors that are common to the numerator and denominator of |
-@var{op}, and make the denominator positive. |
-@end deftypefun |
- |
-@menu |
-* Initializing Rationals:: |
-* Rational Conversions:: |
-* Rational Arithmetic:: |
-* Comparing Rationals:: |
-* Applying Integer Functions:: |
-* I/O of Rationals:: |
-@end menu |
- |
-@node Initializing Rationals, Rational Conversions, Rational Number Functions, Rational Number Functions |
-@comment node-name, next, previous, up |
-@section Initialization and Assignment Functions |
-@cindex Rational assignment functions |
-@cindex Assignment functions |
-@cindex Rational initialization functions |
-@cindex Initialization functions |
- |
-@deftypefun void mpq_init (mpq_t @var{dest_rational}) |
-Initialize @var{dest_rational} and set it to 0/1. Each variable should |
-normally only be initialized once, or at least cleared out (using the function |
-@code{mpq_clear}) between each initialization. |
-@end deftypefun |
- |
-@deftypefun void mpq_clear (mpq_t @var{rational_number}) |
-Free the space occupied by @var{rational_number}. Make sure to call this |
-function for all @code{mpq_t} variables when you are done with them. |
-@end deftypefun |
- |
-@deftypefun void mpq_set (mpq_t @var{rop}, mpq_t @var{op}) |
-@deftypefunx void mpq_set_z (mpq_t @var{rop}, mpz_t @var{op}) |
-Assign @var{rop} from @var{op}. |
-@end deftypefun |
- |
-@deftypefun void mpq_set_ui (mpq_t @var{rop}, unsigned long int @var{op1}, unsigned long int @var{op2}) |
-@deftypefunx void mpq_set_si (mpq_t @var{rop}, signed long int @var{op1}, unsigned long int @var{op2}) |
-Set the value of @var{rop} to @var{op1}/@var{op2}. Note that if @var{op1} and |
-@var{op2} have common factors, @var{rop} has to be passed to |
-@code{mpq_canonicalize} before any operations are performed on @var{rop}. |
-@end deftypefun |
- |
-@deftypefun int mpq_set_str (mpq_t @var{rop}, char *@var{str}, int @var{base}) |
-Set @var{rop} from a null-terminated string @var{str} in the given @var{base}. |
- |
-The string can be an integer like ``41'' or a fraction like ``41/152''. The |
-fraction must be in canonical form (@pxref{Rational Number Functions}), or if |
-not then @code{mpq_canonicalize} must be called. |
- |
-The numerator and optional denominator are parsed the same as in |
-@code{mpz_set_str} (@pxref{Assigning Integers}). White space is allowed in |
-the string, and is simply ignored. The @var{base} can vary from 2 to 62, or |
-if @var{base} is 0 then the leading characters are used: @code{0x} or @code{0X} for hex, |
-@code{0b} or @code{0B} for binary, |
-@code{0} for octal, or decimal otherwise. Note that this is done separately |
-for the numerator and denominator, so for instance @code{0xEF/100} is 239/100, |
-whereas @code{0xEF/0x100} is 239/256. |
- |
-The return value is 0 if the entire string is a valid number, or @minus{}1 if |
-not. |
-@end deftypefun |
- |
-@deftypefun void mpq_swap (mpq_t @var{rop1}, mpq_t @var{rop2}) |
-Swap the values @var{rop1} and @var{rop2} efficiently. |
-@end deftypefun |
- |
- |
-@need 2000 |
-@node Rational Conversions, Rational Arithmetic, Initializing Rationals, Rational Number Functions |
-@comment node-name, next, previous, up |
-@section Conversion Functions |
-@cindex Rational conversion functions |
-@cindex Conversion functions |
- |
-@deftypefun double mpq_get_d (mpq_t @var{op}) |
-Convert @var{op} to a @code{double}, truncating if necessary (ie.@: rounding |
-towards zero). |
- |
-If the exponent from the conversion is too big or too small to fit a |
-@code{double} then the result is system dependent. For too big an infinity is |
-returned when available. For too small @math{0.0} is normally returned. |
-Hardware overflow, underflow and denorm traps may or may not occur. |
-@end deftypefun |
- |
-@deftypefun void mpq_set_d (mpq_t @var{rop}, double @var{op}) |
-@deftypefunx void mpq_set_f (mpq_t @var{rop}, mpf_t @var{op}) |
-Set @var{rop} to the value of @var{op}. There is no rounding, this conversion |
-is exact. |
-@end deftypefun |
- |
-@deftypefun {char *} mpq_get_str (char *@var{str}, int @var{base}, mpq_t @var{op}) |
-Convert @var{op} to a string of digits in base @var{base}. The base may vary |
-from 2 to 36. The string will be of the form @samp{num/den}, or if the |
-denominator is 1 then just @samp{num}. |
- |
-If @var{str} is @code{NULL}, the result string is allocated using the current |
-allocation function (@pxref{Custom Allocation}). The block will be |
-@code{strlen(str)+1} bytes, that being exactly enough for the string and |
-null-terminator. |
- |
-If @var{str} is not @code{NULL}, it should point to a block of storage large |
-enough for the result, that being |
- |
-@example |
-mpz_sizeinbase (mpq_numref(@var{op}), @var{base}) |
-+ mpz_sizeinbase (mpq_denref(@var{op}), @var{base}) + 3 |
-@end example |
- |
-The three extra bytes are for a possible minus sign, possible slash, and the |
-null-terminator. |
- |
-A pointer to the result string is returned, being either the allocated block, |
-or the given @var{str}. |
-@end deftypefun |
- |
- |
-@node Rational Arithmetic, Comparing Rationals, Rational Conversions, Rational Number Functions |
-@comment node-name, next, previous, up |
-@section Arithmetic Functions |
-@cindex Rational arithmetic functions |
-@cindex Arithmetic functions |
- |
-@deftypefun void mpq_add (mpq_t @var{sum}, mpq_t @var{addend1}, mpq_t @var{addend2}) |
-Set @var{sum} to @var{addend1} + @var{addend2}. |
-@end deftypefun |
- |
-@deftypefun void mpq_sub (mpq_t @var{difference}, mpq_t @var{minuend}, mpq_t @var{subtrahend}) |
-Set @var{difference} to @var{minuend} @minus{} @var{subtrahend}. |
-@end deftypefun |
- |
-@deftypefun void mpq_mul (mpq_t @var{product}, mpq_t @var{multiplier}, mpq_t @var{multiplicand}) |
-Set @var{product} to @math{@var{multiplier} @GMPtimes{} @var{multiplicand}}. |
-@end deftypefun |
- |
-@deftypefun void mpq_mul_2exp (mpq_t @var{rop}, mpq_t @var{op1}, unsigned long int @var{op2}) |
-Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to |
-@var{op2}}. |
-@end deftypefun |
- |
-@deftypefun void mpq_div (mpq_t @var{quotient}, mpq_t @var{dividend}, mpq_t @var{divisor}) |
-@cindex Division functions |
-Set @var{quotient} to @var{dividend}/@var{divisor}. |
-@end deftypefun |
- |
-@deftypefun void mpq_div_2exp (mpq_t @var{rop}, mpq_t @var{op1}, unsigned long int @var{op2}) |
-Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to |
-@var{op2}}. |
-@end deftypefun |
- |
-@deftypefun void mpq_neg (mpq_t @var{negated_operand}, mpq_t @var{operand}) |
-Set @var{negated_operand} to @minus{}@var{operand}. |
-@end deftypefun |
- |
-@deftypefun void mpq_abs (mpq_t @var{rop}, mpq_t @var{op}) |
-Set @var{rop} to the absolute value of @var{op}. |
-@end deftypefun |
- |
-@deftypefun void mpq_inv (mpq_t @var{inverted_number}, mpq_t @var{number}) |
-Set @var{inverted_number} to 1/@var{number}. If the new denominator is |
-zero, this routine will divide by zero. |
-@end deftypefun |
- |
-@node Comparing Rationals, Applying Integer Functions, Rational Arithmetic, Rational Number Functions |
-@comment node-name, next, previous, up |
-@section Comparison Functions |
-@cindex Rational comparison functions |
-@cindex Comparison functions |
- |
-@deftypefun int mpq_cmp (mpq_t @var{op1}, mpq_t @var{op2}) |
-Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} > |
-@var{op2}}, zero if @math{@var{op1} = @var{op2}}, and a negative value if |
-@math{@var{op1} < @var{op2}}. |
- |
-To determine if two rationals are equal, @code{mpq_equal} is faster than |
-@code{mpq_cmp}. |
-@end deftypefun |
- |
-@deftypefn Macro int mpq_cmp_ui (mpq_t @var{op1}, unsigned long int @var{num2}, unsigned long int @var{den2}) |
-@deftypefnx Macro int mpq_cmp_si (mpq_t @var{op1}, long int @var{num2}, unsigned long int @var{den2}) |
-Compare @var{op1} and @var{num2}/@var{den2}. Return a positive value if |
-@math{@var{op1} > @var{num2}/@var{den2}}, zero if @math{@var{op1} = |
-@var{num2}/@var{den2}}, and a negative value if @math{@var{op1} < |
-@var{num2}/@var{den2}}. |
- |
-@var{num2} and @var{den2} are allowed to have common factors. |
- |
-These functions are implemented as a macros and evaluate their arguments |
-multiple times. |
-@end deftypefn |
- |
-@deftypefn Macro int mpq_sgn (mpq_t @var{op}) |
-@cindex Sign tests |
-@cindex Rational sign tests |
-Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and |
-@math{-1} if @math{@var{op} < 0}. |
- |
-This function is actually implemented as a macro. It evaluates its |
-arguments multiple times. |
-@end deftypefn |
- |
-@deftypefun int mpq_equal (mpq_t @var{op1}, mpq_t @var{op2}) |
-Return non-zero if @var{op1} and @var{op2} are equal, zero if they are |
-non-equal. Although @code{mpq_cmp} can be used for the same purpose, this |
-function is much faster. |
-@end deftypefun |
- |
-@node Applying Integer Functions, I/O of Rationals, Comparing Rationals, Rational Number Functions |
-@comment node-name, next, previous, up |
-@section Applying Integer Functions to Rationals |
-@cindex Rational numerator and denominator |
-@cindex Numerator and denominator |
- |
-The set of @code{mpq} functions is quite small. In particular, there are few |
-functions for either input or output. The following functions give direct |
-access to the numerator and denominator of an @code{mpq_t}. |
- |
-Note that if an assignment to the numerator and/or denominator could take an |
-@code{mpq_t} out of the canonical form described at the start of this chapter |
-(@pxref{Rational Number Functions}) then @code{mpq_canonicalize} must be |
-called before any other @code{mpq} functions are applied to that @code{mpq_t}. |
- |
-@deftypefn Macro mpz_t mpq_numref (mpq_t @var{op}) |
-@deftypefnx Macro mpz_t mpq_denref (mpq_t @var{op}) |
-Return a reference to the numerator and denominator of @var{op}, respectively. |
-The @code{mpz} functions can be used on the result of these macros. |
-@end deftypefn |
- |
-@deftypefun void mpq_get_num (mpz_t @var{numerator}, mpq_t @var{rational}) |
-@deftypefunx void mpq_get_den (mpz_t @var{denominator}, mpq_t @var{rational}) |
-@deftypefunx void mpq_set_num (mpq_t @var{rational}, mpz_t @var{numerator}) |
-@deftypefunx void mpq_set_den (mpq_t @var{rational}, mpz_t @var{denominator}) |
-Get or set the numerator or denominator of a rational. These functions are |
-equivalent to calling @code{mpz_set} with an appropriate @code{mpq_numref} or |
-@code{mpq_denref}. Direct use of @code{mpq_numref} or @code{mpq_denref} is |
-recommended instead of these functions. |
-@end deftypefun |
- |
- |
-@need 2000 |
-@node I/O of Rationals, , Applying Integer Functions, Rational Number Functions |
-@comment node-name, next, previous, up |
-@section Input and Output Functions |
-@cindex Rational input and output functions |
-@cindex Input functions |
-@cindex Output functions |
-@cindex I/O functions |
- |
-When using any of these functions, it's a good idea to include @file{stdio.h} |
-before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes |
-for these functions. |
- |
-Passing a @code{NULL} pointer for a @var{stream} argument to any of these |
-functions will make them read from @code{stdin} and write to @code{stdout}, |
-respectively. |
- |
-@deftypefun size_t mpq_out_str (FILE *@var{stream}, int @var{base}, mpq_t @var{op}) |
-Output @var{op} on stdio stream @var{stream}, as a string of digits in base |
-@var{base}. The base may vary from 2 to 36. Output is in the form |
-@samp{num/den} or if the denominator is 1 then just @samp{num}. |
- |
-Return the number of bytes written, or if an error occurred, return 0. |
-@end deftypefun |
- |
-@deftypefun size_t mpq_inp_str (mpq_t @var{rop}, FILE *@var{stream}, int @var{base}) |
-Read a string of digits from @var{stream} and convert them to a rational in |
-@var{rop}. Any initial white-space characters are read and discarded. Return |
-the number of characters read (including white space), or 0 if a rational |
-could not be read. |
- |
-The input can be a fraction like @samp{17/63} or just an integer like |
-@samp{123}. Reading stops at the first character not in this form, and white |
-space is not permitted within the string. If the input might not be in |
-canonical form, then @code{mpq_canonicalize} must be called (@pxref{Rational |
-Number Functions}). |
- |
-The @var{base} can be between 2 and 36, or can be 0 in which case the leading |
-characters of the string determine the base, @samp{0x} or @samp{0X} for |
-hexadecimal, @samp{0} for octal, or decimal otherwise. The leading characters |
-are examined separately for the numerator and denominator of a fraction, so |
-for instance @samp{0x10/11} is @math{16/11}, whereas @samp{0x10/0x11} is |
-@math{16/17}. |
-@end deftypefun |
- |
- |
-@node Floating-point Functions, Low-level Functions, Rational Number Functions, Top |
-@comment node-name, next, previous, up |
-@chapter Floating-point Functions |
-@cindex Floating-point functions |
-@cindex Float functions |
-@cindex User-defined precision |
-@cindex Precision of floats |
- |
-GMP floating point numbers are stored in objects of type @code{mpf_t} and |
-functions operating on them have an @code{mpf_} prefix. |
- |
-The mantissa of each float has a user-selectable precision, limited only by |
-available memory. Each variable has its own precision, and that can be |
-increased or decreased at any time. |
- |
-The exponent of each float is a fixed precision, one machine word on most |
-systems. In the current implementation the exponent is a count of limbs, so |
-for example on a 32-bit system this means a range of roughly |
-@math{2^@W{-68719476768}} to @math{2^@W{68719476736}}, or on a 64-bit system |
-this will be greater. Note however @code{mpf_get_str} can only return an |
-exponent which fits an @code{mp_exp_t} and currently @code{mpf_set_str} |
-doesn't accept exponents bigger than a @code{long}. |
- |
-Each variable keeps a size for the mantissa data actually in use. This means |
-that if a float is exactly represented in only a few bits then only those bits |
-will be used in a calculation, even if the selected precision is high. |
- |
-All calculations are performed to the precision of the destination variable. |
-Each function is defined to calculate with ``infinite precision'' followed by |
-a truncation to the destination precision, but of course the work done is only |
-what's needed to determine a result under that definition. |
- |
-The precision selected for a variable is a minimum value, GMP may increase it |
-a little to facilitate efficient calculation. Currently this means rounding |
-up to a whole limb, and then sometimes having a further partial limb, |
-depending on the high limb of the mantissa. But applications shouldn't be |
-concerned by such details. |
- |
-The mantissa in stored in binary, as might be imagined from the fact |
-precisions are expressed in bits. One consequence of this is that decimal |
-fractions like @math{0.1} cannot be represented exactly. The same is true of |
-plain IEEE @code{double} floats. This makes both highly unsuitable for |
-calculations involving money or other values that should be exact decimal |
-fractions. (Suitably scaled integers, or perhaps rationals, are better |
-choices.) |
- |
-@code{mpf} functions and variables have no special notion of infinity or |
-not-a-number, and applications must take care not to overflow the exponent or |
-results will be unpredictable. This might change in a future release. |
- |
-Note that the @code{mpf} functions are @emph{not} intended as a smooth |
-extension to IEEE P754 arithmetic. In particular results obtained on one |
-computer often differ from the results on a computer with a different word |
-size. |
- |
-@menu |
-* Initializing Floats:: |
-* Assigning Floats:: |
-* Simultaneous Float Init & Assign:: |
-* Converting Floats:: |
-* Float Arithmetic:: |
-* Float Comparison:: |
-* I/O of Floats:: |
-* Miscellaneous Float Functions:: |
-@end menu |
- |
-@node Initializing Floats, Assigning Floats, Floating-point Functions, Floating-point Functions |
-@comment node-name, next, previous, up |
-@section Initialization Functions |
-@cindex Float initialization functions |
-@cindex Initialization functions |
- |
-@deftypefun void mpf_set_default_prec (unsigned long int @var{prec}) |
-Set the default precision to be @strong{at least} @var{prec} bits. All |
-subsequent calls to @code{mpf_init} will use this precision, but previously |
-initialized variables are unaffected. |
-@end deftypefun |
- |
-@deftypefun {unsigned long int} mpf_get_default_prec (void) |
-Return the default precision actually used. |
-@end deftypefun |
- |
-An @code{mpf_t} object must be initialized before storing the first value in |
-it. The functions @code{mpf_init} and @code{mpf_init2} are used for that |
-purpose. |
- |
-@deftypefun void mpf_init (mpf_t @var{x}) |
-Initialize @var{x} to 0. Normally, a variable should be initialized once only |
-or at least be cleared, using @code{mpf_clear}, between initializations. The |
-precision of @var{x} is undefined unless a default precision has already been |
-established by a call to @code{mpf_set_default_prec}. |
-@end deftypefun |
- |
-@deftypefun void mpf_init2 (mpf_t @var{x}, unsigned long int @var{prec}) |
-Initialize @var{x} to 0 and set its precision to be @strong{at least} |
-@var{prec} bits. Normally, a variable should be initialized once only or at |
-least be cleared, using @code{mpf_clear}, between initializations. |
-@end deftypefun |
- |
-@deftypefun void mpf_clear (mpf_t @var{x}) |
-Free the space occupied by @var{x}. Make sure to call this function for all |
-@code{mpf_t} variables when you are done with them. |
-@end deftypefun |
- |
-@need 2000 |
-Here is an example on how to initialize floating-point variables: |
-@example |
-@{ |
- mpf_t x, y; |
- mpf_init (x); /* use default precision */ |
- mpf_init2 (y, 256); /* precision @emph{at least} 256 bits */ |
- @dots{} |
- /* Unless the program is about to exit, do ... */ |
- mpf_clear (x); |
- mpf_clear (y); |
-@} |
-@end example |
- |
-The following three functions are useful for changing the precision during a |
-calculation. A typical use would be for adjusting the precision gradually in |
-iterative algorithms like Newton-Raphson, making the computation precision |
-closely match the actual accurate part of the numbers. |
- |
-@deftypefun {unsigned long int} mpf_get_prec (mpf_t @var{op}) |
-Return the current precision of @var{op}, in bits. |
-@end deftypefun |
- |
-@deftypefun void mpf_set_prec (mpf_t @var{rop}, unsigned long int @var{prec}) |
-Set the precision of @var{rop} to be @strong{at least} @var{prec} bits. The |
-value in @var{rop} will be truncated to the new precision. |
- |
-This function requires a call to @code{realloc}, and so should not be used in |
-a tight loop. |
-@end deftypefun |
- |
-@deftypefun void mpf_set_prec_raw (mpf_t @var{rop}, unsigned long int @var{prec}) |
-Set the precision of @var{rop} to be @strong{at least} @var{prec} bits, |
-without changing the memory allocated. |
- |
-@var{prec} must be no more than the allocated precision for @var{rop}, that |
-being the precision when @var{rop} was initialized, or in the most recent |
-@code{mpf_set_prec}. |
- |
-The value in @var{rop} is unchanged, and in particular if it had a higher |
-precision than @var{prec} it will retain that higher precision. New values |
-written to @var{rop} will use the new @var{prec}. |
- |
-Before calling @code{mpf_clear} or the full @code{mpf_set_prec}, another |
-@code{mpf_set_prec_raw} call must be made to restore @var{rop} to its original |
-allocated precision. Failing to do so will have unpredictable results. |
- |
-@code{mpf_get_prec} can be used before @code{mpf_set_prec_raw} to get the |
-original allocated precision. After @code{mpf_set_prec_raw} it reflects the |
-@var{prec} value set. |
- |
-@code{mpf_set_prec_raw} is an efficient way to use an @code{mpf_t} variable at |
-different precisions during a calculation, perhaps to gradually increase |
-precision in an iteration, or just to use various different precisions for |
-different purposes during a calculation. |
-@end deftypefun |
- |
- |
-@need 2000 |
-@node Assigning Floats, Simultaneous Float Init & Assign, Initializing Floats, Floating-point Functions |
-@comment node-name, next, previous, up |
-@section Assignment Functions |
-@cindex Float assignment functions |
-@cindex Assignment functions |
- |
-These functions assign new values to already initialized floats |
-(@pxref{Initializing Floats}). |
- |
-@deftypefun void mpf_set (mpf_t @var{rop}, mpf_t @var{op}) |
-@deftypefunx void mpf_set_ui (mpf_t @var{rop}, unsigned long int @var{op}) |
-@deftypefunx void mpf_set_si (mpf_t @var{rop}, signed long int @var{op}) |
-@deftypefunx void mpf_set_d (mpf_t @var{rop}, double @var{op}) |
-@deftypefunx void mpf_set_z (mpf_t @var{rop}, mpz_t @var{op}) |
-@deftypefunx void mpf_set_q (mpf_t @var{rop}, mpq_t @var{op}) |
-Set the value of @var{rop} from @var{op}. |
-@end deftypefun |
- |
-@deftypefun int mpf_set_str (mpf_t @var{rop}, char *@var{str}, int @var{base}) |
-Set the value of @var{rop} from the string in @var{str}. The string is of the |
-form @samp{M@@N} or, if the base is 10 or less, alternatively @samp{MeN}. |
-@samp{M} is the mantissa and @samp{N} is the exponent. The mantissa is always |
-in the specified base. The exponent is either in the specified base or, if |
-@var{base} is negative, in decimal. The decimal point expected is taken from |
-the current locale, on systems providing @code{localeconv}. |
- |
-The argument @var{base} may be in the ranges 2 to 62, or @minus{}62 to |
-@minus{}2. Negative values are used to specify that the exponent is in |
-decimal. |
- |
-For bases up to 36, case is ignored; upper-case and lower-case letters have |
-the same value; for bases 37 to 62, upper-case letter represent the usual |
-10..35 while lower-case letter represent 36..61. |
- |
-Unlike the corresponding @code{mpz} function, the base will not be determined |
-from the leading characters of the string if @var{base} is 0. This is so that |
-numbers like @samp{0.23} are not interpreted as octal. |
- |
-White space is allowed in the string, and is simply ignored. [This is not |
-really true; white-space is ignored in the beginning of the string and within |
-the mantissa, but not in other places, such as after a minus sign or in the |
-exponent. We are considering changing the definition of this function, making |
-it fail when there is any white-space in the input, since that makes a lot of |
-sense. Please tell us your opinion about this change. Do you really want it |
-to accept @nicode{"3 14"} as meaning 314 as it does now?] |
- |
-This function returns 0 if the entire string is a valid number in base |
-@var{base}. Otherwise it returns @minus{}1. |
-@end deftypefun |
- |
-@deftypefun void mpf_swap (mpf_t @var{rop1}, mpf_t @var{rop2}) |
-Swap @var{rop1} and @var{rop2} efficiently. Both the values and the |
-precisions of the two variables are swapped. |
-@end deftypefun |
- |
- |
-@node Simultaneous Float Init & Assign, Converting Floats, Assigning Floats, Floating-point Functions |
-@comment node-name, next, previous, up |
-@section Combined Initialization and Assignment Functions |
-@cindex Float assignment functions |
-@cindex Assignment functions |
-@cindex Float initialization functions |
-@cindex Initialization functions |
- |
-For convenience, GMP provides a parallel series of initialize-and-set functions |
-which initialize the output and then store the value there. These functions' |
-names have the form @code{mpf_init_set@dots{}} |
- |
-Once the float has been initialized by any of the @code{mpf_init_set@dots{}} |
-functions, it can be used as the source or destination operand for the ordinary |
-float functions. Don't use an initialize-and-set function on a variable |
-already initialized! |
- |
-@deftypefun void mpf_init_set (mpf_t @var{rop}, mpf_t @var{op}) |
-@deftypefunx void mpf_init_set_ui (mpf_t @var{rop}, unsigned long int @var{op}) |
-@deftypefunx void mpf_init_set_si (mpf_t @var{rop}, signed long int @var{op}) |
-@deftypefunx void mpf_init_set_d (mpf_t @var{rop}, double @var{op}) |
-Initialize @var{rop} and set its value from @var{op}. |
- |
-The precision of @var{rop} will be taken from the active default precision, as |
-set by @code{mpf_set_default_prec}. |
-@end deftypefun |
- |
-@deftypefun int mpf_init_set_str (mpf_t @var{rop}, char *@var{str}, int @var{base}) |
-Initialize @var{rop} and set its value from the string in @var{str}. See |
-@code{mpf_set_str} above for details on the assignment operation. |
- |
-Note that @var{rop} is initialized even if an error occurs. (I.e., you have to |
-call @code{mpf_clear} for it.) |
- |
-The precision of @var{rop} will be taken from the active default precision, as |
-set by @code{mpf_set_default_prec}. |
-@end deftypefun |
- |
- |
-@node Converting Floats, Float Arithmetic, Simultaneous Float Init & Assign, Floating-point Functions |
-@comment node-name, next, previous, up |
-@section Conversion Functions |
-@cindex Float conversion functions |
-@cindex Conversion functions |
- |
-@deftypefun double mpf_get_d (mpf_t @var{op}) |
-Convert @var{op} to a @code{double}, truncating if necessary (ie.@: rounding |
-towards zero). |
- |
-If the exponent in @var{op} is too big or too small to fit a @code{double} |
-then the result is system dependent. For too big an infinity is returned when |
-available. For too small @math{0.0} is normally returned. Hardware overflow, |
-underflow and denorm traps may or may not occur. |
-@end deftypefun |
- |
-@deftypefun double mpf_get_d_2exp (signed long int *@var{exp}, mpf_t @var{op}) |
-Convert @var{op} to a @code{double}, truncating if necessary (ie.@: rounding |
-towards zero), and with an exponent returned separately. |
- |
-The return value is in the range @math{0.5@le{}@GMPabs{@var{d}}<1} and the |
-exponent is stored to @code{*@var{exp}}. @m{@var{d} * 2^{exp}, @var{d} * |
-2^@var{exp}} is the (truncated) @var{op} value. If @var{op} is zero, the |
-return is @math{0.0} and 0 is stored to @code{*@var{exp}}. |
- |
-@cindex @code{frexp} |
-This is similar to the standard C @code{frexp} function (@pxref{Normalization |
-Functions,,, libc, The GNU C Library Reference Manual}). |
-@end deftypefun |
- |
-@deftypefun long mpf_get_si (mpf_t @var{op}) |
-@deftypefunx {unsigned long} mpf_get_ui (mpf_t @var{op}) |
-Convert @var{op} to a @code{long} or @code{unsigned long}, truncating any |
-fraction part. If @var{op} is too big for the return type, the result is |
-undefined. |
- |
-See also @code{mpf_fits_slong_p} and @code{mpf_fits_ulong_p} |
-(@pxref{Miscellaneous Float Functions}). |
-@end deftypefun |
- |
-@deftypefun {char *} mpf_get_str (char *@var{str}, mp_exp_t *@var{expptr}, int @var{base}, size_t @var{n_digits}, mpf_t @var{op}) |
-Convert @var{op} to a string of digits in base @var{base}. The base argument |
-may vary from 2 to 62 or from @minus{}2 to @minus{}36. Up to @var{n_digits} |
-digits will be generated. Trailing zeros are not returned. No more digits |
-than can be accurately represented by @var{op} are ever generated. If |
-@var{n_digits} is 0 then that accurate maximum number of digits are generated. |
- |
-For @var{base} in the range 2..36, digits and lower-case letters are used; for |
-@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62, |
-digits, upper-case letters, and lower-case letters (in that significance order) |
-are used. |
- |
-If @var{str} is @code{NULL}, the result string is allocated using the current |
-allocation function (@pxref{Custom Allocation}). The block will be |
-@code{strlen(str)+1} bytes, that being exactly enough for the string and |
-null-terminator. |
- |
-If @var{str} is not @code{NULL}, it should point to a block of |
-@math{@var{n_digits} + 2} bytes, that being enough for the mantissa, a |
-possible minus sign, and a null-terminator. When @var{n_digits} is 0 to get |
-all significant digits, an application won't be able to know the space |
-required, and @var{str} should be @code{NULL} in that case. |
- |
-The generated string is a fraction, with an implicit radix point immediately |
-to the left of the first digit. The applicable exponent is written through |
-the @var{expptr} pointer. For example, the number 3.1416 would be returned as |
-string @nicode{"31416"} and exponent 1. |
- |
-When @var{op} is zero, an empty string is produced and the exponent returned |
-is 0. |
- |
-A pointer to the result string is returned, being either the allocated block |
-or the given @var{str}. |
-@end deftypefun |
- |
- |
-@node Float Arithmetic, Float Comparison, Converting Floats, Floating-point Functions |
-@comment node-name, next, previous, up |
-@section Arithmetic Functions |
-@cindex Float arithmetic functions |
-@cindex Arithmetic functions |
- |
-@deftypefun void mpf_add (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2}) |
-@deftypefunx void mpf_add_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2}) |
-Set @var{rop} to @math{@var{op1} + @var{op2}}. |
-@end deftypefun |
- |
-@deftypefun void mpf_sub (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2}) |
-@deftypefunx void mpf_ui_sub (mpf_t @var{rop}, unsigned long int @var{op1}, mpf_t @var{op2}) |
-@deftypefunx void mpf_sub_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2}) |
-Set @var{rop} to @var{op1} @minus{} @var{op2}. |
-@end deftypefun |
- |
-@deftypefun void mpf_mul (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2}) |
-@deftypefunx void mpf_mul_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2}) |
-Set @var{rop} to @math{@var{op1} @GMPtimes{} @var{op2}}. |
-@end deftypefun |
- |
-Division is undefined if the divisor is zero, and passing a zero divisor to the |
-divide functions will make these functions intentionally divide by zero. This |
-lets the user handle arithmetic exceptions in these functions in the same |
-manner as other arithmetic exceptions. |
- |
-@deftypefun void mpf_div (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2}) |
-@deftypefunx void mpf_ui_div (mpf_t @var{rop}, unsigned long int @var{op1}, mpf_t @var{op2}) |
-@deftypefunx void mpf_div_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2}) |
-@cindex Division functions |
-Set @var{rop} to @var{op1}/@var{op2}. |
-@end deftypefun |
- |
-@deftypefun void mpf_sqrt (mpf_t @var{rop}, mpf_t @var{op}) |
-@deftypefunx void mpf_sqrt_ui (mpf_t @var{rop}, unsigned long int @var{op}) |
-@cindex Root extraction functions |
-Set @var{rop} to @m{\sqrt{@var{op}}, the square root of @var{op}}. |
-@end deftypefun |
- |
-@deftypefun void mpf_pow_ui (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2}) |
-@cindex Exponentiation functions |
-@cindex Powering functions |
-Set @var{rop} to @m{@var{op1}^{op2}, @var{op1} raised to the power @var{op2}}. |
-@end deftypefun |
- |
-@deftypefun void mpf_neg (mpf_t @var{rop}, mpf_t @var{op}) |
-Set @var{rop} to @minus{}@var{op}. |
-@end deftypefun |
- |
-@deftypefun void mpf_abs (mpf_t @var{rop}, mpf_t @var{op}) |
-Set @var{rop} to the absolute value of @var{op}. |
-@end deftypefun |
- |
-@deftypefun void mpf_mul_2exp (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2}) |
-Set @var{rop} to @m{@var{op1} \times 2^{op2}, @var{op1} times 2 raised to |
-@var{op2}}. |
-@end deftypefun |
- |
-@deftypefun void mpf_div_2exp (mpf_t @var{rop}, mpf_t @var{op1}, unsigned long int @var{op2}) |
-Set @var{rop} to @m{@var{op1}/2^{op2}, @var{op1} divided by 2 raised to |
-@var{op2}}. |
-@end deftypefun |
- |
-@node Float Comparison, I/O of Floats, Float Arithmetic, Floating-point Functions |
-@comment node-name, next, previous, up |
-@section Comparison Functions |
-@cindex Float comparison functions |
-@cindex Comparison functions |
- |
-@deftypefun int mpf_cmp (mpf_t @var{op1}, mpf_t @var{op2}) |
-@deftypefunx int mpf_cmp_d (mpf_t @var{op1}, double @var{op2}) |
-@deftypefunx int mpf_cmp_ui (mpf_t @var{op1}, unsigned long int @var{op2}) |
-@deftypefunx int mpf_cmp_si (mpf_t @var{op1}, signed long int @var{op2}) |
-Compare @var{op1} and @var{op2}. Return a positive value if @math{@var{op1} > |
-@var{op2}}, zero if @math{@var{op1} = @var{op2}}, and a negative value if |
-@math{@var{op1} < @var{op2}}. |
- |
-@code{mpf_cmp_d} can be called with an infinity, but results are undefined for |
-a NaN. |
-@end deftypefun |
- |
-@deftypefun int mpf_eq (mpf_t @var{op1}, mpf_t @var{op2}, unsigned long int op3) |
-Return non-zero if the first @var{op3} bits of @var{op1} and @var{op2} are |
-equal, zero otherwise. I.e., test if @var{op1} and @var{op2} are approximately |
-equal. |
- |
-Caution 1: All version of GMP up to version 4.2.4 compared just whole limbs, |
-meaning sometimes more than @var{op3} bits, sometimes fewer. |
- |
-Caution 2: This function will consider XXX11...111 and XX100...000 different, |
-even if ... is replaced by a semi-infinite number of bits. Such numbers are |
-really just one ulp off, and should be considered equal. |
-@end deftypefun |
- |
-@deftypefun void mpf_reldiff (mpf_t @var{rop}, mpf_t @var{op1}, mpf_t @var{op2}) |
-Compute the relative difference between @var{op1} and @var{op2} and store the |
-result in @var{rop}. This is @math{@GMPabs{@var{op1}-@var{op2}}/@var{op1}}. |
-@end deftypefun |
- |
-@deftypefn Macro int mpf_sgn (mpf_t @var{op}) |
-@cindex Sign tests |
-@cindex Float sign tests |
-Return @math{+1} if @math{@var{op} > 0}, 0 if @math{@var{op} = 0}, and |
-@math{-1} if @math{@var{op} < 0}. |
- |
-This function is actually implemented as a macro. It evaluates its arguments |
-multiple times. |
-@end deftypefn |
- |
-@node I/O of Floats, Miscellaneous Float Functions, Float Comparison, Floating-point Functions |
-@comment node-name, next, previous, up |
-@section Input and Output Functions |
-@cindex Float input and output functions |
-@cindex Input functions |
-@cindex Output functions |
-@cindex I/O functions |
- |
-Functions that perform input from a stdio stream, and functions that output to |
-a stdio stream. Passing a @code{NULL} pointer for a @var{stream} argument to |
-any of these functions will make them read from @code{stdin} and write to |
-@code{stdout}, respectively. |
- |
-When using any of these functions, it is a good idea to include @file{stdio.h} |
-before @file{gmp.h}, since that will allow @file{gmp.h} to define prototypes |
-for these functions. |
- |
-@deftypefun size_t mpf_out_str (FILE *@var{stream}, int @var{base}, size_t @var{n_digits}, mpf_t @var{op}) |
-Print @var{op} to @var{stream}, as a string of digits. Return the number of |
-bytes written, or if an error occurred, return 0. |
- |
-The mantissa is prefixed with an @samp{0.} and is in the given @var{base}, |
-which may vary from 2 to 62 or from @minus{}2 to @minus{}36. An exponent is |
-then printed, separated by an @samp{e}, or if the base is greater than 10 then |
-by an @samp{@@}. The exponent is always in decimal. The decimal point follows |
-the current locale, on systems providing @code{localeconv}. |
- |
-For @var{base} in the range 2..36, digits and lower-case letters are used; for |
-@minus{}2..@minus{}36, digits and upper-case letters are used; for 37..62, |
-digits, upper-case letters, and lower-case letters (in that significance order) |
-are used. |
- |
-Up to @var{n_digits} will be printed from the mantissa, except that no more |
-digits than are accurately representable by @var{op} will be printed. |
-@var{n_digits} can be 0 to select that accurate maximum. |
-@end deftypefun |
- |
-@deftypefun size_t mpf_inp_str (mpf_t @var{rop}, FILE *@var{stream}, int @var{base}) |
-Read a string in base @var{base} from @var{stream}, and put the read float in |
-@var{rop}. The string is of the form @samp{M@@N} or, if the base is 10 or |
-less, alternatively @samp{MeN}. @samp{M} is the mantissa and @samp{N} is the |
-exponent. The mantissa is always in the specified base. The exponent is |
-either in the specified base or, if @var{base} is negative, in decimal. The |
-decimal point expected is taken from the current locale, on systems providing |
-@code{localeconv}. |
- |
-The argument @var{base} may be in the ranges 2 to 36, or @minus{}36 to |
-@minus{}2. Negative values are used to specify that the exponent is in |
-decimal. |
- |
-Unlike the corresponding @code{mpz} function, the base will not be determined |
-from the leading characters of the string if @var{base} is 0. This is so that |
-numbers like @samp{0.23} are not interpreted as octal. |
- |
-Return the number of bytes read, or if an error occurred, return 0. |
-@end deftypefun |
- |
-@c @deftypefun void mpf_out_raw (FILE *@var{stream}, mpf_t @var{float}) |
-@c Output @var{float} on stdio stream @var{stream}, in raw binary |
-@c format. The float is written in a portable format, with 4 bytes of |
-@c size information, and that many bytes of limbs. Both the size and the |
-@c limbs are written in decreasing significance order. |
-@c @end deftypefun |
- |
-@c @deftypefun void mpf_inp_raw (mpf_t @var{float}, FILE *@var{stream}) |
-@c Input from stdio stream @var{stream} in the format written by |
-@c @code{mpf_out_raw}, and put the result in @var{float}. |
-@c @end deftypefun |
- |
- |
-@node Miscellaneous Float Functions, , I/O of Floats, Floating-point Functions |
-@comment node-name, next, previous, up |
-@section Miscellaneous Functions |
-@cindex Miscellaneous float functions |
-@cindex Float miscellaneous functions |
- |
-@deftypefun void mpf_ceil (mpf_t @var{rop}, mpf_t @var{op}) |
-@deftypefunx void mpf_floor (mpf_t @var{rop}, mpf_t @var{op}) |
-@deftypefunx void mpf_trunc (mpf_t @var{rop}, mpf_t @var{op}) |
-@cindex Rounding functions |
-@cindex Float rounding functions |
-Set @var{rop} to @var{op} rounded to an integer. @code{mpf_ceil} rounds to the |
-next higher integer, @code{mpf_floor} to the next lower, and @code{mpf_trunc} |
-to the integer towards zero. |
-@end deftypefun |
- |
-@deftypefun int mpf_integer_p (mpf_t @var{op}) |
-Return non-zero if @var{op} is an integer. |
-@end deftypefun |
- |
-@deftypefun int mpf_fits_ulong_p (mpf_t @var{op}) |
-@deftypefunx int mpf_fits_slong_p (mpf_t @var{op}) |
-@deftypefunx int mpf_fits_uint_p (mpf_t @var{op}) |
-@deftypefunx int mpf_fits_sint_p (mpf_t @var{op}) |
-@deftypefunx int mpf_fits_ushort_p (mpf_t @var{op}) |
-@deftypefunx int mpf_fits_sshort_p (mpf_t @var{op}) |
-Return non-zero if @var{op} would fit in the respective C data type, when |
-truncated to an integer. |
-@end deftypefun |
- |
-@deftypefun void mpf_urandomb (mpf_t @var{rop}, gmp_randstate_t @var{state}, unsigned long int @var{nbits}) |
-@cindex Random number functions |
-@cindex Float random number functions |
-Generate a uniformly distributed random float in @var{rop}, such that @math{0 |
-@le{} @var{rop} < 1}, with @var{nbits} significant bits in the mantissa. |
- |
-The variable @var{state} must be initialized by calling one of the |
-@code{gmp_randinit} functions (@ref{Random State Initialization}) before |
-invoking this function. |
-@end deftypefun |
- |
-@deftypefun void mpf_random2 (mpf_t @var{rop}, mp_size_t @var{max_size}, mp_exp_t @var{exp}) |
-Generate a random float of at most @var{max_size} limbs, with long strings of |
-zeros and ones in the binary representation. The exponent of the number is in |
-the interval @minus{}@var{exp} to @var{exp} (in limbs). This function is |
-useful for testing functions and algorithms, since these kind of random |
-numbers have proven to be more likely to trigger corner-case bugs. Negative |
-random numbers are generated when @var{max_size} is negative. |
-@end deftypefun |
- |
-@c @deftypefun size_t mpf_size (mpf_t @var{op}) |
-@c Return the size of @var{op} measured in number of limbs. If @var{op} is |
-@c zero, the returned value will be zero. (@xref{Nomenclature}, for an |
-@c explanation of the concept @dfn{limb}.) |
-@c |
-@c @strong{This function is obsolete. It will disappear from future GMP |
-@c releases.} |
-@c @end deftypefun |
- |
- |
-@node Low-level Functions, Random Number Functions, Floating-point Functions, Top |
-@comment node-name, next, previous, up |
-@chapter Low-level Functions |
-@cindex Low-level functions |
- |
-This chapter describes low-level GMP functions, used to implement the |
-high-level GMP functions, but also intended for time-critical user code. |
- |
-These functions start with the prefix @code{mpn_}. |
- |
-@c 1. Some of these function clobber input operands. |
-@c |
- |
-The @code{mpn} functions are designed to be as fast as possible, @strong{not} |
-to provide a coherent calling interface. The different functions have somewhat |
-similar interfaces, but there are variations that make them hard to use. These |
-functions do as little as possible apart from the real multiple precision |
-computation, so that no time is spent on things that not all callers need. |
- |
-A source operand is specified by a pointer to the least significant limb and a |
-limb count. A destination operand is specified by just a pointer. It is the |
-responsibility of the caller to ensure that the destination has enough space |
-for storing the result. |
- |
-With this way of specifying operands, it is possible to perform computations on |
-subranges of an argument, and store the result into a subrange of a |
-destination. |
- |
-A common requirement for all functions is that each source area needs at least |
-one limb. No size argument may be zero. Unless otherwise stated, in-place |
-operations are allowed where source and destination are the same, but not where |
-they only partly overlap. |
- |
-The @code{mpn} functions are the base for the implementation of the |
-@code{mpz_}, @code{mpf_}, and @code{mpq_} functions. |
- |
-This example adds the number beginning at @var{s1p} and the number beginning at |
-@var{s2p} and writes the sum at @var{destp}. All areas have @var{n} limbs. |
- |
-@example |
-cy = mpn_add_n (destp, s1p, s2p, n) |
-@end example |
- |
-It should be noted that the @code{mpn} functions make no attempt to identify |
-high or low zero limbs on their operands, or other special forms. On random |
-data such cases will be unlikely and it'd be wasteful for every function to |
-check every time. An application knowing something about its data can take |
-steps to trim or perhaps split its calculations. |
-@c |
-@c For reference, within gmp mpz_t operands never have high zero limbs, and |
-@c we rate low zero limbs as unlikely too (or something an application should |
-@c handle). This is a prime motivation for not stripping zero limbs in say |
-@c mpn_mul_n etc. |
-@c |
-@c Other applications doing variable-length calculations will quite likely do |
-@c something similar to mpz. And even if not then it's highly likely zero |
-@c limb stripping can be done at just a few judicious points, which will be |
-@c more efficient than having lots of mpn functions checking every time. |
- |
-@sp 1 |
-@noindent |
-In the notation used below, a source operand is identified by the pointer to |
-the least significant limb, and the limb count in braces. For example, |
-@{@var{s1p}, @var{s1n}@}. |
- |
-@deftypefun mp_limb_t mpn_add_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) |
-Add @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the @var{n} |
-least significant limbs of the result to @var{rp}. Return carry, either 0 or |
-1. |
- |
-This is the lowest-level function for addition. It is the preferred function |
-for addition, since it is written in assembly for most CPUs. For addition of |
-a variable to itself (i.e., @var{s1p} equals @var{s2p}) use @code{mpn_lshift} |
-with a count of 1 for optimal speed. |
-@end deftypefun |
- |
-@deftypefun mp_limb_t mpn_add_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) |
-Add @{@var{s1p}, @var{n}@} and @var{s2limb}, and write the @var{n} least |
-significant limbs of the result to @var{rp}. Return carry, either 0 or 1. |
-@end deftypefun |
- |
-@deftypefun mp_limb_t mpn_add (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n}) |
-Add @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the |
-@var{s1n} least significant limbs of the result to @var{rp}. Return carry, |
-either 0 or 1. |
- |
-This function requires that @var{s1n} is greater than or equal to @var{s2n}. |
-@end deftypefun |
- |
-@deftypefun mp_limb_t mpn_sub_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) |
-Subtract @{@var{s2p}, @var{n}@} from @{@var{s1p}, @var{n}@}, and write the |
-@var{n} least significant limbs of the result to @var{rp}. Return borrow, |
-either 0 or 1. |
- |
-This is the lowest-level function for subtraction. It is the preferred |
-function for subtraction, since it is written in assembly for most CPUs. |
-@end deftypefun |
- |
-@deftypefun mp_limb_t mpn_sub_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) |
-Subtract @var{s2limb} from @{@var{s1p}, @var{n}@}, and write the @var{n} least |
-significant limbs of the result to @var{rp}. Return borrow, either 0 or 1. |
-@end deftypefun |
- |
-@deftypefun mp_limb_t mpn_sub (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n}) |
-Subtract @{@var{s2p}, @var{s2n}@} from @{@var{s1p}, @var{s1n}@}, and write the |
-@var{s1n} least significant limbs of the result to @var{rp}. Return borrow, |
-either 0 or 1. |
- |
-This function requires that @var{s1n} is greater than or equal to |
-@var{s2n}. |
-@end deftypefun |
- |
-@deftypefun void mpn_mul_n (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) |
-Multiply @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@}, and write the |
-2*@var{n}-limb result to @var{rp}. |
- |
-The destination has to have space for 2*@var{n} limbs, even if the product's |
-most significant limb is zero. No overlap is permitted between the |
-destination and either source. |
-@end deftypefun |
- |
-@deftypefun mp_limb_t mpn_mul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) |
-Multiply @{@var{s1p}, @var{n}@} by @var{s2limb}, and write the @var{n} least |
-significant limbs of the product to @var{rp}. Return the most significant |
-limb of the product. @{@var{s1p}, @var{n}@} and @{@var{rp}, @var{n}@} are |
-allowed to overlap provided @math{@var{rp} @le{} @var{s1p}}. |
- |
-This is a low-level function that is a building block for general |
-multiplication as well as other operations in GMP@. It is written in assembly |
-for most CPUs. |
- |
-Don't call this function if @var{s2limb} is a power of 2; use @code{mpn_lshift} |
-with a count equal to the logarithm of @var{s2limb} instead, for optimal speed. |
-@end deftypefun |
- |
-@deftypefun mp_limb_t mpn_addmul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) |
-Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and add the @var{n} least |
-significant limbs of the product to @{@var{rp}, @var{n}@} and write the result |
-to @var{rp}. Return the most significant limb of the product, plus carry-out |
-from the addition. |
- |
-This is a low-level function that is a building block for general |
-multiplication as well as other operations in GMP@. It is written in assembly |
-for most CPUs. |
-@end deftypefun |
- |
-@deftypefun mp_limb_t mpn_submul_1 (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{n}, mp_limb_t @var{s2limb}) |
-Multiply @{@var{s1p}, @var{n}@} and @var{s2limb}, and subtract the @var{n} |
-least significant limbs of the product from @{@var{rp}, @var{n}@} and write the |
-result to @var{rp}. Return the most significant limb of the product, plus |
-borrow-out from the subtraction. |
- |
-This is a low-level function that is a building block for general |
-multiplication and division as well as other operations in GMP@. It is written |
-in assembly for most CPUs. |
-@end deftypefun |
- |
-@deftypefun mp_limb_t mpn_mul (mp_limb_t *@var{rp}, const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n}) |
-Multiply @{@var{s1p}, @var{s1n}@} and @{@var{s2p}, @var{s2n}@}, and write the |
-result to @var{rp}. Return the most significant limb of the result. |
- |
-The destination has to have space for @var{s1n} + @var{s2n} limbs, even if the |
-result might be one limb smaller. |
- |
-This function requires that @var{s1n} is greater than or equal to |
-@var{s2n}. The destination must be distinct from both input operands. |
-@end deftypefun |
- |
-@deftypefun void mpn_tdiv_qr (mp_limb_t *@var{qp}, mp_limb_t *@var{rp}, mp_size_t @var{qxn}, const mp_limb_t *@var{np}, mp_size_t @var{nn}, const mp_limb_t *@var{dp}, mp_size_t @var{dn}) |
-Divide @{@var{np}, @var{nn}@} by @{@var{dp}, @var{dn}@} and put the quotient |
-at @{@var{qp}, @var{nn}@minus{}@var{dn}+1@} and the remainder at @{@var{rp}, |
-@var{dn}@}. The quotient is rounded towards 0. |
- |
-No overlap is permitted between arguments. @var{nn} must be greater than or |
-equal to @var{dn}. The most significant limb of @var{dp} must be non-zero. |
-The @var{qxn} operand must be zero. |
-@comment FIXME: Relax overlap requirements! |
-@end deftypefun |
- |
-@deftypefun mp_limb_t mpn_divrem (mp_limb_t *@var{r1p}, mp_size_t @var{qxn}, mp_limb_t *@var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n}) |
-[This function is obsolete. Please call @code{mpn_tdiv_qr} instead for best |
-performance.] |
- |
-Divide @{@var{rs2p}, @var{rs2n}@} by @{@var{s3p}, @var{s3n}@}, and write the |
-quotient at @var{r1p}, with the exception of the most significant limb, which |
-is returned. The remainder replaces the dividend at @var{rs2p}; it will be |
-@var{s3n} limbs long (i.e., as many limbs as the divisor). |
- |
-In addition to an integer quotient, @var{qxn} fraction limbs are developed, and |
-stored after the integral limbs. For most usages, @var{qxn} will be zero. |
- |
-It is required that @var{rs2n} is greater than or equal to @var{s3n}. It is |
-required that the most significant bit of the divisor is set. |
- |
-If the quotient is not needed, pass @var{rs2p} + @var{s3n} as @var{r1p}. Aside |
-from that special case, no overlap between arguments is permitted. |
- |
-Return the most significant limb of the quotient, either 0 or 1. |
- |
-The area at @var{r1p} needs to be @var{rs2n} @minus{} @var{s3n} + @var{qxn} |
-limbs large. |
-@end deftypefun |
- |
-@deftypefn Function mp_limb_t mpn_divrem_1 (mp_limb_t *@var{r1p}, mp_size_t @var{qxn}, @w{mp_limb_t *@var{s2p}}, mp_size_t @var{s2n}, mp_limb_t @var{s3limb}) |
-@deftypefnx Macro mp_limb_t mpn_divmod_1 (mp_limb_t *@var{r1p}, mp_limb_t *@var{s2p}, @w{mp_size_t @var{s2n}}, @w{mp_limb_t @var{s3limb}}) |
-Divide @{@var{s2p}, @var{s2n}@} by @var{s3limb}, and write the quotient at |
-@var{r1p}. Return the remainder. |
- |
-The integer quotient is written to @{@var{r1p}+@var{qxn}, @var{s2n}@} and in |
-addition @var{qxn} fraction limbs are developed and written to @{@var{r1p}, |
-@var{qxn}@}. Either or both @var{s2n} and @var{qxn} can be zero. For most |
-usages, @var{qxn} will be zero. |
- |
-@code{mpn_divmod_1} exists for upward source compatibility and is simply a |
-macro calling @code{mpn_divrem_1} with a @var{qxn} of 0. |
- |
-The areas at @var{r1p} and @var{s2p} have to be identical or completely |
-separate, not partially overlapping. |
-@end deftypefn |
- |
-@deftypefun mp_limb_t mpn_divmod (mp_limb_t *@var{r1p}, mp_limb_t *@var{rs2p}, mp_size_t @var{rs2n}, const mp_limb_t *@var{s3p}, mp_size_t @var{s3n}) |
-[This function is obsolete. Please call @code{mpn_tdiv_qr} instead for best |
-performance.] |
-@end deftypefun |
- |
-@deftypefn Macro mp_limb_t mpn_divexact_by3 (mp_limb_t *@var{rp}, mp_limb_t *@var{sp}, @w{mp_size_t @var{n}}) |
-@deftypefnx Function mp_limb_t mpn_divexact_by3c (mp_limb_t *@var{rp}, mp_limb_t *@var{sp}, @w{mp_size_t @var{n}}, mp_limb_t @var{carry}) |
-Divide @{@var{sp}, @var{n}@} by 3, expecting it to divide exactly, and writing |
-the result to @{@var{rp}, @var{n}@}. If 3 divides exactly, the return value is |
-zero and the result is the quotient. If not, the return value is non-zero and |
-the result won't be anything useful. |
- |
-@code{mpn_divexact_by3c} takes an initial carry parameter, which can be the |
-return value from a previous call, so a large calculation can be done piece by |
-piece from low to high. @code{mpn_divexact_by3} is simply a macro calling |
-@code{mpn_divexact_by3c} with a 0 carry parameter. |
- |
-These routines use a multiply-by-inverse and will be faster than |
-@code{mpn_divrem_1} on CPUs with fast multiplication but slow division. |
- |
-The source @math{a}, result @math{q}, size @math{n}, initial carry @math{i}, |
-and return value @math{c} satisfy @m{cb^n+a-i=3q, c*b^n + a-i = 3*q}, where |
-@m{b=2\GMPraise{@code{GMP\_NUMB\_BITS}}, b=2^GMP_NUMB_BITS}. The |
-return @math{c} is always 0, 1 or 2, and the initial carry @math{i} must also |
-be 0, 1 or 2 (these are both borrows really). When @math{c=0} clearly |
-@math{q=(a-i)/3}. When @m{c \neq 0, c!=0}, the remainder @math{(a-i) @bmod{} |
-3} is given by @math{3-c}, because @math{b @equiv{} 1 @bmod{} 3} (when |
-@code{mp_bits_per_limb} is even, which is always so currently). |
-@end deftypefn |
- |
-@deftypefun mp_limb_t mpn_mod_1 (mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t @var{s2limb}) |
-Divide @{@var{s1p}, @var{s1n}@} by @var{s2limb}, and return the remainder. |
-@var{s1n} can be zero. |
-@end deftypefun |
- |
-@deftypefun mp_limb_t mpn_bdivmod (mp_limb_t *@var{rp}, mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, const mp_limb_t *@var{s2p}, mp_size_t @var{s2n}, unsigned long int @var{d}) |
-This function puts the low |
-@math{@GMPfloor{@var{d}/@nicode{mp\_bits\_per\_limb}}} limbs of @var{q} = |
-@{@var{s1p}, @var{s1n}@}/@{@var{s2p}, @var{s2n}@} mod @m{2^d,2^@var{d}} at |
-@var{rp}, and returns the high @var{d} mod @code{mp_bits_per_limb} bits of |
-@var{q}. |
- |
-@{@var{s1p}, @var{s1n}@} - @var{q} * @{@var{s2p}, @var{s2n}@} mod @m{2 |
-\GMPraise{@var{s1n}*@code{mp\_bits\_per\_limb}}, |
-2^(@var{s1n}*@nicode{mp\_bits\_per\_limb})} is placed at @var{s1p}. Since the |
-low @math{@GMPfloor{@var{d}/@nicode{mp\_bits\_per\_limb}}} limbs of this |
-difference are zero, it is possible to overwrite the low limbs at @var{s1p} |
-with this difference, provided @math{@var{rp} @le{} @var{s1p}}. |
- |
-This function requires that @math{@var{s1n} * @nicode{mp\_bits\_per\_limb} |
-@ge{} @var{D}}, and that @{@var{s2p}, @var{s2n}@} is odd. |
- |
-@strong{This interface is preliminary. It might change incompatibly in future |
-revisions.} |
-@end deftypefun |
- |
-@deftypefun mp_limb_t mpn_lshift (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}, unsigned int @var{count}) |
-Shift @{@var{sp}, @var{n}@} left by @var{count} bits, and write the result to |
-@{@var{rp}, @var{n}@}. The bits shifted out at the left are returned in the |
-least significant @var{count} bits of the return value (the rest of the return |
-value is zero). |
- |
-@var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1. The |
-regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided |
-@math{@var{rp} @ge{} @var{sp}}. |
- |
-This function is written in assembly for most CPUs. |
-@end deftypefun |
- |
-@deftypefun mp_limb_t mpn_rshift (mp_limb_t *@var{rp}, const mp_limb_t *@var{sp}, mp_size_t @var{n}, unsigned int @var{count}) |
-Shift @{@var{sp}, @var{n}@} right by @var{count} bits, and write the result to |
-@{@var{rp}, @var{n}@}. The bits shifted out at the right are returned in the |
-most significant @var{count} bits of the return value (the rest of the return |
-value is zero). |
- |
-@var{count} must be in the range 1 to @nicode{mp_bits_per_limb}@minus{}1. The |
-regions @{@var{sp}, @var{n}@} and @{@var{rp}, @var{n}@} may overlap, provided |
-@math{@var{rp} @le{} @var{sp}}. |
- |
-This function is written in assembly for most CPUs. |
-@end deftypefun |
- |
-@deftypefun int mpn_cmp (const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) |
-Compare @{@var{s1p}, @var{n}@} and @{@var{s2p}, @var{n}@} and return a |
-positive value if @math{@var{s1} > @var{s2}}, 0 if they are equal, or a |
-negative value if @math{@var{s1} < @var{s2}}. |
-@end deftypefun |
- |
-@deftypefun mp_size_t mpn_gcd (mp_limb_t *@var{rp}, mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t *@var{s2p}, mp_size_t @var{s2n}) |
-Set @{@var{rp}, @var{retval}@} to the greatest common divisor of @{@var{s1p}, |
-@var{s1n}@} and @{@var{s2p}, @var{s2n}@}. The result can be up to @var{s2n} |
-limbs, the return value is the actual number produced. Both source operands |
-are destroyed. |
- |
-@{@var{s1p}, @var{s1n}@} must have at least as many bits as @{@var{s2p}, |
-@var{s2n}@}. @{@var{s2p}, @var{s2n}@} must be odd. Both operands must have |
-non-zero most significant limbs. No overlap is permitted between @{@var{s1p}, |
-@var{s1n}@} and @{@var{s2p}, @var{s2n}@}. |
-@end deftypefun |
- |
-@deftypefun mp_limb_t mpn_gcd_1 (const mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t @var{s2limb}) |
-Return the greatest common divisor of @{@var{s1p}, @var{s1n}@} and |
-@var{s2limb}. Both operands must be non-zero. |
-@end deftypefun |
- |
-@deftypefun mp_size_t mpn_gcdext (mp_limb_t *@var{r1p}, mp_limb_t *@var{r2p}, mp_size_t *@var{r2n}, mp_limb_t *@var{s1p}, mp_size_t @var{s1n}, mp_limb_t *@var{s2p}, mp_size_t @var{s2n}) |
-Calculate the greatest common divisor of @{@var{s1p}, @var{s1n}@} and |
-@{@var{s2p}, @var{s2n}@}. Store the gcd at @{@var{r1p}, @var{retval}@} and |
-the first cofactor at @{@var{r2p}, *@var{r2n}@}, with *@var{r2n} negative if |
-the cofactor is negative. @var{r1p} and @var{r2p} should each have room for |
-@math{@var{s1n}+1} limbs, but the return value and value stored through |
-@var{r2n} indicate the actual number produced. |
- |
-@math{@{@var{s1p}, @var{s1n}@} @ge{} @{@var{s2p}, @var{s2n}@}} is required, |
-and both must be non-zero. The regions @{@var{s1p}, @math{@var{s1n}+1}@} and |
-@{@var{s2p}, @math{@var{s2n}+1}@} are destroyed (i.e.@: the operands plus an |
-extra limb past the end of each). |
- |
-The cofactor @var{r2} will satisfy @m{r_2 s_1 + k s_2 = r_1, @var{r2}*@var{s1} |
-+ @var{k}*@var{s2} = @var{r1}}. The second cofactor @var{k} is not calculated |
-but can easily be obtained from @m{(r_1 - r_2 s_1) / s_2, (@var{r1} - |
-@var{r2}*@var{s1}) / @var{s2}} (this division will be exact). |
-@end deftypefun |
- |
-@deftypefun mp_size_t mpn_sqrtrem (mp_limb_t *@var{r1p}, mp_limb_t *@var{r2p}, const mp_limb_t *@var{sp}, mp_size_t @var{n}) |
-Compute the square root of @{@var{sp}, @var{n}@} and put the result at |
-@{@var{r1p}, @math{@GMPceil{@var{n}/2}}@} and the remainder at @{@var{r2p}, |
-@var{retval}@}. @var{r2p} needs space for @var{n} limbs, but the return value |
-indicates how many are produced. |
- |
-The most significant limb of @{@var{sp}, @var{n}@} must be non-zero. The |
-areas @{@var{r1p}, @math{@GMPceil{@var{n}/2}}@} and @{@var{sp}, @var{n}@} must |
-be completely separate. The areas @{@var{r2p}, @var{n}@} and @{@var{sp}, |
-@var{n}@} must be either identical or completely separate. |
- |
-If the remainder is not wanted then @var{r2p} can be @code{NULL}, and in this |
-case the return value is zero or non-zero according to whether the remainder |
-would have been zero or non-zero. |
- |
-A return value of zero indicates a perfect square. See also |
-@code{mpz_perfect_square_p}. |
-@end deftypefun |
- |
-@deftypefun mp_size_t mpn_get_str (unsigned char *@var{str}, int @var{base}, mp_limb_t *@var{s1p}, mp_size_t @var{s1n}) |
-Convert @{@var{s1p}, @var{s1n}@} to a raw unsigned char array at @var{str} in |
-base @var{base}, and return the number of characters produced. There may be |
-leading zeros in the string. The string is not in ASCII; to convert it to |
-printable format, add the ASCII codes for @samp{0} or @samp{A}, depending on |
-the base and range. @var{base} can vary from 2 to 256. |
- |
-The most significant limb of the input @{@var{s1p}, @var{s1n}@} must be |
-non-zero. The input @{@var{s1p}, @var{s1n}@} is clobbered, except when |
-@var{base} is a power of 2, in which case it's unchanged. |
- |
-The area at @var{str} has to have space for the largest possible number |
-represented by a @var{s1n} long limb array, plus one extra character. |
-@end deftypefun |
- |
-@deftypefun mp_size_t mpn_set_str (mp_limb_t *@var{rp}, const unsigned char *@var{str}, size_t @var{strsize}, int @var{base}) |
-Convert bytes @{@var{str},@var{strsize}@} in the given @var{base} to limbs at |
-@var{rp}. |
- |
-@math{@var{str}[0]} is the most significant byte and |
-@math{@var{str}[@var{strsize}-1]} is the least significant. Each byte should |
-be a value in the range 0 to @math{@var{base}-1}, not an ASCII character. |
-@var{base} can vary from 2 to 256. |
- |
-The return value is the number of limbs written to @var{rp}. If the most |
-significant input byte is non-zero then the high limb at @var{rp} will be |
-non-zero, and only that exact number of limbs will be required there. |
- |
-If the most significant input byte is zero then there may be high zero limbs |
-written to @var{rp} and included in the return value. |
- |
-@var{strsize} must be at least 1, and no overlap is permitted between |
-@{@var{str},@var{strsize}@} and the result at @var{rp}. |
-@end deftypefun |
- |
-@deftypefun {unsigned long int} mpn_scan0 (const mp_limb_t *@var{s1p}, unsigned long int @var{bit}) |
-Scan @var{s1p} from bit position @var{bit} for the next clear bit. |
- |
-It is required that there be a clear bit within the area at @var{s1p} at or |
-beyond bit position @var{bit}, so that the function has something to return. |
-@end deftypefun |
- |
-@deftypefun {unsigned long int} mpn_scan1 (const mp_limb_t *@var{s1p}, unsigned long int @var{bit}) |
-Scan @var{s1p} from bit position @var{bit} for the next set bit. |
- |
-It is required that there be a set bit within the area at @var{s1p} at or |
-beyond bit position @var{bit}, so that the function has something to return. |
-@end deftypefun |
- |
-@deftypefun void mpn_random (mp_limb_t *@var{r1p}, mp_size_t @var{r1n}) |
-@deftypefunx void mpn_random2 (mp_limb_t *@var{r1p}, mp_size_t @var{r1n}) |
-Generate a random number of length @var{r1n} and store it at @var{r1p}. The |
-most significant limb is always non-zero. @code{mpn_random} generates |
-uniformly distributed limb data, @code{mpn_random2} generates long strings of |
-zeros and ones in the binary representation. |
- |
-@code{mpn_random2} is intended for testing the correctness of the @code{mpn} |
-routines. |
-@end deftypefun |
- |
-@deftypefun {unsigned long int} mpn_popcount (const mp_limb_t *@var{s1p}, mp_size_t @var{n}) |
-Count the number of set bits in @{@var{s1p}, @var{n}@}. |
-@end deftypefun |
- |
-@deftypefun {unsigned long int} mpn_hamdist (const mp_limb_t *@var{s1p}, const mp_limb_t *@var{s2p}, mp_size_t @var{n}) |
-Compute the hamming distance between @{@var{s1p}, @var{n}@} and @{@var{s2p}, |
-@var{n}@}, which is the number of bit positions where the two operands have |
-different bit values. |
-@end deftypefun |
- |
-@deftypefun int mpn_perfect_square_p (const mp_limb_t *@var{s1p}, mp_size_t @var{n}) |
-Return non-zero iff @{@var{s1p}, @var{n}@} is a perfect square. |
-@end deftypefun |
- |
- |
-@sp 1 |
-@section Nails |
-@cindex Nails |
- |
-@strong{Everything in this section is highly experimental and may disappear or |
-be subject to incompatible changes in a future version of GMP.} |
- |
-Nails are an experimental feature whereby a few bits are left unused at the |
-top of each @code{mp_limb_t}. This can significantly improve carry handling |
-on some processors. |
- |
-All the @code{mpn} functions accepting limb data will expect the nail bits to |
-be zero on entry, and will return data with the nails similarly all zero. |
-This applies both to limb vectors and to single limb arguments. |
- |
-Nails can be enabled by configuring with @samp{--enable-nails}. By default |
-the number of bits will be chosen according to what suits the host processor, |
-but a particular number can be selected with @samp{--enable-nails=N}. |
- |
-At the mpn level, a nail build is neither source nor binary compatible with a |
-non-nail build, strictly speaking. But programs acting on limbs only through |
-the mpn functions are likely to work equally well with either build, and |
-judicious use of the definitions below should make any program compatible with |
-either build, at the source level. |
- |
-For the higher level routines, meaning @code{mpz} etc, a nail build should be |
-fully source and binary compatible with a non-nail build. |
- |
-@defmac GMP_NAIL_BITS |
-@defmacx GMP_NUMB_BITS |
-@defmacx GMP_LIMB_BITS |
-@code{GMP_NAIL_BITS} is the number of nail bits, or 0 when nails are not in |
-use. @code{GMP_NUMB_BITS} is the number of data bits in a limb. |
-@code{GMP_LIMB_BITS} is the total number of bits in an @code{mp_limb_t}. In |
-all cases |
- |
-@example |
-GMP_LIMB_BITS == GMP_NAIL_BITS + GMP_NUMB_BITS |
-@end example |
-@end defmac |
- |
-@defmac GMP_NAIL_MASK |
-@defmacx GMP_NUMB_MASK |
-Bit masks for the nail and number parts of a limb. @code{GMP_NAIL_MASK} is 0 |
-when nails are not in use. |
- |
-@code{GMP_NAIL_MASK} is not often needed, since the nail part can be obtained |
-with @code{x >> GMP_NUMB_BITS}, and that means one less large constant, which |
-can help various RISC chips. |
-@end defmac |
- |
-@defmac GMP_NUMB_MAX |
-The maximum value that can be stored in the number part of a limb. This is |
-the same as @code{GMP_NUMB_MASK}, but can be used for clarity when doing |
-comparisons rather than bit-wise operations. |
-@end defmac |
- |
-The term ``nails'' comes from finger or toe nails, which are at the ends of a |
-limb (arm or leg). ``numb'' is short for number, but is also how the |
-developers felt after trying for a long time to come up with sensible names |
-for these things. |
- |
-In the future (the distant future most likely) a non-zero nail might be |
-permitted, giving non-unique representations for numbers in a limb vector. |
-This would help vector processors since carries would only ever need to |
-propagate one or two limbs. |
- |
- |
-@node Random Number Functions, Formatted Output, Low-level Functions, Top |
-@chapter Random Number Functions |
-@cindex Random number functions |
- |
-Sequences of pseudo-random numbers in GMP are generated using a variable of |
-type @code{gmp_randstate_t}, which holds an algorithm selection and a current |
-state. Such a variable must be initialized by a call to one of the |
-@code{gmp_randinit} functions, and can be seeded with one of the |
-@code{gmp_randseed} functions. |
- |
-The functions actually generating random numbers are described in @ref{Integer |
-Random Numbers}, and @ref{Miscellaneous Float Functions}. |
- |
-The older style random number functions don't accept a @code{gmp_randstate_t} |
-parameter but instead share a global variable of that type. They use a |
-default algorithm and are currently not seeded (though perhaps that will |
-change in the future). The new functions accepting a @code{gmp_randstate_t} |
-are recommended for applications that care about randomness. |
- |
-@menu |
-* Random State Initialization:: |
-* Random State Seeding:: |
-* Random State Miscellaneous:: |
-@end menu |
- |
-@node Random State Initialization, Random State Seeding, Random Number Functions, Random Number Functions |
-@section Random State Initialization |
-@cindex Random number state |
-@cindex Initialization functions |
- |
-@deftypefun void gmp_randinit_default (gmp_randstate_t @var{state}) |
-Initialize @var{state} with a default algorithm. This will be a compromise |
-between speed and randomness, and is recommended for applications with no |
-special requirements. Currently this is @code{gmp_randinit_mt}. |
-@end deftypefun |
- |
-@deftypefun void gmp_randinit_mt (gmp_randstate_t @var{state}) |
-@cindex Mersenne twister random numbers |
-Initialize @var{state} for a Mersenne Twister algorithm. This algorithm is |
-fast and has good randomness properties. |
-@end deftypefun |
- |
-@deftypefun void gmp_randinit_lc_2exp (gmp_randstate_t @var{state}, mpz_t @var{a}, @w{unsigned long @var{c}}, @w{unsigned long @var{m2exp}}) |
-@cindex Linear congruential random numbers |
-Initialize @var{state} with a linear congruential algorithm @m{X = (@var{a}X + |
-@var{c}) @bmod 2^{m2exp}, X = (@var{a}*X + @var{c}) mod 2^@var{m2exp}}. |
- |
-The low bits of @math{X} in this algorithm are not very random. The least |
-significant bit will have a period no more than 2, and the second bit no more |
-than 4, etc. For this reason only the high half of each @math{X} is actually |
-used. |
- |
-When a random number of more than @math{@var{m2exp}/2} bits is to be |
-generated, multiple iterations of the recurrence are used and the results |
-concatenated. |
-@end deftypefun |
- |
-@deftypefun int gmp_randinit_lc_2exp_size (gmp_randstate_t @var{state}, unsigned long @var{size}) |
-@cindex Linear congruential random numbers |
-Initialize @var{state} for a linear congruential algorithm as per |
-@code{gmp_randinit_lc_2exp}. @var{a}, @var{c} and @var{m2exp} are selected |
-from a table, chosen so that @var{size} bits (or more) of each @math{X} will |
-be used, ie.@: @math{@var{m2exp}/2 @ge{} @var{size}}. |
- |
-If successful the return value is non-zero. If @var{size} is bigger than the |
-table data provides then the return value is zero. The maximum @var{size} |
-currently supported is 128. |
-@end deftypefun |
- |
-@deftypefun void gmp_randinit_set (gmp_randstate_t @var{rop}, gmp_randstate_t @var{op}) |
-Initialize @var{rop} with a copy of the algorithm and state from @var{op}. |
-@end deftypefun |
- |
-@c Although gmp_randinit, gmp_errno and related constants are obsolete, we |
-@c still put @findex entries for them, since they're still documented and |
-@c someone might be looking them up when perusing old application code. |
- |
-@deftypefun void gmp_randinit (gmp_randstate_t @var{state}, @w{gmp_randalg_t @var{alg}}, @dots{}) |
-@strong{This function is obsolete.} |
- |
-@findex GMP_RAND_ALG_LC |
-@findex GMP_RAND_ALG_DEFAULT |
-Initialize @var{state} with an algorithm selected by @var{alg}. The only |
-choice is @code{GMP_RAND_ALG_LC}, which is @code{gmp_randinit_lc_2exp_size} |
-described above. A third parameter of type @code{unsigned long} is required, |
-this is the @var{size} for that function. @code{GMP_RAND_ALG_DEFAULT} or 0 |
-are the same as @code{GMP_RAND_ALG_LC}. |
- |
-@c For reference, this is the only place gmp_errno has been documented, and |
-@c due to being non thread safe we won't be adding to it's uses. |
-@findex gmp_errno |
-@findex GMP_ERROR_UNSUPPORTED_ARGUMENT |
-@findex GMP_ERROR_INVALID_ARGUMENT |
-@code{gmp_randinit} sets bits in the global variable @code{gmp_errno} to |
-indicate an error. @code{GMP_ERROR_UNSUPPORTED_ARGUMENT} if @var{alg} is |
-unsupported, or @code{GMP_ERROR_INVALID_ARGUMENT} if the @var{size} parameter |
-is too big. It may be noted this error reporting is not thread safe (a good |
-reason to use @code{gmp_randinit_lc_2exp_size} instead). |
-@end deftypefun |
- |
-@deftypefun void gmp_randclear (gmp_randstate_t @var{state}) |
-Free all memory occupied by @var{state}. |
-@end deftypefun |
- |
- |
-@node Random State Seeding, Random State Miscellaneous, Random State Initialization, Random Number Functions |
-@section Random State Seeding |
-@cindex Random number seeding |
-@cindex Seeding random numbers |
- |
-@deftypefun void gmp_randseed (gmp_randstate_t @var{state}, mpz_t @var{seed}) |
-@deftypefunx void gmp_randseed_ui (gmp_randstate_t @var{state}, @w{unsigned long int @var{seed}}) |
-Set an initial seed value into @var{state}. |
- |
-The size of a seed determines how many different sequences of random numbers |
-that it's possible to generate. The ``quality'' of the seed is the randomness |
-of a given seed compared to the previous seed used, and this affects the |
-randomness of separate number sequences. The method for choosing a seed is |
-critical if the generated numbers are to be used for important applications, |
-such as generating cryptographic keys. |
- |
-Traditionally the system time has been used to seed, but care needs to be |
-taken with this. If an application seeds often and the resolution of the |
-system clock is low, then the same sequence of numbers might be repeated. |
-Also, the system time is quite easy to guess, so if unpredictability is |
-required then it should definitely not be the only source for the seed value. |
-On some systems there's a special device @file{/dev/random} which provides |
-random data better suited for use as a seed. |
-@end deftypefun |
- |
- |
-@node Random State Miscellaneous, , Random State Seeding, Random Number Functions |
-@section Random State Miscellaneous |
- |
-@deftypefun {unsigned long} gmp_urandomb_ui (gmp_randstate_t @var{state}, unsigned long @var{n}) |
-Return a uniformly distributed random number of @var{n} bits, ie.@: in the |
-range 0 to @m{2^n-1,2^@var{n}-1} inclusive. @var{n} must be less than or |
-equal to the number of bits in an @code{unsigned long}. |
-@end deftypefun |
- |
-@deftypefun {unsigned long} gmp_urandomm_ui (gmp_randstate_t @var{state}, unsigned long @var{n}) |
-Return a uniformly distributed random number in the range 0 to |
-@math{@var{n}-1}, inclusive. |
-@end deftypefun |
- |
- |
-@node Formatted Output, Formatted Input, Random Number Functions, Top |
-@chapter Formatted Output |
-@cindex Formatted output |
-@cindex @code{printf} formatted output |
- |
-@menu |
-* Formatted Output Strings:: |
-* Formatted Output Functions:: |
-* C++ Formatted Output:: |
-@end menu |
- |
-@node Formatted Output Strings, Formatted Output Functions, Formatted Output, Formatted Output |
-@section Format Strings |
- |
-@code{gmp_printf} and friends accept format strings similar to the standard C |
-@code{printf} (@pxref{Formatted Output,, Formatted Output, libc, The GNU C |
-Library Reference Manual}). A format specification is of the form |
- |
-@example |
-% [flags] [width] [.[precision]] [type] conv |
-@end example |
- |
-GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t} |
-and @code{mpf_t} respectively, @samp{M} for @code{mp_limb_t}, and @samp{N} for |
-an @code{mp_limb_t} array. @samp{Z}, @samp{Q}, @samp{M} and @samp{N} behave |
-like integers. @samp{Q} will print a @samp{/} and a denominator, if needed. |
-@samp{F} behaves like a float. For example, |
- |
-@example |
-mpz_t z; |
-gmp_printf ("%s is an mpz %Zd\n", "here", z); |
- |
-mpq_t q; |
-gmp_printf ("a hex rational: %#40Qx\n", q); |
- |
-mpf_t f; |
-int n; |
-gmp_printf ("fixed point mpf %.*Ff with %d digits\n", n, f, n); |
- |
-mp_limb_t l; |
-gmp_printf ("limb %Mu\n", l); |
- |
-const mp_limb_t *ptr; |
-mp_size_t size; |
-gmp_printf ("limb array %Nx\n", ptr, size); |
-@end example |
- |
-For @samp{N} the limbs are expected least significant first, as per the |
-@code{mpn} functions (@pxref{Low-level Functions}). A negative size can be |
-given to print the value as a negative. |
- |
-All the standard C @code{printf} types behave the same as the C library |
-@code{printf}, and can be freely intermixed with the GMP extensions. In the |
-current implementation the standard parts of the format string are simply |
-handed to @code{printf} and only the GMP extensions handled directly. |
- |
-The flags accepted are as follows. GLIBC style @nisamp{'} is only for the |
-standard C types (not the GMP types), and only if the C library supports it. |
- |
-@quotation |
-@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} |
-@item @nicode{0} @tab pad with zeros (rather than spaces) |
-@item @nicode{#} @tab show the base with @samp{0x}, @samp{0X} or @samp{0} |
-@item @nicode{+} @tab always show a sign |
-@item (space) @tab show a space or a @samp{-} sign |
-@item @nicode{'} @tab group digits, GLIBC style (not GMP types) |
-@end multitable |
-@end quotation |
- |
-The optional width and precision can be given as a number within the format |
-string, or as a @samp{*} to take an extra parameter of type @code{int}, the |
-same as the standard @code{printf}. |
- |
-The standard types accepted are as follows. @samp{h} and @samp{l} are |
-portable, the rest will depend on the compiler (or include files) for the type |
-and the C library for the output. |
- |
-@quotation |
-@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} |
-@item @nicode{h} @tab @nicode{short} |
-@item @nicode{hh} @tab @nicode{char} |
-@item @nicode{j} @tab @nicode{intmax_t} or @nicode{uintmax_t} |
-@item @nicode{l} @tab @nicode{long} or @nicode{wchar_t} |
-@item @nicode{ll} @tab @nicode{long long} |
-@item @nicode{L} @tab @nicode{long double} |
-@item @nicode{q} @tab @nicode{quad_t} or @nicode{u_quad_t} |
-@item @nicode{t} @tab @nicode{ptrdiff_t} |
-@item @nicode{z} @tab @nicode{size_t} |
-@end multitable |
-@end quotation |
- |
-@noindent |
-The GMP types are |
- |
-@quotation |
-@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} |
-@item @nicode{F} @tab @nicode{mpf_t}, float conversions |
-@item @nicode{Q} @tab @nicode{mpq_t}, integer conversions |
-@item @nicode{M} @tab @nicode{mp_limb_t}, integer conversions |
-@item @nicode{N} @tab @nicode{mp_limb_t} array, integer conversions |
-@item @nicode{Z} @tab @nicode{mpz_t}, integer conversions |
-@end multitable |
-@end quotation |
- |
-The conversions accepted are as follows. @samp{a} and @samp{A} are always |
-supported for @code{mpf_t} but depend on the C library for standard C float |
-types. @samp{m} and @samp{p} depend on the C library. |
- |
-@quotation |
-@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} |
-@item @nicode{a} @nicode{A} @tab hex floats, C99 style |
-@item @nicode{c} @tab character |
-@item @nicode{d} @tab decimal integer |
-@item @nicode{e} @nicode{E} @tab scientific format float |
-@item @nicode{f} @tab fixed point float |
-@item @nicode{i} @tab same as @nicode{d} |
-@item @nicode{g} @nicode{G} @tab fixed or scientific float |
-@item @nicode{m} @tab @code{strerror} string, GLIBC style |
-@item @nicode{n} @tab store characters written so far |
-@item @nicode{o} @tab octal integer |
-@item @nicode{p} @tab pointer |
-@item @nicode{s} @tab string |
-@item @nicode{u} @tab unsigned integer |
-@item @nicode{x} @nicode{X} @tab hex integer |
-@end multitable |
-@end quotation |
- |
-@samp{o}, @samp{x} and @samp{X} are unsigned for the standard C types, but for |
-types @samp{Z}, @samp{Q} and @samp{N} they are signed. @samp{u} is not |
-meaningful for @samp{Z}, @samp{Q} and @samp{N}. |
- |
-@samp{M} is a proxy for the C library @samp{l} or @samp{L}, according to the |
-size of @code{mp_limb_t}. Unsigned conversions will be usual, but a signed |
-conversion can be used and will interpret the value as a twos complement |
-negative. |
- |
-@samp{n} can be used with any type, even the GMP types. |
- |
-Other types or conversions that might be accepted by the C library |
-@code{printf} cannot be used through @code{gmp_printf}, this includes for |
-instance extensions registered with GLIBC @code{register_printf_function}. |
-Also currently there's no support for POSIX @samp{$} style numbered arguments |
-(perhaps this will be added in the future). |
- |
-The precision field has it's usual meaning for integer @samp{Z} and float |
-@samp{F} types, but is currently undefined for @samp{Q} and should not be used |
-with that. |
- |
-@code{mpf_t} conversions only ever generate as many digits as can be |
-accurately represented by the operand, the same as @code{mpf_get_str} does. |
-Zeros will be used if necessary to pad to the requested precision. This |
-happens even for an @samp{f} conversion of an @code{mpf_t} which is an |
-integer, for instance @math{2^@W{1024}} in an @code{mpf_t} of 128 bits |
-precision will only produce about 40 digits, then pad with zeros to the |
-decimal point. An empty precision field like @samp{%.Fe} or @samp{%.Ff} can |
-be used to specifically request just the significant digits. |
- |
-The decimal point character (or string) is taken from the current locale |
-settings on systems which provide @code{localeconv} (@pxref{Locales,, Locales |
-and Internationalization, libc, The GNU C Library Reference Manual}). The C |
-library will normally do the same for standard float output. |
- |
-The format string is only interpreted as plain @code{char}s, multibyte |
-characters are not recognised. Perhaps this will change in the future. |
- |
- |
-@node Formatted Output Functions, C++ Formatted Output, Formatted Output Strings, Formatted Output |
-@section Functions |
-@cindex Output functions |
- |
-Each of the following functions is similar to the corresponding C library |
-function. The basic @code{printf} forms take a variable argument list. The |
-@code{vprintf} forms take an argument pointer, see @ref{Variadic Functions,, |
-Variadic Functions, libc, The GNU C Library Reference Manual}, or @samp{man 3 |
-va_start}. |
- |
-It should be emphasised that if a format string is invalid, or the arguments |
-don't match what the format specifies, then the behaviour of any of these |
-functions will be unpredictable. GCC format string checking is not available, |
-since it doesn't recognise the GMP extensions. |
- |
-The file based functions @code{gmp_printf} and @code{gmp_fprintf} will return |
-@math{-1} to indicate a write error. Output is not ``atomic'', so partial |
-output may be produced if a write error occurs. All the functions can return |
-@math{-1} if the C library @code{printf} variant in use returns @math{-1}, but |
-this shouldn't normally occur. |
- |
-@deftypefun int gmp_printf (const char *@var{fmt}, @dots{}) |
-@deftypefunx int gmp_vprintf (const char *@var{fmt}, va_list @var{ap}) |
-Print to the standard output @code{stdout}. Return the number of characters |
-written, or @math{-1} if an error occurred. |
-@end deftypefun |
- |
-@deftypefun int gmp_fprintf (FILE *@var{fp}, const char *@var{fmt}, @dots{}) |
-@deftypefunx int gmp_vfprintf (FILE *@var{fp}, const char *@var{fmt}, va_list @var{ap}) |
-Print to the stream @var{fp}. Return the number of characters written, or |
-@math{-1} if an error occurred. |
-@end deftypefun |
- |
-@deftypefun int gmp_sprintf (char *@var{buf}, const char *@var{fmt}, @dots{}) |
-@deftypefunx int gmp_vsprintf (char *@var{buf}, const char *@var{fmt}, va_list @var{ap}) |
-Form a null-terminated string in @var{buf}. Return the number of characters |
-written, excluding the terminating null. |
- |
-No overlap is permitted between the space at @var{buf} and the string |
-@var{fmt}. |
- |
-These functions are not recommended, since there's no protection against |
-exceeding the space available at @var{buf}. |
-@end deftypefun |
- |
-@deftypefun int gmp_snprintf (char *@var{buf}, size_t @var{size}, const char *@var{fmt}, @dots{}) |
-@deftypefunx int gmp_vsnprintf (char *@var{buf}, size_t @var{size}, const char *@var{fmt}, va_list @var{ap}) |
-Form a null-terminated string in @var{buf}. No more than @var{size} bytes |
-will be written. To get the full output, @var{size} must be enough for the |
-string and null-terminator. |
- |
-The return value is the total number of characters which ought to have been |
-produced, excluding the terminating null. If @math{@var{retval} @ge{} |
-@var{size}} then the actual output has been truncated to the first |
-@math{@var{size}-1} characters, and a null appended. |
- |
-No overlap is permitted between the region @{@var{buf},@var{size}@} and the |
-@var{fmt} string. |
- |
-Notice the return value is in ISO C99 @code{snprintf} style. This is so even |
-if the C library @code{vsnprintf} is the older GLIBC 2.0.x style. |
-@end deftypefun |
- |
-@deftypefun int gmp_asprintf (char **@var{pp}, const char *@var{fmt}, @dots{}) |
-@deftypefunx int gmp_vasprintf (char **@var{pp}, const char *@var{fmt}, va_list @var{ap}) |
-Form a null-terminated string in a block of memory obtained from the current |
-memory allocation function (@pxref{Custom Allocation}). The block will be the |
-size of the string and null-terminator. The address of the block in stored to |
-*@var{pp}. The return value is the number of characters produced, excluding |
-the null-terminator. |
- |
-Unlike the C library @code{asprintf}, @code{gmp_asprintf} doesn't return |
-@math{-1} if there's no more memory available, it lets the current allocation |
-function handle that. |
-@end deftypefun |
- |
-@deftypefun int gmp_obstack_printf (struct obstack *@var{ob}, const char *@var{fmt}, @dots{}) |
-@deftypefunx int gmp_obstack_vprintf (struct obstack *@var{ob}, const char *@var{fmt}, va_list @var{ap}) |
-@cindex @code{obstack} output |
-Append to the current object in @var{ob}. The return value is the number of |
-characters written. A null-terminator is not written. |
- |
-@var{fmt} cannot be within the current object in @var{ob}, since that object |
-might move as it grows. |
- |
-These functions are available only when the C library provides the obstack |
-feature, which probably means only on GNU systems, see @ref{Obstacks,, |
-Obstacks, libc, The GNU C Library Reference Manual}. |
-@end deftypefun |
- |
- |
-@node C++ Formatted Output, , Formatted Output Functions, Formatted Output |
-@section C++ Formatted Output |
-@cindex C++ @code{ostream} output |
-@cindex @code{ostream} output |
- |
-The following functions are provided in @file{libgmpxx} (@pxref{Headers and |
-Libraries}), which is built if C++ support is enabled (@pxref{Build Options}). |
-Prototypes are available from @code{<gmp.h>}. |
- |
-@deftypefun ostream& operator<< (ostream& @var{stream}, mpz_t @var{op}) |
-Print @var{op} to @var{stream}, using its @code{ios} formatting settings. |
-@code{ios::width} is reset to 0 after output, the same as the standard |
-@code{ostream operator<<} routines do. |
- |
-In hex or octal, @var{op} is printed as a signed number, the same as for |
-decimal. This is unlike the standard @code{operator<<} routines on @code{int} |
-etc, which instead give twos complement. |
-@end deftypefun |
- |
-@deftypefun ostream& operator<< (ostream& @var{stream}, mpq_t @var{op}) |
-Print @var{op} to @var{stream}, using its @code{ios} formatting settings. |
-@code{ios::width} is reset to 0 after output, the same as the standard |
-@code{ostream operator<<} routines do. |
- |
-Output will be a fraction like @samp{5/9}, or if the denominator is 1 then |
-just a plain integer like @samp{123}. |
- |
-In hex or octal, @var{op} is printed as a signed value, the same as for |
-decimal. If @code{ios::showbase} is set then a base indicator is shown on |
-both the numerator and denominator (if the denominator is required). |
-@end deftypefun |
- |
-@deftypefun ostream& operator<< (ostream& @var{stream}, mpf_t @var{op}) |
-Print @var{op} to @var{stream}, using its @code{ios} formatting settings. |
-@code{ios::width} is reset to 0 after output, the same as the standard |
-@code{ostream operator<<} routines do. |
- |
-The decimal point follows the standard library float @code{operator<<}, which |
-on recent systems means the @code{std::locale} imbued on @var{stream}. |
- |
-Hex and octal are supported, unlike the standard @code{operator<<} on |
-@code{double}. The mantissa will be in hex or octal, the exponent will be in |
-decimal. For hex the exponent delimiter is an @samp{@@}. This is as per |
-@code{mpf_out_str}. |
- |
-@code{ios::showbase} is supported, and will put a base on the mantissa, for |
-example hex @samp{0x1.8} or @samp{0x0.8}, or octal @samp{01.4} or @samp{00.4}. |
-This last form is slightly strange, but at least differentiates itself from |
-decimal. |
-@end deftypefun |
- |
-These operators mean that GMP types can be printed in the usual C++ way, for |
-example, |
- |
-@example |
-mpz_t z; |
-int n; |
-... |
-cout << "iteration " << n << " value " << z << "\n"; |
-@end example |
- |
-But note that @code{ostream} output (and @code{istream} input, @pxref{C++ |
-Formatted Input}) is the only overloading available for the GMP types and that |
-for instance using @code{+} with an @code{mpz_t} will have unpredictable |
-results. For classes with overloading, see @ref{C++ Class Interface}. |
- |
- |
-@node Formatted Input, C++ Class Interface, Formatted Output, Top |
-@chapter Formatted Input |
-@cindex Formatted input |
-@cindex @code{scanf} formatted input |
- |
-@menu |
-* Formatted Input Strings:: |
-* Formatted Input Functions:: |
-* C++ Formatted Input:: |
-@end menu |
- |
- |
-@node Formatted Input Strings, Formatted Input Functions, Formatted Input, Formatted Input |
-@section Formatted Input Strings |
- |
-@code{gmp_scanf} and friends accept format strings similar to the standard C |
-@code{scanf} (@pxref{Formatted Input,, Formatted Input, libc, The GNU C |
-Library Reference Manual}). A format specification is of the form |
- |
-@example |
-% [flags] [width] [type] conv |
-@end example |
- |
-GMP adds types @samp{Z}, @samp{Q} and @samp{F} for @code{mpz_t}, @code{mpq_t} |
-and @code{mpf_t} respectively. @samp{Z} and @samp{Q} behave like integers. |
-@samp{Q} will read a @samp{/} and a denominator, if present. @samp{F} behaves |
-like a float. |
- |
-GMP variables don't require an @code{&} when passed to @code{gmp_scanf}, since |
-they're already ``call-by-reference''. For example, |
- |
-@example |
-/* to read say "a(5) = 1234" */ |
-int n; |
-mpz_t z; |
-gmp_scanf ("a(%d) = %Zd\n", &n, z); |
- |
-mpq_t q1, q2; |
-gmp_sscanf ("0377 + 0x10/0x11", "%Qi + %Qi", q1, q2); |
- |
-/* to read say "topleft (1.55,-2.66)" */ |
-mpf_t x, y; |
-char buf[32]; |
-gmp_scanf ("%31s (%Ff,%Ff)", buf, x, y); |
-@end example |
- |
-All the standard C @code{scanf} types behave the same as in the C library |
-@code{scanf}, and can be freely intermixed with the GMP extensions. In the |
-current implementation the standard parts of the format string are simply |
-handed to @code{scanf} and only the GMP extensions handled directly. |
- |
-The flags accepted are as follows. @samp{a} and @samp{'} will depend on |
-support from the C library, and @samp{'} cannot be used with GMP types. |
- |
-@quotation |
-@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} |
-@item @nicode{*} @tab read but don't store |
-@item @nicode{a} @tab allocate a buffer (string conversions) |
-@item @nicode{'} @tab grouped digits, GLIBC style (not GMP types) |
-@end multitable |
-@end quotation |
- |
-The standard types accepted are as follows. @samp{h} and @samp{l} are |
-portable, the rest will depend on the compiler (or include files) for the type |
-and the C library for the input. |
- |
-@quotation |
-@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} |
-@item @nicode{h} @tab @nicode{short} |
-@item @nicode{hh} @tab @nicode{char} |
-@item @nicode{j} @tab @nicode{intmax_t} or @nicode{uintmax_t} |
-@item @nicode{l} @tab @nicode{long int}, @nicode{double} or @nicode{wchar_t} |
-@item @nicode{ll} @tab @nicode{long long} |
-@item @nicode{L} @tab @nicode{long double} |
-@item @nicode{q} @tab @nicode{quad_t} or @nicode{u_quad_t} |
-@item @nicode{t} @tab @nicode{ptrdiff_t} |
-@item @nicode{z} @tab @nicode{size_t} |
-@end multitable |
-@end quotation |
- |
-@noindent |
-The GMP types are |
- |
-@quotation |
-@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} |
-@item @nicode{F} @tab @nicode{mpf_t}, float conversions |
-@item @nicode{Q} @tab @nicode{mpq_t}, integer conversions |
-@item @nicode{Z} @tab @nicode{mpz_t}, integer conversions |
-@end multitable |
-@end quotation |
- |
-The conversions accepted are as follows. @samp{p} and @samp{[} will depend on |
-support from the C library, the rest are standard. |
- |
-@quotation |
-@multitable {(space)} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} |
-@item @nicode{c} @tab character or characters |
-@item @nicode{d} @tab decimal integer |
-@item @nicode{e} @nicode{E} @nicode{f} @nicode{g} @nicode{G} |
- @tab float |
-@item @nicode{i} @tab integer with base indicator |
-@item @nicode{n} @tab characters read so far |
-@item @nicode{o} @tab octal integer |
-@item @nicode{p} @tab pointer |
-@item @nicode{s} @tab string of non-whitespace characters |
-@item @nicode{u} @tab decimal integer |
-@item @nicode{x} @nicode{X} @tab hex integer |
-@item @nicode{[} @tab string of characters in a set |
-@end multitable |
-@end quotation |
- |
-@samp{e}, @samp{E}, @samp{f}, @samp{g} and @samp{G} are identical, they all |
-read either fixed point or scientific format, and either upper or lower case |
-@samp{e} for the exponent in scientific format. |
- |
-C99 style hex float format (@code{printf %a}, @pxref{Formatted Output |
-Strings}) is always accepted for @code{mpf_t}, but for the standard float |
-types it will depend on the C library. |
- |
-@samp{x} and @samp{X} are identical, both accept both upper and lower case |
-hexadecimal. |
- |
-@samp{o}, @samp{u}, @samp{x} and @samp{X} all read positive or negative |
-values. For the standard C types these are described as ``unsigned'' |
-conversions, but that merely affects certain overflow handling, negatives are |
-still allowed (per @code{strtoul}, @pxref{Parsing of Integers,, Parsing of |
-Integers, libc, The GNU C Library Reference Manual}). For GMP types there are |
-no overflows, so @samp{d} and @samp{u} are identical. |
- |
-@samp{Q} type reads the numerator and (optional) denominator as given. If the |
-value might not be in canonical form then @code{mpq_canonicalize} must be |
-called before using it in any calculations (@pxref{Rational Number |
-Functions}). |
- |
-@samp{Qi} will read a base specification separately for the numerator and |
-denominator. For example @samp{0x10/11} would be 16/11, whereas |
-@samp{0x10/0x11} would be 16/17. |
- |
-@samp{n} can be used with any of the types above, even the GMP types. |
-@samp{*} to suppress assignment is allowed, though in that case it would do |
-nothing at all. |
- |
-Other conversions or types that might be accepted by the C library |
-@code{scanf} cannot be used through @code{gmp_scanf}. |
- |
-Whitespace is read and discarded before a field, except for @samp{c} and |
-@samp{[} conversions. |
- |
-For float conversions, the decimal point character (or string) expected is |
-taken from the current locale settings on systems which provide |
-@code{localeconv} (@pxref{Locales,, Locales and Internationalization, libc, |
-The GNU C Library Reference Manual}). The C library will normally do the same |
-for standard float input. |
- |
-The format string is only interpreted as plain @code{char}s, multibyte |
-characters are not recognised. Perhaps this will change in the future. |
- |
- |
-@node Formatted Input Functions, C++ Formatted Input, Formatted Input Strings, Formatted Input |
-@section Formatted Input Functions |
-@cindex Input functions |
- |
-Each of the following functions is similar to the corresponding C library |
-function. The plain @code{scanf} forms take a variable argument list. The |
-@code{vscanf} forms take an argument pointer, see @ref{Variadic Functions,, |
-Variadic Functions, libc, The GNU C Library Reference Manual}, or @samp{man 3 |
-va_start}. |
- |
-It should be emphasised that if a format string is invalid, or the arguments |
-don't match what the format specifies, then the behaviour of any of these |
-functions will be unpredictable. GCC format string checking is not available, |
-since it doesn't recognise the GMP extensions. |
- |
-No overlap is permitted between the @var{fmt} string and any of the results |
-produced. |
- |
-@deftypefun int gmp_scanf (const char *@var{fmt}, @dots{}) |
-@deftypefunx int gmp_vscanf (const char *@var{fmt}, va_list @var{ap}) |
-Read from the standard input @code{stdin}. |
-@end deftypefun |
- |
-@deftypefun int gmp_fscanf (FILE *@var{fp}, const char *@var{fmt}, @dots{}) |
-@deftypefunx int gmp_vfscanf (FILE *@var{fp}, const char *@var{fmt}, va_list @var{ap}) |
-Read from the stream @var{fp}. |
-@end deftypefun |
- |
-@deftypefun int gmp_sscanf (const char *@var{s}, const char *@var{fmt}, @dots{}) |
-@deftypefunx int gmp_vsscanf (const char *@var{s}, const char *@var{fmt}, va_list @var{ap}) |
-Read from a null-terminated string @var{s}. |
-@end deftypefun |
- |
-The return value from each of these functions is the same as the standard C99 |
-@code{scanf}, namely the number of fields successfully parsed and stored. |
-@samp{%n} fields and fields read but suppressed by @samp{*} don't count |
-towards the return value. |
- |
-If end of input (or a file error) is reached before a character for a field or |
-a literal, and if no previous non-suppressed fields have matched, then the |
-return value is @code{EOF} instead of 0. A whitespace character in the format |
-string is only an optional match and doesn't induce an @code{EOF} in this |
-fashion. Leading whitespace read and discarded for a field don't count as |
-characters for that field. |
- |
-For the GMP types, input parsing follows C99 rules, namely one character of |
-lookahead is used and characters are read while they continue to meet the |
-format requirements. If this doesn't provide a complete number then the |
-function terminates, with that field not stored nor counted towards the return |
-value. For instance with @code{mpf_t} an input @samp{1.23e-XYZ} would be read |
-up to the @samp{X} and that character pushed back since it's not a digit. The |
-string @samp{1.23e-} would then be considered invalid since an @samp{e} must |
-be followed by at least one digit. |
- |
-For the standard C types, in the current implementation GMP calls the C |
-library @code{scanf} functions, which might have looser rules about what |
-constitutes a valid input. |
- |
-Note that @code{gmp_sscanf} is the same as @code{gmp_fscanf} and only does one |
-character of lookahead when parsing. Although clearly it could look at its |
-entire input, it is deliberately made identical to @code{gmp_fscanf}, the same |
-way C99 @code{sscanf} is the same as @code{fscanf}. |
- |
- |
-@node C++ Formatted Input, , Formatted Input Functions, Formatted Input |
-@section C++ Formatted Input |
-@cindex C++ @code{istream} input |
-@cindex @code{istream} input |
- |
-The following functions are provided in @file{libgmpxx} (@pxref{Headers and |
-Libraries}), which is built only if C++ support is enabled (@pxref{Build |
-Options}). Prototypes are available from @code{<gmp.h>}. |
- |
-@deftypefun istream& operator>> (istream& @var{stream}, mpz_t @var{rop}) |
-Read @var{rop} from @var{stream}, using its @code{ios} formatting settings. |
-@end deftypefun |
- |
-@deftypefun istream& operator>> (istream& @var{stream}, mpq_t @var{rop}) |
-An integer like @samp{123} will be read, or a fraction like @samp{5/9}. No |
-whitespace is allowed around the @samp{/}. If the fraction is not in |
-canonical form then @code{mpq_canonicalize} must be called (@pxref{Rational |
-Number Functions}) before operating on it. |
- |
-As per integer input, an @samp{0} or @samp{0x} base indicator is read when |
-none of @code{ios::dec}, @code{ios::oct} or @code{ios::hex} are set. This is |
-done separately for numerator and denominator, so that for instance |
-@samp{0x10/11} is @math{16/11} and @samp{0x10/0x11} is @math{16/17}. |
-@end deftypefun |
- |
-@deftypefun istream& operator>> (istream& @var{stream}, mpf_t @var{rop}) |
-Read @var{rop} from @var{stream}, using its @code{ios} formatting settings. |
- |
-Hex or octal floats are not supported, but might be in the future, or perhaps |
-it's best to accept only what the standard float @code{operator>>} does. |
-@end deftypefun |
- |
-Note that digit grouping specified by the @code{istream} locale is currently |
-not accepted. Perhaps this will change in the future. |
- |
-@sp 1 |
-These operators mean that GMP types can be read in the usual C++ way, for |
-example, |
- |
-@example |
-mpz_t z; |
-... |
-cin >> z; |
-@end example |
- |
-But note that @code{istream} input (and @code{ostream} output, @pxref{C++ |
-Formatted Output}) is the only overloading available for the GMP types and |
-that for instance using @code{+} with an @code{mpz_t} will have unpredictable |
-results. For classes with overloading, see @ref{C++ Class Interface}. |
- |
- |
- |
-@node C++ Class Interface, BSD Compatible Functions, Formatted Input, Top |
-@chapter C++ Class Interface |
-@cindex C++ interface |
- |
-This chapter describes the C++ class based interface to GMP. |
- |
-All GMP C language types and functions can be used in C++ programs, since |
-@file{gmp.h} has @code{extern "C"} qualifiers, but the class interface offers |
-overloaded functions and operators which may be more convenient. |
- |
-Due to the implementation of this interface, a reasonably recent C++ compiler |
-is required, one supporting namespaces, partial specialization of templates |
-and member templates. For GCC this means version 2.91 or later. |
- |
-@strong{Everything described in this chapter is to be considered preliminary |
-and might be subject to incompatible changes if some unforeseen difficulty |
-reveals itself.} |
- |
-@menu |
-* C++ Interface General:: |
-* C++ Interface Integers:: |
-* C++ Interface Rationals:: |
-* C++ Interface Floats:: |
-* C++ Interface Random Numbers:: |
-* C++ Interface Limitations:: |
-@end menu |
- |
- |
-@node C++ Interface General, C++ Interface Integers, C++ Class Interface, C++ Class Interface |
-@section C++ Interface General |
- |
-@noindent |
-All the C++ classes and functions are available with |
- |
-@cindex @code{gmpxx.h} |
-@example |
-#include <gmpxx.h> |
-@end example |
- |
-Programs should be linked with the @file{libgmpxx} and @file{libgmp} |
-libraries. For example, |
- |
-@example |
-g++ mycxxprog.cc -lgmpxx -lgmp |
-@end example |
- |
-@noindent |
-The classes defined are |
- |
-@deftp Class mpz_class |
-@deftpx Class mpq_class |
-@deftpx Class mpf_class |
-@end deftp |
- |
-The standard operators and various standard functions are overloaded to allow |
-arithmetic with these classes. For example, |
- |
-@example |
-int |
-main (void) |
-@{ |
- mpz_class a, b, c; |
- |
- a = 1234; |
- b = "-5678"; |
- c = a+b; |
- cout << "sum is " << c << "\n"; |
- cout << "absolute value is " << abs(c) << "\n"; |
- |
- return 0; |
-@} |
-@end example |
- |
-An important feature of the implementation is that an expression like |
-@code{a=b+c} results in a single call to the corresponding @code{mpz_add}, |
-without using a temporary for the @code{b+c} part. Expressions which by their |
-nature imply intermediate values, like @code{a=b*c+d*e}, still use temporaries |
-though. |
- |
-The classes can be freely intermixed in expressions, as can the classes and |
-the standard types @code{long}, @code{unsigned long} and @code{double}. |
-Smaller types like @code{int} or @code{float} can also be intermixed, since |
-C++ will promote them. |
- |
-Note that @code{bool} is not accepted directly, but must be explicitly cast to |
-an @code{int} first. This is because C++ will automatically convert any |
-pointer to a @code{bool}, so if GMP accepted @code{bool} it would make all |
-sorts of invalid class and pointer combinations compile but almost certainly |
-not do anything sensible. |
- |
-Conversions back from the classes to standard C++ types aren't done |
-automatically, instead member functions like @code{get_si} are provided (see |
-the following sections for details). |
- |
-Also there are no automatic conversions from the classes to the corresponding |
-GMP C types, instead a reference to the underlying C object can be obtained |
-with the following functions, |
- |
-@deftypefun mpz_t mpz_class::get_mpz_t () |
-@deftypefunx mpq_t mpq_class::get_mpq_t () |
-@deftypefunx mpf_t mpf_class::get_mpf_t () |
-@end deftypefun |
- |
-These can be used to call a C function which doesn't have a C++ class |
-interface. For example to set @code{a} to the GCD of @code{b} and @code{c}, |
- |
-@example |
-mpz_class a, b, c; |
-... |
-mpz_gcd (a.get_mpz_t(), b.get_mpz_t(), c.get_mpz_t()); |
-@end example |
- |
-In the other direction, a class can be initialized from the corresponding GMP |
-C type, or assigned to if an explicit constructor is used. In both cases this |
-makes a copy of the value, it doesn't create any sort of association. For |
-example, |
- |
-@example |
-mpz_t z; |
-// ... init and calculate z ... |
-mpz_class x(z); |
-mpz_class y; |
-y = mpz_class (z); |
-@end example |
- |
-There are no namespace setups in @file{gmpxx.h}, all types and functions are |
-simply put into the global namespace. This is what @file{gmp.h} has done in |
-the past, and continues to do for compatibility. The extras provided by |
-@file{gmpxx.h} follow GMP naming conventions and are unlikely to clash with |
-anything. |
- |
- |
-@node C++ Interface Integers, C++ Interface Rationals, C++ Interface General, C++ Class Interface |
-@section C++ Interface Integers |
- |
-@deftypefun void mpz_class::mpz_class (type @var{n}) |
-Construct an @code{mpz_class}. All the standard C++ types may be used, except |
-@code{long long} and @code{long double}, and all the GMP C++ classes can be |
-used. Any necessary conversion follows the corresponding C function, for |
-example @code{double} follows @code{mpz_set_d} (@pxref{Assigning Integers}). |
-@end deftypefun |
- |
-@deftypefun void mpz_class::mpz_class (mpz_t @var{z}) |
-Construct an @code{mpz_class} from an @code{mpz_t}. The value in @var{z} is |
-copied into the new @code{mpz_class}, there won't be any permanent association |
-between it and @var{z}. |
-@end deftypefun |
- |
-@deftypefun void mpz_class::mpz_class (const char *@var{s}) |
-@deftypefunx void mpz_class::mpz_class (const char *@var{s}, int @var{base} = 0) |
-@deftypefunx void mpz_class::mpz_class (const string& @var{s}) |
-@deftypefunx void mpz_class::mpz_class (const string& @var{s}, int @var{base} = 0) |
-Construct an @code{mpz_class} converted from a string using @code{mpz_set_str} |
-(@pxref{Assigning Integers}). |
- |
-If the string is not a valid integer, an @code{std::invalid_argument} |
-exception is thrown. The same applies to @code{operator=}. |
-@end deftypefun |
- |
-@deftypefun mpz_class operator/ (mpz_class @var{a}, mpz_class @var{d}) |
-@deftypefunx mpz_class operator% (mpz_class @var{a}, mpz_class @var{d}) |
-Divisions involving @code{mpz_class} round towards zero, as per the |
-@code{mpz_tdiv_q} and @code{mpz_tdiv_r} functions (@pxref{Integer Division}). |
-This is the same as the C99 @code{/} and @code{%} operators. |
- |
-The @code{mpz_fdiv@dots{}} or @code{mpz_cdiv@dots{}} functions can always be called |
-directly if desired. For example, |
- |
-@example |
-mpz_class q, a, d; |
-... |
-mpz_fdiv_q (q.get_mpz_t(), a.get_mpz_t(), d.get_mpz_t()); |
-@end example |
-@end deftypefun |
- |
-@deftypefun mpz_class abs (mpz_class @var{op1}) |
-@deftypefunx int cmp (mpz_class @var{op1}, type @var{op2}) |
-@deftypefunx int cmp (type @var{op1}, mpz_class @var{op2}) |
-@maybepagebreak |
-@deftypefunx bool mpz_class::fits_sint_p (void) |
-@deftypefunx bool mpz_class::fits_slong_p (void) |
-@deftypefunx bool mpz_class::fits_sshort_p (void) |
-@maybepagebreak |
-@deftypefunx bool mpz_class::fits_uint_p (void) |
-@deftypefunx bool mpz_class::fits_ulong_p (void) |
-@deftypefunx bool mpz_class::fits_ushort_p (void) |
-@maybepagebreak |
-@deftypefunx double mpz_class::get_d (void) |
-@deftypefunx long mpz_class::get_si (void) |
-@deftypefunx string mpz_class::get_str (int @var{base} = 10) |
-@deftypefunx {unsigned long} mpz_class::get_ui (void) |
-@maybepagebreak |
-@deftypefunx int mpz_class::set_str (const char *@var{str}, int @var{base}) |
-@deftypefunx int mpz_class::set_str (const string& @var{str}, int @var{base}) |
-@deftypefunx int sgn (mpz_class @var{op}) |
-@deftypefunx mpz_class sqrt (mpz_class @var{op}) |
-These functions provide a C++ class interface to the corresponding GMP C |
-routines. |
- |
-@code{cmp} can be used with any of the classes or the standard C++ types, |
-except @code{long long} and @code{long double}. |
-@end deftypefun |
- |
-@sp 1 |
-Overloaded operators for combinations of @code{mpz_class} and @code{double} |
-are provided for completeness, but it should be noted that if the given |
-@code{double} is not an integer then the way any rounding is done is currently |
-unspecified. The rounding might take place at the start, in the middle, or at |
-the end of the operation, and it might change in the future. |
- |
-Conversions between @code{mpz_class} and @code{double}, however, are defined |
-to follow the corresponding C functions @code{mpz_get_d} and @code{mpz_set_d}. |
-And comparisons are always made exactly, as per @code{mpz_cmp_d}. |
- |
- |
-@node C++ Interface Rationals, C++ Interface Floats, C++ Interface Integers, C++ Class Interface |
-@section C++ Interface Rationals |
- |
-In all the following constructors, if a fraction is given then it should be in |
-canonical form, or if not then @code{mpq_class::canonicalize} called. |
- |
-@deftypefun void mpq_class::mpq_class (type @var{op}) |
-@deftypefunx void mpq_class::mpq_class (integer @var{num}, integer @var{den}) |
-Construct an @code{mpq_class}. The initial value can be a single value of any |
-type, or a pair of integers (@code{mpz_class} or standard C++ integer types) |
-representing a fraction, except that @code{long long} and @code{long double} |
-are not supported. For example, |
- |
-@example |
-mpq_class q (99); |
-mpq_class q (1.75); |
-mpq_class q (1, 3); |
-@end example |
-@end deftypefun |
- |
-@deftypefun void mpq_class::mpq_class (mpq_t @var{q}) |
-Construct an @code{mpq_class} from an @code{mpq_t}. The value in @var{q} is |
-copied into the new @code{mpq_class}, there won't be any permanent association |
-between it and @var{q}. |
-@end deftypefun |
- |
-@deftypefun void mpq_class::mpq_class (const char *@var{s}) |
-@deftypefunx void mpq_class::mpq_class (const char *@var{s}, int @var{base} = 0) |
-@deftypefunx void mpq_class::mpq_class (const string& @var{s}) |
-@deftypefunx void mpq_class::mpq_class (const string& @var{s}, int @var{base} = 0) |
-Construct an @code{mpq_class} converted from a string using @code{mpq_set_str} |
-(@pxref{Initializing Rationals}). |
- |
-If the string is not a valid rational, an @code{std::invalid_argument} |
-exception is thrown. The same applies to @code{operator=}. |
-@end deftypefun |
- |
-@deftypefun void mpq_class::canonicalize () |
-Put an @code{mpq_class} into canonical form, as per @ref{Rational Number |
-Functions}. All arithmetic operators require their operands in canonical |
-form, and will return results in canonical form. |
-@end deftypefun |
- |
-@deftypefun mpq_class abs (mpq_class @var{op}) |
-@deftypefunx int cmp (mpq_class @var{op1}, type @var{op2}) |
-@deftypefunx int cmp (type @var{op1}, mpq_class @var{op2}) |
-@maybepagebreak |
-@deftypefunx double mpq_class::get_d (void) |
-@deftypefunx string mpq_class::get_str (int @var{base} = 10) |
-@maybepagebreak |
-@deftypefunx int mpq_class::set_str (const char *@var{str}, int @var{base}) |
-@deftypefunx int mpq_class::set_str (const string& @var{str}, int @var{base}) |
-@deftypefunx int sgn (mpq_class @var{op}) |
-These functions provide a C++ class interface to the corresponding GMP C |
-routines. |
- |
-@code{cmp} can be used with any of the classes or the standard C++ types, |
-except @code{long long} and @code{long double}. |
-@end deftypefun |
- |
-@deftypefun {mpz_class&} mpq_class::get_num () |
-@deftypefunx {mpz_class&} mpq_class::get_den () |
-Get a reference to an @code{mpz_class} which is the numerator or denominator |
-of an @code{mpq_class}. This can be used both for read and write access. If |
-the object returned is modified, it modifies the original @code{mpq_class}. |
- |
-If direct manipulation might produce a non-canonical value, then |
-@code{mpq_class::canonicalize} must be called before further operations. |
-@end deftypefun |
- |
-@deftypefun mpz_t mpq_class::get_num_mpz_t () |
-@deftypefunx mpz_t mpq_class::get_den_mpz_t () |
-Get a reference to the underlying @code{mpz_t} numerator or denominator of an |
-@code{mpq_class}. This can be passed to C functions expecting an |
-@code{mpz_t}. Any modifications made to the @code{mpz_t} will modify the |
-original @code{mpq_class}. |
- |
-If direct manipulation might produce a non-canonical value, then |
-@code{mpq_class::canonicalize} must be called before further operations. |
-@end deftypefun |
- |
-@deftypefun istream& operator>> (istream& @var{stream}, mpq_class& @var{rop}); |
-Read @var{rop} from @var{stream}, using its @code{ios} formatting settings, |
-the same as @code{mpq_t operator>>} (@pxref{C++ Formatted Input}). |
- |
-If the @var{rop} read might not be in canonical form then |
-@code{mpq_class::canonicalize} must be called. |
-@end deftypefun |
- |
- |
-@node C++ Interface Floats, C++ Interface Random Numbers, C++ Interface Rationals, C++ Class Interface |
-@section C++ Interface Floats |
- |
-When an expression requires the use of temporary intermediate @code{mpf_class} |
-values, like @code{f=g*h+x*y}, those temporaries will have the same precision |
-as the destination @code{f}. Explicit constructors can be used if this |
-doesn't suit. |
- |
-@deftypefun {} mpf_class::mpf_class (type @var{op}) |
-@deftypefunx {} mpf_class::mpf_class (type @var{op}, unsigned long @var{prec}) |
-Construct an @code{mpf_class}. Any standard C++ type can be used, except |
-@code{long long} and @code{long double}, and any of the GMP C++ classes can be |
-used. |
- |
-If @var{prec} is given, the initial precision is that value, in bits. If |
-@var{prec} is not given, then the initial precision is determined by the type |
-of @var{op} given. An @code{mpz_class}, @code{mpq_class}, or C++ |
-builtin type will give the default @code{mpf} precision (@pxref{Initializing |
-Floats}). An @code{mpf_class} or expression will give the precision of that |
-value. The precision of a binary expression is the higher of the two |
-operands. |
- |
-@example |
-mpf_class f(1.5); // default precision |
-mpf_class f(1.5, 500); // 500 bits (at least) |
-mpf_class f(x); // precision of x |
-mpf_class f(abs(x)); // precision of x |
-mpf_class f(-g, 1000); // 1000 bits (at least) |
-mpf_class f(x+y); // greater of precisions of x and y |
-@end example |
-@end deftypefun |
- |
-@deftypefun void mpf_class::mpf_class (const char *@var{s}) |
-@deftypefunx void mpf_class::mpf_class (const char *@var{s}, unsigned long @var{prec}, int @var{base} = 0) |
-@deftypefunx void mpf_class::mpf_class (const string& @var{s}) |
-@deftypefunx void mpf_class::mpf_class (const string& @var{s}, unsigned long @var{prec}, int @var{base} = 0) |
-Construct an @code{mpf_class} converted from a string using @code{mpf_set_str} |
-(@pxref{Assigning Floats}). If @var{prec} is given, the initial precision is |
-that value, in bits. If not, the default @code{mpf} precision |
-(@pxref{Initializing Floats}) is used. |
- |
-If the string is not a valid float, an @code{std::invalid_argument} exception |
-is thrown. The same applies to @code{operator=}. |
-@end deftypefun |
- |
-@deftypefun {mpf_class&} mpf_class::operator= (type @var{op}) |
-Convert and store the given @var{op} value to an @code{mpf_class} object. The |
-same types are accepted as for the constructors above. |
- |
-Note that @code{operator=} only stores a new value, it doesn't copy or change |
-the precision of the destination, instead the value is truncated if necessary. |
-This is the same as @code{mpf_set} etc. Note in particular this means for |
-@code{mpf_class} a copy constructor is not the same as a default constructor |
-plus assignment. |
- |
-@example |
-mpf_class x (y); // x created with precision of y |
- |
-mpf_class x; // x created with default precision |
-x = y; // value truncated to that precision |
-@end example |
- |
-Applications using templated code may need to be careful about the assumptions |
-the code makes in this area, when working with @code{mpf_class} values of |
-various different or non-default precisions. For instance implementations of |
-the standard @code{complex} template have been seen in both styles above, |
-though of course @code{complex} is normally only actually specified for use |
-with the builtin float types. |
-@end deftypefun |
- |
-@deftypefun mpf_class abs (mpf_class @var{op}) |
-@deftypefunx mpf_class ceil (mpf_class @var{op}) |
-@deftypefunx int cmp (mpf_class @var{op1}, type @var{op2}) |
-@deftypefunx int cmp (type @var{op1}, mpf_class @var{op2}) |
-@maybepagebreak |
-@deftypefunx bool mpf_class::fits_sint_p (void) |
-@deftypefunx bool mpf_class::fits_slong_p (void) |
-@deftypefunx bool mpf_class::fits_sshort_p (void) |
-@maybepagebreak |
-@deftypefunx bool mpf_class::fits_uint_p (void) |
-@deftypefunx bool mpf_class::fits_ulong_p (void) |
-@deftypefunx bool mpf_class::fits_ushort_p (void) |
-@maybepagebreak |
-@deftypefunx mpf_class floor (mpf_class @var{op}) |
-@deftypefunx mpf_class hypot (mpf_class @var{op1}, mpf_class @var{op2}) |
-@maybepagebreak |
-@deftypefunx double mpf_class::get_d (void) |
-@deftypefunx long mpf_class::get_si (void) |
-@deftypefunx string mpf_class::get_str (mp_exp_t& @var{exp}, int @var{base} = 10, size_t @var{digits} = 0) |
-@deftypefunx {unsigned long} mpf_class::get_ui (void) |
-@maybepagebreak |
-@deftypefunx int mpf_class::set_str (const char *@var{str}, int @var{base}) |
-@deftypefunx int mpf_class::set_str (const string& @var{str}, int @var{base}) |
-@deftypefunx int sgn (mpf_class @var{op}) |
-@deftypefunx mpf_class sqrt (mpf_class @var{op}) |
-@deftypefunx mpf_class trunc (mpf_class @var{op}) |
-These functions provide a C++ class interface to the corresponding GMP C |
-routines. |
- |
-@code{cmp} can be used with any of the classes or the standard C++ types, |
-except @code{long long} and @code{long double}. |
- |
-The accuracy provided by @code{hypot} is not currently guaranteed. |
-@end deftypefun |
- |
-@deftypefun {unsigned long int} mpf_class::get_prec () |
-@deftypefunx void mpf_class::set_prec (unsigned long @var{prec}) |
-@deftypefunx void mpf_class::set_prec_raw (unsigned long @var{prec}) |
-Get or set the current precision of an @code{mpf_class}. |
- |
-The restrictions described for @code{mpf_set_prec_raw} (@pxref{Initializing |
-Floats}) apply to @code{mpf_class::set_prec_raw}. Note in particular that the |
-@code{mpf_class} must be restored to it's allocated precision before being |
-destroyed. This must be done by application code, there's no automatic |
-mechanism for it. |
-@end deftypefun |
- |
- |
-@node C++ Interface Random Numbers, C++ Interface Limitations, C++ Interface Floats, C++ Class Interface |
-@section C++ Interface Random Numbers |
- |
-@deftp Class gmp_randclass |
-The C++ class interface to the GMP random number functions uses |
-@code{gmp_randclass} to hold an algorithm selection and current state, as per |
-@code{gmp_randstate_t}. |
-@end deftp |
- |
-@deftypefun {} gmp_randclass::gmp_randclass (void (*@var{randinit}) (gmp_randstate_t, @dots{}), @dots{}) |
-Construct a @code{gmp_randclass}, using a call to the given @var{randinit} |
-function (@pxref{Random State Initialization}). The arguments expected are |
-the same as @var{randinit}, but with @code{mpz_class} instead of @code{mpz_t}. |
-For example, |
- |
-@example |
-gmp_randclass r1 (gmp_randinit_default); |
-gmp_randclass r2 (gmp_randinit_lc_2exp_size, 32); |
-gmp_randclass r3 (gmp_randinit_lc_2exp, a, c, m2exp); |
-gmp_randclass r4 (gmp_randinit_mt); |
-@end example |
- |
-@code{gmp_randinit_lc_2exp_size} will fail if the size requested is too big, |
-an @code{std::length_error} exception is thrown in that case. |
-@end deftypefun |
- |
-@deftypefun {} gmp_randclass::gmp_randclass (gmp_randalg_t @var{alg}, @dots{}) |
-Construct a @code{gmp_randclass} using the same parameters as |
-@code{gmp_randinit} (@pxref{Random State Initialization}). This function is |
-obsolete and the above @var{randinit} style should be preferred. |
-@end deftypefun |
- |
-@deftypefun void gmp_randclass::seed (unsigned long int @var{s}) |
-@deftypefunx void gmp_randclass::seed (mpz_class @var{s}) |
-Seed a random number generator. See @pxref{Random Number Functions}, for how |
-to choose a good seed. |
-@end deftypefun |
- |
-@deftypefun mpz_class gmp_randclass::get_z_bits (unsigned long @var{bits}) |
-@deftypefunx mpz_class gmp_randclass::get_z_bits (mpz_class @var{bits}) |
-Generate a random integer with a specified number of bits. |
-@end deftypefun |
- |
-@deftypefun mpz_class gmp_randclass::get_z_range (mpz_class @var{n}) |
-Generate a random integer in the range 0 to @math{@var{n}-1} inclusive. |
-@end deftypefun |
- |
-@deftypefun mpf_class gmp_randclass::get_f () |
-@deftypefunx mpf_class gmp_randclass::get_f (unsigned long @var{prec}) |
-Generate a random float @var{f} in the range @math{0 <= @var{f} < 1}. @var{f} |
-will be to @var{prec} bits precision, or if @var{prec} is not given then to |
-the precision of the destination. For example, |
- |
-@example |
-gmp_randclass r; |
-... |
-mpf_class f (0, 512); // 512 bits precision |
-f = r.get_f(); // random number, 512 bits |
-@end example |
-@end deftypefun |
- |
- |
- |
-@node C++ Interface Limitations, , C++ Interface Random Numbers, C++ Class Interface |
-@section C++ Interface Limitations |
- |
-@table @asis |
-@item @code{mpq_class} and Templated Reading |
-A generic piece of template code probably won't know that @code{mpq_class} |
-requires a @code{canonicalize} call if inputs read with @code{operator>>} |
-might be non-canonical. This can lead to incorrect results. |
- |
-@code{operator>>} behaves as it does for reasons of efficiency. A |
-canonicalize can be quite time consuming on large operands, and is best |
-avoided if it's not necessary. |
- |
-But this potential difficulty reduces the usefulness of @code{mpq_class}. |
-Perhaps a mechanism to tell @code{operator>>} what to do will be adopted in |
-the future, maybe a preprocessor define, a global flag, or an @code{ios} flag |
-pressed into service. Or maybe, at the risk of inconsistency, the |
-@code{mpq_class} @code{operator>>} could canonicalize and leave @code{mpq_t} |
-@code{operator>>} not doing so, for use on those occasions when that's |
-acceptable. Send feedback or alternate ideas to @email{gmp-bugs@@gmplib.org}. |
- |
-@item Subclassing |
-Subclassing the GMP C++ classes works, but is not currently recommended. |
- |
-Expressions involving subclasses resolve correctly (or seem to), but in normal |
-C++ fashion the subclass doesn't inherit constructors and assignments. |
-There's many of those in the GMP classes, and a good way to reestablish them |
-in a subclass is not yet provided. |
- |
-@item Templated Expressions |
-A subtle difficulty exists when using expressions together with |
-application-defined template functions. Consider the following, with @code{T} |
-intended to be some numeric type, |
- |
-@example |
-template <class T> |
-T fun (const T &, const T &); |
-@end example |
- |
-@noindent |
-When used with, say, plain @code{mpz_class} variables, it works fine: @code{T} |
-is resolved as @code{mpz_class}. |
- |
-@example |
-mpz_class f(1), g(2); |
-fun (f, g); // Good |
-@end example |
- |
-@noindent |
-But when one of the arguments is an expression, it doesn't work. |
- |
-@example |
-mpz_class f(1), g(2), h(3); |
-fun (f, g+h); // Bad |
-@end example |
- |
-This is because @code{g+h} ends up being a certain expression template type |
-internal to @code{gmpxx.h}, which the C++ template resolution rules are unable |
-to automatically convert to @code{mpz_class}. The workaround is simply to add |
-an explicit cast. |
- |
-@example |
-mpz_class f(1), g(2), h(3); |
-fun (f, mpz_class(g+h)); // Good |
-@end example |
- |
-Similarly, within @code{fun} it may be necessary to cast an expression to type |
-@code{T} when calling a templated @code{fun2}. |
- |
-@example |
-template <class T> |
-void fun (T f, T g) |
-@{ |
- fun2 (f, f+g); // Bad |
-@} |
- |
-template <class T> |
-void fun (T f, T g) |
-@{ |
- fun2 (f, T(f+g)); // Good |
-@} |
-@end example |
-@end table |
- |
- |
-@node BSD Compatible Functions, Custom Allocation, C++ Class Interface, Top |
-@comment node-name, next, previous, up |
-@chapter Berkeley MP Compatible Functions |
-@cindex Berkeley MP compatible functions |
-@cindex BSD MP compatible functions |
- |
-These functions are intended to be fully compatible with the Berkeley MP |
-library which is available on many BSD derived U*ix systems. The |
-@samp{--enable-mpbsd} option must be used when building GNU MP to make these |
-available (@pxref{Installing GMP}). |
- |
-The original Berkeley MP library has a usage restriction: you cannot use the |
-same variable as both source and destination in a single function call. The |
-compatible functions in GNU MP do not share this restriction---inputs and |
-outputs may overlap. |
- |
-It is not recommended that new programs are written using these functions. |
-Apart from the incomplete set of functions, the interface for initializing |
-@code{MINT} objects is more error prone, and the @code{pow} function collides |
-with @code{pow} in @file{libm.a}. |
- |
-@cindex @code{mp.h} |
-@tindex MINT |
-Include the header @file{mp.h} to get the definition of the necessary types and |
-functions. If you are on a BSD derived system, make sure to include GNU |
-@file{mp.h} if you are going to link the GNU @file{libmp.a} to your program. |
-This means that you probably need to give the @samp{-I<dir>} option to the |
-compiler, where @samp{<dir>} is the directory where you have GNU @file{mp.h}. |
- |
-@deftypefun {MINT *} itom (signed short int @var{initial_value}) |
-Allocate an integer consisting of a @code{MINT} object and dynamic limb space. |
-Initialize the integer to @var{initial_value}. Return a pointer to the |
-@code{MINT} object. |
-@end deftypefun |
- |
-@deftypefun {MINT *} xtom (char *@var{initial_value}) |
-Allocate an integer consisting of a @code{MINT} object and dynamic limb space. |
-Initialize the integer from @var{initial_value}, a hexadecimal, |
-null-terminated C string. Return a pointer to the @code{MINT} object. |
-@end deftypefun |
- |
-@deftypefun void move (MINT *@var{src}, MINT *@var{dest}) |
-Set @var{dest} to @var{src} by copying. Both variables must be previously |
-initialized. |
-@end deftypefun |
- |
-@deftypefun void madd (MINT *@var{src_1}, MINT *@var{src_2}, MINT *@var{destination}) |
-Add @var{src_1} and @var{src_2} and put the sum in @var{destination}. |
-@end deftypefun |
- |
-@deftypefun void msub (MINT *@var{src_1}, MINT *@var{src_2}, MINT *@var{destination}) |
-Subtract @var{src_2} from @var{src_1} and put the difference in |
-@var{destination}. |
-@end deftypefun |
- |
-@deftypefun void mult (MINT *@var{src_1}, MINT *@var{src_2}, MINT *@var{destination}) |
-Multiply @var{src_1} and @var{src_2} and put the product in @var{destination}. |
-@end deftypefun |
- |
-@deftypefun void mdiv (MINT *@var{dividend}, MINT *@var{divisor}, MINT *@var{quotient}, MINT *@var{remainder}) |
-@deftypefunx void sdiv (MINT *@var{dividend}, signed short int @var{divisor}, MINT *@var{quotient}, signed short int *@var{remainder}) |
-Set @var{quotient} to @var{dividend}/@var{divisor}, and @var{remainder} to |
-@var{dividend} mod @var{divisor}. The quotient is rounded towards zero; the |
-remainder has the same sign as the dividend unless it is zero. |
- |
-Some implementations of these functions work differently---or not at all---for |
-negative arguments. |
-@end deftypefun |
- |
-@deftypefun void msqrt (MINT *@var{op}, MINT *@var{root}, MINT *@var{remainder}) |
-Set @var{root} to @m{\lfloor\sqrt{@var{op}}\rfloor, the truncated integer part |
-of the square root of @var{op}}, like @code{mpz_sqrt}. Set @var{remainder} to |
-@m{(@var{op} - @var{root}^2), @var{op}@minus{}@var{root}*@var{root}}, i.e. |
-zero if @var{op} is a perfect square. |
- |
-If @var{root} and @var{remainder} are the same variable, the results are |
-undefined. |
-@end deftypefun |
- |
-@deftypefun void pow (MINT *@var{base}, MINT *@var{exp}, MINT *@var{mod}, MINT *@var{dest}) |
-Set @var{dest} to (@var{base} raised to @var{exp}) modulo @var{mod}. |
- |
-Note that the name @code{pow} clashes with @code{pow} from the standard C math |
-library (@pxref{Exponents and Logarithms,, Exponentiation and Logarithms, |
-libc, The GNU C Library Reference Manual}). An application will only be able |
-to use one or the other. |
-@end deftypefun |
- |
-@deftypefun void rpow (MINT *@var{base}, signed short int @var{exp}, MINT *@var{dest}) |
-Set @var{dest} to @var{base} raised to @var{exp}. |
-@end deftypefun |
- |
-@deftypefun void gcd (MINT *@var{op1}, MINT *@var{op2}, MINT *@var{res}) |
-Set @var{res} to the greatest common divisor of @var{op1} and @var{op2}. |
-@end deftypefun |
- |
-@deftypefun int mcmp (MINT *@var{op1}, MINT *@var{op2}) |
-Compare @var{op1} and @var{op2}. Return a positive value if @var{op1} > |
-@var{op2}, zero if @var{op1} = @var{op2}, and a negative value if @var{op1} < |
-@var{op2}. |
-@end deftypefun |
- |
-@deftypefun void min (MINT *@var{dest}) |
-Input a decimal string from @code{stdin}, and put the read integer in |
-@var{dest}. SPC and TAB are allowed in the number string, and are ignored. |
-@end deftypefun |
- |
-@deftypefun void mout (MINT *@var{src}) |
-Output @var{src} to @code{stdout}, as a decimal string. Also output a newline. |
-@end deftypefun |
- |
-@deftypefun {char *} mtox (MINT *@var{op}) |
-Convert @var{op} to a hexadecimal string, and return a pointer to the string. |
-The returned string is allocated using the default memory allocation function, |
-@code{malloc} by default. It will be @code{strlen(str)+1} bytes, that being |
-exactly enough for the string and null-terminator. |
-@end deftypefun |
- |
-@deftypefun void mfree (MINT *@var{op}) |
-De-allocate, the space used by @var{op}. @strong{This function should only be |
-passed a value returned by @code{itom} or @code{xtom}.} |
-@end deftypefun |
- |
- |
-@node Custom Allocation, Language Bindings, BSD Compatible Functions, Top |
-@comment node-name, next, previous, up |
-@chapter Custom Allocation |
-@cindex Custom allocation |
-@cindex Memory allocation |
-@cindex Allocation of memory |
- |
-By default GMP uses @code{malloc}, @code{realloc} and @code{free} for memory |
-allocation, and if they fail GMP prints a message to the standard error output |
-and terminates the program. |
- |
-Alternate functions can be specified, to allocate memory in a different way or |
-to have a different error action on running out of memory. |
- |
-This feature is available in the Berkeley compatibility library (@pxref{BSD |
-Compatible Functions}) as well as the main GMP library. |
- |
-@deftypefun void mp_set_memory_functions (@* void *(*@var{alloc_func_ptr}) (size_t), @* void *(*@var{realloc_func_ptr}) (void *, size_t, size_t), @* void (*@var{free_func_ptr}) (void *, size_t)) |
-Replace the current allocation functions from the arguments. If an argument |
-is @code{NULL}, the corresponding default function is used. |
- |
-These functions will be used for all memory allocation done by GMP, apart from |
-temporary space from @code{alloca} if that function is available and GMP is |
-configured to use it (@pxref{Build Options}). |
- |
-@strong{Be sure to call @code{mp_set_memory_functions} only when there are no |
-active GMP objects allocated using the previous memory functions! Usually |
-that means calling it before any other GMP function.} |
-@end deftypefun |
- |
-The functions supplied should fit the following declarations: |
- |
-@deftypevr Function {void *} allocate_function (size_t @var{alloc_size}) |
-Return a pointer to newly allocated space with at least @var{alloc_size} |
-bytes. |
-@end deftypevr |
- |
-@deftypevr Function {void *} reallocate_function (void *@var{ptr}, size_t @var{old_size}, size_t @var{new_size}) |
-Resize a previously allocated block @var{ptr} of @var{old_size} bytes to be |
-@var{new_size} bytes. |
- |
-The block may be moved if necessary or if desired, and in that case the |
-smaller of @var{old_size} and @var{new_size} bytes must be copied to the new |
-location. The return value is a pointer to the resized block, that being the |
-new location if moved or just @var{ptr} if not. |
- |
-@var{ptr} is never @code{NULL}, it's always a previously allocated block. |
-@var{new_size} may be bigger or smaller than @var{old_size}. |
-@end deftypevr |
- |
-@deftypevr Function void free_function (void *@var{ptr}, size_t @var{size}) |
-De-allocate the space pointed to by @var{ptr}. |
- |
-@var{ptr} is never @code{NULL}, it's always a previously allocated block of |
-@var{size} bytes. |
-@end deftypevr |
- |
-A @dfn{byte} here means the unit used by the @code{sizeof} operator. |
- |
-The @var{old_size} parameters to @var{reallocate_function} and |
-@var{free_function} are passed for convenience, but of course can be ignored |
-if not needed. The default functions using @code{malloc} and friends for |
-instance don't use them. |
- |
-No error return is allowed from any of these functions, if they return then |
-they must have performed the specified operation. In particular note that |
-@var{allocate_function} or @var{reallocate_function} mustn't return |
-@code{NULL}. |
- |
-Getting a different fatal error action is a good use for custom allocation |
-functions, for example giving a graphical dialog rather than the default print |
-to @code{stderr}. How much is possible when genuinely out of memory is |
-another question though. |
- |
-There's currently no defined way for the allocation functions to recover from |
-an error such as out of memory, they must terminate program execution. A |
-@code{longjmp} or throwing a C++ exception will have undefined results. This |
-may change in the future. |
- |
-GMP may use allocated blocks to hold pointers to other allocated blocks. This |
-will limit the assumptions a conservative garbage collection scheme can make. |
- |
-Since the default GMP allocation uses @code{malloc} and friends, those |
-functions will be linked in even if the first thing a program does is an |
-@code{mp_set_memory_functions}. It's necessary to change the GMP sources if |
-this is a problem. |
- |
-@sp 1 |
-@deftypefun void mp_get_memory_functions (@* void *(**@var{alloc_func_ptr}) (size_t), @* void *(**@var{realloc_func_ptr}) (void *, size_t, size_t), @* void (**@var{free_func_ptr}) (void *, size_t)) |
-Get the current allocation functions, storing function pointers to the |
-locations given by the arguments. If an argument is @code{NULL}, that |
-function pointer is not stored. |
- |
-@need 1000 |
-For example, to get just the current free function, |
- |
-@example |
-void (*freefunc) (void *, size_t); |
- |
-mp_get_memory_functions (NULL, NULL, &freefunc); |
-@end example |
-@end deftypefun |
- |
-@node Language Bindings, Algorithms, Custom Allocation, Top |
-@chapter Language Bindings |
-@cindex Language bindings |
-@cindex Other languages |
- |
-The following packages and projects offer access to GMP from languages other |
-than C, though perhaps with varying levels of functionality and efficiency. |
- |
-@c @spaceuref{U} is the same as @uref{U}, but with a couple of extra spaces |
-@c in tex, just to separate the URL from the preceding text a bit. |
-@iftex |
-@macro spaceuref {U} |
-@ @ @uref{\U\} |
-@end macro |
-@end iftex |
-@ifnottex |
-@macro spaceuref {U} |
-@uref{\U\} |
-@end macro |
-@end ifnottex |
- |
-@sp 1 |
-@table @asis |
-@item C++ |
-@itemize @bullet |
-@item |
-GMP C++ class interface, @pxref{C++ Class Interface} @* Straightforward |
-interface, expression templates to eliminate temporaries. |
-@item |
-ALP @spaceuref{http://www-sop.inria.fr/saga/logiciels/ALP/} @* Linear algebra and |
-polynomials using templates. |
-@item |
-Arithmos @spaceuref{http://www.win.ua.ac.be/~cant/arithmos/} @* Rationals |
-with infinities and square roots. |
-@item |
-CLN @spaceuref{http://www.ginac.de/CLN/} @* High level classes for arithmetic. |
-@item |
-LiDIA @spaceuref{http://www.cdc.informatik.tu-darmstadt.de/TI/LiDIA/} @* A C++ |
-library for computational number theory. |
-@item |
-Linbox @spaceuref{http://www.linalg.org/} @* Sparse vectors and matrices. |
-@item |
-NTL @spaceuref{http://www.shoup.net/ntl/} @* A C++ number theory library. |
-@end itemize |
- |
-@c @item D |
-@c @itemize @bullet |
-@c @item |
-@c gmp-d @spaceuref{http://home.comcast.net/~benhinkle/gmp-d/} |
-@c @end itemize |
- |
-@item Fortran |
-@itemize @bullet |
-@item |
-Omni F77 @spaceuref{http://phase.hpcc.jp/Omni/home.html} @* Arbitrary |
-precision floats. |
-@end itemize |
- |
-@item Haskell |
-@itemize @bullet |
-@item |
-Glasgow Haskell Compiler @spaceuref{http://www.haskell.org/ghc/} |
-@end itemize |
- |
-@item Java |
-@itemize @bullet |
-@item |
-Kaffe @spaceuref{http://www.kaffe.org/} |
-@item |
-Kissme @spaceuref{http://kissme.sourceforge.net/} |
-@end itemize |
- |
-@item Lisp |
-@itemize @bullet |
-@item |
-GNU Common Lisp @spaceuref{http://www.gnu.org/software/gcl/gcl.html} |
-@item |
-Librep @spaceuref{http://librep.sourceforge.net/} |
-@item |
-@c FIXME: When there's a stable release with gmp support, just refer to it |
-@c rather than bothering to talk about betas. |
-XEmacs (21.5.18 beta and up) @spaceuref{http://www.xemacs.org} @* Optional |
-big integers, rationals and floats using GMP. |
-@end itemize |
- |
-@item M4 |
-@itemize @bullet |
-@item |
-@c FIXME: When there's a stable release with gmp support, just refer to it |
-@c rather than bothering to talk about betas. |
-GNU m4 betas @spaceuref{http://www.seindal.dk/rene/gnu/} @* Optionally provides |
-an arbitrary precision @code{mpeval}. |
-@end itemize |
- |
-@item ML |
-@itemize @bullet |
-@item |
-MLton compiler @spaceuref{http://mlton.org/} |
-@end itemize |
- |
-@item Objective Caml |
-@itemize @bullet |
-@item |
-MLGMP @spaceuref{http://www.di.ens.fr/~monniaux/programmes.html.en} |
-@item |
-Numerix @spaceuref{http://pauillac.inria.fr/~quercia/} @* Optionally using |
-GMP. |
-@end itemize |
- |
-@item Oz |
-@itemize @bullet |
-@item |
-Mozart @spaceuref{http://www.mozart-oz.org/} |
-@end itemize |
- |
-@item Pascal |
-@itemize @bullet |
-@item |
-GNU Pascal Compiler @spaceuref{http://www.gnu-pascal.de/} @* GMP unit. |
-@item |
-Numerix @spaceuref{http://pauillac.inria.fr/~quercia/} @* For Free Pascal, |
-optionally using GMP. |
-@end itemize |
- |
-@item Perl |
-@itemize @bullet |
-@item |
-GMP module, see @file{demos/perl} in the GMP sources (@pxref{Demonstration |
-Programs}). |
-@item |
-Math::GMP @spaceuref{http://www.cpan.org/} @* Compatible with Math::BigInt, but |
-not as many functions as the GMP module above. |
-@item |
-Math::BigInt::GMP @spaceuref{http://www.cpan.org/} @* Plug Math::GMP into |
-normal Math::BigInt operations. |
-@end itemize |
- |
-@need 1000 |
-@item Pike |
-@itemize @bullet |
-@item |
-mpz module in the standard distribution, @uref{http://pike.ida.liu.se/} |
-@end itemize |
- |
-@need 500 |
-@item Prolog |
-@itemize @bullet |
-@item |
-SWI Prolog @spaceuref{http://www.swi-prolog.org/} @* |
-Arbitrary precision floats. |
-@end itemize |
- |
-@item Python |
-@itemize @bullet |
-@item |
-mpz module in the standard distribution, @uref{http://www.python.org/} |
-@item |
-GMPY @uref{http://gmpy.sourceforge.net/} |
-@end itemize |
- |
-@item Scheme |
-@itemize @bullet |
-@item |
-GNU Guile (upcoming 1.8) @spaceuref{http://www.gnu.org/software/guile/guile.html} |
-@item |
-RScheme @spaceuref{http://www.rscheme.org/} |
-@item |
-STklos @spaceuref{http://www.stklos.org/} |
-@c |
-@c For reference, MzScheme uses some of gmp, but (as of version 205) it only |
-@c has copies of some of the generic C code, and we don't consider that a |
-@c language binding to gmp. |
-@c |
-@end itemize |
- |
-@item Smalltalk |
-@itemize @bullet |
-@item |
-GNU Smalltalk @spaceuref{http://www.smalltalk.org/versions/GNUSmalltalk.html} |
-@end itemize |
- |
-@item Other |
-@itemize @bullet |
-@item |
-Axiom @uref{http://savannah.nongnu.org/projects/axiom} @* Computer algebra |
-using GCL. |
-@item |
-DrGenius @spaceuref{http://drgenius.seul.org/} @* Geometry system and |
-mathematical programming language. |
-@item |
-GiNaC @spaceuref{http://www.ginac.de/} @* C++ computer algebra using CLN. |
-@item |
-GOO @spaceuref{http://www.googoogaga.org/} @* Dynamic object oriented |
-language. |
-@item |
-Maxima @uref{http://www.ma.utexas.edu/users/wfs/maxima.html} @* Macsyma |
-computer algebra using GCL. |
-@item |
-Q @spaceuref{http://q-lang.sourceforge.net/} @* Equational programming system. |
-@item |
-Regina @spaceuref{http://regina.sourceforge.net/} @* Topological calculator. |
-@item |
-Yacas @spaceuref{http://www.xs4all.nl/~apinkus/yacas.html} @* Yet another |
-computer algebra system. |
-@end itemize |
- |
-@end table |
- |
- |
-@node Algorithms, Internals, Language Bindings, Top |
-@chapter Algorithms |
-@cindex Algorithms |
- |
-This chapter is an introduction to some of the algorithms used for various GMP |
-operations. The code is likely to be hard to understand without knowing |
-something about the algorithms. |
- |
-Some GMP internals are mentioned, but applications that expect to be |
-compatible with future GMP releases should take care to use only the |
-documented functions. |
- |
-@menu |
-* Multiplication Algorithms:: |
-* Division Algorithms:: |
-* Greatest Common Divisor Algorithms:: |
-* Powering Algorithms:: |
-* Root Extraction Algorithms:: |
-* Radix Conversion Algorithms:: |
-* Other Algorithms:: |
-* Assembly Coding:: |
-@end menu |
- |
- |
-@node Multiplication Algorithms, Division Algorithms, Algorithms, Algorithms |
-@section Multiplication |
-@cindex Multiplication algorithms |
- |
-N@cross{}N limb multiplications and squares are done using one of five |
-algorithms, as the size N increases. |
- |
-@quotation |
-@multitable {KaratsubaMMM} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} |
-@item Algorithm @tab Threshold |
-@item Basecase @tab (none) |
-@item Karatsuba @tab @code{MUL_KARATSUBA_THRESHOLD} |
-@item Toom-3 @tab @code{MUL_TOOM33_THRESHOLD} |
-@item Toom-4 @tab @code{MUL_TOOM44_THRESHOLD} |
-@item FFT @tab @code{MUL_FFT_THRESHOLD} |
-@end multitable |
-@end quotation |
- |
-Similarly for squaring, with the @code{SQR} thresholds. |
- |
-N@cross{}M multiplications of operands with different sizes above |
-@code{MUL_KARATSUBA_THRESHOLD} are currently done by special Toom-inspired |
-algorithms or directly with FFT, depending on operand size (@pxref{Unbalanced |
-Multiplication}). |
- |
-@menu |
-* Basecase Multiplication:: |
-* Karatsuba Multiplication:: |
-* Toom 3-Way Multiplication:: |
-* Toom 4-Way Multiplication:: |
-* FFT Multiplication:: |
-* Other Multiplication:: |
-* Unbalanced Multiplication:: |
-@end menu |
- |
- |
-@node Basecase Multiplication, Karatsuba Multiplication, Multiplication Algorithms, Multiplication Algorithms |
-@subsection Basecase Multiplication |
- |
-Basecase N@cross{}M multiplication is a straightforward rectangular set of |
-cross-products, the same as long multiplication done by hand and for that |
-reason sometimes known as the schoolbook or grammar school method. This is an |
-@m{O(NM),O(N*M)} algorithm. See Knuth section 4.3.1 algorithm M |
-(@pxref{References}), and the @file{mpn/generic/mul_basecase.c} code. |
- |
-Assembly implementations of @code{mpn_mul_basecase} are essentially the same |
-as the generic C code, but have all the usual assembly tricks and |
-obscurities introduced for speed. |
- |
-A square can be done in roughly half the time of a multiply, by using the fact |
-that the cross products above and below the diagonal are the same. A triangle |
-of products below the diagonal is formed, doubled (left shift by one bit), and |
-then the products on the diagonal added. This can be seen in |
-@file{mpn/generic/sqr_basecase.c}. Again the assembly implementations take |
-essentially the same approach. |
- |
-@tex |
-\def\GMPline#1#2#3#4#5#6{% |
- \hbox {% |
- \vrule height 2.5ex depth 1ex |
- \hbox to 2em {\hfil{#2}\hfil}% |
- \vrule \hbox to 2em {\hfil{#3}\hfil}% |
- \vrule \hbox to 2em {\hfil{#4}\hfil}% |
- \vrule \hbox to 2em {\hfil{#5}\hfil}% |
- \vrule \hbox to 2em {\hfil{#6}\hfil}% |
- \vrule}} |
-\GMPdisplay{ |
- \hbox{% |
- \vbox{% |
- \hbox to 1.5em {\vrule height 2.5ex depth 1ex width 0pt}% |
- \hbox {\vrule height 2.5ex depth 1ex width 0pt u0\hfil}% |
- \hbox {\vrule height 2.5ex depth 1ex width 0pt u1\hfil}% |
- \hbox {\vrule height 2.5ex depth 1ex width 0pt u2\hfil}% |
- \hbox {\vrule height 2.5ex depth 1ex width 0pt u3\hfil}% |
- \hbox {\vrule height 2.5ex depth 1ex width 0pt u4\hfil}% |
- \vfill}% |
- \vbox{% |
- \hbox{% |
- \hbox to 2em {\hfil u0\hfil}% |
- \hbox to 2em {\hfil u1\hfil}% |
- \hbox to 2em {\hfil u2\hfil}% |
- \hbox to 2em {\hfil u3\hfil}% |
- \hbox to 2em {\hfil u4\hfil}}% |
- \vskip 0.7ex |
- \hrule |
- \GMPline{u0}{d}{}{}{}{}% |
- \hrule |
- \GMPline{u1}{}{d}{}{}{}% |
- \hrule |
- \GMPline{u2}{}{}{d}{}{}% |
- \hrule |
- \GMPline{u3}{}{}{}{d}{}% |
- \hrule |
- \GMPline{u4}{}{}{}{}{d}% |
- \hrule}}} |
-@end tex |
-@ifnottex |
-@example |
-@group |
- u0 u1 u2 u3 u4 |
- +---+---+---+---+---+ |
-u0 | d | | | | | |
- +---+---+---+---+---+ |
-u1 | | d | | | | |
- +---+---+---+---+---+ |
-u2 | | | d | | | |
- +---+---+---+---+---+ |
-u3 | | | | d | | |
- +---+---+---+---+---+ |
-u4 | | | | | d | |
- +---+---+---+---+---+ |
-@end group |
-@end example |
-@end ifnottex |
- |
-In practice squaring isn't a full 2@cross{} faster than multiplying, it's |
-usually around 1.5@cross{}. Less than 1.5@cross{} probably indicates |
-@code{mpn_sqr_basecase} wants improving on that CPU. |
- |
-On some CPUs @code{mpn_mul_basecase} can be faster than the generic C |
-@code{mpn_sqr_basecase} on some small sizes. @code{SQR_BASECASE_THRESHOLD} is |
-the size at which to use @code{mpn_sqr_basecase}, this will be zero if that |
-routine should be used always. |
- |
- |
-@node Karatsuba Multiplication, Toom 3-Way Multiplication, Basecase Multiplication, Multiplication Algorithms |
-@subsection Karatsuba Multiplication |
-@cindex Karatsuba multiplication |
- |
-The Karatsuba multiplication algorithm is described in Knuth section 4.3.3 |
-part A, and various other textbooks. A brief description is given here. |
- |
-The inputs @math{x} and @math{y} are treated as each split into two parts of |
-equal length (or the most significant part one limb shorter if N is odd). |
- |
-@tex |
-% GMPboxwidth used for all the multiplication pictures |
-\global\newdimen\GMPboxwidth \global\GMPboxwidth=5em |
-% GMPboxdepth and GMPboxheight are also used for the float pictures |
-\global\newdimen\GMPboxdepth \global\GMPboxdepth=1ex |
-\global\newdimen\GMPboxheight \global\GMPboxheight=2ex |
-\gdef\GMPvrule{\vrule height \GMPboxheight depth \GMPboxdepth} |
-\def\GMPbox#1#2{% |
- \vbox {% |
- \hrule |
- \hbox to 2\GMPboxwidth{% |
- \GMPvrule \hfil $#1$\hfil \vrule \hfil $#2$\hfil \vrule}% |
- \hrule}} |
-\GMPdisplay{% |
-\vbox{% |
- \hbox to 2\GMPboxwidth {high \hfil low} |
- \vskip 0.7ex |
- \GMPbox{x_1}{x_0} |
- \vskip 0.5ex |
- \GMPbox{y_1}{y_0} |
-}} |
-@end tex |
-@ifnottex |
-@example |
-@group |
- high low |
-+----------+----------+ |
-| x1 | x0 | |
-+----------+----------+ |
- |
-+----------+----------+ |
-| y1 | y0 | |
-+----------+----------+ |
-@end group |
-@end example |
-@end ifnottex |
- |
-Let @math{b} be the power of 2 where the split occurs, ie.@: if @ms{x,0} is |
-@math{k} limbs (@ms{y,0} the same) then |
-@m{b=2\GMPraise{$k*$@code{mp\_bits\_per\_limb}}, b=2^(k*mp_bits_per_limb)}. |
-With that @m{x=x_1b+x_0,x=x1*b+x0} and @m{y=y_1b+y_0,y=y1*b+y0}, and the |
-following holds, |
- |
-@display |
-@m{xy = (b^2+b)x_1y_1 - b(x_1-x_0)(y_1-y_0) + (b+1)x_0y_0, |
- x*y = (b^2+b)*x1*y1 - b*(x1-x0)*(y1-y0) + (b+1)*x0*y0} |
-@end display |
- |
-This formula means doing only three multiplies of (N/2)@cross{}(N/2) limbs, |
-whereas a basecase multiply of N@cross{}N limbs is equivalent to four |
-multiplies of (N/2)@cross{}(N/2). The factors @math{(b^2+b)} etc represent |
-the positions where the three products must be added. |
- |
-@tex |
-\def\GMPboxA#1#2{% |
- \vbox{% |
- \hrule |
- \hbox{% |
- \GMPvrule |
- \hbox to 2\GMPboxwidth {\hfil\hbox{$#1$}\hfil}% |
- \vrule |
- \hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}% |
- \vrule} |
- \hrule}} |
-\def\GMPboxB#1#2{% |
- \hbox{% |
- \raise \GMPboxdepth \hbox to \GMPboxwidth {\hfil #1\hskip 0.5em}% |
- \vbox{% |
- \hrule |
- \hbox{% |
- \GMPvrule |
- \hbox to 2\GMPboxwidth {\hfil\hbox{$#2$}\hfil}% |
- \vrule}% |
- \hrule}}} |
-\GMPdisplay{% |
-\vbox{% |
- \hbox to 4\GMPboxwidth {high \hfil low} |
- \vskip 0.7ex |
- \GMPboxA{x_1y_1}{x_0y_0} |
- \vskip 0.5ex |
- \GMPboxB{$+$}{x_1y_1} |
- \vskip 0.5ex |
- \GMPboxB{$+$}{x_0y_0} |
- \vskip 0.5ex |
- \GMPboxB{$-$}{(x_1-x_0)(y_1-y_0)} |
-}} |
-@end tex |
-@ifnottex |
-@example |
-@group |
- high low |
-+--------+--------+ +--------+--------+ |
-| x1*y1 | | x0*y0 | |
-+--------+--------+ +--------+--------+ |
- +--------+--------+ |
- add | x1*y1 | |
- +--------+--------+ |
- +--------+--------+ |
- add | x0*y0 | |
- +--------+--------+ |
- +--------+--------+ |
- sub | (x1-x0)*(y1-y0) | |
- +--------+--------+ |
-@end group |
-@end example |
-@end ifnottex |
- |
-The term @m{(x_1-x_0)(y_1-y_0),(x1-x0)*(y1-y0)} is best calculated as an |
-absolute value, and the sign used to choose to add or subtract. Notice the |
-sum @m{\mathop{\rm high}(x_0y_0)+\mathop{\rm low}(x_1y_1), |
-high(x0*y0)+low(x1*y1)} occurs twice, so it's possible to do @m{5k,5*k} limb |
-additions, rather than @m{6k,6*k}, but in GMP extra function call overheads |
-outweigh the saving. |
- |
-Squaring is similar to multiplying, but with @math{x=y} the formula reduces to |
-an equivalent with three squares, |
- |
-@display |
-@m{x^2 = (b^2+b)x_1^2 - b(x_1-x_0)^2 + (b+1)x_0^2, |
- x^2 = (b^2+b)*x1^2 - b*(x1-x0)^2 + (b+1)*x0^2} |
-@end display |
- |
-The final result is accumulated from those three squares the same way as for |
-the three multiplies above. The middle term @m{(x_1-x_0)^2,(x1-x0)^2} is now |
-always positive. |
- |
-A similar formula for both multiplying and squaring can be constructed with a |
-middle term @m{(x_1+x_0)(y_1+y_0),(x1+x0)*(y1+y0)}. But those sums can exceed |
-@math{k} limbs, leading to more carry handling and additions than the form |
-above. |
- |
-Karatsuba multiplication is asymptotically an @math{O(N^@W{1.585})} algorithm, |
-the exponent being @m{\log3/\log2,log(3)/log(2)}, representing 3 multiplies |
-each @math{1/2} the size of the inputs. This is a big improvement over the |
-basecase multiply at @math{O(N^2)} and the advantage soon overcomes the extra |
-additions Karatsuba performs. @code{MUL_KARATSUBA_THRESHOLD} can be as little |
-as 10 limbs. The @code{SQR} threshold is usually about twice the @code{MUL}. |
- |
-The basecase algorithm will take a time of the form @m{M(N) = aN^2 + bN + c, |
-M(N) = a*N^2 + b*N + c} and the Karatsuba algorithm @m{K(N) = 3M(N/2) + dN + |
-e, K(N) = 3*M(N/2) + d*N + e}, which expands to @m{K(N) = {3\over4} aN^2 + |
-{3\over2} bN + 3c + dN + e, K(N) = 3/4*a*N^2 + 3/2*b*N + 3*c + d*N + e}. The |
-factor @m{3\over4, 3/4} for @math{a} means per-crossproduct speedups in the |
-basecase code will increase the threshold since they benefit @math{M(N)} more |
-than @math{K(N)}. And conversely the @m{3\over2, 3/2} for @math{b} means |
-linear style speedups of @math{b} will increase the threshold since they |
-benefit @math{K(N)} more than @math{M(N)}. The latter can be seen for |
-instance when adding an optimized @code{mpn_sqr_diagonal} to |
-@code{mpn_sqr_basecase}. Of course all speedups reduce total time, and in |
-that sense the algorithm thresholds are merely of academic interest. |
- |
- |
-@node Toom 3-Way Multiplication, Toom 4-Way Multiplication, Karatsuba Multiplication, Multiplication Algorithms |
-@subsection Toom 3-Way Multiplication |
-@cindex Toom multiplication |
- |
-The Karatsuba formula is the simplest case of a general approach to splitting |
-inputs that leads to both Toom and FFT algorithms. A description of |
-Toom can be found in Knuth section 4.3.3, with an example 3-way |
-calculation after Theorem A@. The 3-way form used in GMP is described here. |
- |
-The operands are each considered split into 3 pieces of equal length (or the |
-most significant part 1 or 2 limbs shorter than the other two). |
- |
-@tex |
-\def\GMPbox#1#2#3{% |
- \vbox{% |
- \hrule \vfil |
- \hbox to 3\GMPboxwidth {% |
- \GMPvrule |
- \hfil$#1$\hfil |
- \vrule |
- \hfil$#2$\hfil |
- \vrule |
- \hfil$#3$\hfil |
- \vrule}% |
- \vfil \hrule |
-}} |
-\GMPdisplay{% |
-\vbox{% |
- \hbox to 3\GMPboxwidth {high \hfil low} |
- \vskip 0.7ex |
- \GMPbox{x_2}{x_1}{x_0} |
- \vskip 0.5ex |
- \GMPbox{y_2}{y_1}{y_0} |
- \vskip 0.5ex |
-}} |
-@end tex |
-@ifnottex |
-@example |
-@group |
- high low |
-+----------+----------+----------+ |
-| x2 | x1 | x0 | |
-+----------+----------+----------+ |
- |
-+----------+----------+----------+ |
-| y2 | y1 | y0 | |
-+----------+----------+----------+ |
-@end group |
-@end example |
-@end ifnottex |
- |
-@noindent |
-These parts are treated as the coefficients of two polynomials |
- |
-@display |
-@group |
-@m{X(t) = x_2t^2 + x_1t + x_0, |
- X(t) = x2*t^2 + x1*t + x0} |
-@m{Y(t) = y_2t^2 + y_1t + y_0, |
- Y(t) = y2*t^2 + y1*t + y0} |
-@end group |
-@end display |
- |
-Let @math{b} equal the power of 2 which is the size of the @ms{x,0}, @ms{x,1}, |
-@ms{y,0} and @ms{y,1} pieces, ie.@: if they're @math{k} limbs each then |
-@m{b=2\GMPraise{$k*$@code{mp\_bits\_per\_limb}}, b=2^(k*mp_bits_per_limb)}. |
-With this @math{x=X(b)} and @math{y=Y(b)}. |
- |
-Let a polynomial @m{W(t)=X(t)Y(t),W(t)=X(t)*Y(t)} and suppose its coefficients |
-are |
- |
-@display |
-@m{W(t) = w_4t^4 + w_3t^3 + w_2t^2 + w_1t + w_0, |
- W(t) = w4*t^4 + w3*t^3 + w2*t^2 + w1*t + w0} |
-@end display |
- |
-The @m{w_i,w[i]} are going to be determined, and when they are they'll give |
-the final result using @math{w=W(b)}, since |
-@m{xy=X(b)Y(b),x*y=X(b)*Y(b)=W(b)}. The coefficients will be roughly |
-@math{b^2} each, and the final @math{W(b)} will be an addition like, |
- |
-@tex |
-\def\GMPbox#1#2{% |
- \moveright #1\GMPboxwidth |
- \vbox{% |
- \hrule |
- \hbox{% |
- \GMPvrule |
- \hbox to 2\GMPboxwidth {\hfil$#2$\hfil}% |
- \vrule}% |
- \hrule |
-}} |
-\GMPdisplay{% |
-\vbox{% |
- \hbox to 6\GMPboxwidth {high \hfil low}% |
- \vskip 0.7ex |
- \GMPbox{0}{w_4} |
- \vskip 0.5ex |
- \GMPbox{1}{w_3} |
- \vskip 0.5ex |
- \GMPbox{2}{w_2} |
- \vskip 0.5ex |
- \GMPbox{3}{w_1} |
- \vskip 0.5ex |
- \GMPbox{4}{w_0} |
-}} |
-@end tex |
-@ifnottex |
-@example |
-@group |
- high low |
-+-------+-------+ |
-| w4 | |
-+-------+-------+ |
- +--------+-------+ |
- | w3 | |
- +--------+-------+ |
- +--------+-------+ |
- | w2 | |
- +--------+-------+ |
- +--------+-------+ |
- | w1 | |
- +--------+-------+ |
- +-------+-------+ |
- | w0 | |
- +-------+-------+ |
-@end group |
-@end example |
-@end ifnottex |
- |
-The @m{w_i,w[i]} coefficients could be formed by a simple set of cross |
-products, like @m{w_4=x_2y_2,w4=x2*y2}, @m{w_3=x_2y_1+x_1y_2,w3=x2*y1+x1*y2}, |
-@m{w_2=x_2y_0+x_1y_1+x_0y_2,w2=x2*y0+x1*y1+x0*y2} etc, but this would need all |
-nine @m{x_iy_j,x[i]*y[j]} for @math{i,j=0,1,2}, and would be equivalent merely |
-to a basecase multiply. Instead the following approach is used. |
- |
-@math{X(t)} and @math{Y(t)} are evaluated and multiplied at 5 points, giving |
-values of @math{W(t)} at those points. In GMP the following points are used, |
- |
-@quotation |
-@multitable {@m{t=\infty,t=inf}M} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} |
-@item Point @tab Value |
-@item @math{t=0} @tab @m{x_0y_0,x0 * y0}, which gives @ms{w,0} immediately |
-@item @math{t=1} @tab @m{(x_2+x_1+x_0)(y_2+y_1+y_0),(x2+x1+x0) * (y2+y1+y0)} |
-@item @math{t=-1} @tab @m{(x_2-x_1+x_0)(y_2-y_1+y_0),(x2-x1+x0) * (y2-y1+y0)} |
-@item @math{t=2} @tab @m{(4x_2+2x_1+x_0)(4y_2+2y_1+y_0),(4*x2+2*x1+x0) * (4*y2+2*y1+y0)} |
-@item @m{t=\infty,t=inf} @tab @m{x_2y_2,x2 * y2}, which gives @ms{w,4} immediately |
-@end multitable |
-@end quotation |
- |
-At @math{t=-1} the values can be negative and that's handled using the |
-absolute values and tracking the sign separately. At @m{t=\infty,t=inf} the |
-value is actually @m{\lim_{t\to\infty} {X(t)Y(t)\over t^4}, X(t)*Y(t)/t^4 in |
-the limit as t approaches infinity}, but it's much easier to think of as |
-simply @m{x_2y_2,x2*y2} giving @ms{w,4} immediately (much like |
-@m{x_0y_0,x0*y0} at @math{t=0} gives @ms{w,0} immediately). |
- |
-Each of the points substituted into |
-@m{W(t)=w_4t^4+\cdots+w_0,W(t)=w4*t^4+@dots{}+w0} gives a linear combination |
-of the @m{w_i,w[i]} coefficients, and the value of those combinations has just |
-been calculated. |
- |
-@tex |
-\GMPdisplay{% |
-$\matrix{% |
-W(0) & = & & & & & & & & & w_0 \cr |
-W(1) & = & w_4 & + & w_3 & + & w_2 & + & w_1 & + & w_0 \cr |
-W(-1) & = & w_4 & - & w_3 & + & w_2 & - & w_1 & + & w_0 \cr |
-W(2) & = & 16w_4 & + & 8w_3 & + & 4w_2 & + & 2w_1 & + & w_0 \cr |
-W(\infty) & = & w_4 \cr |
-}$} |
-@end tex |
-@ifnottex |
-@example |
-@group |
-W(0) = w0 |
-W(1) = w4 + w3 + w2 + w1 + w0 |
-W(-1) = w4 - w3 + w2 - w1 + w0 |
-W(2) = 16*w4 + 8*w3 + 4*w2 + 2*w1 + w0 |
-W(inf) = w4 |
-@end group |
-@end example |
-@end ifnottex |
- |
-This is a set of five equations in five unknowns, and some elementary linear |
-algebra quickly isolates each @m{w_i,w[i]}. This involves adding or |
-subtracting one @math{W(t)} value from another, and a couple of divisions by |
-powers of 2 and one division by 3, the latter using the special |
-@code{mpn_divexact_by3} (@pxref{Exact Division}). |
- |
-The conversion of @math{W(t)} values to the coefficients is interpolation. A |
-polynomial of degree 4 like @math{W(t)} is uniquely determined by values known |
-at 5 different points. The points are arbitrary and can be chosen to make the |
-linear equations come out with a convenient set of steps for quickly isolating |
-the @m{w_i,w[i]}. |
- |
-Squaring follows the same procedure as multiplication, but there's only one |
-@math{X(t)} and it's evaluated at the 5 points, and those values squared to |
-give values of @math{W(t)}. The interpolation is then identical, and in fact |
-the same @code{toom3_interpolate} subroutine is used for both squaring and |
-multiplying. |
- |
-Toom-3 is asymptotically @math{O(N^@W{1.465})}, the exponent being |
-@m{\log5/\log3,log(5)/log(3)}, representing 5 recursive multiplies of 1/3 the |
-original size each. This is an improvement over Karatsuba at |
-@math{O(N^@W{1.585})}, though Toom does more work in the evaluation and |
-interpolation and so it only realizes its advantage above a certain size. |
- |
-Near the crossover between Toom-3 and Karatsuba there's generally a range of |
-sizes where the difference between the two is small. |
-@code{MUL_TOOM33_THRESHOLD} is a somewhat arbitrary point in that range and |
-successive runs of the tune program can give different values due to small |
-variations in measuring. A graph of time versus size for the two shows the |
-effect, see @file{tune/README}. |
- |
-At the fairly small sizes where the Toom-3 thresholds occur it's worth |
-remembering that the asymptotic behaviour for Karatsuba and Toom-3 can't be |
-expected to make accurate predictions, due of course to the big influence of |
-all sorts of overheads, and the fact that only a few recursions of each are |
-being performed. Even at large sizes there's a good chance machine dependent |
-effects like cache architecture will mean actual performance deviates from |
-what might be predicted. |
- |
-The formula given for the Karatsuba algorithm (@pxref{Karatsuba |
-Multiplication}) has an equivalent for Toom-3 involving only five multiplies, |
-but this would be complicated and unenlightening. |
- |
-An alternate view of Toom-3 can be found in Zuras (@pxref{References}), using |
-a vector to represent the @math{x} and @math{y} splits and a matrix |
-multiplication for the evaluation and interpolation stages. The matrix |
-inverses are not meant to be actually used, and they have elements with values |
-much greater than in fact arise in the interpolation steps. The diagram shown |
-for the 3-way is attractive, but again doesn't have to be implemented that way |
-and for example with a bit of rearrangement just one division by 6 can be |
-done. |
- |
- |
-@node Toom 4-Way Multiplication, FFT Multiplication, Toom 3-Way Multiplication, Multiplication Algorithms |
-@subsection Toom 4-Way Multiplication |
-@cindex Toom multiplication |
- |
-Karatsuba and Toom-3 split the operands into 2 and 3 coefficients, |
-respectively. Toom-4 analogously splits the operands into 4 coefficients. |
-Using the notation from the section on Toom-3 multiplication, we form two |
-polynomials: |
- |
-@display |
-@group |
-@m{X(t) = x_3t^3 + x_2t^2 + x_1t + x_0, |
- X(t) = x3*t^3 + x2*t^2 + x1*t + x0} |
-@m{Y(t) = y_3t^3 + y_2t^2 + y_1t + y_0, |
- Y(t) = y3*t^3 + y2*t^2 + y1*t + y0} |
-@end group |
-@end display |
- |
-@math{X(t)} and @math{Y(t)} are evaluated and multiplied at 7 points, giving |
-values of @math{W(t)} at those points. In GMP the following points are used, |
- |
-@quotation |
-@multitable {@m{t=-1/2,t=inf}M} {MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM} |
-@item Point @tab Value |
-@item @math{t=0} @tab @m{x_0y_0,x0 * y0}, which gives @ms{w,0} immediately |
-@item @math{t=1/2} @tab @m{(x_3+2x_2+4x_1+8x_0)(y_3+2y_2+4y_1+8y_0),(x3+2*x2+4*x1+8*x0) * (y3+2*y2+4*y1+8*y0)} |
-@item @math{t=-1/2} @tab @m{(-x_3+2x_2-4x_1+8x_0)(-y_3+2y_2-4y_1+8y_0),(-x3+2*x2-4*x1+8*x0) * (-y3+2*y2-4*y1+8*y0)} |
-@item @math{t=1} @tab @m{(x_3+x_2+x_1+x_0)(y_3+y_2+y_1+y_0),(x3+x2+x1+x0) * (y3+y2+y1+y0)} |
-@item @math{t=-1} @tab @m{(-x_3+x_2-x_1+x_0)(-y_3+y_2-y_1+y_0),(-x3+x2-x1+x0) * (-y3+y2-y1+y0)} |
-@item @math{t=2} @tab @m{(8x_3+4x_2+2x_1+x_0)(8y_3+4y_2+2y_1+y_0),(8*x3+4*x2+2*x1+x0) * (8*y3+4*y2+2*y1+y0)} |
-@item @m{t=\infty,t=inf} @tab @m{x_3y_3,x3 * y3}, which gives @ms{w,6} immediately |
-@end multitable |
-@end quotation |
- |
-The number of additions and subtractions for Toom-4 is much larger than for Toom-3. |
-But several subexpressions occur multiple times, for example @m{x_2+x_0,x2+x0}, occurs |
-for both @math{t=1} and @math{t=-1}. |
- |
-Toom-4 is asymptotically @math{O(N^@W{1.404})}, the exponent being |
-@m{\log7/\log4,log(7)/log(4)}, representing 7 recursive multiplies of 1/4 the |
-original size each. |
- |
- |
-@node FFT Multiplication, Other Multiplication, Toom 4-Way Multiplication, Multiplication Algorithms |
-@subsection FFT Multiplication |
-@cindex FFT multiplication |
-@cindex Fast Fourier Transform |
- |
-At large to very large sizes a Fermat style FFT multiplication is used, |
-following Sch@"onhage and Strassen (@pxref{References}). Descriptions of FFTs |
-in various forms can be found in many textbooks, for instance Knuth section |
-4.3.3 part C or Lipson chapter IX@. A brief description of the form used in |
-GMP is given here. |
- |
-The multiplication done is @m{xy \bmod 2^N+1, x*y mod 2^N+1}, for a given |
-@math{N}. A full product @m{xy,x*y} is obtained by choosing @m{N \ge |
-\mathop{\rm bits}(x)+\mathop{\rm bits}(y), N>=bits(x)+bits(y)} and padding |
-@math{x} and @math{y} with high zero limbs. The modular product is the native |
-form for the algorithm, so padding to get a full product is unavoidable. |
- |
-The algorithm follows a split, evaluate, pointwise multiply, interpolate and |
-combine similar to that described above for Karatsuba and Toom-3. A @math{k} |
-parameter controls the split, with an FFT-@math{k} splitting into @math{2^k} |
-pieces of @math{M=N/2^k} bits each. @math{N} must be a multiple of |
-@m{2^k\times@code{mp\_bits\_per\_limb}, (2^k)*@nicode{mp_bits_per_limb}} so |
-the split falls on limb boundaries, avoiding bit shifts in the split and |
-combine stages. |
- |
-The evaluations, pointwise multiplications, and interpolation, are all done |
-modulo @m{2^{N'}+1, 2^N'+1} where @math{N'} is @math{2M+k+3} rounded up to a |
-multiple of @math{2^k} and of @code{mp_bits_per_limb}. The results of |
-interpolation will be the following negacyclic convolution of the input |
-pieces, and the choice of @math{N'} ensures these sums aren't truncated. |
-@tex |
-$$ w_n = \sum_{{i+j = b2^k+n}\atop{b=0,1}} (-1)^b x_i y_j $$ |
-@end tex |
-@ifnottex |
- |
-@example |
- --- |
- \ b |
-w[n] = / (-1) * x[i] * y[j] |
- --- |
- i+j==b*2^k+n |
- b=0,1 |
-@end example |
- |
-@end ifnottex |
-The points used for the evaluation are @math{g^i} for @math{i=0} to |
-@math{2^k-1} where @m{g=2^{2N'/2^k}, g=2^(2N'/2^k)}. @math{g} is a |
-@m{2^k,2^k'}th root of unity mod @m{2^{N'}+1,2^N'+1}, which produces necessary |
-cancellations at the interpolation stage, and it's also a power of 2 so the |
-fast fourier transforms used for the evaluation and interpolation do only |
-shifts, adds and negations. |
- |
-The pointwise multiplications are done modulo @m{2^{N'}+1, 2^N'+1} and either |
-recurse into a further FFT or use a plain multiplication (Toom-3, Karatsuba or |
-basecase), whichever is optimal at the size @math{N'}. The interpolation is |
-an inverse fast fourier transform. The resulting set of sums of @m{x_iy_j, |
-x[i]*y[j]} are added at appropriate offsets to give the final result. |
- |
-Squaring is the same, but @math{x} is the only input so it's one transform at |
-the evaluate stage and the pointwise multiplies are squares. The |
-interpolation is the same. |
- |
-For a mod @math{2^N+1} product, an FFT-@math{k} is an @m{O(N^{k/(k-1)}), |
-O(N^(k/(k-1)))} algorithm, the exponent representing @math{2^k} recursed |
-modular multiplies each @m{1/2^{k-1},1/2^(k-1)} the size of the original. |
-Each successive @math{k} is an asymptotic improvement, but overheads mean each |
-is only faster at bigger and bigger sizes. In the code, @code{MUL_FFT_TABLE} |
-and @code{SQR_FFT_TABLE} are the thresholds where each @math{k} is used. Each |
-new @math{k} effectively swaps some multiplying for some shifts, adds and |
-overheads. |
- |
-A mod @math{2^N+1} product can be formed with a normal |
-@math{N@cross{}N@rightarrow{}2N} bit multiply plus a subtraction, so an FFT |
-and Toom-3 etc can be compared directly. A @math{k=4} FFT at |
-@math{O(N^@W{1.333})} can be expected to be the first faster than Toom-3 at |
-@math{O(N^@W{1.465})}. In practice this is what's found, with |
-@code{MUL_FFT_MODF_THRESHOLD} and @code{SQR_FFT_MODF_THRESHOLD} being between |
-300 and 1000 limbs, depending on the CPU@. So far it's been found that only |
-very large FFTs recurse into pointwise multiplies above these sizes. |
- |
-When an FFT is to give a full product, the change of @math{N} to @math{2N} |
-doesn't alter the theoretical complexity for a given @math{k}, but for the |
-purposes of considering where an FFT might be first used it can be assumed |
-that the FFT is recursing into a normal multiply and that on that basis it's |
-doing @math{2^k} recursed multiplies each @m{1/2^{k-2},1/2^(k-2)} the size of |
-the inputs, making it @m{O(N^{k/(k-2)}), O(N^(k/(k-2)))}. This would mean |
-@math{k=7} at @math{O(N^@W{1.4})} would be the first FFT faster than Toom-3. |
-In practice @code{MUL_FFT_THRESHOLD} and @code{SQR_FFT_THRESHOLD} have been |
-found to be in the @math{k=8} range, somewhere between 3000 and 10000 limbs. |
- |
-The way @math{N} is split into @math{2^k} pieces and then @math{2M+k+3} is |
-rounded up to a multiple of @math{2^k} and @code{mp_bits_per_limb} means that |
-when @math{2^k@ge{}@nicode{mp\_bits\_per\_limb}} the effective @math{N} is a |
-multiple of @m{2^{2k-1},2^(2k-1)} bits. The @math{+k+3} means some values of |
-@math{N} just under such a multiple will be rounded to the next. The |
-complexity calculations above assume that a favourable size is used, meaning |
-one which isn't padded through rounding, and it's also assumed that the extra |
-@math{+k+3} bits are negligible at typical FFT sizes. |
- |
-The practical effect of the @m{2^{2k-1},2^(2k-1)} constraint is to introduce a |
-step-effect into measured speeds. For example @math{k=8} will round @math{N} |
-up to a multiple of 32768 bits, so for a 32-bit limb there'll be 512 limb |
-groups of sizes for which @code{mpn_mul_n} runs at the same speed. Or for |
-@math{k=9} groups of 2048 limbs, @math{k=10} groups of 8192 limbs, etc. In |
-practice it's been found each @math{k} is used at quite small multiples of its |
-size constraint and so the step effect is quite noticeable in a time versus |
-size graph. |
- |
-The threshold determinations currently measure at the mid-points of size |
-steps, but this is sub-optimal since at the start of a new step it can happen |
-that it's better to go back to the previous @math{k} for a while. Something |
-more sophisticated for @code{MUL_FFT_TABLE} and @code{SQR_FFT_TABLE} will be |
-needed. |
- |
- |
-@node Other Multiplication, Unbalanced Multiplication, FFT Multiplication, Multiplication Algorithms |
-@subsection Other Multiplication |
-@cindex Toom multiplication |
- |
-The Toom algorithms described above (@pxref{Toom 3-Way Multiplication}, |
-@pxref{Toom 4-Way Multiplication}) generalizes to split into an arbitrary |
-number of pieces, as per Knuth section 4.3.3 algorithm C@. This is not |
-currently used. The notes here are merely for interest. |
- |
-In general a split into @math{r+1} pieces is made, and evaluations and |
-pointwise multiplications done at @m{2r+1,2*r+1} points. A 4-way split does 7 |
-pointwise multiplies, 5-way does 9, etc. Asymptotically an @math{(r+1)}-way |
-algorithm is @m{O(N^{log(2r+1)/log(r+1)}, O(N^(log(2*r+1)/log(r+1)))}. Only |
-the pointwise multiplications count towards big-@math{O} complexity, but the |
-time spent in the evaluate and interpolate stages grows with @math{r} and has |
-a significant practical impact, with the asymptotic advantage of each @math{r} |
-realized only at bigger and bigger sizes. The overheads grow as |
-@m{O(Nr),O(N*r)}, whereas in an @math{r=2^k} FFT they grow only as @m{O(N \log |
-r), O(N*log(r))}. |
- |
-Knuth algorithm C evaluates at points 0,1,2,@dots{},@m{2r,2*r}, but exercise 4 |
-uses @math{-r},@dots{},0,@dots{},@math{r} and the latter saves some small |
-multiplies in the evaluate stage (or rather trades them for additions), and |
-has a further saving of nearly half the interpolate steps. The idea is to |
-separate odd and even final coefficients and then perform algorithm C steps C7 |
-and C8 on them separately. The divisors at step C7 become @math{j^2} and the |
-multipliers at C8 become @m{2tj-j^2,2*t*j-j^2}. |
- |
-Splitting odd and even parts through positive and negative points can be |
-thought of as using @math{-1} as a square root of unity. If a 4th root of |
-unity was available then a further split and speedup would be possible, but no |
-such root exists for plain integers. Going to complex integers with |
-@m{i=\sqrt{-1}, i=sqrt(-1)} doesn't help, essentially because in cartesian |
-form it takes three real multiplies to do a complex multiply. The existence |
-of @m{2^k,2^k'}th roots of unity in a suitable ring or field lets the fast |
-fourier transform keep splitting and get to @m{O(N \log r), O(N*log(r))}. |
- |
-Floating point FFTs use complex numbers approximating Nth roots of unity. |
-Some processors have special support for such FFTs. But these are not used in |
-GMP since it's very difficult to guarantee an exact result (to some number of |
-bits). An occasional difference of 1 in the last bit might not matter to a |
-typical signal processing algorithm, but is of course of vital importance to |
-GMP. |
- |
- |
-@node Unbalanced Multiplication, , Other Multiplication, Multiplication Algorithms |
-@subsection Unbalanced Multiplication |
-@cindex Unbalanced multiplication |
- |
-Multiplication of operands with different sizes, both below |
-@code{MUL_KARATSUBA_THRESHOLD} are done with plain schoolbook multiplication |
-(@pxref{Basecase Multiplication}). |
- |
-For really large operands, we invoke FFT directly. |
- |
-For operands between these sizes, we use Toom inspired algorithms suggested by |
-Alberto Zanoni and Marco Bodrato. The idea is to split the operands into |
-polynomials of different degree. GMP currently splits the smaller operand |
-onto 2 coefficients, i.e., a polynomial of degree 1, but the larger operand |
-can be split into 2, 3, or 4 coefficients, i.e., a polynomial of degree 1 to |
-3. |
- |
-@c FIXME: This is mighty ugly, but a cleaner @need triggers texinfo bugs that |
-@c screws up layout here and there in the rest of the manual. |
-@c @tex |
-@c \goodbreak |
-@c @end tex |
-@node Division Algorithms, Greatest Common Divisor Algorithms, Multiplication Algorithms, Algorithms |
-@section Division Algorithms |
-@cindex Division algorithms |
- |
-@menu |
-* Single Limb Division:: |
-* Basecase Division:: |
-* Divide and Conquer Division:: |
-* Exact Division:: |
-* Exact Remainder:: |
-* Small Quotient Division:: |
-@end menu |
- |
- |
-@node Single Limb Division, Basecase Division, Division Algorithms, Division Algorithms |
-@subsection Single Limb Division |
- |
-N@cross{}1 division is implemented using repeated 2@cross{}1 divisions from |
-high to low, either with a hardware divide instruction or a multiplication by |
-inverse, whichever is best on a given CPU. |
- |
-The multiply by inverse follows section 8 of ``Division by Invariant Integers |
-using Multiplication'' by Granlund and Montgomery (@pxref{References}) and is |
-implemented as @code{udiv_qrnnd_preinv} in @file{gmp-impl.h}. The idea is to |
-have a fixed-point approximation to @math{1/d} (see @code{invert_limb}) and |
-then multiply by the high limb (plus one bit) of the dividend to get a |
-quotient @math{q}. With @math{d} normalized (high bit set), @math{q} is no |
-more than 1 too small. Subtracting @m{qd,q*d} from the dividend gives a |
-remainder, and reveals whether @math{q} or @math{q-1} is correct. |
- |
-The result is a division done with two multiplications and four or five |
-arithmetic operations. On CPUs with low latency multipliers this can be much |
-faster than a hardware divide, though the cost of calculating the inverse at |
-the start may mean it's only better on inputs bigger than say 4 or 5 limbs. |
- |
-When a divisor must be normalized, either for the generic C |
-@code{__udiv_qrnnd_c} or the multiply by inverse, the division performed is |
-actually @m{a2^k,a*2^k} by @m{d2^k,d*2^k} where @math{a} is the dividend and |
-@math{k} is the power necessary to have the high bit of @m{d2^k,d*2^k} set. |
-The bit shifts for the dividend are usually accomplished ``on the fly'' |
-meaning by extracting the appropriate bits at each step. Done this way the |
-quotient limbs come out aligned ready to store. When only the remainder is |
-wanted, an alternative is to take the dividend limbs unshifted and calculate |
-@m{r = a \bmod d2^k, r = a mod d*2^k} followed by an extra final step @m{r2^k |
-\bmod d2^k, r*2^k mod d*2^k}. This can help on CPUs with poor bit shifts or |
-few registers. |
- |
-The multiply by inverse can be done two limbs at a time. The calculation is |
-basically the same, but the inverse is two limbs and the divisor treated as if |
-padded with a low zero limb. This means more work, since the inverse will |
-need a 2@cross{}2 multiply, but the four 1@cross{}1s to do that are |
-independent and can therefore be done partly or wholly in parallel. Likewise |
-for a 2@cross{}1 calculating @m{qd,q*d}. The net effect is to process two |
-limbs with roughly the same two multiplies worth of latency that one limb at a |
-time gives. This extends to 3 or 4 limbs at a time, though the extra work to |
-apply the inverse will almost certainly soon reach the limits of multiplier |
-throughput. |
- |
-A similar approach in reverse can be taken to process just half a limb at a |
-time if the divisor is only a half limb. In this case the 1@cross{}1 multiply |
-for the inverse effectively becomes two @m{{1\over2}\times1, (1/2)x1} for each |
-limb, which can be a saving on CPUs with a fast half limb multiply, or in fact |
-if the only multiply is a half limb, and especially if it's not pipelined. |
- |
- |
-@node Basecase Division, Divide and Conquer Division, Single Limb Division, Division Algorithms |
-@subsection Basecase Division |
- |
-Basecase N@cross{}M division is like long division done by hand, but in base |
-@m{2\GMPraise{@code{mp\_bits\_per\_limb}}, 2^mp_bits_per_limb}. See Knuth |
-section 4.3.1 algorithm D, and @file{mpn/generic/sb_divrem_mn.c}. |
- |
-Briefly stated, while the dividend remains larger than the divisor, a high |
-quotient limb is formed and the N@cross{}1 product @m{qd,q*d} subtracted at |
-the top end of the dividend. With a normalized divisor (most significant bit |
-set), each quotient limb can be formed with a 2@cross{}1 division and a |
-1@cross{}1 multiplication plus some subtractions. The 2@cross{}1 division is |
-by the high limb of the divisor and is done either with a hardware divide or a |
-multiply by inverse (the same as in @ref{Single Limb Division}) whichever is |
-faster. Such a quotient is sometimes one too big, requiring an addback of the |
-divisor, but that happens rarely. |
- |
-With Q=N@minus{}M being the number of quotient limbs, this is an |
-@m{O(QM),O(Q*M)} algorithm and will run at a speed similar to a basecase |
-Q@cross{}M multiplication, differing in fact only in the extra multiply and |
-divide for each of the Q quotient limbs. |
- |
- |
-@node Divide and Conquer Division, Exact Division, Basecase Division, Division Algorithms |
-@subsection Divide and Conquer Division |
- |
-For divisors larger than @code{DIV_DC_THRESHOLD}, division is done by dividing. |
-Or to be precise by a recursive divide and conquer algorithm based on work by |
-Moenck and Borodin, Jebelean, and Burnikel and Ziegler (@pxref{References}). |
- |
-The algorithm consists essentially of recognising that a 2N@cross{}N division |
-can be done with the basecase division algorithm (@pxref{Basecase Division}), |
-but using N/2 limbs as a base, not just a single limb. This way the |
-multiplications that arise are (N/2)@cross{}(N/2) and can take advantage of |
-Karatsuba and higher multiplication algorithms (@pxref{Multiplication |
-Algorithms}). The two ``digits'' of the quotient are formed by recursive |
-N@cross{}(N/2) divisions. |
- |
-If the (N/2)@cross{}(N/2) multiplies are done with a basecase multiplication |
-then the work is about the same as a basecase division, but with more function |
-call overheads and with some subtractions separated from the multiplies. |
-These overheads mean that it's only when N/2 is above |
-@code{MUL_KARATSUBA_THRESHOLD} that divide and conquer is of use. |
- |
-@code{DIV_DC_THRESHOLD} is based on the divisor size N, so it will be somewhere |
-above twice @code{MUL_KARATSUBA_THRESHOLD}, but how much above depends on the |
-CPU@. An optimized @code{mpn_mul_basecase} can lower @code{DIV_DC_THRESHOLD} a |
-little by offering a ready-made advantage over repeated @code{mpn_submul_1} |
-calls. |
- |
-Divide and conquer is asymptotically @m{O(M(N)\log N),O(M(N)*log(N))} where |
-@math{M(N)} is the time for an N@cross{}N multiplication done with FFTs. The |
-actual time is a sum over multiplications of the recursed sizes, as can be |
-seen near the end of section 2.2 of Burnikel and Ziegler. For example, within |
-the Toom-3 range, divide and conquer is @m{2.63M(N), 2.63*M(N)}. With higher |
-algorithms the @math{M(N)} term improves and the multiplier tends to @m{\log |
-N, log(N)}. In practice, at moderate to large sizes, a 2N@cross{}N division |
-is about 2 to 4 times slower than an N@cross{}N multiplication. |
- |
-Newton's method used for division is asymptotically @math{O(M(N))} and should |
-therefore be superior to divide and conquer, but it's believed this would only |
-be for large to very large N. |
- |
- |
-@node Exact Division, Exact Remainder, Divide and Conquer Division, Division Algorithms |
-@subsection Exact Division |
- |
-A so-called exact division is when the dividend is known to be an exact |
-multiple of the divisor. Jebelean's exact division algorithm uses this |
-knowledge to make some significant optimizations (@pxref{References}). |
- |
-The idea can be illustrated in decimal for example with 368154 divided by |
-543. Because the low digit of the dividend is 4, the low digit of the |
-quotient must be 8. This is arrived at from @m{4 \mathord{\times} 7 \bmod 10, |
-4*7 mod 10}, using the fact 7 is the modular inverse of 3 (the low digit of |
-the divisor), since @m{3 \mathord{\times} 7 \mathop{\equiv} 1 \bmod 10, 3*7 |
-@equiv{} 1 mod 10}. So @m{8\mathord{\times}543 = 4344,8*543=4344} can be |
-subtracted from the dividend leaving 363810. Notice the low digit has become |
-zero. |
- |
-The procedure is repeated at the second digit, with the next quotient digit 7 |
-(@m{1 \mathord{\times} 7 \bmod 10, 7 @equiv{} 1*7 mod 10}), subtracting |
-@m{7\mathord{\times}543 = 3801,7*543=3801}, leaving 325800. And finally at |
-the third digit with quotient digit 6 (@m{8 \mathord{\times} 7 \bmod 10, 8*7 |
-mod 10}), subtracting @m{6\mathord{\times}543 = 3258,6*543=3258} leaving 0. |
-So the quotient is 678. |
- |
-Notice however that the multiplies and subtractions don't need to extend past |
-the low three digits of the dividend, since that's enough to determine the |
-three quotient digits. For the last quotient digit no subtraction is needed |
-at all. On a 2N@cross{}N division like this one, only about half the work of |
-a normal basecase division is necessary. |
- |
-For an N@cross{}M exact division producing Q=N@minus{}M quotient limbs, the |
-saving over a normal basecase division is in two parts. Firstly, each of the |
-Q quotient limbs needs only one multiply, not a 2@cross{}1 divide and |
-multiply. Secondly, the crossproducts are reduced when @math{Q>M} to |
-@m{QM-M(M+1)/2,Q*M-M*(M+1)/2}, or when @math{Q@le{}M} to @m{Q(Q-1)/2, |
-Q*(Q-1)/2}. Notice the savings are complementary. If Q is big then many |
-divisions are saved, or if Q is small then the crossproducts reduce to a small |
-number. |
- |
-The modular inverse used is calculated efficiently by @code{modlimb_invert} in |
-@file{gmp-impl.h}. This does four multiplies for a 32-bit limb, or six for a |
-64-bit limb. @file{tune/modlinv.c} has some alternate implementations that |
-might suit processors better at bit twiddling than multiplying. |
- |
-The sub-quadratic exact division described by Jebelean in ``Exact Division |
-with Karatsuba Complexity'' is not currently implemented. It uses a |
-rearrangement similar to the divide and conquer for normal division |
-(@pxref{Divide and Conquer Division}), but operating from low to high. A |
-further possibility not currently implemented is ``Bidirectional Exact Integer |
-Division'' by Krandick and Jebelean which forms quotient limbs from both the |
-high and low ends of the dividend, and can halve once more the number of |
-crossproducts needed in a 2N@cross{}N division. |
- |
-A special case exact division by 3 exists in @code{mpn_divexact_by3}, |
-supporting Toom-3 multiplication and @code{mpq} canonicalizations. It forms |
-quotient digits with a multiply by the modular inverse of 3 (which is |
-@code{0xAA..AAB}) and uses two comparisons to determine a borrow for the next |
-limb. The multiplications don't need to be on the dependent chain, as long as |
-the effect of the borrows is applied, which can help chips with pipelined |
-multipliers. |
- |
- |
-@node Exact Remainder, Small Quotient Division, Exact Division, Division Algorithms |
-@subsection Exact Remainder |
-@cindex Exact remainder |
- |
-If the exact division algorithm is done with a full subtraction at each stage |
-and the dividend isn't a multiple of the divisor, then low zero limbs are |
-produced but with a remainder in the high limbs. For dividend @math{a}, |
-divisor @math{d}, quotient @math{q}, and @m{b = 2 |
-\GMPraise{@code{mp\_bits\_per\_limb}}, b = 2^mp_bits_per_limb}, this remainder |
-@math{r} is of the form |
-@tex |
-$$ a = qd + r b^n $$ |
-@end tex |
-@ifnottex |
- |
-@example |
-a = q*d + r*b^n |
-@end example |
- |
-@end ifnottex |
-@math{n} represents the number of zero limbs produced by the subtractions, |
-that being the number of limbs produced for @math{q}. @math{r} will be in the |
-range @math{0@le{}r<d} and can be viewed as a remainder, but one shifted up by |
-a factor of @math{b^n}. |
- |
-Carrying out full subtractions at each stage means the same number of cross |
-products must be done as a normal division, but there's still some single limb |
-divisions saved. When @math{d} is a single limb some simplifications arise, |
-providing good speedups on a number of processors. |
- |
-@code{mpn_bdivmod}, @code{mpn_divexact_by3}, @code{mpn_modexact_1_odd} and the |
-@code{redc} function in @code{mpz_powm} differ subtly in how they return |
-@math{r}, leading to some negations in the above formula, but all are |
-essentially the same. |
- |
-@cindex Divisibility algorithm |
-@cindex Congruence algorithm |
-Clearly @math{r} is zero when @math{a} is a multiple of @math{d}, and this |
-leads to divisibility or congruence tests which are potentially more efficient |
-than a normal division. |
- |
-The factor of @math{b^n} on @math{r} can be ignored in a GCD when @math{d} is |
-odd, hence the use of @code{mpn_bdivmod} in @code{mpn_gcd}, and the use of |
-@code{mpn_modexact_1_odd} by @code{mpn_gcd_1} and @code{mpz_kronecker_ui} etc |
-(@pxref{Greatest Common Divisor Algorithms}). |
- |
-Montgomery's REDC method for modular multiplications uses operands of the form |
-of @m{xb^{-n}, x*b^-n} and @m{yb^{-n}, y*b^-n} and on calculating @m{(xb^{-n}) |
-(yb^{-n}), (x*b^-n)*(y*b^-n)} uses the factor of @math{b^n} in the exact |
-remainder to reach a product in the same form @m{(xy)b^{-n}, (x*y)*b^-n} |
-(@pxref{Modular Powering Algorithm}). |
- |
-Notice that @math{r} generally gives no useful information about the ordinary |
-remainder @math{a @bmod d} since @math{b^n @bmod d} could be anything. If |
-however @math{b^n @equiv{} 1 @bmod d}, then @math{r} is the negative of the |
-ordinary remainder. This occurs whenever @math{d} is a factor of |
-@math{b^n-1}, as for example with 3 in @code{mpn_divexact_by3}. For a 32 or |
-64 bit limb other such factors include 5, 17 and 257, but no particular use |
-has been found for this. |
- |
- |
-@node Small Quotient Division, , Exact Remainder, Division Algorithms |
-@subsection Small Quotient Division |
- |
-An N@cross{}M division where the number of quotient limbs Q=N@minus{}M is |
-small can be optimized somewhat. |
- |
-An ordinary basecase division normalizes the divisor by shifting it to make |
-the high bit set, shifting the dividend accordingly, and shifting the |
-remainder back down at the end of the calculation. This is wasteful if only a |
-few quotient limbs are to be formed. Instead a division of just the top |
-@m{\rm2Q,2*Q} limbs of the dividend by the top Q limbs of the divisor can be |
-used to form a trial quotient. This requires only those limbs normalized, not |
-the whole of the divisor and dividend. |
- |
-A multiply and subtract then applies the trial quotient to the M@minus{}Q |
-unused limbs of the divisor and N@minus{}Q dividend limbs (which includes Q |
-limbs remaining from the trial quotient division). The starting trial |
-quotient can be 1 or 2 too big, but all cases of 2 too big and most cases of 1 |
-too big are detected by first comparing the most significant limbs that will |
-arise from the subtraction. An addback is done if the quotient still turns |
-out to be 1 too big. |
- |
-This whole procedure is essentially the same as one step of the basecase |
-algorithm done in a Q limb base, though with the trial quotient test done only |
-with the high limbs, not an entire Q limb ``digit'' product. The correctness |
-of this weaker test can be established by following the argument of Knuth |
-section 4.3.1 exercise 20 but with the @m{v_2 \GMPhat q > b \GMPhat r |
-+ u_2, v2*q>b*r+u2} condition appropriately relaxed. |
- |
- |
-@need 1000 |
-@node Greatest Common Divisor Algorithms, Powering Algorithms, Division Algorithms, Algorithms |
-@section Greatest Common Divisor |
-@cindex Greatest common divisor algorithms |
-@cindex GCD algorithms |
- |
-@menu |
-* Binary GCD:: |
-* Lehmer's Algorithm:: |
-* Subquadratic GCD:: |
-* Extended GCD:: |
-* Jacobi Symbol:: |
-@end menu |
- |
- |
-@node Binary GCD, Lehmer's Algorithm, Greatest Common Divisor Algorithms, Greatest Common Divisor Algorithms |
-@subsection Binary GCD |
- |
-At small sizes GMP uses an @math{O(N^2)} binary style GCD@. This is described |
-in many textbooks, for example Knuth section 4.5.2 algorithm B@. It simply |
-consists of successively reducing odd operands @math{a} and @math{b} using |
- |
-@quotation |
-@math{a,b = @abs{}(a-b),@min{}(a,b)} @* |
-strip factors of 2 from @math{a} |
-@end quotation |
- |
-The Euclidean GCD algorithm, as per Knuth algorithms E and A, repeatedly |
-computes the quotient @m{q = \lfloor a/b \rfloor, q = floor(a/b)} and replaces |
-@math{a,b} by @math{v, u - q v}. The binary algorithm has so far been found to |
-be faster than the Euclidean algorithm everywhere. One reason the binary |
-method does well is that the implied quotient at each step is usually small, |
-so often only one or two subtractions are needed to get the same effect as a |
-division. Quotients 1, 2 and 3 for example occur 67.7% of the time, see Knuth |
-section 4.5.3 Theorem E. |
- |
-When the implied quotient is large, meaning @math{b} is much smaller than |
-@math{a}, then a division is worthwhile. This is the basis for the initial |
-@math{a @bmod b} reductions in @code{mpn_gcd} and @code{mpn_gcd_1} (the latter |
-for both N@cross{}1 and 1@cross{}1 cases). But after that initial reduction, |
-big quotients occur too rarely to make it worth checking for them. |
- |
-@sp 1 |
-The final @math{1@cross{}1} GCD in @code{mpn_gcd_1} is done in the generic C |
-code as described above. For two N-bit operands, the algorithm takes about |
-0.68 iterations per bit. For optimum performance some attention needs to be |
-paid to the way the factors of 2 are stripped from @math{a}. |
- |
-Firstly it may be noted that in twos complement the number of low zero bits on |
-@math{a-b} is the same as @math{b-a}, so counting or testing can begin on |
-@math{a-b} without waiting for @math{@abs{}(a-b)} to be determined. |
- |
-A loop stripping low zero bits tends not to branch predict well, since the |
-condition is data dependent. But on average there's only a few low zeros, so |
-an option is to strip one or two bits arithmetically then loop for more (as |
-done for AMD K6). Or use a lookup table to get a count for several bits then |
-loop for more (as done for AMD K7). An alternative approach is to keep just |
-one of @math{a} or @math{b} odd and iterate |
- |
-@quotation |
-@math{a,b = @abs{}(a-b), @min{}(a,b)} @* |
-@math{a = a/2} if even @* |
-@math{b = b/2} if even |
-@end quotation |
- |
-This requires about 1.25 iterations per bit, but stripping of a single bit at |
-each step avoids any branching. Repeating the bit strip reduces to about 0.9 |
-iterations per bit, which may be a worthwhile tradeoff. |
- |
-Generally with the above approaches a speed of perhaps 6 cycles per bit can be |
-achieved, which is still not terribly fast with for instance a 64-bit GCD |
-taking nearly 400 cycles. It's this sort of time which means it's not usually |
-advantageous to combine a set of divisibility tests into a GCD. |
- |
-Currently, the binary algorithm is used for GCD only when @math{N < 3}. |
- |
-@node Lehmer's Algorithm, Subquadratic GCD, Binary GCD, Greatest Common Divisor Algorithms |
-@comment node-name, next, previous, up |
-@subsection Lehmer's algorithm |
- |
-Lehmer's improvement of the Euclidean algorithms is based on the observation |
-that the initial part of the quotient sequence depends only on the most |
-significant parts of the inputs. The variant of Lehmer's algorithm used in GMP |
-splits off the most significant two limbs, as suggested, e.g., in ``A |
-Double-Digit Lehmer-Euclid Algorithm'' by Jebelean (@pxref{References}). The |
-quotients of two double-limb inputs are collected as a 2 by 2 matrix with |
-single-limb elements. This is done by the function @code{mpn_hgcd2}. The |
-resulting matrix is applied to the inputs using @code{mpn_mul_1} and |
-@code{mpn_submul_1}. Each iteration usually reduces the inputs by almost one |
-limb. In the rare case of a large quotient, no progress can be made by |
-examining just the most significant two limbs, and the quotient is computing |
-using plain division. |
- |
-The resulting algorithm is asymptotically @math{O(N^2)}, just as the Euclidean |
-algorithm and the binary algorithm. The quadratic part of the work are |
-the calls to @code{mpn_mul_1} and @code{mpn_submul_1}. For small sizes, the |
-linear work is also significant. There are roughly @math{N} calls to the |
-@code{mpn_hgcd2} function. This function uses a couple of important |
-optimizations: |
- |
-@itemize |
-@item |
-It uses the same relaxed notion of correctness as @code{mpn_hgcd} (see next |
-section). This means that when called with the most significant two limbs of |
-two large numbers, the returned matrix does not always correspond exactly to |
-the initial quotient sequence for the two large numbers; the final quotient |
-may sometimes be one off. |
- |
-@item |
-It takes advantage of the fact the quotients are usually small. The division |
-operator is not used, since the corresponding assembler instruction is very |
-slow on most architectures. (This code could probably be improved further, it |
-uses many branches that are unfriendly to prediction). |
- |
-@item |
-It switches from double-limb calculations to single-limb calculations half-way |
-through, when the input numbers have been reduced in size from two limbs to |
-one and a half. |
- |
-@end itemize |
- |
-@node Subquadratic GCD, Extended GCD, Lehmer's Algorithm, Greatest Common Divisor Algorithms |
-@subsection Subquadratic GCD |
- |
-For inputs larger than @code{GCD_DC_THRESHOLD}, GCD is computed via the HGCD |
-(Half GCD) function, as a generalization to Lehmer's algorithm. |
- |
-Let the inputs @math{a,b} be of size @math{N} limbs each. Put @m{S=\lfloor N/2 |
-\rfloor + 1, S = floor(N/2) + 1}. Then HGCD(a,b) returns a transformation |
-matrix @math{T} with non-negative elements, and reduced numbers @math{(c;d) = |
-T^{-1} (a;b)}. The reduced numbers @math{c,d} must be larger than @math{S} |
-limbs, while their difference @math{abs(c-d)} must fit in @math{S} limbs. The |
-matrix elements will also be of size roughly @math{N/2}. |
- |
-The HGCD base case uses Lehmer's algorithm, but with the above stop condition |
-that returns reduced numbers and the corresponding transformation matrix |
-half-way through. For inputs larger than @code{HGCD_THRESHOLD}, HGCD is |
-computed recursively, using the divide and conquer algorithm in ``On |
-Sch@"onhage's algorithm and subquadratic integer GCD computation'' by M@"oller |
-(@pxref{References}). The recursive algorithm consists of these main |
-steps. |
- |
-@itemize |
- |
-@item |
-Call HGCD recursively, on the most significant @math{N/2} limbs. Apply the |
-resulting matrix @math{T_1} to the full numbers, reducing them to a size just |
-above @math{3N/2}. |
- |
-@item |
-Perform a small number of division or subtraction steps to reduce the numbers |
-to size below @math{3N/2}. This is essential mainly for the unlikely case of |
-large quotients. |
- |
-@item |
-Call HGCD recursively, on the most significant @math{N/2} limbs of the reduced |
-numbers. Apply the resulting matrix @math{T_2} to the full numbers, reducing |
-them to a size just above @math{N/2}. |
- |
-@item |
-Compute @math{T = T_1 T_2}. |
- |
-@item |
-Perform a small number of division and subtraction steps to satisfy the |
-requirements, and return. |
-@end itemize |
- |
-GCD is then implemented as a loop around HGCD, similarly to Lehmer's |
-algorithm. Where Lehmer repeatedly chops off the top two limbs, calls |
-@code{mpn_hgcd2}, and applies the resulting matrix to the full numbers, the |
-subquadratic GCD chops off the most significant third of the limbs (the |
-proportion is a tuning parameter, and @math{1/3} seems to be more efficient |
-than, e.g, @math{1/2}), calls @code{mpn_hgcd}, and applies the resulting |
-matrix. Once the input numbers are reduced to size below |
-@code{GCD_DC_THRESHOLD}, Lehmer's algorithm is used for the rest of the work. |
- |
-The asymptotic running time of both HGCD and GCD is @m{O(M(N)\log N),O(M(N)*log(N))}, |
-where @math{M(N)} is the time for multiplying two @math{N}-limb numbers. |
- |
-@comment node-name, next, previous, up |
- |
-@node Extended GCD, Jacobi Symbol, Subquadratic GCD, Greatest Common Divisor Algorithms |
-@subsection Extended GCD |
- |
-The extended GCD function, or GCDEXT, calculates @math{@gcd{}(a,b)} and also |
-cofactors @math{x} and @math{y} satisfying @m{ax+by=\gcd(a@C{}b), |
-a*x+b*y=gcd(a@C{}b)}. All the algorithms used for plain GCD are extended to |
-handle this case. The binary algorithm is used only for single-limb GCDEXT. |
-Lehmer's algorithm is used for sizes up to @code{GCDEXT_DC_THRESHOLD}. Above |
-this threshold, GCDEXT is implemented as a loop around HGCD, but with more |
-book-keeping to keep track of the cofactors. This gives the same asymptotic |
-running time as for GCD and HGCD, @m{O(M(N)\log N),O(M(N)*log(N))} |
- |
-One difference to plain GCD is that while the inputs @math{a} and @math{b} are |
-reduced as the algorithm proceeds, the cofactors @math{x} and @math{y} grow in |
-size. This makes the tuning of the chopping-point more difficult. The current |
-code chops off the most significant half of the inputs for the call to HGCD in |
-the first iteration, and the most significant two thirds for the remaining |
-calls. This strategy could surely be improved. Also the stop condition for the |
-loop, where Lehmer's algorithm is invoked once the inputs are reduced below |
-@code{GCDEXT_DC_THRESHOLD}, could maybe be improved by taking into account the |
-current size of the cofactors. |
- |
-@node Jacobi Symbol, , Extended GCD, Greatest Common Divisor Algorithms |
-@subsection Jacobi Symbol |
-@cindex Jacobi symbol algorithm |
- |
-@code{mpz_jacobi} and @code{mpz_kronecker} are currently implemented with a |
-simple binary algorithm similar to that described for the GCDs (@pxref{Binary |
-GCD}). They're not very fast when both inputs are large. Lehmer's multi-step |
-improvement or a binary based multi-step algorithm is likely to be better. |
- |
-When one operand fits a single limb, and that includes @code{mpz_kronecker_ui} |
-and friends, an initial reduction is done with either @code{mpn_mod_1} or |
-@code{mpn_modexact_1_odd}, followed by the binary algorithm on a single limb. |
-The binary algorithm is well suited to a single limb, and the whole |
-calculation in this case is quite efficient. |
- |
-In all the routines sign changes for the result are accumulated using some bit |
-twiddling, avoiding table lookups or conditional jumps. |
- |
- |
-@need 1000 |
-@node Powering Algorithms, Root Extraction Algorithms, Greatest Common Divisor Algorithms, Algorithms |
-@section Powering Algorithms |
-@cindex Powering algorithms |
- |
-@menu |
-* Normal Powering Algorithm:: |
-* Modular Powering Algorithm:: |
-@end menu |
- |
- |
-@node Normal Powering Algorithm, Modular Powering Algorithm, Powering Algorithms, Powering Algorithms |
-@subsection Normal Powering |
- |
-Normal @code{mpz} or @code{mpf} powering uses a simple binary algorithm, |
-successively squaring and then multiplying by the base when a 1 bit is seen in |
-the exponent, as per Knuth section 4.6.3. The ``left to right'' |
-variant described there is used rather than algorithm A, since it's just as |
-easy and can be done with somewhat less temporary memory. |
- |
- |
-@node Modular Powering Algorithm, , Normal Powering Algorithm, Powering Algorithms |
-@subsection Modular Powering |
- |
-Modular powering is implemented using a @math{2^k}-ary sliding window |
-algorithm, as per ``Handbook of Applied Cryptography'' algorithm 14.85 |
-(@pxref{References}). @math{k} is chosen according to the size of the |
-exponent. Larger exponents use larger values of @math{k}, the choice being |
-made to minimize the average number of multiplications that must supplement |
-the squaring. |
- |
-The modular multiplies and squares use either a simple division or the REDC |
-method by Montgomery (@pxref{References}). REDC is a little faster, |
-essentially saving N single limb divisions in a fashion similar to an exact |
-remainder (@pxref{Exact Remainder}). The current REDC has some limitations. |
-It's only @math{O(N^2)} so above @code{POWM_THRESHOLD} division becomes faster |
-and is used. It doesn't attempt to detect small bases, but rather always uses |
-a REDC form, which is usually a full size operand. And lastly it's only |
-applied to odd moduli. |
- |
- |
-@node Root Extraction Algorithms, Radix Conversion Algorithms, Powering Algorithms, Algorithms |
-@section Root Extraction Algorithms |
-@cindex Root extraction algorithms |
- |
-@menu |
-* Square Root Algorithm:: |
-* Nth Root Algorithm:: |
-* Perfect Square Algorithm:: |
-* Perfect Power Algorithm:: |
-@end menu |
- |
- |
-@node Square Root Algorithm, Nth Root Algorithm, Root Extraction Algorithms, Root Extraction Algorithms |
-@subsection Square Root |
-@cindex Square root algorithm |
-@cindex Karatsuba square root algorithm |
- |
-Square roots are taken using the ``Karatsuba Square Root'' algorithm by Paul |
-Zimmermann (@pxref{References}). |
- |
-An input @math{n} is split into four parts of @math{k} bits each, so with |
-@math{b=2^k} we have @m{n = a_3b^3 + a_2b^2 + a_1b + a_0, n = a3*b^3 + a2*b^2 |
-+ a1*b + a0}. Part @ms{a,3} must be ``normalized'' so that either the high or |
-second highest bit is set. In GMP, @math{k} is kept on a limb boundary and |
-the input is left shifted (by an even number of bits) to normalize. |
- |
-The square root of the high two parts is taken, by recursive application of |
-the algorithm (bottoming out in a one-limb Newton's method), |
-@tex |
-$$ s',r' = \mathop{\rm sqrtrem} \> (a_3b + a_2) $$ |
-@end tex |
-@ifnottex |
- |
-@example |
-s1,r1 = sqrtrem (a3*b + a2) |
-@end example |
- |
-@end ifnottex |
-This is an approximation to the desired root and is extended by a division to |
-give @math{s},@math{r}, |
-@tex |
-$$\eqalign{ |
-q,u &= \mathop{\rm divrem} \> (r'b + a_1, 2s') \cr |
-s &= s'b + q \cr |
-r &= ub + a_0 - q^2 |
-}$$ |
-@end tex |
-@ifnottex |
- |
-@example |
-q,u = divrem (r1*b + a1, 2*s1) |
-s = s1*b + q |
-r = u*b + a0 - q^2 |
-@end example |
- |
-@end ifnottex |
-The normalization requirement on @ms{a,3} means at this point @math{s} is |
-either correct or 1 too big. @math{r} is negative in the latter case, so |
-@tex |
-$$\eqalign{ |
-\mathop{\rm if} \; r &< 0 \; \mathop{\rm then} \cr |
-r &\leftarrow r + 2s - 1 \cr |
-s &\leftarrow s - 1 |
-}$$ |
-@end tex |
-@ifnottex |
- |
-@example |
-if r < 0 then |
- r = r + 2*s - 1 |
- s = s - 1 |
-@end example |
- |
-@end ifnottex |
-The algorithm is expressed in a divide and conquer form, but as noted in the |
-paper it can also be viewed as a discrete variant of Newton's method, or as a |
-variation on the schoolboy method (no longer taught) for square roots two |
-digits at a time. |
- |
-If the remainder @math{r} is not required then usually only a few high limbs |
-of @math{r} and @math{u} need to be calculated to determine whether an |
-adjustment to @math{s} is required. This optimization is not currently |
-implemented. |
- |
-In the Karatsuba multiplication range this algorithm is @m{O({3\over2} |
-M(N/2)),O(1.5*M(N/2))}, where @math{M(n)} is the time to multiply two numbers |
-of @math{n} limbs. In the FFT multiplication range this grows to a bound of |
-@m{O(6 M(N/2)),O(6*M(N/2))}. In practice a factor of about 1.5 to 1.8 is |
-found in the Karatsuba and Toom-3 ranges, growing to 2 or 3 in the FFT range. |
- |
-The algorithm does all its calculations in integers and the resulting |
-@code{mpn_sqrtrem} is used for both @code{mpz_sqrt} and @code{mpf_sqrt}. |
-The extended precision given by @code{mpf_sqrt_ui} is obtained by |
-padding with zero limbs. |
- |
- |
-@node Nth Root Algorithm, Perfect Square Algorithm, Square Root Algorithm, Root Extraction Algorithms |
-@subsection Nth Root |
-@cindex Root extraction algorithm |
-@cindex Nth root algorithm |
- |
-Integer Nth roots are taken using Newton's method with the following |
-iteration, where @math{A} is the input and @math{n} is the root to be taken. |
-@tex |
-$$a_{i+1} = {1\over n} \left({A \over a_i^{n-1}} + (n-1)a_i \right)$$ |
-@end tex |
-@ifnottex |
- |
-@example |
- 1 A |
-a[i+1] = - * ( --------- + (n-1)*a[i] ) |
- n a[i]^(n-1) |
-@end example |
- |
-@end ifnottex |
-The initial approximation @m{a_1,a[1]} is generated bitwise by successively |
-powering a trial root with or without new 1 bits, aiming to be just above the |
-true root. The iteration converges quadratically when started from a good |
-approximation. When @math{n} is large more initial bits are needed to get |
-good convergence. The current implementation is not particularly well |
-optimized. |
- |
- |
-@node Perfect Square Algorithm, Perfect Power Algorithm, Nth Root Algorithm, Root Extraction Algorithms |
-@subsection Perfect Square |
-@cindex Perfect square algorithm |
- |
-A significant fraction of non-squares can be quickly identified by checking |
-whether the input is a quadratic residue modulo small integers. |
- |
-@code{mpz_perfect_square_p} first tests the input mod 256, which means just |
-examining the low byte. Only 44 different values occur for squares mod 256, |
-so 82.8% of inputs can be immediately identified as non-squares. |
- |
-On a 32-bit system similar tests are done mod 9, 5, 7, 13 and 17, for a total |
-99.25% of inputs identified as non-squares. On a 64-bit system 97 is tested |
-too, for a total 99.62%. |
- |
-These moduli are chosen because they're factors of @math{2^@W{24}-1} (or |
-@math{2^@W{48}-1} for 64-bits), and such a remainder can be quickly taken just |
-using additions (see @code{mpn_mod_34lsub1}). |
- |
-When nails are in use moduli are instead selected by the @file{gen-psqr.c} |
-program and applied with an @code{mpn_mod_1}. The same @math{2^@W{24}-1} or |
-@math{2^@W{48}-1} could be done with nails using some extra bit shifts, but |
-this is not currently implemented. |
- |
-In any case each modulus is applied to the @code{mpn_mod_34lsub1} or |
-@code{mpn_mod_1} remainder and a table lookup identifies non-squares. By |
-using a ``modexact'' style calculation, and suitably permuted tables, just one |
-multiply each is required, see the code for details. Moduli are also combined |
-to save operations, so long as the lookup tables don't become too big. |
-@file{gen-psqr.c} does all the pre-calculations. |
- |
-A square root must still be taken for any value that passes these tests, to |
-verify it's really a square and not one of the small fraction of non-squares |
-that get through (ie.@: a pseudo-square to all the tested bases). |
- |
-Clearly more residue tests could be done, @code{mpz_perfect_square_p} only |
-uses a compact and efficient set. Big inputs would probably benefit from more |
-residue testing, small inputs might be better off with less. The assumed |
-distribution of squares versus non-squares in the input would affect such |
-considerations. |
- |
- |
-@node Perfect Power Algorithm, , Perfect Square Algorithm, Root Extraction Algorithms |
-@subsection Perfect Power |
-@cindex Perfect power algorithm |
- |
-Detecting perfect powers is required by some factorization algorithms. |
-Currently @code{mpz_perfect_power_p} is implemented using repeated Nth root |
-extractions, though naturally only prime roots need to be considered. |
-(@xref{Nth Root Algorithm}.) |
- |
-If a prime divisor @math{p} with multiplicity @math{e} can be found, then only |
-roots which are divisors of @math{e} need to be considered, much reducing the |
-work necessary. To this end divisibility by a set of small primes is checked. |
- |
- |
-@node Radix Conversion Algorithms, Other Algorithms, Root Extraction Algorithms, Algorithms |
-@section Radix Conversion |
-@cindex Radix conversion algorithms |
- |
-Radix conversions are less important than other algorithms. A program |
-dominated by conversions should probably use a different data representation. |
- |
-@menu |
-* Binary to Radix:: |
-* Radix to Binary:: |
-@end menu |
- |
- |
-@node Binary to Radix, Radix to Binary, Radix Conversion Algorithms, Radix Conversion Algorithms |
-@subsection Binary to Radix |
- |
-Conversions from binary to a power-of-2 radix use a simple and fast |
-@math{O(N)} bit extraction algorithm. |
- |
-Conversions from binary to other radices use one of two algorithms. Sizes |
-below @code{GET_STR_PRECOMPUTE_THRESHOLD} use a basic @math{O(N^2)} method. |
-Repeated divisions by @math{b^n} are made, where @math{b} is the radix and |
-@math{n} is the biggest power that fits in a limb. But instead of simply |
-using the remainder @math{r} from such divisions, an extra divide step is done |
-to give a fractional limb representing @math{r/b^n}. The digits of @math{r} |
-can then be extracted using multiplications by @math{b} rather than divisions. |
-Special case code is provided for decimal, allowing multiplications by 10 to |
-optimize to shifts and adds. |
- |
-Above @code{GET_STR_PRECOMPUTE_THRESHOLD} a sub-quadratic algorithm is used. |
-For an input @math{t}, powers @m{b^{n2^i},b^(n*2^i)} of the radix are |
-calculated, until a power between @math{t} and @m{\sqrt{t},sqrt(t)} is |
-reached. @math{t} is then divided by that largest power, giving a quotient |
-which is the digits above that power, and a remainder which is those below. |
-These two parts are in turn divided by the second highest power, and so on |
-recursively. When a piece has been divided down to less than |
-@code{GET_STR_DC_THRESHOLD} limbs, the basecase algorithm described above is |
-used. |
- |
-The advantage of this algorithm is that big divisions can make use of the |
-sub-quadratic divide and conquer division (@pxref{Divide and Conquer |
-Division}), and big divisions tend to have less overheads than lots of |
-separate single limb divisions anyway. But in any case the cost of |
-calculating the powers @m{b^{n2^i},b^(n*2^i)} must first be overcome. |
- |
-@code{GET_STR_PRECOMPUTE_THRESHOLD} and @code{GET_STR_DC_THRESHOLD} represent |
-the same basic thing, the point where it becomes worth doing a big division to |
-cut the input in half. @code{GET_STR_PRECOMPUTE_THRESHOLD} includes the cost |
-of calculating the radix power required, whereas @code{GET_STR_DC_THRESHOLD} |
-assumes that's already available, which is the case when recursing. |
- |
-Since the base case produces digits from least to most significant but they |
-want to be stored from most to least, it's necessary to calculate in advance |
-how many digits there will be, or at least be sure not to underestimate that. |
-For GMP the number of input bits is multiplied by @code{chars_per_bit_exactly} |
-from @code{mp_bases}, rounding up. The result is either correct or one too |
-big. |
- |
-Examining some of the high bits of the input could increase the chance of |
-getting the exact number of digits, but an exact result every time would not |
-be practical, since in general the difference between numbers 100@dots{} and |
-99@dots{} is only in the last few bits and the work to identify 99@dots{} |
-might well be almost as much as a full conversion. |
- |
-@code{mpf_get_str} doesn't currently use the algorithm described here, it |
-multiplies or divides by a power of @math{b} to move the radix point to the |
-just above the highest non-zero digit (or at worst one above that location), |
-then multiplies by @math{b^n} to bring out digits. This is @math{O(N^2)} and |
-is certainly not optimal. |
- |
-The @math{r/b^n} scheme described above for using multiplications to bring out |
-digits might be useful for more than a single limb. Some brief experiments |
-with it on the base case when recursing didn't give a noticeable improvement, |
-but perhaps that was only due to the implementation. Something similar would |
-work for the sub-quadratic divisions too, though there would be the cost of |
-calculating a bigger radix power. |
- |
-Another possible improvement for the sub-quadratic part would be to arrange |
-for radix powers that balanced the sizes of quotient and remainder produced, |
-ie.@: the highest power would be an @m{b^{nk},b^(n*k)} approximately equal to |
-@m{\sqrt{t},sqrt(t)}, not restricted to a @math{2^i} factor. That ought to |
-smooth out a graph of times against sizes, but may or may not be a net |
-speedup. |
- |
- |
-@node Radix to Binary, , Binary to Radix, Radix Conversion Algorithms |
-@subsection Radix to Binary |
- |
-@strong{This section needs to be rewritten, it currently describes the |
-algorithms used before GMP 4.3.} |
- |
-Conversions from a power-of-2 radix into binary use a simple and fast |
-@math{O(N)} bitwise concatenation algorithm. |
- |
-Conversions from other radices use one of two algorithms. Sizes below |
-@code{SET_STR_PRECOMPUTE_THRESHOLD} use a basic @math{O(N^2)} method. Groups |
-of @math{n} digits are converted to limbs, where @math{n} is the biggest |
-power of the base @math{b} which will fit in a limb, then those groups are |
-accumulated into the result by multiplying by @math{b^n} and adding. This |
-saves multi-precision operations, as per Knuth section 4.4 part E |
-(@pxref{References}). Some special case code is provided for decimal, giving |
-the compiler a chance to optimize multiplications by 10. |
- |
-Above @code{SET_STR_PRECOMPUTE_THRESHOLD} a sub-quadratic algorithm is used. |
-First groups of @math{n} digits are converted into limbs. Then adjacent |
-limbs are combined into limb pairs with @m{xb^n+y,x*b^n+y}, where @math{x} |
-and @math{y} are the limbs. Adjacent limb pairs are combined into quads |
-similarly with @m{xb^{2n}+y,x*b^(2n)+y}. This continues until a single block |
-remains, that being the result. |
- |
-The advantage of this method is that the multiplications for each @math{x} are |
-big blocks, allowing Karatsuba and higher algorithms to be used. But the cost |
-of calculating the powers @m{b^{n2^i},b^(n*2^i)} must be overcome. |
-@code{SET_STR_PRECOMPUTE_THRESHOLD} usually ends up quite big, around 5000 digits, and on |
-some processors much bigger still. |
- |
-@code{SET_STR_PRECOMPUTE_THRESHOLD} is based on the input digits (and tuned |
-for decimal), though it might be better based on a limb count, so as to be |
-independent of the base. But that sort of count isn't used by the base case |
-and so would need some sort of initial calculation or estimate. |
- |
-The main reason @code{SET_STR_PRECOMPUTE_THRESHOLD} is so much bigger than the |
-corresponding @code{GET_STR_PRECOMPUTE_THRESHOLD} is that @code{mpn_mul_1} is |
-much faster than @code{mpn_divrem_1} (often by a factor of 5, or more). |
- |
- |
-@need 1000 |
-@node Other Algorithms, Assembly Coding, Radix Conversion Algorithms, Algorithms |
-@section Other Algorithms |
- |
-@menu |
-* Prime Testing Algorithm:: |
-* Factorial Algorithm:: |
-* Binomial Coefficients Algorithm:: |
-* Fibonacci Numbers Algorithm:: |
-* Lucas Numbers Algorithm:: |
-* Random Number Algorithms:: |
-@end menu |
- |
- |
-@node Prime Testing Algorithm, Factorial Algorithm, Other Algorithms, Other Algorithms |
-@subsection Prime Testing |
-@cindex Prime testing algorithms |
- |
-The primality testing in @code{mpz_probab_prime_p} (@pxref{Number Theoretic |
-Functions}) first does some trial division by small factors and then uses the |
-Miller-Rabin probabilistic primality testing algorithm, as described in Knuth |
-section 4.5.4 algorithm P (@pxref{References}). |
- |
-For an odd input @math{n}, and with @math{n = q@GMPmultiply{}2^k+1} where |
-@math{q} is odd, this algorithm selects a random base @math{x} and tests |
-whether @math{x^q @bmod{} n} is 1 or @math{-1}, or an @m{x^{q2^j} \bmod n, |
-x^(q*2^j) mod n} is @math{1}, for @math{1@le{}j@le{}k}. If so then @math{n} |
-is probably prime, if not then @math{n} is definitely composite. |
- |
-Any prime @math{n} will pass the test, but some composites do too. Such |
-composites are known as strong pseudoprimes to base @math{x}. No @math{n} is |
-a strong pseudoprime to more than @math{1/4} of all bases (see Knuth exercise |
-22), hence with @math{x} chosen at random there's no more than a @math{1/4} |
-chance a ``probable prime'' will in fact be composite. |
- |
-In fact strong pseudoprimes are quite rare, making the test much more |
-powerful than this analysis would suggest, but @math{1/4} is all that's proven |
-for an arbitrary @math{n}. |
- |
- |
-@node Factorial Algorithm, Binomial Coefficients Algorithm, Prime Testing Algorithm, Other Algorithms |
-@subsection Factorial |
-@cindex Factorial algorithm |
- |
-Factorials are calculated by a combination of removal of twos, powering, and |
-binary splitting. The procedure can be best illustrated with an example, |
- |
-@quotation |
-@math{23! = 1.2.3.4.5.6.7.8.9.10.11.12.13.14.15.16.17.18.19.20.21.22.23} |
-@end quotation |
- |
-@noindent |
-has factors of two removed, |
- |
-@quotation |
-@math{23! = 2^{19}.1.1.3.1.5.3.7.1.9.5.11.3.13.7.15.1.17.9.19.5.21.11.23} |
-@end quotation |
- |
-@noindent |
-and the resulting terms collected up according to their multiplicity, |
- |
-@quotation |
-@math{23! = 2^{19}.(3.5)^3.(7.9.11)^2.(13.15.17.19.21.23)} |
-@end quotation |
- |
-Each sequence such as @math{13.15.17.19.21.23} is evaluated by splitting into |
-every second term, as for instance @math{(13.17.21).(15.19.23)}, and the same |
-recursively on each half. This is implemented iteratively using some bit |
-twiddling. |
- |
-Such splitting is more efficient than repeated N@cross{}1 multiplies since it |
-forms big multiplies, allowing Karatsuba and higher algorithms to be used. |
-And even below the Karatsuba threshold a big block of work can be more |
-efficient for the basecase algorithm. |
- |
-Splitting into subsequences of every second term keeps the resulting products |
-more nearly equal in size than would the simpler approach of say taking the |
-first half and second half of the sequence. Nearly equal products are more |
-efficient for the current multiply implementation. |
- |
- |
-@node Binomial Coefficients Algorithm, Fibonacci Numbers Algorithm, Factorial Algorithm, Other Algorithms |
-@subsection Binomial Coefficients |
-@cindex Binomial coefficient algorithm |
- |
-Binomial coefficients @m{\left({n}\atop{k}\right), C(n@C{}k)} are calculated |
-by first arranging @math{k @le{} n/2} using @m{\left({n}\atop{k}\right) = |
-\left({n}\atop{n-k}\right), C(n@C{}k) = C(n@C{}n-k)} if necessary, and then |
-evaluating the following product simply from @math{i=2} to @math{i=k}. |
-@tex |
-$$ \left({n}\atop{k}\right) = (n-k+1) \prod_{i=2}^{k} {{n-k+i} \over i} $$ |
-@end tex |
-@ifnottex |
- |
-@example |
- k (n-k+i) |
-C(n,k) = (n-k+1) * prod ------- |
- i=2 i |
-@end example |
- |
-@end ifnottex |
-It's easy to show that each denominator @math{i} will divide the product so |
-far, so the exact division algorithm is used (@pxref{Exact Division}). |
- |
-The numerators @math{n-k+i} and denominators @math{i} are first accumulated |
-into as many fit a limb, to save multi-precision operations, though for |
-@code{mpz_bin_ui} this applies only to the divisors, since @math{n} is an |
-@code{mpz_t} and @math{n-k+i} in general won't fit in a limb at all. |
- |
- |
-@node Fibonacci Numbers Algorithm, Lucas Numbers Algorithm, Binomial Coefficients Algorithm, Other Algorithms |
-@subsection Fibonacci Numbers |
-@cindex Fibonacci number algorithm |
- |
-The Fibonacci functions @code{mpz_fib_ui} and @code{mpz_fib2_ui} are designed |
-for calculating isolated @m{F_n,F[n]} or @m{F_n,F[n]},@m{F_{n-1},F[n-1]} |
-values efficiently. |
- |
-For small @math{n}, a table of single limb values in @code{__gmp_fib_table} is |
-used. On a 32-bit limb this goes up to @m{F_{47},F[47]}, or on a 64-bit limb |
-up to @m{F_{93},F[93]}. For convenience the table starts at @m{F_{-1},F[-1]}. |
- |
-Beyond the table, values are generated with a binary powering algorithm, |
-calculating a pair @m{F_n,F[n]} and @m{F_{n-1},F[n-1]} working from high to |
-low across the bits of @math{n}. The formulas used are |
-@tex |
-$$\eqalign{ |
- F_{2k+1} &= 4F_k^2 - F_{k-1}^2 + 2(-1)^k \cr |
- F_{2k-1} &= F_k^2 + F_{k-1}^2 \cr |
- F_{2k} &= F_{2k+1} - F_{2k-1} |
-}$$ |
-@end tex |
-@ifnottex |
- |
-@example |
-F[2k+1] = 4*F[k]^2 - F[k-1]^2 + 2*(-1)^k |
-F[2k-1] = F[k]^2 + F[k-1]^2 |
- |
-F[2k] = F[2k+1] - F[2k-1] |
-@end example |
- |
-@end ifnottex |
-At each step, @math{k} is the high @math{b} bits of @math{n}. If the next bit |
-of @math{n} is 0 then @m{F_{2k},F[2k]},@m{F_{2k-1},F[2k-1]} is used, or if |
-it's a 1 then @m{F_{2k+1},F[2k+1]},@m{F_{2k},F[2k]} is used, and the process |
-repeated until all bits of @math{n} are incorporated. Notice these formulas |
-require just two squares per bit of @math{n}. |
- |
-It'd be possible to handle the first few @math{n} above the single limb table |
-with simple additions, using the defining Fibonacci recurrence @m{F_{k+1} = |
-F_k + F_{k-1}, F[k+1]=F[k]+F[k-1]}, but this is not done since it usually |
-turns out to be faster for only about 10 or 20 values of @math{n}, and |
-including a block of code for just those doesn't seem worthwhile. If they |
-really mattered it'd be better to extend the data table. |
- |
-Using a table avoids lots of calculations on small numbers, and makes small |
-@math{n} go fast. A bigger table would make more small @math{n} go fast, it's |
-just a question of balancing size against desired speed. For GMP the code is |
-kept compact, with the emphasis primarily on a good powering algorithm. |
- |
-@code{mpz_fib2_ui} returns both @m{F_n,F[n]} and @m{F_{n-1},F[n-1]}, but |
-@code{mpz_fib_ui} is only interested in @m{F_n,F[n]}. In this case the last |
-step of the algorithm can become one multiply instead of two squares. One of |
-the following two formulas is used, according as @math{n} is odd or even. |
-@tex |
-$$\eqalign{ |
- F_{2k} &= F_k (F_k + 2F_{k-1}) \cr |
- F_{2k+1} &= (2F_k + F_{k-1}) (2F_k - F_{k-1}) + 2(-1)^k |
-}$$ |
-@end tex |
-@ifnottex |
- |
-@example |
-F[2k] = F[k]*(F[k]+2F[k-1]) |
- |
-F[2k+1] = (2F[k]+F[k-1])*(2F[k]-F[k-1]) + 2*(-1)^k |
-@end example |
- |
-@end ifnottex |
-@m{F_{2k+1},F[2k+1]} here is the same as above, just rearranged to be a |
-multiply. For interest, the @m{2(-1)^k, 2*(-1)^k} term both here and above |
-can be applied just to the low limb of the calculation, without a carry or |
-borrow into further limbs, which saves some code size. See comments with |
-@code{mpz_fib_ui} and the internal @code{mpn_fib2_ui} for how this is done. |
- |
- |
-@node Lucas Numbers Algorithm, Random Number Algorithms, Fibonacci Numbers Algorithm, Other Algorithms |
-@subsection Lucas Numbers |
-@cindex Lucas number algorithm |
- |
-@code{mpz_lucnum2_ui} derives a pair of Lucas numbers from a pair of Fibonacci |
-numbers with the following simple formulas. |
-@tex |
-$$\eqalign{ |
- L_k &= F_k + 2F_{k-1} \cr |
- L_{k-1} &= 2F_k - F_{k-1} |
-}$$ |
-@end tex |
-@ifnottex |
- |
-@example |
-L[k] = F[k] + 2*F[k-1] |
-L[k-1] = 2*F[k] - F[k-1] |
-@end example |
- |
-@end ifnottex |
-@code{mpz_lucnum_ui} is only interested in @m{L_n,L[n]}, and some work can be |
-saved. Trailing zero bits on @math{n} can be handled with a single square |
-each. |
-@tex |
-$$ L_{2k} = L_k^2 - 2(-1)^k $$ |
-@end tex |
-@ifnottex |
- |
-@example |
-L[2k] = L[k]^2 - 2*(-1)^k |
-@end example |
- |
-@end ifnottex |
-And the lowest 1 bit can be handled with one multiply of a pair of Fibonacci |
-numbers, similar to what @code{mpz_fib_ui} does. |
-@tex |
-$$ L_{2k+1} = 5F_{k-1} (2F_k + F_{k-1}) - 4(-1)^k $$ |
-@end tex |
-@ifnottex |
- |
-@example |
-L[2k+1] = 5*F[k-1]*(2*F[k]+F[k-1]) - 4*(-1)^k |
-@end example |
- |
-@end ifnottex |
- |
- |
-@node Random Number Algorithms, , Lucas Numbers Algorithm, Other Algorithms |
-@subsection Random Numbers |
-@cindex Random number algorithms |
- |
-For the @code{urandomb} functions, random numbers are generated simply by |
-concatenating bits produced by the generator. As long as the generator has |
-good randomness properties this will produce well-distributed @math{N} bit |
-numbers. |
- |
-For the @code{urandomm} functions, random numbers in a range @math{0@le{}R<N} |
-are generated by taking values @math{R} of @m{\lceil \log_2 N \rceil, |
-ceil(log2(N))} bits each until one satisfies @math{R<N}. This will normally |
-require only one or two attempts, but the attempts are limited in case the |
-generator is somehow degenerate and produces only 1 bits or similar. |
- |
-@cindex Mersenne twister algorithm |
-The Mersenne Twister generator is by Matsumoto and Nishimura |
-(@pxref{References}). It has a non-repeating period of @math{2^@W{19937}-1}, |
-which is a Mersenne prime, hence the name of the generator. The state is 624 |
-words of 32-bits each, which is iterated with one XOR and shift for each |
-32-bit word generated, making the algorithm very fast. Randomness properties |
-are also very good and this is the default algorithm used by GMP. |
- |
-@cindex Linear congruential algorithm |
-Linear congruential generators are described in many text books, for instance |
-Knuth volume 2 (@pxref{References}). With a modulus @math{M} and parameters |
-@math{A} and @math{C}, a integer state @math{S} is iterated by the formula |
-@math{S @leftarrow{} A@GMPmultiply{}S+C @bmod{} M}. At each step the new |
-state is a linear function of the previous, mod @math{M}, hence the name of |
-the generator. |
- |
-In GMP only moduli of the form @math{2^N} are supported, and the current |
-implementation is not as well optimized as it could be. Overheads are |
-significant when @math{N} is small, and when @math{N} is large clearly the |
-multiply at each step will become slow. This is not a big concern, since the |
-Mersenne Twister generator is better in every respect and is therefore |
-recommended for all normal applications. |
- |
-For both generators the current state can be deduced by observing enough |
-output and applying some linear algebra (over GF(2) in the case of the |
-Mersenne Twister). This generally means raw output is unsuitable for |
-cryptographic applications without further hashing or the like. |
- |
- |
-@node Assembly Coding, , Other Algorithms, Algorithms |
-@section Assembly Coding |
-@cindex Assembly coding |
- |
-The assembly subroutines in GMP are the most significant source of speed at |
-small to moderate sizes. At larger sizes algorithm selection becomes more |
-important, but of course speedups in low level routines will still speed up |
-everything proportionally. |
- |
-Carry handling and widening multiplies that are important for GMP can't be |
-easily expressed in C@. GCC @code{asm} blocks help a lot and are provided in |
-@file{longlong.h}, but hand coding low level routines invariably offers a |
-speedup over generic C by a factor of anything from 2 to 10. |
- |
-@menu |
-* Assembly Code Organisation:: |
-* Assembly Basics:: |
-* Assembly Carry Propagation:: |
-* Assembly Cache Handling:: |
-* Assembly Functional Units:: |
-* Assembly Floating Point:: |
-* Assembly SIMD Instructions:: |
-* Assembly Software Pipelining:: |
-* Assembly Loop Unrolling:: |
-* Assembly Writing Guide:: |
-@end menu |
- |
- |
-@node Assembly Code Organisation, Assembly Basics, Assembly Coding, Assembly Coding |
-@subsection Code Organisation |
-@cindex Assembly code organisation |
-@cindex Code organisation |
- |
-The various @file{mpn} subdirectories contain machine-dependent code, written |
-in C or assembly. The @file{mpn/generic} subdirectory contains default code, |
-used when there's no machine-specific version of a particular file. |
- |
-Each @file{mpn} subdirectory is for an ISA family. Generally 32-bit and |
-64-bit variants in a family cannot share code and have separate directories. |
-Within a family further subdirectories may exist for CPU variants. |
- |
-In each directory a @file{nails} subdirectory may exist, holding code with |
-nails support for that CPU variant. A @code{NAILS_SUPPORT} directive in each |
-file indicates the nails values the code handles. Nails code only exists |
-where it's faster, or promises to be faster, than plain code. There's no |
-effort put into nails if they're not going to enhance a given CPU. |
- |
- |
-@node Assembly Basics, Assembly Carry Propagation, Assembly Code Organisation, Assembly Coding |
-@subsection Assembly Basics |
- |
-@code{mpn_addmul_1} and @code{mpn_submul_1} are the most important routines |
-for overall GMP performance. All multiplications and divisions come down to |
-repeated calls to these. @code{mpn_add_n}, @code{mpn_sub_n}, |
-@code{mpn_lshift} and @code{mpn_rshift} are next most important. |
- |
-On some CPUs assembly versions of the internal functions |
-@code{mpn_mul_basecase} and @code{mpn_sqr_basecase} give significant speedups, |
-mainly through avoiding function call overheads. They can also potentially |
-make better use of a wide superscalar processor, as can bigger primitives like |
-@code{mpn_addmul_2} or @code{mpn_addmul_4}. |
- |
-The restrictions on overlaps between sources and destinations |
-(@pxref{Low-level Functions}) are designed to facilitate a variety of |
-implementations. For example, knowing @code{mpn_add_n} won't have partly |
-overlapping sources and destination means reading can be done far ahead of |
-writing on superscalar processors, and loops can be vectorized on a vector |
-processor, depending on the carry handling. |
- |
- |
-@node Assembly Carry Propagation, Assembly Cache Handling, Assembly Basics, Assembly Coding |
-@subsection Carry Propagation |
-@cindex Assembly carry propagation |
- |
-The problem that presents most challenges in GMP is propagating carries from |
-one limb to the next. In functions like @code{mpn_addmul_1} and |
-@code{mpn_add_n}, carries are the only dependencies between limb operations. |
- |
-On processors with carry flags, a straightforward CISC style @code{adc} is |
-generally best. AMD K6 @code{mpn_addmul_1} however is an example of an |
-unusual set of circumstances where a branch works out better. |
- |
-On RISC processors generally an add and compare for overflow is used. This |
-sort of thing can be seen in @file{mpn/generic/aors_n.c}. Some carry |
-propagation schemes require 4 instructions, meaning at least 4 cycles per |
-limb, but other schemes may use just 1 or 2. On wide superscalar processors |
-performance may be completely determined by the number of dependent |
-instructions between carry-in and carry-out for each limb. |
- |
-On vector processors good use can be made of the fact that a carry bit only |
-very rarely propagates more than one limb. When adding a single bit to a |
-limb, there's only a carry out if that limb was @code{0xFF@dots{}FF} which on |
-random data will be only 1 in @m{2\GMPraise{@code{mp\_bits\_per\_limb}}, |
-2^mp_bits_per_limb}. @file{mpn/cray/add_n.c} is an example of this, it adds |
-all limbs in parallel, adds one set of carry bits in parallel and then only |
-rarely needs to fall through to a loop propagating further carries. |
- |
-On the x86s, GCC (as of version 2.95.2) doesn't generate particularly good code |
-for the RISC style idioms that are necessary to handle carry bits in |
-C@. Often conditional jumps are generated where @code{adc} or @code{sbb} forms |
-would be better. And so unfortunately almost any loop involving carry bits |
-needs to be coded in assembly for best results. |
- |
- |
-@node Assembly Cache Handling, Assembly Functional Units, Assembly Carry Propagation, Assembly Coding |
-@subsection Cache Handling |
-@cindex Assembly cache handling |
- |
-GMP aims to perform well both on operands that fit entirely in L1 cache and |
-those which don't. |
- |
-Basic routines like @code{mpn_add_n} or @code{mpn_lshift} are often used on |
-large operands, so L2 and main memory performance is important for them. |
-@code{mpn_mul_1} and @code{mpn_addmul_1} are mostly used for multiply and |
-square basecases, so L1 performance matters most for them, unless assembly |
-versions of @code{mpn_mul_basecase} and @code{mpn_sqr_basecase} exist, in |
-which case the remaining uses are mostly for larger operands. |
- |
-For L2 or main memory operands, memory access times will almost certainly be |
-more than the calculation time. The aim therefore is to maximize memory |
-throughput, by starting a load of the next cache line while processing the |
-contents of the previous one. Clearly this is only possible if the chip has a |
-lock-up free cache or some sort of prefetch instruction. Most current chips |
-have both these features. |
- |
-Prefetching sources combines well with loop unrolling, since a prefetch can be |
-initiated once per unrolled loop (or more than once if the loop covers more |
-than one cache line). |
- |
-On CPUs without write-allocate caches, prefetching destinations will ensure |
-individual stores don't go further down the cache hierarchy, limiting |
-bandwidth. Of course for calculations which are slow anyway, like |
-@code{mpn_divrem_1}, write-throughs might be fine. |
- |
-The distance ahead to prefetch will be determined by memory latency versus |
-throughput. The aim of course is to have data arriving continuously, at peak |
-throughput. Some CPUs have limits on the number of fetches or prefetches in |
-progress. |
- |
-If a special prefetch instruction doesn't exist then a plain load can be used, |
-but in that case care must be taken not to attempt to read past the end of an |
-operand, since that might produce a segmentation violation. |
- |
-Some CPUs or systems have hardware that detects sequential memory accesses and |
-initiates suitable cache movements automatically, making life easy. |
- |
- |
-@node Assembly Functional Units, Assembly Floating Point, Assembly Cache Handling, Assembly Coding |
-@subsection Functional Units |
- |
-When choosing an approach for an assembly loop, consideration is given to |
-what operations can execute simultaneously and what throughput can thereby be |
-achieved. In some cases an algorithm can be tweaked to accommodate available |
-resources. |
- |
-Loop control will generally require a counter and pointer updates, costing as |
-much as 5 instructions, plus any delays a branch introduces. CPU addressing |
-modes might reduce pointer updates, perhaps by allowing just one updating |
-pointer and others expressed as offsets from it, or on CISC chips with all |
-addressing done with the loop counter as a scaled index. |
- |
-The final loop control cost can be amortised by processing several limbs in |
-each iteration (@pxref{Assembly Loop Unrolling}). This at least ensures loop |
-control isn't a big fraction the work done. |
- |
-Memory throughput is always a limit. If perhaps only one load or one store |
-can be done per cycle then 3 cycles/limb will the top speed for ``binary'' |
-operations like @code{mpn_add_n}, and any code achieving that is optimal. |
- |
-Integer resources can be freed up by having the loop counter in a float |
-register, or by pressing the float units into use for some multiplying, |
-perhaps doing every second limb on the float side (@pxref{Assembly Floating |
-Point}). |
- |
-Float resources can be freed up by doing carry propagation on the integer |
-side, or even by doing integer to float conversions in integers using bit |
-twiddling. |
- |
- |
-@node Assembly Floating Point, Assembly SIMD Instructions, Assembly Functional Units, Assembly Coding |
-@subsection Floating Point |
-@cindex Assembly floating Point |
- |
-Floating point arithmetic is used in GMP for multiplications on CPUs with poor |
-integer multipliers. It's mostly useful for @code{mpn_mul_1}, |
-@code{mpn_addmul_1} and @code{mpn_submul_1} on 64-bit machines, and |
-@code{mpn_mul_basecase} on both 32-bit and 64-bit machines. |
- |
-With IEEE 53-bit double precision floats, integer multiplications producing up |
-to 53 bits will give exact results. Breaking a 64@cross{}64 multiplication |
-into eight 16@cross{}@math{32@rightarrow{}48} bit pieces is convenient. With |
-some care though six 21@cross{}@math{32@rightarrow{}53} bit products can be |
-used, if one of the lower two 21-bit pieces also uses the sign bit. |
- |
-For the @code{mpn_mul_1} family of functions on a 64-bit machine, the |
-invariant single limb is split at the start, into 3 or 4 pieces. Inside the |
-loop, the bignum operand is split into 32-bit pieces. Fast conversion of |
-these unsigned 32-bit pieces to floating point is highly machine-dependent. |
-In some cases, reading the data into the integer unit, zero-extending to |
-64-bits, then transferring to the floating point unit back via memory is the |
-only option. |
- |
-Converting partial products back to 64-bit limbs is usually best done as a |
-signed conversion. Since all values are smaller than @m{2^{53},2^53}, signed |
-and unsigned are the same, but most processors lack unsigned conversions. |
- |
-@sp 2 |
- |
-Here is a diagram showing 16@cross{}32 bit products for an @code{mpn_mul_1} or |
-@code{mpn_addmul_1} with a 64-bit limb. The single limb operand V is split |
-into four 16-bit parts. The multi-limb operand U is split in the loop into |
-two 32-bit parts. |
- |
-@tex |
-\global\newdimen\GMPbits \global\GMPbits=0.18em |
-\def\GMPbox#1#2#3{% |
- \hbox{% |
- \hbox to 128\GMPbits{\hfil |
- \vbox{% |
- \hrule |
- \hbox to 48\GMPbits {\GMPvrule \hfil$#2$\hfil \vrule}% |
- \hrule}% |
- \hskip #1\GMPbits}% |
- \raise \GMPboxdepth \hbox{\hskip 2em #3}}} |
-% |
-\GMPdisplay{% |
- \vbox{% |
- \hbox{% |
- \hbox to 128\GMPbits {\hfil |
- \vbox{% |
- \hrule |
- \hbox to 64\GMPbits{% |
- \GMPvrule \hfil$v48$\hfil |
- \vrule \hfil$v32$\hfil |
- \vrule \hfil$v16$\hfil |
- \vrule \hfil$v00$\hfil |
- \vrule} |
- \hrule}}% |
- \raise \GMPboxdepth \hbox{\hskip 2em V Operand}} |
- \vskip 0.5ex |
- \hbox{% |
- \hbox to 128\GMPbits {\hfil |
- \raise \GMPboxdepth \hbox{$\times$\hskip 1.5em}% |
- \vbox{% |
- \hrule |
- \hbox to 64\GMPbits {% |
- \GMPvrule \hfil$u32$\hfil |
- \vrule \hfil$u00$\hfil |
- \vrule}% |
- \hrule}}% |
- \raise \GMPboxdepth \hbox{\hskip 2em U Operand (one limb)}}% |
- \vskip 0.5ex |
- \hbox{\vbox to 2ex{\hrule width 128\GMPbits}}% |
- \GMPbox{0}{u00 \times v00}{$p00$\hskip 1.5em 48-bit products}% |
- \vskip 0.5ex |
- \GMPbox{16}{u00 \times v16}{$p16$} |
- \vskip 0.5ex |
- \GMPbox{32}{u00 \times v32}{$p32$} |
- \vskip 0.5ex |
- \GMPbox{48}{u00 \times v48}{$p48$} |
- \vskip 0.5ex |
- \GMPbox{32}{u32 \times v00}{$r32$} |
- \vskip 0.5ex |
- \GMPbox{48}{u32 \times v16}{$r48$} |
- \vskip 0.5ex |
- \GMPbox{64}{u32 \times v32}{$r64$} |
- \vskip 0.5ex |
- \GMPbox{80}{u32 \times v48}{$r80$} |
-}} |
-@end tex |
-@ifnottex |
-@example |
-@group |
- +---+---+---+---+ |
- |v48|v32|v16|v00| V operand |
- +---+---+---+---+ |
- |
- +-------+---+---+ |
- x | u32 | u00 | U operand (one limb) |
- +---------------+ |
- |
---------------------------------- |
- |
- +-----------+ |
- | u00 x v00 | p00 48-bit products |
- +-----------+ |
- +-----------+ |
- | u00 x v16 | p16 |
- +-----------+ |
- +-----------+ |
- | u00 x v32 | p32 |
- +-----------+ |
- +-----------+ |
- | u00 x v48 | p48 |
- +-----------+ |
- +-----------+ |
- | u32 x v00 | r32 |
- +-----------+ |
- +-----------+ |
- | u32 x v16 | r48 |
- +-----------+ |
- +-----------+ |
- | u32 x v32 | r64 |
- +-----------+ |
-+-----------+ |
-| u32 x v48 | r80 |
-+-----------+ |
-@end group |
-@end example |
-@end ifnottex |
- |
-@math{p32} and @math{r32} can be summed using floating-point addition, and |
-likewise @math{p48} and @math{r48}. @math{p00} and @math{p16} can be summed |
-with @math{r64} and @math{r80} from the previous iteration. |
- |
-For each loop then, four 49-bit quantities are transfered to the integer unit, |
-aligned as follows, |
- |
-@tex |
-% GMPbox here should be 49 bits wide, but use 51 to better show p16+r80' |
-% crossing into the upper 64 bits. |
-\def\GMPbox#1#2#3{% |
- \hbox{% |
- \hbox to 128\GMPbits {% |
- \hfil |
- \vbox{% |
- \hrule |
- \hbox to 51\GMPbits {\GMPvrule \hfil$#2$\hfil \vrule}% |
- \hrule}% |
- \hskip #1\GMPbits}% |
- \raise \GMPboxdepth \hbox{\hskip 1.5em $#3$\hfil}% |
-}} |
-\newbox\b \setbox\b\hbox{64 bits}% |
-\newdimen\bw \bw=\wd\b \advance\bw by 2em |
-\newdimen\x \x=128\GMPbits |
-\advance\x by -2\bw |
-\divide\x by4 |
-\GMPdisplay{% |
- \vbox{% |
- \hbox to 128\GMPbits {% |
- \GMPvrule |
- \raise 0.5ex \vbox{\hrule \hbox to \x {}}% |
- \hfil 64 bits\hfil |
- \raise 0.5ex \vbox{\hrule \hbox to \x {}}% |
- \vrule |
- \raise 0.5ex \vbox{\hrule \hbox to \x {}}% |
- \hfil 64 bits\hfil |
- \raise 0.5ex \vbox{\hrule \hbox to \x {}}% |
- \vrule}% |
- \vskip 0.7ex |
- \GMPbox{0}{p00+r64'}{i00} |
- \vskip 0.5ex |
- \GMPbox{16}{p16+r80'}{i16} |
- \vskip 0.5ex |
- \GMPbox{32}{p32+r32}{i32} |
- \vskip 0.5ex |
- \GMPbox{48}{p48+r48}{i48} |
-}} |
-@end tex |
-@ifnottex |
-@example |
-@group |
-|-----64bits----|-----64bits----| |
- +------------+ |
- | p00 + r64' | i00 |
- +------------+ |
- +------------+ |
- | p16 + r80' | i16 |
- +------------+ |
- +------------+ |
- | p32 + r32 | i32 |
- +------------+ |
- +------------+ |
- | p48 + r48 | i48 |
- +------------+ |
-@end group |
-@end example |
-@end ifnottex |
- |
-The challenge then is to sum these efficiently and add in a carry limb, |
-generating a low 64-bit result limb and a high 33-bit carry limb (@math{i48} |
-extends 33 bits into the high half). |
- |
- |
-@node Assembly SIMD Instructions, Assembly Software Pipelining, Assembly Floating Point, Assembly Coding |
-@subsection SIMD Instructions |
-@cindex Assembly SIMD |
- |
-The single-instruction multiple-data support in current microprocessors is |
-aimed at signal processing algorithms where each data point can be treated |
-more or less independently. There's generally not much support for |
-propagating the sort of carries that arise in GMP. |
- |
-SIMD multiplications of say four 16@cross{}16 bit multiplies only do as much |
-work as one 32@cross{}32 from GMP's point of view, and need some shifts and |
-adds besides. But of course if say the SIMD form is fully pipelined and uses |
-less instruction decoding then it may still be worthwhile. |
- |
-On the x86 chips, MMX has so far found a use in @code{mpn_rshift} and |
-@code{mpn_lshift}, and is used in a special case for 16-bit multipliers in the |
-P55 @code{mpn_mul_1}. SSE2 is used for Pentium 4 @code{mpn_mul_1}, |
-@code{mpn_addmul_1}, and @code{mpn_submul_1}. |
- |
- |
-@node Assembly Software Pipelining, Assembly Loop Unrolling, Assembly SIMD Instructions, Assembly Coding |
-@subsection Software Pipelining |
-@cindex Assembly software pipelining |
- |
-Software pipelining consists of scheduling instructions around the branch |
-point in a loop. For example a loop might issue a load not for use in the |
-present iteration but the next, thereby allowing extra cycles for the data to |
-arrive from memory. |
- |
-Naturally this is wanted only when doing things like loads or multiplies that |
-take several cycles to complete, and only where a CPU has multiple functional |
-units so that other work can be done in the meantime. |
- |
-A pipeline with several stages will have a data value in progress at each |
-stage and each loop iteration moves them along one stage. This is like |
-juggling. |
- |
-If the latency of some instruction is greater than the loop time then it will |
-be necessary to unroll, so one register has a result ready to use while |
-another (or multiple others) are still in progress. (@pxref{Assembly Loop |
-Unrolling}). |
- |
- |
-@node Assembly Loop Unrolling, Assembly Writing Guide, Assembly Software Pipelining, Assembly Coding |
-@subsection Loop Unrolling |
-@cindex Assembly loop unrolling |
- |
-Loop unrolling consists of replicating code so that several limbs are |
-processed in each loop. At a minimum this reduces loop overheads by a |
-corresponding factor, but it can also allow better register usage, for example |
-alternately using one register combination and then another. Judicious use of |
-@command{m4} macros can help avoid lots of duplication in the source code. |
- |
-Any amount of unrolling can be handled with a loop counter that's decremented |
-by @math{N} each time, stopping when the remaining count is less than the |
-further @math{N} the loop will process. Or by subtracting @math{N} at the |
-start, the termination condition becomes when the counter @math{C} is less |
-than 0 (and the count of remaining limbs is @math{C+N}). |
- |
-Alternately for a power of 2 unroll the loop count and remainder can be |
-established with a shift and mask. This is convenient if also making a |
-computed jump into the middle of a large loop. |
- |
-The limbs not a multiple of the unrolling can be handled in various ways, for |
-example |
- |
-@itemize @bullet |
-@item |
-A simple loop at the end (or the start) to process the excess. Care will be |
-wanted that it isn't too much slower than the unrolled part. |
- |
-@item |
-A set of binary tests, for example after an 8-limb unrolling, test for 4 more |
-limbs to process, then a further 2 more or not, and finally 1 more or not. |
-This will probably take more code space than a simple loop. |
- |
-@item |
-A @code{switch} statement, providing separate code for each possible excess, |
-for example an 8-limb unrolling would have separate code for 0 remaining, 1 |
-remaining, etc, up to 7 remaining. This might take a lot of code, but may be |
-the best way to optimize all cases in combination with a deep pipelined loop. |
- |
-@item |
-A computed jump into the middle of the loop, thus making the first iteration |
-handle the excess. This should make times smoothly increase with size, which |
-is attractive, but setups for the jump and adjustments for pointers can be |
-tricky and could become quite difficult in combination with deep pipelining. |
-@end itemize |
- |
- |
-@node Assembly Writing Guide, , Assembly Loop Unrolling, Assembly Coding |
-@subsection Writing Guide |
-@cindex Assembly writing guide |
- |
-This is a guide to writing software pipelined loops for processing limb |
-vectors in assembly. |
- |
-First determine the algorithm and which instructions are needed. Code it |
-without unrolling or scheduling, to make sure it works. On a 3-operand CPU |
-try to write each new value to a new register, this will greatly simplify later |
-steps. |
- |
-Then note for each instruction the functional unit and/or issue port |
-requirements. If an instruction can use either of two units, like U0 or U1 |
-then make a category ``U0/U1''. Count the total using each unit (or combined |
-unit), and count all instructions. |
- |
-Figure out from those counts the best possible loop time. The goal will be to |
-find a perfect schedule where instruction latencies are completely hidden. |
-The total instruction count might be the limiting factor, or perhaps a |
-particular functional unit. It might be possible to tweak the instructions to |
-help the limiting factor. |
- |
-Suppose the loop time is @math{N}, then make @math{N} issue buckets, with the |
-final loop branch at the end of the last. Now fill the buckets with dummy |
-instructions using the functional units desired. Run this to make sure the |
-intended speed is reached. |
- |
-Now replace the dummy instructions with the real instructions from the slow |
-but correct loop you started with. The first will typically be a load |
-instruction. Then the instruction using that value is placed in a bucket an |
-appropriate distance down. Run the loop again, to check it still runs at |
-target speed. |
- |
-Keep placing instructions, frequently measuring the loop. After a few you |
-will need to wrap around from the last bucket back to the top of the loop. If |
-you used the new-register for new-value strategy above then there will be no |
-register conflicts. If not then take care not to clobber something already in |
-use. Changing registers at this time is very error prone. |
- |
-The loop will overlap two or more of the original loop iterations, and the |
-computation of one vector element result will be started in one iteration of |
-the new loop, and completed one or several iterations later. |
- |
-The final step is to create feed-in and wind-down code for the loop. A good |
-way to do this is to make a copy (or copies) of the loop at the start and |
-delete those instructions which don't have valid antecedents, and at the end |
-replicate and delete those whose results are unwanted (including any further |
-loads). |
- |
-The loop will have a minimum number of limbs loaded and processed, so the |
-feed-in code must test if the request size is smaller and skip either to a |
-suitable part of the wind-down or to special code for small sizes. |
- |
- |
-@node Internals, Contributors, Algorithms, Top |
-@chapter Internals |
-@cindex Internals |
- |
-@strong{This chapter is provided only for informational purposes and the |
-various internals described here may change in future GMP releases. |
-Applications expecting to be compatible with future releases should use only |
-the documented interfaces described in previous chapters.} |
- |
-@menu |
-* Integer Internals:: |
-* Rational Internals:: |
-* Float Internals:: |
-* Raw Output Internals:: |
-* C++ Interface Internals:: |
-@end menu |
- |
-@node Integer Internals, Rational Internals, Internals, Internals |
-@section Integer Internals |
-@cindex Integer internals |
- |
-@code{mpz_t} variables represent integers using sign and magnitude, in space |
-dynamically allocated and reallocated. The fields are as follows. |
- |
-@table @asis |
-@item @code{_mp_size} |
-The number of limbs, or the negative of that when representing a negative |
-integer. Zero is represented by @code{_mp_size} set to zero, in which case |
-the @code{_mp_d} data is unused. |
- |
-@item @code{_mp_d} |
-A pointer to an array of limbs which is the magnitude. These are stored |
-``little endian'' as per the @code{mpn} functions, so @code{_mp_d[0]} is the |
-least significant limb and @code{_mp_d[ABS(_mp_size)-1]} is the most |
-significant. Whenever @code{_mp_size} is non-zero, the most significant limb |
-is non-zero. |
- |
-Currently there's always at least one limb allocated, so for instance |
-@code{mpz_set_ui} never needs to reallocate, and @code{mpz_get_ui} can fetch |
-@code{_mp_d[0]} unconditionally (though its value is then only wanted if |
-@code{_mp_size} is non-zero). |
- |
-@item @code{_mp_alloc} |
-@code{_mp_alloc} is the number of limbs currently allocated at @code{_mp_d}, |
-and naturally @code{_mp_alloc >= ABS(_mp_size)}. When an @code{mpz} routine |
-is about to (or might be about to) increase @code{_mp_size}, it checks |
-@code{_mp_alloc} to see whether there's enough space, and reallocates if not. |
-@code{MPZ_REALLOC} is generally used for this. |
-@end table |
- |
-The various bitwise logical functions like @code{mpz_and} behave as if |
-negative values were twos complement. But sign and magnitude is always used |
-internally, and necessary adjustments are made during the calculations. |
-Sometimes this isn't pretty, but sign and magnitude are best for other |
-routines. |
- |
-Some internal temporary variables are setup with @code{MPZ_TMP_INIT} and these |
-have @code{_mp_d} space obtained from @code{TMP_ALLOC} rather than the memory |
-allocation functions. Care is taken to ensure that these are big enough that |
-no reallocation is necessary (since it would have unpredictable consequences). |
- |
-@code{_mp_size} and @code{_mp_alloc} are @code{int}, although @code{mp_size_t} |
-is usually a @code{long}. This is done to make the fields just 32 bits on |
-some 64 bits systems, thereby saving a few bytes of data space but still |
-providing plenty of range. |
- |
- |
-@node Rational Internals, Float Internals, Integer Internals, Internals |
-@section Rational Internals |
-@cindex Rational internals |
- |
-@code{mpq_t} variables represent rationals using an @code{mpz_t} numerator and |
-denominator (@pxref{Integer Internals}). |
- |
-The canonical form adopted is denominator positive (and non-zero), no common |
-factors between numerator and denominator, and zero uniquely represented as |
-0/1. |
- |
-It's believed that casting out common factors at each stage of a calculation |
-is best in general. A GCD is an @math{O(N^2)} operation so it's better to do |
-a few small ones immediately than to delay and have to do a big one later. |
-Knowing the numerator and denominator have no common factors can be used for |
-example in @code{mpq_mul} to make only two cross GCDs necessary, not four. |
- |
-This general approach to common factors is badly sub-optimal in the presence |
-of simple factorizations or little prospect for cancellation, but GMP has no |
-way to know when this will occur. As per @ref{Efficiency}, that's left to |
-applications. The @code{mpq_t} framework might still suit, with |
-@code{mpq_numref} and @code{mpq_denref} for direct access to the numerator and |
-denominator, or of course @code{mpz_t} variables can be used directly. |
- |
- |
-@node Float Internals, Raw Output Internals, Rational Internals, Internals |
-@section Float Internals |
-@cindex Float internals |
- |
-Efficient calculation is the primary aim of GMP floats and the use of whole |
-limbs and simple rounding facilitates this. |
- |
-@code{mpf_t} floats have a variable precision mantissa and a single machine |
-word signed exponent. The mantissa is represented using sign and magnitude. |
- |
-@c FIXME: The arrow heads don't join to the lines exactly. |
-@tex |
-\global\newdimen\GMPboxwidth \GMPboxwidth=5em |
-\global\newdimen\GMPboxheight \GMPboxheight=3ex |
-\def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}} |
-\GMPdisplay{% |
-\vbox{% |
- \hbox to 5\GMPboxwidth {most significant limb \hfil least significant limb} |
- \vskip 0.7ex |
- \def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}} |
- \hbox { |
- \hbox to 3\GMPboxwidth {% |
- \setbox 0 = \hbox{@code{\_mp\_exp}}% |
- \dimen0=3\GMPboxwidth |
- \advance\dimen0 by -\wd0 |
- \divide\dimen0 by 2 |
- \advance\dimen0 by -1em |
- \setbox1 = \hbox{$\rightarrow$}% |
- \dimen1=\dimen0 |
- \advance\dimen1 by -\wd1 |
- \GMPcentreline{\dimen0}% |
- \hfil |
- \box0% |
- \hfil |
- \GMPcentreline{\dimen1{}}% |
- \box1} |
- \hbox to 2\GMPboxwidth {\hfil @code{\_mp\_d}}} |
- \vskip 0.5ex |
- \vbox {% |
- \hrule |
- \hbox{% |
- \vrule height 2ex depth 1ex |
- \hbox to \GMPboxwidth {}% |
- \vrule |
- \hbox to \GMPboxwidth {}% |
- \vrule |
- \hbox to \GMPboxwidth {}% |
- \vrule |
- \hbox to \GMPboxwidth {}% |
- \vrule |
- \hbox to \GMPboxwidth {}% |
- \vrule} |
- \hrule |
- } |
- \hbox {% |
- \hbox to 0.8 pt {} |
- \hbox to 3\GMPboxwidth {% |
- \hfil $\cdot$} \hbox {$\leftarrow$ radix point\hfil}} |
- \hbox to 5\GMPboxwidth{% |
- \setbox 0 = \hbox{@code{\_mp\_size}}% |
- \dimen0 = 5\GMPboxwidth |
- \advance\dimen0 by -\wd0 |
- \divide\dimen0 by 2 |
- \advance\dimen0 by -1em |
- \dimen1 = \dimen0 |
- \setbox1 = \hbox{$\leftarrow$}% |
- \setbox2 = \hbox{$\rightarrow$}% |
- \advance\dimen0 by -\wd1 |
- \advance\dimen1 by -\wd2 |
- \hbox to 0.3 em {}% |
- \box1 |
- \GMPcentreline{\dimen0}% |
- \hfil |
- \box0 |
- \hfil |
- \GMPcentreline{\dimen1}% |
- \box2} |
-}} |
-@end tex |
-@ifnottex |
-@example |
- most least |
-significant significant |
- limb limb |
- |
- _mp_d |
- |---- _mp_exp ---> | |
- _____ _____ _____ _____ _____ |
- |_____|_____|_____|_____|_____| |
- . <------------ radix point |
- |
- <-------- _mp_size ---------> |
-@sp 1 |
-@end example |
-@end ifnottex |
- |
-@noindent |
-The fields are as follows. |
- |
-@table @asis |
-@item @code{_mp_size} |
-The number of limbs currently in use, or the negative of that when |
-representing a negative value. Zero is represented by @code{_mp_size} and |
-@code{_mp_exp} both set to zero, and in that case the @code{_mp_d} data is |
-unused. (In the future @code{_mp_exp} might be undefined when representing |
-zero.) |
- |
-@item @code{_mp_prec} |
-The precision of the mantissa, in limbs. In any calculation the aim is to |
-produce @code{_mp_prec} limbs of result (the most significant being non-zero). |
- |
-@item @code{_mp_d} |
-A pointer to the array of limbs which is the absolute value of the mantissa. |
-These are stored ``little endian'' as per the @code{mpn} functions, so |
-@code{_mp_d[0]} is the least significant limb and |
-@code{_mp_d[ABS(_mp_size)-1]} the most significant. |
- |
-The most significant limb is always non-zero, but there are no other |
-restrictions on its value, in particular the highest 1 bit can be anywhere |
-within the limb. |
- |
-@code{_mp_prec+1} limbs are allocated to @code{_mp_d}, the extra limb being |
-for convenience (see below). There are no reallocations during a calculation, |
-only in a change of precision with @code{mpf_set_prec}. |
- |
-@item @code{_mp_exp} |
-The exponent, in limbs, determining the location of the implied radix point. |
-Zero means the radix point is just above the most significant limb. Positive |
-values mean a radix point offset towards the lower limbs and hence a value |
-@math{@ge{} 1}, as for example in the diagram above. Negative exponents mean |
-a radix point further above the highest limb. |
- |
-Naturally the exponent can be any value, it doesn't have to fall within the |
-limbs as the diagram shows, it can be a long way above or a long way below. |
-Limbs other than those included in the @code{@{_mp_d,_mp_size@}} data |
-are treated as zero. |
-@end table |
- |
-The @code{_mp_size} and @code{_mp_prec} fields are @code{int}, although the |
-@code{mp_size_t} type is usually a @code{long}. The @code{_mp_exp} field is |
-usually @code{long}. This is done to make some fields just 32 bits on some 64 |
-bits systems, thereby saving a few bytes of data space but still providing |
-plenty of precision and a very large range. |
- |
- |
-@sp 1 |
-@noindent |
-The following various points should be noted. |
- |
-@table @asis |
-@item Low Zeros |
-The least significant limbs @code{_mp_d[0]} etc can be zero, though such low |
-zeros can always be ignored. Routines likely to produce low zeros check and |
-avoid them to save time in subsequent calculations, but for most routines |
-they're quite unlikely and aren't checked. |
- |
-@item Mantissa Size Range |
-The @code{_mp_size} count of limbs in use can be less than @code{_mp_prec} if |
-the value can be represented in less. This means low precision values or |
-small integers stored in a high precision @code{mpf_t} can still be operated |
-on efficiently. |
- |
-@code{_mp_size} can also be greater than @code{_mp_prec}. Firstly a value is |
-allowed to use all of the @code{_mp_prec+1} limbs available at @code{_mp_d}, |
-and secondly when @code{mpf_set_prec_raw} lowers @code{_mp_prec} it leaves |
-@code{_mp_size} unchanged and so the size can be arbitrarily bigger than |
-@code{_mp_prec}. |
- |
-@item Rounding |
-All rounding is done on limb boundaries. Calculating @code{_mp_prec} limbs |
-with the high non-zero will ensure the application requested minimum precision |
-is obtained. |
- |
-The use of simple ``trunc'' rounding towards zero is efficient, since there's |
-no need to examine extra limbs and increment or decrement. |
- |
-@item Bit Shifts |
-Since the exponent is in limbs, there are no bit shifts in basic operations |
-like @code{mpf_add} and @code{mpf_mul}. When differing exponents are |
-encountered all that's needed is to adjust pointers to line up the relevant |
-limbs. |
- |
-Of course @code{mpf_mul_2exp} and @code{mpf_div_2exp} will require bit shifts, |
-but the choice is between an exponent in limbs which requires shifts there, or |
-one in bits which requires them almost everywhere else. |
- |
-@item Use of @code{_mp_prec+1} Limbs |
-The extra limb on @code{_mp_d} (@code{_mp_prec+1} rather than just |
-@code{_mp_prec}) helps when an @code{mpf} routine might get a carry from its |
-operation. @code{mpf_add} for instance will do an @code{mpn_add} of |
-@code{_mp_prec} limbs. If there's no carry then that's the result, but if |
-there is a carry then it's stored in the extra limb of space and |
-@code{_mp_size} becomes @code{_mp_prec+1}. |
- |
-Whenever @code{_mp_prec+1} limbs are held in a variable, the low limb is not |
-needed for the intended precision, only the @code{_mp_prec} high limbs. But |
-zeroing it out or moving the rest down is unnecessary. Subsequent routines |
-reading the value will simply take the high limbs they need, and this will be |
-@code{_mp_prec} if their target has that same precision. This is no more than |
-a pointer adjustment, and must be checked anyway since the destination |
-precision can be different from the sources. |
- |
-Copy functions like @code{mpf_set} will retain a full @code{_mp_prec+1} limbs |
-if available. This ensures that a variable which has @code{_mp_size} equal to |
-@code{_mp_prec+1} will get its full exact value copied. Strictly speaking |
-this is unnecessary since only @code{_mp_prec} limbs are needed for the |
-application's requested precision, but it's considered that an @code{mpf_set} |
-from one variable into another of the same precision ought to produce an exact |
-copy. |
- |
-@item Application Precisions |
-@code{__GMPF_BITS_TO_PREC} converts an application requested precision to an |
-@code{_mp_prec}. The value in bits is rounded up to a whole limb then an |
-extra limb is added since the most significant limb of @code{_mp_d} is only |
-non-zero and therefore might contain only one bit. |
- |
-@code{__GMPF_PREC_TO_BITS} does the reverse conversion, and removes the extra |
-limb from @code{_mp_prec} before converting to bits. The net effect of |
-reading back with @code{mpf_get_prec} is simply the precision rounded up to a |
-multiple of @code{mp_bits_per_limb}. |
- |
-Note that the extra limb added here for the high only being non-zero is in |
-addition to the extra limb allocated to @code{_mp_d}. For example with a |
-32-bit limb, an application request for 250 bits will be rounded up to 8 |
-limbs, then an extra added for the high being only non-zero, giving an |
-@code{_mp_prec} of 9. @code{_mp_d} then gets 10 limbs allocated. Reading |
-back with @code{mpf_get_prec} will take @code{_mp_prec} subtract 1 limb and |
-multiply by 32, giving 256 bits. |
- |
-Strictly speaking, the fact the high limb has at least one bit means that a |
-float with, say, 3 limbs of 32-bits each will be holding at least 65 bits, but |
-for the purposes of @code{mpf_t} it's considered simply to be 64 bits, a nice |
-multiple of the limb size. |
-@end table |
- |
- |
-@node Raw Output Internals, C++ Interface Internals, Float Internals, Internals |
-@section Raw Output Internals |
-@cindex Raw output internals |
- |
-@noindent |
-@code{mpz_out_raw} uses the following format. |
- |
-@tex |
-\global\newdimen\GMPboxwidth \GMPboxwidth=5em |
-\global\newdimen\GMPboxheight \GMPboxheight=3ex |
-\def\centreline{\hbox{\raise 0.8ex \vbox{\hrule \hbox{\hfil}}}} |
-\GMPdisplay{% |
-\vbox{% |
- \def\GMPcentreline#1{\hbox{\raise 0.5 ex \vbox{\hrule \hbox to #1 {}}}} |
- \vbox {% |
- \hrule |
- \hbox{% |
- \vrule height 2.5ex depth 1.5ex |
- \hbox to \GMPboxwidth {\hfil size\hfil}% |
- \vrule |
- \hbox to 3\GMPboxwidth {\hfil data bytes\hfil}% |
- \vrule} |
- \hrule} |
-}} |
-@end tex |
-@ifnottex |
-@example |
-+------+------------------------+ |
-| size | data bytes | |
-+------+------------------------+ |
-@end example |
-@end ifnottex |
- |
-The size is 4 bytes written most significant byte first, being the number of |
-subsequent data bytes, or the twos complement negative of that when a negative |
-integer is represented. The data bytes are the absolute value of the integer, |
-written most significant byte first. |
- |
-The most significant data byte is always non-zero, so the output is the same |
-on all systems, irrespective of limb size. |
- |
-In GMP 1, leading zero bytes were written to pad the data bytes to a multiple |
-of the limb size. @code{mpz_inp_raw} will still accept this, for |
-compatibility. |
- |
-The use of ``big endian'' for both the size and data fields is deliberate, it |
-makes the data easy to read in a hex dump of a file. Unfortunately it also |
-means that the limb data must be reversed when reading or writing, so neither |
-a big endian nor little endian system can just read and write @code{_mp_d}. |
- |
- |
-@node C++ Interface Internals, , Raw Output Internals, Internals |
-@section C++ Interface Internals |
-@cindex C++ interface internals |
- |
-A system of expression templates is used to ensure something like @code{a=b+c} |
-turns into a simple call to @code{mpz_add} etc. For @code{mpf_class} |
-the scheme also ensures the precision of the final |
-destination is used for any temporaries within a statement like |
-@code{f=w*x+y*z}. These are important features which a naive implementation |
-cannot provide. |
- |
-A simplified description of the scheme follows. The true scheme is |
-complicated by the fact that expressions have different return types. For |
-detailed information, refer to the source code. |
- |
-To perform an operation, say, addition, we first define a ``function object'' |
-evaluating it, |
- |
-@example |
-struct __gmp_binary_plus |
-@{ |
- static void eval(mpf_t f, mpf_t g, mpf_t h) @{ mpf_add(f, g, h); @} |
-@}; |
-@end example |
- |
-@noindent |
-And an ``additive expression'' object, |
- |
-@example |
-__gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> > |
-operator+(const mpf_class &f, const mpf_class &g) |
-@{ |
- return __gmp_expr |
- <__gmp_binary_expr<mpf_class, mpf_class, __gmp_binary_plus> >(f, g); |
-@} |
-@end example |
- |
-The seemingly redundant @code{__gmp_expr<__gmp_binary_expr<@dots{}>>} is used to |
-encapsulate any possible kind of expression into a single template type. In |
-fact even @code{mpf_class} etc are @code{typedef} specializations of |
-@code{__gmp_expr}. |
- |
-Next we define assignment of @code{__gmp_expr} to @code{mpf_class}. |
- |
-@example |
-template <class T> |
-mpf_class & mpf_class::operator=(const __gmp_expr<T> &expr) |
-@{ |
- expr.eval(this->get_mpf_t(), this->precision()); |
- return *this; |
-@} |
- |
-template <class Op> |
-void __gmp_expr<__gmp_binary_expr<mpf_class, mpf_class, Op> >::eval |
-(mpf_t f, unsigned long int precision) |
-@{ |
- Op::eval(f, expr.val1.get_mpf_t(), expr.val2.get_mpf_t()); |
-@} |
-@end example |
- |
-where @code{expr.val1} and @code{expr.val2} are references to the expression's |
-operands (here @code{expr} is the @code{__gmp_binary_expr} stored within the |
-@code{__gmp_expr}). |
- |
-This way, the expression is actually evaluated only at the time of assignment, |
-when the required precision (that of @code{f}) is known. Furthermore the |
-target @code{mpf_t} is now available, thus we can call @code{mpf_add} directly |
-with @code{f} as the output argument. |
- |
-Compound expressions are handled by defining operators taking subexpressions |
-as their arguments, like this: |
- |
-@example |
-template <class T, class U> |
-__gmp_expr |
-<__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> > |
-operator+(const __gmp_expr<T> &expr1, const __gmp_expr<U> &expr2) |
-@{ |
- return __gmp_expr |
- <__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, __gmp_binary_plus> > |
- (expr1, expr2); |
-@} |
-@end example |
- |
-And the corresponding specializations of @code{__gmp_expr::eval}: |
- |
-@example |
-template <class T, class U, class Op> |
-void __gmp_expr |
-<__gmp_binary_expr<__gmp_expr<T>, __gmp_expr<U>, Op> >::eval |
-(mpf_t f, unsigned long int precision) |
-@{ |
- // declare two temporaries |
- mpf_class temp1(expr.val1, precision), temp2(expr.val2, precision); |
- Op::eval(f, temp1.get_mpf_t(), temp2.get_mpf_t()); |
-@} |
-@end example |
- |
-The expression is thus recursively evaluated to any level of complexity and |
-all subexpressions are evaluated to the precision of @code{f}. |
- |
- |
-@node Contributors, References, Internals, Top |
-@comment node-name, next, previous, up |
-@appendix Contributors |
-@cindex Contributors |
- |
-Torbj@"orn Granlund wrote the original GMP library and is still the main |
-developer. Code not explicitly attributed to others, was contributed by |
-Torbj@"orn. Several other individuals and organizations have contributed |
-GMP. Here is a list in chronological order on first contribution: |
- |
-Gunnar Sj@"odin and Hans Riesel helped with mathematical problems in early |
-versions of the library. |
- |
-Richard Stallman helped with the interface design and revised the first |
-version of this manual. |
- |
-Brian Beuning and Doug Lea helped with testing of early versions of the |
-library and made creative suggestions. |
- |
-John Amanatides of York University in Canada contributed the function |
-@code{mpz_probab_prime_p}. |
- |
-Paul Zimmermann wrote the REDC-based mpz_powm code, the Sch@"onhage-Strassen |
-FFT multiply code, and the Karatsuba square root code. He also improved the |
-Toom3 code for GMP 4.2. Paul sparked the development of GMP 2, with his |
-comparisons between bignum packages. The ECMNET project Paul is organizing |
-was a driving force behind many of the optimizations in GMP 3. Paul also |
-wrote the new GMP 4.3 nth root code (with Torbj@"orn). |
- |
-Ken Weber (Kent State University, Universidade Federal do Rio Grande do Sul) |
-contributed @code{mpz_gcd}, @code{mpz_divexact}, @code{mpn_gcd}, and |
-@code{mpn_bdivmod}, partially supported by CNPq (Brazil) grant 301314194-2.c |
- |
-Per Bothner of Cygnus Support helped to set up GMP to use Cygnus' configure. |
-He has also made valuable suggestions and tested numerous intermediary |
-releases. |
- |
-Joachim Hollman was involved in the design of the @code{mpf} interface, and in |
-the @code{mpz} design revisions for version 2. |
- |
-Bennet Yee contributed the initial versions of @code{mpz_jacobi} and |
-@code{mpz_legendre}. |
- |
-Andreas Schwab contributed the files @file{mpn/m68k/lshift.S} and |
-@file{mpn/m68k/rshift.S} (now in @file{.asm} form). |
- |
-Robert Harley of Inria, France and David Seal of ARM, England, suggested clever |
-improvements for population count. Robert also wrote highly optimized |
-Karatsuba and 3-way Toom multiplication functions for GMP 3, and contributed |
-the ARM assembly code. |
- |
-Torsten Ekedahl of the Mathematical department of Stockholm University provided |
-significant inspiration during several phases of the GMP development. His |
-mathematical expertise helped improve several algorithms. |
- |
-Linus Nordberg wrote the new configure system based on autoconf and |
-implemented the new random functions. |
- |
-Kevin Ryde worked on a large number of things: optimized x86 code, m4 asm |
-macros, parameter tuning, speed measuring, the configure system, function |
-inlining, divisibility tests, bit scanning, Jacobi symbols, Fibonacci and Lucas |
-number functions, printf and scanf functions, perl interface, demo expression |
-parser, the algorithms chapter in the manual, @file{gmpasm-mode.el}, and |
-various miscellaneous improvements elsewhere. |
- |
-Kent Boortz made the Mac OS 9 port. |
- |
-Steve Root helped write the optimized alpha 21264 assembly code. |
- |
-Gerardo Ballabio wrote the @file{gmpxx.h} C++ class interface and the C++ |
-@code{istream} input routines. |
- |
-Jason Moxham rewrote @code{mpz_fac_ui}. |
- |
-Pedro Gimeno implemented the Mersenne Twister and made other random number |
-improvements. |
- |
-Niels M@"oller wrote the sub-quadratic GCD and extended GCD code, the |
-quadratic Hensel division code, and (with Torbj@"orn) the new divide and |
-conquer division code for GMP 4.3. Niels also helped implement the new Toom |
-multiply code for GMP 4.3. |
- |
-Alberto Zanoni and Marco Bodrato suggested the unbalanced multiply strategy, |
-and found the optimal strategies for evaluation and interpolation in Toom |
-multiplication. Marco also helped implement the new Toom multiply code for |
-GMP 4.3. |
- |
-David Harvey suggested the internal function @code{mpn_bdiv_dbm1}, |
-implementing division relevant to Toom multiplication. He also worked on |
-fast assembly sequences, in particular on a fast AMD64 |
-@code{mpn_mul_basecase}. |
- |
-(This list is chronological, not ordered after significance. If you have |
-contributed to GMP but are not listed above, please tell @email{tege@@gmplib.org} |
-about the omission!) |
- |
-The development of floating point functions of GNU MP 2, were supported in part |
-by the ESPRIT-BRA (Basic Research Activities) 6846 project POSSO (POlynomial |
-System SOlving). |
- |
-The development of GMP 2, 3, and 4 was supported in part by the IDA Center for |
-Computing Sciences. |
- |
-Thanks go to Hans Thorsen for donating an SGI system for the GMP test system |
-environment. |
- |
-@node References, GNU Free Documentation License, Contributors, Top |
-@comment node-name, next, previous, up |
-@appendix References |
-@cindex References |
- |
-@c FIXME: In tex, the @uref's are unhyphenated, which is good for clarity, |
-@c but being long words they upset paragraph formatting (the preceding line |
-@c can get badly stretched). Would like an conditional @* style line break |
-@c if the uref is too long to fit on the last line of the paragraph, but it's |
-@c not clear how to do that. For now explicit @texlinebreak{}s are used on |
-@c paragraphs that come out bad. |
- |
-@section Books |
- |
-@itemize @bullet |
-@item |
-Jonathan M. Borwein and Peter B. Borwein, ``Pi and the AGM: A Study in |
-Analytic Number Theory and Computational Complexity'', Wiley, 1998. |
- |
-@item |
-Henri Cohen, ``A Course in Computational Algebraic Number Theory'', Graduate |
-Texts in Mathematics number 138, Springer-Verlag, 1993. |
-@texlinebreak{} @uref{http://www.math.u-bordeaux.fr/~cohen/} |
- |
-@item |
-Donald E. Knuth, ``The Art of Computer Programming'', volume 2, |
-``Seminumerical Algorithms'', 3rd edition, Addison-Wesley, 1998. |
-@texlinebreak{} @uref{http://www-cs-faculty.stanford.edu/~knuth/taocp.html} |
- |
-@item |
-John D. Lipson, ``Elements of Algebra and Algebraic Computing'', |
-The Benjamin Cummings Publishing Company Inc, 1981. |
- |
-@item |
-Alfred J. Menezes, Paul C. van Oorschot and Scott A. Vanstone, ``Handbook of |
-Applied Cryptography'', @uref{http://www.cacr.math.uwaterloo.ca/hac/} |
- |
-@item |
-Richard M. Stallman, ``Using and Porting GCC'', Free Software Foundation, 1999, |
-available online @uref{http://gcc.gnu.org/onlinedocs/}, and in |
-the GCC package @uref{ftp://ftp.gnu.org/gnu/gcc/} |
-@end itemize |
- |
-@section Papers |
- |
-@itemize @bullet |
-@item |
-Yves Bertot, Nicolas Magaud and Paul Zimmermann, ``A Proof of GMP Square |
-Root'', Journal of Automated Reasoning, volume 29, 2002, pp.@: 225-252. Also |
-available online as INRIA Research Report 4475, June 2001, |
-@uref{http://www.inria.fr/rrrt/rr-4475.html} |
- |
-@item |
-Christoph Burnikel and Joachim Ziegler, ``Fast Recursive Division'', |
-Max-Planck-Institut fuer Informatik Research Report MPI-I-98-1-022, |
-@texlinebreak{} @uref{http://data.mpi-sb.mpg.de/internet/reports.nsf/NumberView/1998-1-022} |
- |
-@item |
-Torbj@"orn Granlund and Peter L. Montgomery, ``Division by Invariant Integers |
-using Multiplication'', in Proceedings of the SIGPLAN PLDI'94 Conference, June |
-1994. Also available @uref{ftp://ftp.cwi.nl/pub/pmontgom/divcnst.psa4.gz} |
-(and .psl.gz). |
- |
-@item |
-Tudor Jebelean, |
-``An algorithm for exact division'', |
-Journal of Symbolic Computation, |
-volume 15, 1993, pp.@: 169-180. |
-Research report version available @texlinebreak{} |
-@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-35.ps.gz} |
- |
-@item |
-Tudor Jebelean, ``Exact Division with Karatsuba Complexity - Extended |
-Abstract'', RISC-Linz technical report 96-31, @texlinebreak{} |
-@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-31.ps.gz} |
- |
-@item |
-Tudor Jebelean, ``Practical Integer Division with Karatsuba Complexity'', |
-ISSAC 97, pp.@: 339-341. Technical report available @texlinebreak{} |
-@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1996/96-29.ps.gz} |
- |
-@item |
-Tudor Jebelean, ``A Generalization of the Binary GCD Algorithm'', ISSAC 93, |
-pp.@: 111-116. Technical report version available @texlinebreak{} |
-@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1993/93-01.ps.gz} |
- |
-@item |
-Tudor Jebelean, ``A Double-Digit Lehmer-Euclid Algorithm for Finding the GCD |
-of Long Integers'', Journal of Symbolic Computation, volume 19, 1995, |
-pp.@: 145-157. Technical report version also available @texlinebreak{} |
-@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1992/92-69.ps.gz} |
- |
-@item |
-Werner Krandick and Tudor Jebelean, ``Bidirectional Exact Integer Division'', |
-Journal of Symbolic Computation, volume 21, 1996, pp.@: 441-455. Early |
-technical report version also available |
-@uref{ftp://ftp.risc.uni-linz.ac.at/pub/techreports/1994/94-50.ps.gz} |
- |
-@item |
-Makoto Matsumoto and Takuji Nishimura, ``Mersenne Twister: A 623-dimensionally |
-equidistributed uniform pseudorandom number generator'', ACM Transactions on |
-Modelling and Computer Simulation, volume 8, January 1998, pp.@: 3-30. |
-Available online @texlinebreak{} |
-@uref{http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/ARTICLES/mt.ps.gz} (or .pdf) |
- |
-@item |
-R. Moenck and A. Borodin, ``Fast Modular Transforms via Division'', |
-Proceedings of the 13th Annual IEEE Symposium on Switching and Automata |
-Theory, October 1972, pp.@: 90-96. Reprinted as ``Fast Modular Transforms'', |
-Journal of Computer and System Sciences, volume 8, number 3, June 1974, |
-pp.@: 366-386. |
- |
-@item |
-Niels M@"oller, ``On Sch@"onhage's algorithm and subquadratic integer GCD |
- computation'', in Mathematics of Computation, volume 77, January 2008, pp.@: |
- 589-607. |
- |
-@item |
-Peter L. Montgomery, ``Modular Multiplication Without Trial Division'', in |
-Mathematics of Computation, volume 44, number 170, April 1985. |
- |
-@item |
-Arnold Sch@"onhage and Volker Strassen, ``Schnelle Multiplikation grosser |
-Zahlen'', Computing 7, 1971, pp.@: 281-292. |
- |
-@item |
-Kenneth Weber, ``The accelerated integer GCD algorithm'', |
-ACM Transactions on Mathematical Software, |
-volume 21, number 1, March 1995, pp.@: 111-122. |
- |
-@item |
-Paul Zimmermann, ``Karatsuba Square Root'', INRIA Research Report 3805, |
-November 1999, @uref{http://www.inria.fr/rrrt/rr-3805.html} |
- |
-@item |
-Paul Zimmermann, ``A Proof of GMP Fast Division and Square Root |
-Implementations'', @texlinebreak{} |
-@uref{http://www.loria.fr/~zimmerma/papers/proof-div-sqrt.ps.gz} |
- |
-@item |
-Dan Zuras, ``On Squaring and Multiplying Large Integers'', ARITH-11: IEEE |
-Symposium on Computer Arithmetic, 1993, pp.@: 260 to 271. Reprinted as ``More |
-on Multiplying and Squaring Large Integers'', IEEE Transactions on Computers, |
-volume 43, number 8, August 1994, pp.@: 899-908. |
-@end itemize |
- |
- |
-@node GNU Free Documentation License, Concept Index, References, Top |
-@appendix GNU Free Documentation License |
-@cindex GNU Free Documentation License |
-@cindex Free Documentation License |
-@cindex Documentation license |
-@include fdl.texi |
- |
- |
-@node Concept Index, Function Index, GNU Free Documentation License, Top |
-@comment node-name, next, previous, up |
-@unnumbered Concept Index |
-@printindex cp |
- |
-@node Function Index, , Concept Index, Top |
-@comment node-name, next, previous, up |
-@unnumbered Function and Type Index |
-@printindex fn |
- |
-@bye |
- |
-@c Local variables: |
-@c fill-column: 78 |
-@c compile-command: "make gmp.info" |
-@c End: |