Note: This white paper was authored by FairCom Chief Architect Mike Bowers
If you are using SQL in database operations, it is all but inevitable that you will use one of these two data types, Approximate and Exact, when storing numbers in a database column. Therefore, it is important that you understand the differences between the two and the use cases for each. In this piece, we will go in depth about the principles of Approximate and Exact numeric data types.
Considerations for Using Approximate and Exact Numeric Data Types
The Exact Numeric Data Types are the best fit for storing exact values, performing precise and safe calculations, performing the fastest integer calculations and avoiding binary-to-decimal rounding errors. Values are stored as literal representations of a number’s value. Common numeric data types in SQL databases are BIT, TINYINT, SMALLINT, INTEGER, BIGINT, MONEY, and NUMERIC.
On the other hand, Approximate Numeric Data Types are the best fit for minimizing storage space and performing fast fractional calculations. Even though Approximate Numeric Data Types represent real numbers, they are an approximation due to the method in which computer systems represent them. Common Approximate Numeric Data Types in SQL are REAL and DOUBLE. It must be noted that REAL requires four bytes of storage. DOUBLE requires eight bytes. REAL has half the precision of DOUBLE and runs calculations faster.
SQL databases may use different names for these data types. NUMBER and DECIMAL are often synonyms for NUMERIC. INT is a synonym for INTEGER. Synonyms for REAL include FLOAT(24) and BINARY_FLOAT. Synonyms for DOUBLE include FLOAT(53), DOUBLE PRECISION and BINARY_DOUBLE.
What types to choose and when to use:
- Choose REAL when binary floating point numbers are desirable and you need safe integer precision between -16,777,216 to 16,777,216, or you need less than seven decimal digits of fractional precision.
- Choose DOUBLE when you need safe integer precision between 9,007,199,254,740,991 and -9,007,199,254,740,991, or you need between seven and 15 decimal digits of fractional precision.
- Choose NUMERIC(p,s) when you need 32 or more digits of base 10 precision with a fixed number of fractional digits. This is more than twice the decimal precision of DOUBLE. You set the maximum amount of precision (maximum number of digits to the left and right of the decimal point) and the scale (the number of digits to the right of the decimal point). NUMBER requires 19 or more bytes of storage compared to DOUBLE’s 8 bytes, and it runs calculations slower than DOUBLE. Unlike DOUBLE, NUMBER has no binary rounding and conversion errors. Lastly, NUMBER, NUMERIC, and DECIMAL are different labels for the same data type.
- Choose MONEY when you need exact numeric precision from -922,337,203,685,477.5808 to 922,337,203,685,477.5807. It has 18 decimal digits of precision with four fixed decimal places. It uses 8 bytes of storage. It runs calculations significantly faster than REAL and DOUBLE without binary rounding and conversion errors.
- Choose BIGINT when you need exact numeric integers from -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807. It is ideal for auto-incrementing unique keys. It uses 8 bytes of storage and has 18 digits of decimal precision. It also runs integer calculations significantly faster than REAL and DOUBLE.
- Choose INTEGER when you need exact numeric integers from -2,147,483,648 to 2,147,483,647. It is ideal for auto-incrementing unique keys. It uses 4 bytes of storage and has 12 digits of decimal precision. It is fastest at running integer calculations.
- Choose SMALLINT when you need exact numeric integers from -32,768 to 32,767. It uses 2 bytes of storage and has four digits of decimal precision. It is fast at running integer calculations.
- Choose TINYINT when you need exact numeric integers from -128 to 127. It uses 1 byte of storage and has two digits of decimal precision. It is fast at running integer calculations.
- Choose BIT when you need exact numeric integers from 0 to 1. Depending on the database, it uses 1 bit to 4 bytes of storage. It is useful for Boolean values where 0 is false and 1 is true.
Challenges with Approximate Numeric Data Types
The use of numeric data types can be tricky, especially in regards to approximate numeric data types. Here are some examples:
- Binary floating-point numbers are approximate numbers because they trade precision for increased range. This allows them to represent numbers that are much larger than their precision. For example, a signed 32-bit integer is an exact numeric type and accurately represents every number from -2,147,483,648 and 2,147,483,647 without a loss of precision, but it cannot store any larger values and it cannot store fractions. In contrast, a 32-bit binary floating point number can store numbers as small as 1.4012984643E−45 and as large as 3.4028234664E+38, but the numbers in between vary greatly in their amount of precision.
- Floating point numbers lose precision as their integer part becomes larger and as their fractional part becomes smaller. For example, a 32-bit binary floating point number can safely store integers from -16,777,216 to 16,777,216 without losing precision. Integer numbers beyond this range are unsafe because they lose precision and automatically start rounding up to higher numbers. For example, 16,777,217 and 16,777,218 both round up to 16,777,218; thus, 16,777,217 = 16,777,218. The numbers 33,554,433 and 33,554,434 and 33,554,435 and 33,554,436 all round up to 33,554,436; thus, they are all equal to each other. The rounding problem increases until any number larger than 2.0E+128 rounds to infinity and any number less than -2.0E+128 rounds to -infinity.
- Another challenge with binary floating-point numbers is that base two and base 10 fractions round differently. For example, 1 ÷ 10 in base 10 produces a simple, non-repeating fraction of 0.1. In base two, the same value is 0.000110011001100110011001100110011… which repeats 0011 forever. Because a digital computer is finite, it must round up repeating fractions. Thus, when 0.1 (base 10) is stored as a binary floating point number, the computer rounds it up to the equivalent of 0.10000000000000000555 in base 10. Thus, calculations using binary numbers often produce non-intuitive results in base 10.
- Binary rounding can create mistrust in end users because software converts the results of binary calculations to base 10 before displaying them to the user. Software attempts to hide binary rounding errors from users by truncating decimal precision. Unfortunately, it cannot hide all rounding errors. Thus, numerically precise users, such as accountants and scientists, find inconsistencies that cannot be easily explained. Thus, software should use exact numeric types when it is built for numerically precise users.
If you are already a FairCom customer, here is some additional information and resources:
The FairCom Database Engine implements all these SQL types. It also provides a non-SQL API where you can specify additional numeric data types. Once created, these types can be read and written using SQL:
- Unsigned BIGINT provides exact numeric integers from 0 to 18,446,744,073,709,551,616. It is ideal for auto-incrementing unique keys. It uses 8 bytes of storage and has 19 digits of decimal precision. It also runs integer calculations significantly faster than REAL and DOUBLE.
- Unsigned INTEGER provides exact numeric integers from 0 to 4,294,967,296. It is ideal for auto-incrementing unique keys. It uses 4 bytes of storage and has 12 digits of decimal precision. It is fastest at running integer calculations.
- Unsigned SMALLINT provides exact numeric integers from 0 to 65,536. It uses 2 bytes of storage and has 4 digits of decimal precision. It is fast at running integer calculations.
- Unsigned TINYINT provides exact numeric integers from 0 to 255. It uses 1 byte of storage and has 2 digits of decimal precision. It is fast at running integer calculations.
- Small MONEY provides exact numeric precision from -21,474,836.48 to 21,474,836.47. It has 9 decimal digits of precision with two fixed decimal places. It uses 4 bytes of storage. It runs calculations significantly faster than REAL and DOUBLE without binary rounding and conversion errors.