Today we continue stealing float bits, but in an entirely different way this time. We’ll end up with the ability to switch between float and 21-bit integer modes and to know which mode we’re in. We can do all of this without using any more than four bytes just by exploiting a little knowledge of the float data format. Read on to learn how!

The float Data Format

Recall from last time that the bits of the float format look like this, from most-significant on the left to least-significant on the right:

Sign Exponent Fraction
1 8 23

NaN, short for “not a number,” values are represented by setting all of the exponent bits to 1 and at least one of the fraction bits to 1. On all relevant CPUs, the most-significant bit of the fraction bits means “is quiet.” This is because there are two categories of NaN values: quiet and signalling. Quiet NaN (a.k.a. qNaN) is the normal NaN and has no further meaning. Signalling NaN (a.k.a. sNaN) uses the remaining 22 bits of the fraction to encode a reason for the NaN.

So for a NaN value, the bits of a float look like this:

Sign Exponent Quiet Signal
1 8 (must be 11111111) 1 22

Here are some examples of NaN values:

Name Sign Exponent Quiet Signal
Positive qNaN 0 11111111 1 0000000000000000000000
Negative qNaN 1 11111111 1 0000000000000000000000
Positive sNaN with 1 “signal” 0 11111111 0 0000000000000000000001
Positive sNaN with 5 “signal” 0 11111111 0 0000000000000000000101
Negative sNaN 1 11111111 0 0000000000000000000001
Stealing Bits

Today we’ll make a discriminated union of a float and a 21-bit integer. Normally we’d make something like this:

[StructLayout(LayoutKind.Explicit)]
public struct FloatIntDiscriminatedUnion
{
    [FieldOffset(0)] public bool IsInt;
    [FieldOffset(4)] public int Int;
    [FieldOffset(4)] public float Float;
}

This takes up 8 bytes instead of just the four bytes that a float takes up, so we’ll strive to do better. To do so, we’ll exploit the float format.

Notice that “quiet NaN” conveys no information other than that the value is “not a number” to its user. This is encoded in 9 of the bits: the all-1 exponent and the 1 in the most-significant fraction bit. The remaining 22 fraction bits are essentially wasted in this case because they seem to always be all 0.

That opens up an opportunity for us to use 21 of the unused fraction bits to store whatever integer we want without changing the meaning of “quiet NaN” because the fraction bits are ignored. Here’s how we can treat the bits:

Sign Exponent Quiet Stored integer
1 8 (must be 11111111) 1 22

This just leaves the problem of determining whether a float is storing an integer or is just a normal float. One approach would be to check that the most-significant 10 bits are all 1 and that the least-significant 22 bits are not all 0. Unfortunately, this has two problems. First, it requires two tests when we’d rather just perform one. Second, it doesn’t allow us to store integers whose value is 0. This second problem is unacceptable to most use cases.

To remedy this, let’s reserve the next-most-significant bit after the “is quiet” bit to indicate whether or not we’ve stored an integer:

Sign Exponent Quiet Is Int Stored integer
1 8 (must be 11111111) 1 1 21

Now all we need to do to check whether a float is storing an integer is to see if the most-significant 11 bits are all 1. If that’s true, then the remaining 21 bits are the stored integer. Otherwise, there’s no stored integer and the float may be a normal value, NaN, or anything else.

There are a few of issues with the way we’re using the bits of the float. First, we’ll need to spend a couple CPU cycles performing bitwise operations rather than directly accessing the values. Second, C# doesn’t really understand what we’re doing here so language features like passing by ref for a uint parameter won’t work since the type is still float. Third, we only get 21 bits rather than the full 32 so we’ll be limited to the 0 to 2097151 range for unsigned values and -1048576 to 1048575 for signed values. These are often acceptable tradeoffs, but it’ll depend on the particular data being processed.

Code

With this plan in mind, let’s write some utility functions to make this kind of access easy. Here they are at the end of the FloatBitStealing static class from last time:

/// <summary>
/// Utility functions for working with <see cref="float"/> values that have some
/// of their bits "stolen" to store another value.
///
/// These function encompass two ways to "steal" bits:
/// 1) Writing to the least-significant fraction bits, which decreases accuracy
///    but allows for combined data storage.
/// 2) Storing a 21-bit integer in the least-significant fraction bits of a
///    "quiet NaN" value, which allows for treating the data as either a float
///    or an integer with the ability to detect which "mode" it is in.
/// </summary>
///
/// <author>
/// Jackson Dunstan, https://JacksonDunstan.com/articles/5377
/// </author>
///
/// <license>
/// MIT
/// </license>
public static class FloatBitStealing
{
    /// <summary>
    /// Utility to change the type of a <see cref="float"/> to a
    /// <see cref="uint"/> or visa versa.
    /// </summary>
    [System.Runtime.InteropServices.StructLayout(
        System.Runtime.InteropServices.LayoutKind.Explicit)]
    private struct UintFloatUnion
    {
        /// <summary>
        /// <see cref="uint"/> value of the memory
        /// </summary>
        [System.Runtime.InteropServices.FieldOffset(0)] public uint Uint;
 
        /// <summary>
        /// <see cref="float"/> value of the memory
        /// </summary>
        [System.Runtime.InteropServices.FieldOffset(0)] public float Float;
    }
 
    /// <summary>
    /// Read the value with the least-signficant bits cleared to zero
    /// </summary>
    /// 
    /// <param name="f">
    /// Value to read from
    /// </param>
    /// 
    /// <param name="numStolenBits">
    /// Number of least-signficant bits to clear to zero
    /// </param>
    /// 
    /// <returns>
    /// The given value with the given number of least-signficant bits cleared
    /// to zero
    /// </returns>
    public static float ReadFloatCleared(float f, int numStolenBits)
    {
        // 1s for all stolen bits
        uint stolenMask = (1u << numStolenBits) - 1;
 
        UintFloatUnion ufu;
        ufu.Uint = 0;
        ufu.Float = f;
        ufu.Uint = ufu.Uint & ~stolenMask;
        return ufu.Float;
    }
 
    /// <summary>
    /// Read the least-signficant bits
    /// </summary>
    /// 
    /// <param name="f">
    /// Value to read from
    /// </param>
    /// 
    /// <param name="numStolenBits">
    /// Number of least-signficant bits to read
    /// </param>
    /// 
    /// <returns>
    /// The given number of least-signficant bits of the given value
    /// </returns>
    public static uint ReadStolen(float f, int numStolenBits)
    {
        // 1s for all stolen bits
        uint stolenMask = (1u << numStolenBits) - 1;
 
        UintFloatUnion ufu;
        ufu.Uint = 0;
        ufu.Float = f;
        return ufu.Uint & stolenMask;
    }
 
    /// <summary>
    /// Write a value to an existing value without overwriting its
    /// least-signficant bits
    /// </summary>
    /// 
    /// <param name="src">
    /// Value to write
    /// </param>
    /// 
    /// <param name="dest">
    /// Value to write to
    /// </param>
    /// 
    /// <param name="numStolenBits">
    /// Number of least-signficant bits to not overwrite
    /// </param>
    /// 
    /// <returns>
    /// The most-signficant <paramref name="numStolenBits"/> bits of the given
    /// <paramref name="src"/> value combined with the least-signficant bits of
    /// the given <paramref name="dest"/> value.
    /// </returns>
    public static float Write(float src, float dest, int numStolenBits)
    {
        // 1s for all stolen bits
        uint stolenMask = (1u << numStolenBits) - 1;
 
        // Get the not-stolen bits of 'src', which are the most-signficant bits
        UintFloatUnion ufu;
        ufu.Uint = 0;
        ufu.Float = src;
        uint srcNotStolen = ufu.Uint & ~stolenMask;
 
        // Get the stolen bits of 'dest', which are the least-significant bits
        ufu.Uint = 0;
        ufu.Float = dest;
        uint destStolen = ufu.Uint & stolenMask;
 
        // Combine them
        ufu.Uint = srcNotStolen | destStolen;
        return ufu.Float;
    }
 
    /// <summary>
    /// Check if a <see cref="float"/> is being used to store a
    /// <see cref="uint"/>. This is true if the sign bit and all exponent bits
    /// are 1 and the two most-significant fraction bits are 01. Use
    /// <see cref="SetInt"/> to create such a value and <see cref="GetInt"/>
    /// to recover the <see cref="uint"/> it contains.
    /// </summary>
    /// 
    /// <param name="f">
    /// <see cref="float"/> value to check
    /// </param>
    /// 
    /// <returns>
    /// If <paramref name="f"/> is being used to store a
    /// <see cref="uint"/>.
    /// </returns>
    public static bool IsInt(float f)
    {
        UintFloatUnion ufu;
        ufu.Uint = 0;
        ufu.Float = f;
 
        // The sign bit and all exponent bits must be 1
        // The two most-significant fraction bits must be 11
        return (ufu.Uint & 0xffe00000) == 0xffe00000;
    }
 
    /// <summary>
    /// Get a stored <see cref="uint"/> from a <see cref="float"/>. This is the
    /// least-significant 21 bits. Use <see cref="SetInt"/> to store a value
    /// that this function can get.
    /// </summary>
    /// 
    /// <param name="f">
    /// <see cref="float"/> value to get the stored <see cref="uint"/> from
    /// </param>
    /// 
    /// <returns>
    /// The stored <see cref="uint"/> in <paramref name="f"/>.
    /// </returns>
    public static uint GetInt(float f)
    {
        UintFloatUnion ufu;
        ufu.Uint = 0;
        ufu.Float = f;
 
        // Least-significant 21 bits
        return ufu.Uint & 0x1fffff;
    }
 
    /// <summary>
    /// Store a <see cref="uint"/> in a <see cref="float"/>. The stored value
    /// should not contain any 1s in the most-significant 11 bits. The returned
    /// value is a "quiet NaN". Use <see cref="GetInt"/> to get the value that
    /// this function stores.
    /// </summary>
    /// 
    /// <param name="i">
    /// The value to store in the least-significant 21 bits of a
    /// <see cref="float"/>.
    /// </param>
    /// 
    /// <returns>
    /// A <see cref="float"/> with <paramref name="i"/> stored in its
    /// least-significant 21 bits.
    /// </returns>
    public static float SetInt(uint i)
    {
        UintFloatUnion ufu;
        ufu.Float = 0;
        ufu.Uint = 0xffe00000 | i;
        return ufu.Float;
    }
}
Usage

Using these utility functions is easy. Here are some examples:

// Make a float that stores an integer
float stored123 = FloatBitStealing.SetInt(123);
Debug.Log(stored123); // NaN
 
// Get the integer back out of the float
uint got123 = FloatBitStealing.GetInt(stored123);
Debug.Log(got123); // 123
 
// Check if a float contains an integer
Debug.Log(FloatBitStealing.IsInt(1.0f)); // false
Debug.Log(FloatBitStealing.IsInt(float.NaN)); // false
Debug.Log(FloatBitStealing.IsInt(stored123)); // true
Conclusion

By understanding the data format of float, we can further exploit it to great effect. Here we’ve reduced an 8 byte data structure down to 4 bytes with pretty minimal drawbacks. This shows, again, that even “primitive” data types like float are hardly atomic and that we can peer inside of them and use their inner workings to our advantage.