Brash & Plucky

Recklessly encoding int as float

Quiz: Does this loop end, and if it does, what is the value of k after the loop? (assuming IEEE-754 and usual 32-bit semantics for int/float)

Running this code in VS you get k=16777217 which equals 2^24+1.

In other words, in line with the true intent of the quiz, float can encode -exactly- (without any precission loss) all natural numbers up to 2^24 (included). Because float encodes sign in a dedicated bit, this property holds true for negative values as well. So you could hypothetically encode any [-2^24..+2^24] int in a float with a static/implicit cast.

I said hypothetically because generally speaking I would never do such thing. Carelessly casting between int and float (without an explicit floor, ceil, trunc, round…) is often a sign of lousy coding, unless performance is at stake and you know well what you are doing.

However, I came across the situation in Unity recently, where the Vertex ID node in their Shader Graph hands over SV_VertexID as a float (disappointed face :-P). I would’ve expected an uint/int output or a float that you could reinterpret-cast to retrieve the raw SV_VertexID bits with asuint(). But nope. You are handed over a float which seemingly carries (float)SV_VertexID.

One may recklessly use this float to index a StructuredBuffer ignoring the way static-casting works. This is one case where ignorance is bliss, because the float received exactly matches the original int as long as the original value is <=2^24. That is, as long as you are dealing with fewer than (roughly) 16.7 million vertices, which is usually the case.

I believe that Unity’s ShaderGraph doesn’t support int/uint as In/Out values between nodes, so I guess that the Vertex ID and Instance ID nodes just static-cast the corresponding SV_... value to a float. But it would be better (in the pedantic sense) to reinterpret the bit pattern of the raw values with asfloat and then let the user retrieve them with asint()/asuint().

reinterpret-cast between int/float in HLSL




A (loose) explanation of the 2^24 limit

This is the IEEE-754 standard for floating-point numbers, as described in Wikipedia:

The value of a float-encoded number is reconstructed as (-1)^S * M * 2^E.

  • S is the 1-bit sign.
  • M is the 23-bit mantissa, interpreted as 1.xxxxx (in binary).
  • E is the 8-bit exponent, used as E-127 where 127 is often called bias.

e.g., a power-of-2 integer number like 4 would be encoded as:

  • M=1.00 which is the binary representation of 4, with the point at the left-most 1.
  • E=2 (+bias).

The restored number is 1.00 shl 2=4.

We should be able to do the same for all power-of-2 integers until we max out E.

For non-power-of-2 integers the process is similar. e.g., number 5:

  • M=1.01...00.
  • E=2 (+bias).

The restored number is now 1.01 shl 2=5.

This works the same for all integer numbers as long as M can hold the raw binary representation of the number. The tallest binary number that M can hold is 23 consecutive 1s. That is: 1.11...11 (24x1s in total). With E=23 (+bias) this equals 2^24-1.

The next integer 2^24 would be encoded as 1.00...00 (24x0s clamped to 23, but the trailing 0s are meaningless here). With E=24 (+bias) this equals 2^24 (the answer provided above!!).

But the next integer 1.{23x0s.1} can’t be encoded in a 23-bit M. So from 2^24 onwards, there is necessarily a loss. Some integers beyond 2^24 (such as powers-of-2) may be luckily encoded exactly by float. But not all consecutive numbers will. Actually, 2^24+1 is the first integer that won’t.

Whoa! As always, apologies for any mistakes in any of my posts.

[EDIT] The same reasoning can be applied to double, where the mantisa M is 52-bit long.

Hence double can encode exactly all integers in the range [-2^53..+2^53].