Float 8#
- class onnx_array_api.validation.f8.CastFloat8[source]#
- Helpers to cast float8 into float32 or the other way around. - static find_closest_value(value, sorted_values)[source]#
- Search a value into a sorted array of values. - Parameters:
- value – float32 value to search 
- sorted_values – list of tuple [(float 32, byte)] 
 
- Returns:
- byte 
 - The function searches into the first column the closest value and return the value on the second columns. 
 
- onnx_array_api.validation.f8.display_fe4m3(value, sign=1, exponent=4, mantissa=3)[source]#
- Displays a float 8 E4M3 into b. - Parameters:
- value – value to display (int) 
- sign – number of bits for the sign 
- exponent – number of bits for the exponent 
- mantissa – number of bits for the mantissa 
 
- Returns:
- string 
 
- onnx_array_api.validation.f8.display_fe5m2(value, sign=1, exponent=4, mantissa=3)[source]#
- Displays a float 8 E5M2 into binary format. - Parameters:
- value – value to display (int) 
- sign – number of bits for the sign 
- exponent – number of bits for the exponent 
- mantissa – number of bits for the mantissa 
 
- Returns:
- string 
 
- onnx_array_api.validation.f8.display_fexmx(value, sign, exponent, mantissa)[source]#
- Displays any float encoded with 1 bit for the sign, exponent bit for the exponent and mantissa bit for the mantissa. - Parameters:
- value – value to display (int) 
- sign – number of bits for the sign 
- exponent – number of bits for the exponent 
- mantissa – number of bits for the mantissa 
 
- Returns:
- string 
 
- onnx_array_api.validation.f8.display_float16(value, sign=1, exponent=5, mantissa=10)[source]#
- Displays a float32 into b. - Parameters:
- value – value to display (float16) 
- sign – number of bits for the sign 
- exponent – number of bits for the exponent 
- mantissa – number of bits for the mantissa 
 
- Returns:
- string 
 
- onnx_array_api.validation.f8.display_float32(value, sign=1, exponent=8, mantissa=23)[source]#
- Displays a float32 into b. - Parameters:
- value – value to display (float32) 
- sign – number of bits for the sign 
- exponent – number of bits for the exponent 
- mantissa – number of bits for the mantissa 
 
- Returns:
- string 
 
- onnx_array_api.validation.f8.display_int(ival, sign=1, exponent=8, mantissa=23)[source]#
- Displays an integer as bits. - Parameters:
- ival – value to display (float32) 
- sign – number of bits for the sign 
- exponent – number of bits for the exponent 
- mantissa – number of bits for the mantissa 
 
- Returns:
- string 
 
- onnx_array_api.validation.f8.fe4m3_to_float32(ival: int, fn: bool = True, uz: bool = False) float[source]#
- Casts a float E4M3 encoded as an integer into a float. - Parameters:
- ival – byte 
- fn – no inifinite values 
- uz – no negative zero 
 
- Returns:
- float (float 32) 
 
- onnx_array_api.validation.f8.fe4m3_to_float32_float(ival: int, fn: bool = True, uz: bool = False) float[source]#
- Casts a float 8 encoded as an integer into a float. - Parameters:
- ival – byte 
- fn – no infinite values 
- uz – no negative zero 
 
- Returns:
- float (float 32) 
 
- onnx_array_api.validation.f8.fe5m2_to_float32(ival: int, fn: bool = False, uz: bool = False) float[source]#
- Casts a float E5M2 encoded as an integer into a float. - Parameters:
- ival – byte 
- fn – no inifinite values 
- uz – no negative values 
 
- Returns:
- float (float 32) 
 
- onnx_array_api.validation.f8.fe5m2_to_float32_float(ival: int, fn: bool = False, uz: bool = False) float[source]#
- Casts a float 8 encoded as an integer into a float. - Parameters:
- ival – byte 
- fn – no infinite values 
- uz – no negative zero 
 
- Returns:
- float (float 32) 
 
- onnx_array_api.validation.f8.float32_to_fe4m3(x, fn: bool = True, uz: bool = False, saturate: bool = True)[source]#
- Converts a float32 into a float E4M3. - Parameters:
- x – numpy.float32 
- fn – no infinite values 
- uz – no negative zero 
- saturate – to convert out of range and infinities to max value if True 
 
- Returns:
- byte 
 
- onnx_array_api.validation.f8.float32_to_fe5m2(x, fn: bool = False, uz: bool = False, saturate: bool = True)[source]#
- Converts a float32 into a float E5M2. - Parameters:
- x – numpy.float32 
- fn – no infinite values 
- uz – no negative zero 
- saturate – to convert out of range and infinities to max value if True 
 
- Returns:
- byte 
 
- onnx_array_api.validation.f8.search_float32_into_fe4m3(value: float, fn: bool = True, uz: bool = False, saturate: bool = True) int[source]#
- Casts a float 32 into a float E4M3. - Parameters:
- value – float 
- fn – no infinite values 
- uz – no negative zero 
- saturate – to convert out of range and infinities to max value if True 
 
- Returns:
- byte 
 
- onnx_array_api.validation.f8.search_float32_into_fe5m2(value: float, fn: bool = False, uz: bool = False, saturate: bool = True) int[source]#
- Casts a float 32 into a float E5M2. - Parameters:
- value – float 
- fn – no infinite values 
- uz – no negative zero 
- saturate – to convert out of range and infinities to max value if True 
 
- Returns:
- byte