Faster data unpacking (Diffferent to #1291) #1473
Open
+212
−39
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Here is a faster implementation of the data unpacking (src/finn/util/data_packing.py::packed_bytearray_to_finnpy). This implementation is different to the one seen in #1291.
While being very efficient mdanilows variant suffers from weaknesses, such at not supporting SIMD>1 and not supporting some data types such as fixed and floating point.
This PR addresses these problems and the performance problems of the current implementation by adding a unique unpacking for different datatype categories.
Furthermore, I removed the inferring of the output_shape, as it is ambiguous: E.g. a Byte can store 1 to 4 UINT2 numbers.
I ran a test comparing the current implementation to my variant:
For the first test I assumed an input of shape (10, 32, 32, 8, 1) of different datatype, which is packed into a byte array. Then I unpacked it with both variants. Here are the speedups:
For the second test I assumed an input of shape (10, 8, 1). One can see that the speedup is decreasing, but still is substantial: