Custom Quantization #1785
-
Hello, Im just wondering if its possible to define a custom data type to do WOQ in this repo? Im following the MX branch to see how they add that data type, however i wonder if there is a more straightforward approach since im only after WOQ |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hey @anthony-lemurian, thanks for showing interest in our project! For WOQ, the basic idea is quantize and dequantize the tensor (weight) to mimic the quantization error. The main function for this is quant_tensor, which takes a tensor (weight) and certain configurations to select the qdq_weight_actor. Taking 4-bits as an example, qdq_weight_asym applies asymmetrical quantization and dequantization to the provided weight. Hope this can give you some insights to define new data type :) |
Beta Was this translation helpful? Give feedback.
Hey @anthony-lemurian, thanks for showing interest in our project!
For WOQ, the basic idea is quantize and dequantize the tensor (weight) to mimic the quantization error. The main function for this is quant_tensor, which takes a tensor (weight) and certain configurations to select the qdq_weight_actor.
Taking 4-bits as an example, qdq_weight_asym applies asymmetrical quantization and dequantization to the provided weight.
Hope this can give you some insights to define new data type :)