Releases: huggingface/kernels
v0.11.3
New features
Use kernel functions to extend layers
Up until now, it was only possible to extend existing layers with kernel layers from the Hub. Starting with this release it's also possible to extend them with kernel functions from the Hub. For instance, a silu-and-mul layer
@use_kernel_forward_from_hub("SiluAndMul")
class SiluAndMul(nn.Module):
def forward(self, input: torch.Tensor) -> torch.Tensor:
d = input.shape[-1] // 2
return F.silu(input[..., :d]) * input[..., d:]can now be extended with a silu_and_mul function from the Hub:
with use_kernel_mapping({
"SiluAndMul": {
"cuda": FuncRepository(
repo_id="kernels-community/activation",
func_name="silu_and_mul",
),
}
}):
kernelize(...)We have added the FuncRepository, LocalFuncRepository, and LockedFuncRepository classes to load functions from regular, local, and locked repositories.
Making functions extensible
The counterpart to the previous enhancement is that functions can now also be made extensible using the new use_kernel_func_from_hub decorator:
@use_kernel_forward_from_hub("silu_and_mul")
def silu_and_mul(x: torch.Tensor) -> torch.Tensor:
d = x.shape[-1] // 2
return F.silu(x[..., :d]) * x[..., d:]This will implicitly replace the function by a Torch nn.Module. Since Torch modules implement __call__, it can still be called as a function:
out = silu_and_mul(x)However, when the function stored as part of a model/layer, it will also be kernelized:
class FeedForward(nn.Module):
def __init__(self, in_features: int, out_features: int):
self.linear = nn.Linear(in_features, out_features)
# Note: silu_and_mul is a Torch module.
self.silu_and_mul = silu_and_mul
def forward(self, x: torch.Tensor) -> torch.Tensor:
return self.silu_and_mul(self.linear(x))Similar to layers, the function can be kernelized using both a Hub layer and a Hub function.
What's Changed
- Split up
kernels.layerinto several modules by @danieldk in #187 - Add discord link to the kernel requirements doc by @danieldk in #189
- Support functions as layers by @danieldk in #188
Full Changelog: v0.11.2...v0.11.3
v0.11.2
New feature
This version supports the new noarch build variant that replaces universal kernels. Noarch builds use the build variant format torch-<backend>. This solves two issues that the universal variant has:
- A kernel without AoT-compiled might still be backend-specific. E.g. NVIDIA CuTe-based kernels are not universal in the sense that they don't work on non-NVIDIA GPUs.
- We cannot specify dependencies per backend.
This change introduces support for loading noarch kernels. In the future, we will start emitting deprecation warnings for universal kernels (to eventually remove support).
Full Changelog: v0.11.1...v0.11.2
v0.10.5
v0.9.1
v0.8.2
v0.7.1
v0.6.3
v0.5.1
v0.4.5
v0.11.1
New features
Kernel Python dependencies
This version adds support for kernel Python dependencies. So far, we mostly considered kernels to be either pure PyTorch + Triton or compiled CUDA/ROCm/XPU with a small Torch wrapper. This assumption made kernels easy to deploy everywhere, since they do not have external dependencies. However, DSLs for writing kernels, such as the CUTLASS DSL, are becoming increasingly popular.
To accommodate such DSLs without bringing back the issues that dependencies have, we allow a small, curated set of dependencies. This kernels release implements the client side, validating dependencies and checking that they are installed. The next version of kernel-builder will add dependencies to builds.
Support flattened build directories
Thus far, kernels were stored in build/<variant>/<module_name>. This version of kernels also supports kernels that are stored in build/<variant>. This solves the issue where are kernel cannot be loaded when module_name does not match the repository name (e.g. after a rename). The next version of kernel-builder will build such flattened kernels (with a compatibility layer for older versions of kernels).
What's Changed
- Log glibc version and Python version by @danieldk in #178
- misc(upload): specifically define what folder to allow uploading by @mfuntowicz in #180
- Support new flattened kernel builds by @danieldk in #181
- Validate kernel Python dependencies by @danieldk in #182
- Describe new flattened layout and dependencies by @danieldk in #184
to-wheel: support flat builds by @danieldk in #185- Set version to 0.11.1.dev0 by @danieldk in #186
Full Changelog: v0.11.0...v0.11.1