[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"project-72144":3},{"id":4,"name":5,"fullName":6,"owner":7,"repo":5,"description":8,"homepage":9,"htmlUrl":10,"language":11,"languages":10,"totalLinesOfCode":10,"stars":12,"forks":13,"watchers":14,"openIssues":15,"contributorsCount":16,"subscribersCount":16,"size":16,"stars1d":16,"stars7d":17,"stars30d":18,"stars90d":16,"forks30d":16,"starsTrendScore":16,"compositeScore":19,"rankGlobal":10,"rankLanguage":10,"license":20,"archived":21,"fork":21,"defaultBranch":22,"hasWiki":23,"hasPages":21,"topics":24,"createdAt":10,"pushedAt":10,"updatedAt":25,"readmeContent":26,"aiSummary":27,"trendingCount":16,"starSnapshotCount":16,"syncStatus":28,"lastSyncTime":29,"discoverSource":30},72144,"efficient-kan","Blealtan\u002Fefficient-kan","Blealtan","An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).","",null,"Python",4645,416,38,34,0,6,9,29.86,"MIT License",false,"master",true,[],"2026-06-12 02:02:59","# An Efficient Implementation of Kolmogorov-Arnold Network\n\nThis repository contains an efficient implementation of Kolmogorov-Arnold Network (KAN).\nThe original implementation of KAN is available [here](https:\u002F\u002Fgithub.com\u002FKindXiaoming\u002Fpykan).\n\nThe performance issue of the original implementation is mostly because it needs to expand all intermediate variables to perform the different activation functions.\nFor a layer with `in_features` input and `out_features` output, the original implementation needs to expand the input to a tensor with shape `(batch_size, out_features, in_features)` to perform the activation functions.\nHowever, all activation functions are linear combination of a fixed set of basis functions which are B-splines; given that, we can reformulate the computation as activate the input with different basis functions and then combine them linearly.\nThis reformulation can significantly reduce the memory cost and make the computation a straightforward matrix multiplication, and works with both forward and backward pass naturally.\n\nThe problem is in the **sparsification** which is claimed to be critical to KAN's interpretability.\nThe authors proposed a L1 regularization defined on the input samples, which requires non-linear operations on the `(batch_size, out_features, in_features)` tensor, and is thus not compatible with the reformulation.\nI instead replace the L1 regularization with a L1 regularization on the weights, which is more common in neural networks and is compatible with the reformulation.\nThe author's implementation indeed include this kind of regularization alongside the one described in the paper as well, so I think it might help.\nMore experiments are needed to verify this; but at least the original approach is infeasible if efficiency is wanted.\n\nAnother difference is that, beside the learnable activation functions (B-splines), the original implementation also includes a learnable scale on each activation function.\nI provided an option `enable_standalone_scale_spline` that defaults to `True` to include this feature; disable it will make the model more efficient, but potentially hurts results.\nIt needs more experiments.\n\n2024-05-04 Update: @xiaol hinted that the constant initialization of `base_weight` parameters can be a problem on MNIST.\nFor now I've changed both the `base_weight` and `spline_scaler` matrices to be initialized with `kaiming_uniform_`, following `nn.Linear`'s initialization.\nIt seems to work much much better on MNIST (~20% to ~97%), but I'm not sure if it's a good idea in general.\n","该项目提供了一个高效的纯PyTorch实现的Kolmogorov-Arnold网络（KAN）。其核心功能在于通过将激活函数重新表述为基函数（B-splines）的线性组合，从而显著减少了内存消耗，并使计算过程简化为简单的矩阵乘法。此外，项目还引入了对权重的L1正则化以替代原版中基于输入样本的L1正则化方法，进一步提高了模型效率。此实现特别适合需要高效计算资源且对模型解释性有一定要求的应用场景，如图像识别任务中的MNIST数据集处理等。",2,"2026-06-11 03:40:35","high_star"]