Package Reference§

This section documents the main components of TimeKAN.

Models§

class timekan.models.tKANGRU(input_dim, hidden_dim, sub_kan_configs=None, sub_kan_output_dim=None, sub_kan_input_dim=None, activation=<built-in method tanh of type object>, recurrent_activation=<built-in method sigmoid of type object>, dropout=0.0, recurrent_dropout=0.0, return_sequences=False, bidirectional=False, layer_norm=False, kan_type='fourier')[source]§

Bases: Module

A KAN-enhanced GRU model for time series processing.

This class wraps TKANGRUCell to process full sequences, with options for bidirectional processing and sequence output.

Parameters:

input_dim (int) – Size of the input dimension.
hidden_dim (int) – Size of the hidden state dimension.
sub_kan_configs (dict, optional) – Configuration for KAN sub-layers. Defaults to None.
sub_kan_output_dim (int, optional) – Output dimension of KAN sub-layers. Defaults to None.
sub_kan_input_dim (int, optional) – Input dimension of KAN sub-layers. Defaults to None.
activation (callable, optional) – Activation function for candidate state. Defaults to torch.tanh.
recurrent_activation (callable, optional) – Activation for gates. Defaults to torch.sigmoid.
dropout (float, optional) – Dropout rate for input. Defaults to 0.0.
recurrent_dropout (float, optional) – Dropout rate for recurrent connections. Defaults to 0.0.
return_sequences (bool, optional) – Whether to return the full sequence. Defaults to False.
bidirectional (bool, optional) – Whether to process bidirectionally. Defaults to False.
layer_norm (bool, optional) – Whether to apply layer normalization. Defaults to False.
kan_type (str, optional) – Type of KAN layer (‘spline’, ‘chebyshev’, ‘fourier’). Defaults to ‘fourier’.

Example

>>> import torch
>>> model = tKANGRU(input_dim=1, hidden_dim=16, return_sequences=True, bidirectional=True)
>>> x = torch.randn(32, 10, 1)  # batch_size=32, seq_len=10, input_dim=1
>>> output = model(x)
>>> print(output.shape)  # Expected: torch.Size([32, 10, 32]) due to bidirectional

forward(x, initial_states=None)[source]§

Processes an input sequence through the KAN-enhanced GRU.

Parameters:

x (torch.Tensor) – Input tensor of shape (batch_size, seq_len, input_dim).
initial_states (list, optional) – Initial states for forward pass. Defaults to None.

Returns:

Output tensor, shape depends on return_sequences and bidirectional settings.

Return type:

torch.Tensor

class timekan.models.tKANLSTM(input_dim, hidden_dim, sub_kan_configs=None, sub_kan_output_dim=None, sub_kan_input_dim=None, activation=<built-in method tanh of type object>, recurrent_activation=<built-in method sigmoid of type object>, dropout=0.0, recurrent_dropout=0.0, return_sequences=False, bidirectional=False, layer_norm=False, kan_type='fourier')[source]§

Bases: Module

A KAN-enhanced LSTM model for time series processing.

This class wraps TKANCell to process full sequences, with options for bidirectional processing and sequence output.

Parameters:

input_dim (int) – Size of the input dimension.
hidden_dim (int) – Size of the hidden state dimension.
sub_kan_configs (dict, optional) – Configuration for KAN sub-layers. Defaults to None.
sub_kan_output_dim (int, optional) – Output dimension of KAN sub-layers. Defaults to None.
sub_kan_input_dim (int, optional) – Input dimension of KAN sub-layers. Defaults to None.
activation (callable, optional) – Activation for cell state. Defaults to torch.tanh.
recurrent_activation (callable, optional) – Activation for gates. Defaults to torch.sigmoid.
dropout (float, optional) – Dropout rate for input. Defaults to 0.0.
recurrent_dropout (float, optional) – Dropout rate for recurrent connections. Defaults to 0.0.
return_sequences (bool, optional) – Whether to return the full sequence. Defaults to False.
bidirectional (bool, optional) – Whether to process bidirectionally. Defaults to False.
layer_norm (bool, optional) – Whether to apply layer normalization. Defaults to False.
kan_type (str, optional) – Type of KAN layer (‘spline’, ‘chebyshev’, ‘fourier’). Defaults to ‘fourier’.

Example

>>> import torch
>>> model = tKANLSTM(input_dim=1, hidden_dim=16, return_sequences=True, bidirectional=True)
>>> x = torch.randn(32, 10, 1)  # batch_size=32, seq_len=10, input_dim=1
>>> output = model(x)
>>> print(output.shape)  # Expected: torch.Size([32, 10, 32]) due to bidirectional

forward(x, initial_states=None)[source]§

Processes an input sequence through the KAN-enhanced LSTM.

Parameters:

x (torch.Tensor) – Input tensor of shape (batch_size, seq_len, input_dim).
initial_states (list, optional) – Initial states for forward pass. Defaults to None.

Returns:

Output tensor, shape depends on return_sequences and bidirectional settings.

Return type:

torch.Tensor

Layers§

class timekan.layers.Chebyshev(inputdim, outdim, degree=3)[source]§

Bases: Module

A neural network layer that applies a Chebyshev polynomial transformation to the input.

This layer approximates functions using a Chebyshev series expansion, where the input is first normalized to the range [-1, 1] using the hyperbolic tangent function (tanh), then transformed using Chebyshev polynomials of the first kind. :param inputdim: The number of input features. :type inputdim: int :param outdim: The number of output features. :type outdim: int :param degree: The degree of the Chebyshev expansion. Default is 3. :type degree: int, optional

forward(x)[source]§

Forward pass of the Chebyshev Kernel Attention Network (KAN) layer.

The input is first normalized to the range [-1, 1] using tanh. The layer then computes Chebyshev polynomials using the arccosine and cosine functions.

Parameters:

x (torch.Tensor) – Input tensor of shape (batch_size, inputdim).

Returns:

Output tensor of shape (batch_size, outdim), obtained via: Chebyshev polynomial interpolation.

Return type:

torch.Tensor

class timekan.layers.Fourier(inputdim, outdim, gridsize=300, addbias=True)[source]§

Bases: Module

A neural network layer that approximates functions using a Fourier series expansion.

This layer transforms input features into a high-dimensional Fourier space using sine and cosine functions and learns Fourier coefficients to approximate functions. :param inputdim: Number of input features. :type inputdim: int :param outdim: Number of output features. :type outdim: int :param gridsize: Number of Fourier basis functions. Default is 300. :type gridsize: int, optional :param addbias: Whether to include a bias term. Default is True. :type addbias: bool, optional

forward(x)[source]§

Forward pass of the Naive Fourier Kernel Attention Network (KAN) layer.

The input is expanded using a Fourier series approximation by computing sine and cosine transformations, then applying learned Fourier coefficients to approximate functions.

Parameters:

x (torch.Tensor) – Input tensor of shape (…, inputdim), where … represents arbitrary batch dimensions.

Returns:

Output tensor of shape (…, outdim), representing the transformed: Fourier features.

Return type:

torch.Tensor

class timekan.layers.ReLU(inputdim: int, outdim: int, train_ab: bool = True, g=5, k=3)[source]§

Bases: Module

A ReLU-based Kolmogorov-Arnold Network (KAN) layer for nonlinear transformations.

This layer uses trainable ReLU activations and a convolutional operation to transform input features into a higher-dimensional output, suitable for enhancing recurrent models.

Parameters:

inputdim (int) – Number of input features.
outdim (int) – Number of output features.
train_ab (bool, optional) – If True, ReLU thresholds are trainable. Defaults to True.
g (int, optional) – Grid size parameter controlling the number of basis points. Defaults to 5.
k (int, optional) – Parameter controlling the range of ReLU thresholds. Defaults to 3.

forward(x)[source]§

Transforms input through ReLU-based KAN operations and convolution.

Parameters:: x (torch.Tensor) – Input tensor, either [batch_size, inputdim] or [batch_size, seq_len, inputdim].
Returns:: Output tensor, either [batch_size, outdim] or [batch_size, seq_len, outdim].
Return type:: torch.Tensor

class timekan.layers.Spline(in_features, out_features, grid_size=50, spline_order=5, scale_noise=0.1, scale_base=1.0, scale_spline=1.0, enable_standalone_scale_spline=True, base_activation=<class 'torch.nn.modules.activation.SiLU'>, grid_eps=0.02, grid_range=[-1, 1])[source]§

Bases: Module

b_splines(x: Tensor)[source]§

Compute the B-spline bases for the given input tensor.

Parameters:: x (torch.Tensor) – Input tensor of shape (batch_size, in_features).
Returns:: B-spline bases tensor of shape (batch_size, in_features, grid_size + spline_order).
Return type:: torch.Tensor

curve2coeff(x: Tensor, y: Tensor)[source]§

Compute the coefficients of the curve that interpolates the given points.

Parameters:

x (torch.Tensor) – Input tensor of shape (batch_size, in_features).
y (torch.Tensor) – Output tensor of shape (batch_size, in_features, out_features).

Returns:

Coefficients tensor of shape (out_features, in_features, grid_size + spline_order).

Return type:

torch.Tensor

forward(x: Tensor)[source]§

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

regularization_loss(regularize_activation=1.0, regularize_entropy=1.0)[source]§

Compute the regularization loss.

This is a dumb simulation of the original L1 regularization as stated in the paper, since the original one requires computing absolutes and entropy from the expanded (batch, in_features, out_features) intermediate tensor, which is hidden behind the F.linear function if we want an memory efficient implementation.

The L1 regularization is now computed as mean absolute value of the spline weights. The authors implementation also includes this term in addition to the sample-based regularization.

reset_parameters()[source]§

property scaled_spline_weight§

update_grid(x: Tensor, margin=0.01)[source]§

Utilities§

timekan.utils.lorenz(length=1200, sigma=10.0, rho=28.0, beta=2.6666666666666665, x0=1.0, y0=0.0, z0=0.0, dt=0.01, window_size=20, train_size=1000)[source]§

Generate training and testing datasets for the Lorenz system (x-coordinate).

Parameters:

length (int) – Total number of points in the time series (default: 1200)
sigma (float) – Lorenz parameter sigma (default: 10.0)
rho (float) – Lorenz parameter rho (default: 28.0)
beta (float) – Lorenz parameter beta (default: 8/3)
x0 (float) – Initial x-coordinate (default: 1.0)
y0 (float) – Initial y-coordinate (default: 0.0)
z0 (float) – Initial z-coordinate (default: 0.0)
dt (float) – Time step for Euler integration (default: 0.01)
window_size (int) – Number of past points to predict the next point (default: 20)
train_size (int) – Number of points for training (default: 1000)

Returns:

(x_train, y_train, x_test, y_test)

x_train (torch.Tensor): Training inputs, shape [980, 20, 1]
y_train (torch.Tensor): Training targets, shape [980]
x_test (torch.Tensor): Testing inputs, shape [180, 20, 1]
y_test (torch.Tensor): Testing targets, shape [180]

Return type:

tuple

timekan.utils.mackey_glass(length=1200, tau=17, a=0.2, b=0.1, n=10, x0=1.2, window_size=20, train_size=1000)[source]§

Generate training and testing datasets for the Mackey-Glass time series.

Parameters:

length (int) – Total number of points in the time series (default: 1200)
tau (int) – Delay parameter (default: 17)
a (float) – Equation parameter ‘a’ (default: 0.2)
b (float) – Equation parameter ‘b’ (default: 0.1)
n (int) – Equation parameter ‘n’ (default: 10)
x0 (float) – Initial condition (default: 1.2)
window_size (int) – Number of past points to predict the next point (default: 20)
train_size (int) – Number of points for training (default: 1000)

Returns:

(x_train, y_train, x_test, y_test)

x_train (torch.Tensor): Training inputs, shape [980, 20, 1]
y_train (torch.Tensor): Training targets, shape [980]
x_test (torch.Tensor): Testing inputs, shape [180, 20, 1]
y_test (torch.Tensor): Testing targets, shape [180]

Return type:

tuple

timekan.utils.rossler(length=1200, a=0.2, b=0.2, c=5.7, x0=0.1, y0=0.1, z0=0.1, dt=0.01, window_size=20, train_size=1000)[source]§

Generate training and testing datasets for the Rössler system (x-coordinate).

Parameters:

length (int) – Total number of points in the time series (default: 1200)
a (float) – Rössler parameter ‘a’ (default: 0.2)
b (float) – Rössler parameter ‘b’ (default: 0.2)
c (float) – Rössler parameter ‘c’ (default: 5.7)
x0 (float) – Initial x-coordinate (default: 0.1)
y0 (float) – Initial y-coordinate (default: 0.1)
z0 (float) – Initial z-coordinate (default: 0.1)
dt (float) – Time step for Euler integration (default: 0.01)
window_size (int) – Number of past points to predict the next point (default: 20)
train_size (int) – Number of points for training (default: 1000)

Returns:

(x_train, y_train, x_test, y_test)

x_train (torch.Tensor): Training inputs, shape [980, 20, 1]
y_train (torch.Tensor): Training targets, shape [980]
x_test (torch.Tensor): Testing inputs, shape [180, 20, 1]
y_test (torch.Tensor): Testing targets, shape [180]

Return type:

tuple