Skip to content

ae

Autoencoder(module, loss_fn='mse', optimizer_fn='sgd', lr=0.001, is_feature_incremental=False, device='cpu', seed=42, **kwargs)

Bases: DeepEstimator, AnomalyDetector

Wrapper for PyTorch autoencoder models that uses the networks reconstruction error for scoring the anomalousness of a given example.

PARAMETER DESCRIPTION
module

Torch Module that builds the autoencoder to be wrapped. The Module should accept parameter n_features so that the returned model's input shape can be determined based on the number of features in the initial training example.

TYPE: Type[Module]

loss_fn

Loss function to be used for training the wrapped model. Can be a loss function provided by torch.nn.functional or one of the following: 'mse', 'l1', 'cross_entropy', 'binary_crossentropy', 'smooth_l1', 'kl_div'.

TYPE: Union[str, Callable] DEFAULT: 'mse'

optimizer_fn

Optimizer to be used for training the wrapped model. Can be an optimizer class provided by torch.optim or one of the following: "adam", "adam_w", "sgd", "rmsprop", "lbfgs".

TYPE: Union[str, Callable] DEFAULT: 'sgd'

lr

Learning rate of the optimizer.

TYPE: float DEFAULT: 0.001

device

Device to run the wrapped model on. Can be "cpu" or "cuda".

TYPE: str DEFAULT: 'cpu'

seed

Random seed to be used for training the wrapped model.

TYPE: int DEFAULT: 42

**kwargs

Parameters to be passed to the torch.Module class aside from n_features.

DEFAULT: {}

Examples:

>>> from deep_river.anomaly import Autoencoder
>>> from river import metrics
>>> from river.datasets import CreditCard
>>> from torch import nn
>>> import math
>>> from river.compose import Pipeline
>>> from river.preprocessing import MinMaxScaler
>>> dataset = CreditCard().take(5000)
>>> metric = metrics.RollingROCAUC(window_size=5000)
>>> class MyAutoEncoder(torch.nn.Module):
...     def __init__(self, n_features, latent_dim=3):
...         super(MyAutoEncoder, self).__init__()
...         self.linear1 = nn.Linear(n_features, latent_dim)
...         self.nonlin = torch.nn.LeakyReLU()
...         self.linear2 = nn.Linear(latent_dim, n_features)
...         self.sigmoid = nn.Sigmoid()
...
...     def forward(self, X, **kwargs):
...         X = self.linear1(X)
...         X = self.nonlin(X)
...         X = self.linear2(X)
...         return self.sigmoid(X)
>>> ae = Autoencoder(module=MyAutoEncoder, lr=0.005)
>>> scaler = MinMaxScaler()
>>> model = Pipeline(scaler, ae)
>>> for x, y in dataset:
...    score = model.score_one(x)
...    model.learn_one(x=x)
...    metric.update(y, score)
...
>>> print(f"Rolling ROCAUC: {metric.get():.4f}")
Rolling ROCAUC: 0.8901

learn_many(X)

Performs one step of training with a batch of examples.

PARAMETER DESCRIPTION
X

Input batch of examples.

TYPE: DataFrame

learn_one(x, y=None, **kwargs)

Performs one step of training with a single example.

PARAMETER DESCRIPTION
x

Input example.

TYPE: dict

**kwargs

DEFAULT: {}

score_many(X)

Returns an anomaly score for the provided batch of examples in the form of the autoencoder's reconstruction error.

PARAMETER DESCRIPTION
x

Input batch of examples.

RETURNS DESCRIPTION
float

Anomaly scores for the given batch of examples. Larger values indicate more anomalous examples.

score_one(x)

Returns an anomaly score for the provided example in the form of the autoencoder's reconstruction error.

PARAMETER DESCRIPTION
x

Input example.

TYPE: dict

RETURNS DESCRIPTION
float

Anomaly score for the given example. Larger values indicate more anomalous examples.

AutoencoderInitialized(module, loss_fn='mse', optimizer_fn='sgd', lr=0.001, is_feature_incremental=False, device='cpu', seed=42, **kwargs)

Bases: DeepEstimatorInitialized, AnomalyDetector

Represents an initialized autoencoder for anomaly detection and feature learning.

This class is built upon the DeepEstimatorInitialized and AnomalyDetector base classes. It provides methods for performing unsupervised learning through an autoencoder mechanism. The primary objective of the class is to train the autoencoder on input data and compute anomaly scores based on the reconstruction error. It supports learning on individual examples or entire batches of data.

ATTRIBUTE DESCRIPTION
is_feature_incremental

Indicates whether the model is designed to increment features dynamically.

TYPE: bool

module

The PyTorch model representing the autoencoder architecture.

TYPE: Module

loss_fn

Specifies the loss function to compute the reconstruction error.

TYPE: Union[str, Callable]

optimizer_fn

Specifies the optimizer to be used for training the autoencoder.

TYPE: Union[str, Callable]

lr

The learning rate for optimization.

TYPE: float

device

The device on which the model is loaded and trained (e.g., "cpu", "cuda").

TYPE: str

seed

Random seed for ensuring reproducibility.

TYPE: int

learn_many(X)

Performs one step of training with a batch of examples.

PARAMETER DESCRIPTION
X

Input batch of examples.

TYPE: DataFrame

learn_one(x, y=None, **kwargs)

Performs one step of training with a single example.

PARAMETER DESCRIPTION
x

Input example.

TYPE: dict

**kwargs

DEFAULT: {}

score_many(X)

Returns an anomaly score for the provided batch of examples in the form of the autoencoder's reconstruction error.

PARAMETER DESCRIPTION
x

Input batch of examples.

RETURNS DESCRIPTION
float

Anomaly scores for the given batch of examples. Larger values indicate more anomalous examples.

score_one(x)

Returns an anomaly score for the provided example in the form of the autoencoder's reconstruction error.

PARAMETER DESCRIPTION
x

Input example.

TYPE: dict

RETURNS DESCRIPTION
float

Anomaly score for the given example. Larger values indicate more anomalous examples.