Skip to content

ae

Autoencoder(module, loss_fn='mse', optimizer_fn='sgd', lr=0.001, is_feature_incremental=False, device='cpu', seed=42, **kwargs)

Bases: DeepEstimator, AnomalyDetector

Wrapper for PyTorch autoencoder models that uses the networks reconstruction error for scoring the anomalousness of a given example.

PARAMETER DESCRIPTION
module

Torch Module that builds the autoencoder to be wrapped. The Module should accept parameter n_features so that the returned model's input shape can be determined based on the number of features in the initial training example.

TYPE: Type[Module]

loss_fn

Loss function to be used for training the wrapped model. Can be a loss function provided by torch.nn.functional or one of the following: 'mse', 'l1', 'cross_entropy', 'binary_crossentropy', 'smooth_l1', 'kl_div'.

TYPE: Union[str, Callable] DEFAULT: 'mse'

optimizer_fn

Optimizer to be used for training the wrapped model. Can be an optimizer class provided by torch.optim or one of the following: "adam", "adam_w", "sgd", "rmsprop", "lbfgs".

TYPE: Union[str, Callable] DEFAULT: 'sgd'

lr

Learning rate of the optimizer.

TYPE: float DEFAULT: 0.001

device

Device to run the wrapped model on. Can be "cpu" or "cuda".

TYPE: str DEFAULT: 'cpu'

seed

Random seed to be used for training the wrapped model.

TYPE: int DEFAULT: 42

**kwargs

Parameters to be passed to the torch.Module class aside from n_features.

DEFAULT: {}

Examples:

>>> from deep_river.anomaly import Autoencoder
>>> from river import metrics
>>> from river.datasets import CreditCard
>>> from torch import nn
>>> import math
>>> from river.compose import Pipeline
>>> from river.preprocessing import MinMaxScaler
>>> dataset = CreditCard().take(5000)
>>> metric = metrics.ROCAUC(n_thresholds=50)
>>> class MyAutoEncoder(torch.nn.Module):
...     def __init__(self, n_features, latent_dim=3):
...         super(MyAutoEncoder, self).__init__()
...         self.linear1 = nn.Linear(n_features, latent_dim)
...         self.nonlin = torch.nn.LeakyReLU()
...         self.linear2 = nn.Linear(latent_dim, n_features)
...         self.sigmoid = nn.Sigmoid()
...
...     def forward(self, X, **kwargs):
...         X = self.linear1(X)
...         X = self.nonlin(X)
...         X = self.linear2(X)
...         return self.sigmoid(X)
>>> ae = Autoencoder(module=MyAutoEncoder, lr=0.005)
>>> scaler = MinMaxScaler()
>>> model = Pipeline(scaler, ae)
>>> for x, y in dataset:
...    score = model.score_one(x)
...    model.learn_one(x=x)
...    metric.update(y, score)
...
>>> print(f"ROCAUC: {metric.get():.4f}")
ROCAUC: 0.7812

learn_many(X)

Performs one step of training with a batch of examples.

PARAMETER DESCRIPTION
X

Input batch of examples.

TYPE: DataFrame

RETURNS DESCRIPTION
Autoencoder

The model itself.

learn_one(x, y=None, **kwargs)

Performs one step of training with a single example.

PARAMETER DESCRIPTION
x

Input example.

TYPE: dict

**kwargs

DEFAULT: {}

RETURNS DESCRIPTION
Autoencoder

The model itself.

score_many(X)

Returns an anomaly score for the provided batch of examples in the form of the autoencoder's reconstruction error.

PARAMETER DESCRIPTION
x

Input batch of examples.

RETURNS DESCRIPTION
float

Anomaly scores for the given batch of examples. Larger values indicate more anomalous examples.

score_one(x)

Returns an anomaly score for the provided example in the form of the autoencoder's reconstruction error.

PARAMETER DESCRIPTION
x

Input example.

TYPE: dict

RETURNS DESCRIPTION
float

Anomaly score for the given example. Larger values indicate more anomalous examples.