Skip to content

Binary classification

Bananas

Summary

Model Accuracy F1 Memory in Mb Time in s
Deep River LSTM 0.606604 0.36219 0.0321245 1183.4
Deep River Logistic 0.527925 0.222981 0.019639 151.192
Deep River MLP 0.548113 0.101313 0.0418444 278.597
Deep River RNN 0.541321 0.271938 0.0323954 444.79
Logistic regression 0.543208 0.197015 0.00424099 16.6572
[baseline] Prior class 0.551236 0.00335289 0.000611305 7.11342

Charts

Elec2

Summary

Model Accuracy F1 Memory in Mb Time in s
Deep River LSTM 0.835751 0.79717 0.0359993 3685.53
Deep River Logistic 0.837372 0.799445 0.0213318 1308.76
Deep River MLP 0.582715 0.291357 0.0435371 2416.23
Deep River RNN 0.833267 0.799565 0.0357666 3638.42
Logistic regression 0.822144 0.777086 0.005373 201.865
[baseline] Prior class 0.575335 0.00248834 0.000611305 86.623

Charts

Phishing

Summary

Model Accuracy F1 Memory in Mb Time in s
Deep River LSTM 0.8696 0.843118 0.0370378 383.293
Deep River Logistic 0.8488 0.832 0.0215311 45.0264
Deep River MLP 0.5528 0.231087 0.0437365 71.7688
Deep River RNN 0.8784 0.86055 0.0373087 110.05
Logistic regression 0.8872 0.871233 0.00556469 6.14968
[baseline] Prior class 0.554844 0.0794702 0.000611305 2.30399

Charts

Datasets

Bananas

Bananas dataset.

An artificial dataset where instances belongs to several clusters with a banana shape. There are two attributes that correspond to the x and y axis, respectively.

Name  Bananas                                                                                                        
Task  Binary classification

Samples 5,300
Features 2
Sparse False
Path /Users/cedrickulbach/Documents/Projects/deep-river/.venv/lib/python3.10/site-packages/river/datasets/banana.zip

Elec2

Electricity prices in New South Wales.

This is a binary classification task, where the goal is to predict if the price of electricity will go up or down.

This data was collected from the Australian New South Wales Electricity Market. In this market, prices are not fixed and are affected by demand and supply of the market. They are set every five minutes. Electricity transfers to/from the neighboring state of Victoria were done to alleviate fluctuations.

  Name  Elec2                                                      
  Task  Binary classification

Samples 45,312
Features 8
Sparse False
Path /Users/cedrickulbach/river_data/Elec2/electricity.csv
URL https://maxhalford.github.io/files/datasets/electricity.zip Size 2.95 MiB
Downloaded True

Phishing

Phishing websites.

This dataset contains features from web pages that are classified as phishing or not.

Name  Phishing                                                                                                            
Task  Binary classification

Samples 1,250
Features 9
Sparse False
Path /Users/cedrickulbach/Documents/Projects/deep-river/.venv/lib/python3.10/site-packages/river/datasets/phishing.csv.gz

Models

Logistic regression

Pipeline (
  StandardScaler (
    with_std=True
  ),
  LogisticRegression (
    optimizer=SGD (
      lr=Constant (
        learning_rate=0.005
      )
    )
    loss=Log (
      weight_pos=1.
      weight_neg=1.
    )
    l2=0.
    l1=0.
    intercept_init=0.
    intercept_lr=Constant (
      learning_rate=0.01
    )
    clip_gradient=1e+12
    initializer=Zeros ()
  )
)

Deep River Logistic

Pipeline (
  StandardScaler (
    with_std=True
  ),
  LogisticRegressionInitialized (
    n_features=10
    n_init_classes=2
    loss_fn="cross_entropy"
    optimizer_fn="sgd"
    lr=0.005
    output_is_logit=True
    is_feature_incremental=True
    is_class_incremental=True
    device="cpu"
    seed=42
    gradient_clip_value=None
  )
)

Deep River MLP

Pipeline (
  StandardScaler (
    with_std=True
  ),
  MultiLayerPerceptronInitialized (
    n_features=10
    n_width=5
    n_layers=5
    n_init_classes=2
    loss_fn="cross_entropy"
    optimizer_fn="sgd"
    lr=0.005
    output_is_logit=True
    is_feature_incremental=True
    is_class_incremental=True
    device="cpu"
    seed=42
    gradient_clip_value=None
  )
)

Deep River LSTM

Pipeline (
  StandardScaler (
    with_std=True
  ),
  LSTMClassifier (
    n_features=10
    hidden_size=32
    n_init_classes=2
    loss_fn="cross_entropy"
    optimizer_fn="adam"
    lr=0.001
    output_is_logit=True
    is_feature_incremental=True
    is_class_incremental=True
    device="cpu"
    seed=42
    gradient_clip_value=None
  )
)

Deep River RNN

Pipeline (
  StandardScaler (
    with_std=True
  ),
  RNNClassifier (
    n_features=10
    hidden_size=32
    num_layers=1
    nonlinearity="tanh"
    n_init_classes=2
    loss_fn="cross_entropy"
    optimizer_fn="adam"
    lr=0.001
    output_is_logit=True
    is_feature_incremental=True
    is_class_incremental=True
    device="cpu"
    seed=42
    gradient_clip_value=None
  )
)

[baseline] Prior class

PriorClassifier ()

Environment

Python implementation: CPython
Python version       : 3.12.12
IPython version      : 9.6.0

river       : 0.22.0
numpy       : 1.26.4
scikit-learn: 1.5.2
pandas      : 2.2.3
scipy       : 1.16.2

Compiler    : Clang 21.1.4 
OS          : Linux
Release     : 6.11.0-1018-azure
Machine     : x86_64
Processor   : x86_64
CPU cores   : 4
Architecture: 64bit