Binary classification¶
Bananas¶
Summary¶
| Model | Accuracy | F1 | Memory in Mb | Time in s |
|---|---|---|---|---|
| Deep River LSTM | 0.606604 | 0.36219 | 0.0321245 | 1183.4 |
| Deep River Logistic | 0.527925 | 0.222981 | 0.019639 | 151.192 |
| Deep River MLP | 0.548113 | 0.101313 | 0.0418444 | 278.597 |
| Deep River RNN | 0.541321 | 0.271938 | 0.0323954 | 444.79 |
| Logistic regression | 0.543208 | 0.197015 | 0.00424099 | 16.6572 |
| [baseline] Prior class | 0.551236 | 0.00335289 | 0.000611305 | 7.11342 |
Charts¶
Elec2¶
Summary¶
| Model | Accuracy | F1 | Memory in Mb | Time in s |
|---|---|---|---|---|
| Deep River LSTM | 0.835751 | 0.79717 | 0.0359993 | 3685.53 |
| Deep River Logistic | 0.837372 | 0.799445 | 0.0213318 | 1308.76 |
| Deep River MLP | 0.582715 | 0.291357 | 0.0435371 | 2416.23 |
| Deep River RNN | 0.833267 | 0.799565 | 0.0357666 | 3638.42 |
| Logistic regression | 0.822144 | 0.777086 | 0.005373 | 201.865 |
| [baseline] Prior class | 0.575335 | 0.00248834 | 0.000611305 | 86.623 |
Charts¶
Phishing¶
Summary¶
| Model | Accuracy | F1 | Memory in Mb | Time in s |
|---|---|---|---|---|
| Deep River LSTM | 0.8696 | 0.843118 | 0.0370378 | 383.293 |
| Deep River Logistic | 0.8488 | 0.832 | 0.0215311 | 45.0264 |
| Deep River MLP | 0.5528 | 0.231087 | 0.0437365 | 71.7688 |
| Deep River RNN | 0.8784 | 0.86055 | 0.0373087 | 110.05 |
| Logistic regression | 0.8872 | 0.871233 | 0.00556469 | 6.14968 |
| [baseline] Prior class | 0.554844 | 0.0794702 | 0.000611305 | 2.30399 |
Charts¶
Datasets¶
Bananas
Bananas dataset.
An artificial dataset where instances belongs to several clusters with a banana shape. There are two attributes that correspond to the x and y axis, respectively.
Name Bananas
Task Binary classification
Samples 5,300
Features 2
Sparse False
Path /Users/cedrickulbach/Documents/Projects/deep-river/.venv/lib/python3.10/site-packages/river/datasets/banana.zip
Elec2
Electricity prices in New South Wales.
This is a binary classification task, where the goal is to predict if the price of electricity will go up or down.
This data was collected from the Australian New South Wales Electricity Market. In this market, prices are not fixed and are affected by demand and supply of the market. They are set every five minutes. Electricity transfers to/from the neighboring state of Victoria were done to alleviate fluctuations.
Name Elec2
Task Binary classification
Samples 45,312
Features 8
Sparse False
Path /Users/cedrickulbach/river_data/Elec2/electricity.csv
URL https://maxhalford.github.io/files/datasets/electricity.zip
Size 2.95 MiB
Downloaded True
Phishing
Phishing websites.
This dataset contains features from web pages that are classified as phishing or not.
Name Phishing
Task Binary classification
Samples 1,250
Features 9
Sparse False
Path /Users/cedrickulbach/Documents/Projects/deep-river/.venv/lib/python3.10/site-packages/river/datasets/phishing.csv.gz
Models¶
Logistic regression
Pipeline (
StandardScaler (
with_std=True
),
LogisticRegression (
optimizer=SGD (
lr=Constant (
learning_rate=0.005
)
)
loss=Log (
weight_pos=1.
weight_neg=1.
)
l2=0.
l1=0.
intercept_init=0.
intercept_lr=Constant (
learning_rate=0.01
)
clip_gradient=1e+12
initializer=Zeros ()
)
)
Deep River Logistic
Pipeline (
StandardScaler (
with_std=True
),
LogisticRegressionInitialized (
n_features=10
n_init_classes=2
loss_fn="cross_entropy"
optimizer_fn="sgd"
lr=0.005
output_is_logit=True
is_feature_incremental=True
is_class_incremental=True
device="cpu"
seed=42
gradient_clip_value=None
)
)
Deep River MLP
Pipeline (
StandardScaler (
with_std=True
),
MultiLayerPerceptronInitialized (
n_features=10
n_width=5
n_layers=5
n_init_classes=2
loss_fn="cross_entropy"
optimizer_fn="sgd"
lr=0.005
output_is_logit=True
is_feature_incremental=True
is_class_incremental=True
device="cpu"
seed=42
gradient_clip_value=None
)
)
Deep River LSTM
Pipeline (
StandardScaler (
with_std=True
),
LSTMClassifier (
n_features=10
hidden_size=32
n_init_classes=2
loss_fn="cross_entropy"
optimizer_fn="adam"
lr=0.001
output_is_logit=True
is_feature_incremental=True
is_class_incremental=True
device="cpu"
seed=42
gradient_clip_value=None
)
)
Deep River RNN
Pipeline (
StandardScaler (
with_std=True
),
RNNClassifier (
n_features=10
hidden_size=32
num_layers=1
nonlinearity="tanh"
n_init_classes=2
loss_fn="cross_entropy"
optimizer_fn="adam"
lr=0.001
output_is_logit=True
is_feature_incremental=True
is_class_incremental=True
device="cpu"
seed=42
gradient_clip_value=None
)
)
[baseline] Prior class
PriorClassifier ()
Environment¶
Python implementation: CPython Python version : 3.12.12 IPython version : 9.6.0 river : 0.22.0 numpy : 1.26.4 scikit-learn: 1.5.2 pandas : 2.2.3 scipy : 1.16.2 Compiler : Clang 21.1.4 OS : Linux Release : 6.11.0-1018-azure Machine : x86_64 Processor : x86_64 CPU cores : 4 Architecture: 64bit