r - How to interpret the reconstruction MSE from H2O anomaly detection? -
i using h2o anomaly detection in data. data contains several continuous , categorical features , label either 0 or 1. now, because count of 1s less 1%, trying out anomaly detection technique instead of using usual classification methods. however, in end mse calculated per row of data , not sure how interpret able actual label 0 because of anomaly , should 1.
the code using far:
features <- names(train.df)[!names(train.df) %in% c("label")] train.df <- subset(train.df, label==0) train.h <- as.h2o(train.df) mod.dl <- h2o.deeplearning( x=features, autoencoder=true, training_frame=train.h, activation=c("tanh"), hidden=c(10,10), epochs=20, adaptive_rate=false, variable_importances=true, l1=1e-4, l2=1e-4, sparse=true ) pred.oc <- as.data.frame(h2o.anomaly(mod.dl.oc, train.h.oc))
head(pred.oc)
:
reconstruction.mse 1 0.012059304 2 0.014490905 3 0.011002231 4 0.013142910 5 0.009631915 6 0.012897779
an autoencoder trying learn nonlinear, reduced representation of original data. unsupervised approach, consider features of data. not approach classification.
the mean square error way see how hard autoencoder represent output. anomalies considered rows/observations high mean squared error.
in case, rows highest mse should considered anomalous. rows 1s, labeled 0. however, conclusion can’t drawn autoencoder approach.
Comments
Post a Comment