r - How to interpret the reconstruction MSE from H2O anomaly detection? -


i using h2o anomaly detection in data. data contains several continuous , categorical features , label either 0 or 1. now, because count of 1s less 1%, trying out anomaly detection technique instead of using usual classification methods. however, in end mse calculated per row of data , not sure how interpret able actual label 0 because of anomaly , should 1.

the code using far:

features <- names(train.df)[!names(train.df) %in% c("label")] train.df <- subset(train.df, label==0) train.h <- as.h2o(train.df)  mod.dl <- h2o.deeplearning(   x=features,   autoencoder=true,   training_frame=train.h,   activation=c("tanh"),   hidden=c(10,10), epochs=20, adaptive_rate=false,   variable_importances=true,    l1=1e-4, l2=1e-4,   sparse=true )  pred.oc <- as.data.frame(h2o.anomaly(mod.dl.oc, train.h.oc)) 

head(pred.oc):

  reconstruction.mse 1        0.012059304 2        0.014490905 3        0.011002231 4        0.013142910 5        0.009631915 6        0.012897779 

an autoencoder trying learn nonlinear, reduced representation of original data. unsupervised approach, consider features of data. not approach classification.

the mean square error way see how hard autoencoder represent output. anomalies considered rows/observations high mean squared error.

in case, rows highest mse should considered anomalous. rows 1s, labeled 0. however, conclusion can’t drawn autoencoder approach.


Comments

Popular posts from this blog

php - Wordpress website dashboard page or post editor content is not showing but front end data is showing properly -

javascript - Get parameter of GET request -

javascript - Twitter Bootstrap - how to add some more margin between tooltip popup and element -