r - How to interpret the reconstruction MSE from H2O anomaly detection? -

May 15, 2013

i using h2o anomaly detection in data. data contains several continuous , categorical features , label either 0 or 1. now, because count of 1s less 1%, trying out anomaly detection technique instead of using usual classification methods. however, in end mse calculated per row of data , not sure how interpret able actual label 0 because of anomaly , should 1.

the code using far:

features <- names(train.df)[!names(train.df) %in% c("label")] train.df <- subset(train.df, label==0) train.h <- as.h2o(train.df)  mod.dl <- h2o.deeplearning(   x=features,   autoencoder=true,   training_frame=train.h,   activation=c("tanh"),   hidden=c(10,10), epochs=20, adaptive_rate=false,   variable_importances=true,    l1=1e-4, l2=1e-4,   sparse=true )  pred.oc <- as.data.frame(h2o.anomaly(mod.dl.oc, train.h.oc))

head(pred.oc):

  reconstruction.mse 1        0.012059304 2        0.014490905 3        0.011002231 4        0.013142910 5        0.009631915 6        0.012897779

an autoencoder trying learn nonlinear, reduced representation of original data. unsupervised approach, consider features of data. not approach classification.

the mean square error way see how hard autoencoder represent output. anomalies considered rows/observations high mean squared error.

in case, rows highest mse should considered anomalous. rows 1s, labeled 0. however, conclusion can’t drawn autoencoder approach.

Search This Blog

Live one

r - How to interpret the reconstruction MSE from H2O anomaly detection? -

Comments

Post a Comment

Popular posts from this blog

php - Wordpress website dashboard page or post editor content is not showing but front end data is showing properly -

php - XML feed for Wordpress Social Board plugin modifications -

javascript - Twitter Bootstrap - how to add some more margin between tooltip popup and element -