Accessing weights and raw activations of all layers in deeplearning4j - java

My goal is to visualize a model classifying an image. For the visualization I need the raw activations / outputs of each layer. Is there a way to access these when predicting? Furthermore, it would be very helpful if there is a way to access the weights. However, this is only optional.
The models to visualize are built dynamically and will be used to classify images of the MNIST and EMNIST data sets.
model.summary() of an exemplary model:
=======================================================================
LayerName (LayerType) nIn,nOut TotalParams ParamsShape
=======================================================================
layer0 (DenseLayer) 784,200 157.000 W:{784,200}, b:{1,200}
layer1 (DenseLayer) 200,100 20.100 W:{200,100}, b:{1,100}
layer2 (OutputLayer) 100,10 1.010 W:{100,10}, b:{1,10}
-----------------------------------------------------------------------
Total Parameters: 178.110
Trainable Parameters: 178.110
Frozen Parameters: 0
=======================================================================
The code for image classification:
INDArray reshaped = reshapeImage(image);
int predictedIndex = model.predict(reshaped)[0];
double conf = model.output(reshaped).getDouble(predictedIndex);
If you need more information / code snippets, please let me know.

Related

H2O : NullPointerException error while building ensemble model using deep learning grid

I am trying to build a stacked ensemble model to predict merchant churn using R (version 3.3.3) and deep learning in h2o (version 3.10.5.1). The response variable is binary. At the moment I am trying run the code to build a stacked ensemble model using the top 5 models developed by the grid search. However, when the code is run, I get the java.lang.NullPointerException error with the following output:
java.lang.NullPointerException
at hex.StackedEnsembleModel.checkAndInheritModelProperties(StackedEnsembleModel.java:265)
at hex.ensemble.StackedEnsemble$StackedEnsembleDriver.computeImpl(StackedEnsemble.java:115)
at hex.ModelBuilder$Driver.compute2(ModelBuilder.java:173)
at water.H2O$H2OCountedCompleter.compute(H2O.java:1349)
at jsr166y.CountedCompleter.exec(CountedCompleter.java:468)
at jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:263)
at jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:974)
at jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1477)
at jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
Below is the code that I've used to do the hyper-parameter grid search and build the ensemble model:
hyper_params <- list(
activation=c("Rectifier","Tanh","Maxout","RectifierWithDropout","TanhWithDropout","MaxoutWithDropout"),
hidden=list(c(50,50),c(30,30,30),c(32,32,32,32,32),c(64,64,64,64,64),c(100,100,100,100,100)),
input_dropout_ratio=seq(0,0.2,0.05),
l1=seq(0,1e-4,1e-6),
l2=seq(0,1e-4,1e-6),
rho = c(0.9,0.95,0.99,0.999),
epsilon=c(1e-10,1e-09,1e-08,1e-07,1e-06,1e-05,1e-04)
)
search_criteria <- list(
strategy = "RandomDiscrete",
max_runtime_secs = 3600,
max_models = 100,
seed=1234,
stopping_metric="misclassification",
stopping_tolerance=0.01,
stopping_rounds=5
)
dl_ensemble_grid <- h2o.grid(
hyper_params = hyper_params,
search_criteria = search_criteria,
algorithm="deeplearning",
grid_id = "final_grid_ensemble_dl",
x=predictors,
y=response,
training_frame = h2o.rbind(train, valid, test),
nfolds=5,
fold_assignment="Modulo",
keep_cross_validation_predictions = TRUE,
keep_cross_validation_fold_assignment = TRUE,
epochs=12,
max_runtime_secs = 3600,
stopping_metric="misclassification",
stopping_tolerance=0.01,
stopping_rounds=5,
seed = 1234,
max_w2=10
)
DLsortedGridEnsemble_logloss <- h2o.getGrid("final_grid_ensemble_dl",sort_by="logloss",decreasing=FALSE)
ensemble <- h2o.stackedEnsemble(x = predictors,
y = response,
training_frame = h2o.rbind(train,valid,test),
base_models = list(
DLsortedGridEnsemble_logloss#model_ids[[1]],
DLsortedGridEnsemble_logloss#model_ids[[2]],
DLsortedGridEnsemble_logloss#model_ids[[3]],
DLsortedGridEnsemble_logloss#model_ids[[4]],
DLsortedGridEnsemble_logloss#model_ids[[5]],
)
Note: what I have realised so far is that h2o.stackedEnsemble function works when there's only one base model and it gives the Java error as soon as there's two or more base models.
I would really appreciate if I could get some feedback as to how this could be resolved.
The error refers to a line of the StackedEnsembleModel.java code that checks that the training_frame in the base models and the training_frame in h2o.stackedEnsemble() have the same checksum. I think the problem is caused because you dynamically created the training frame, rather than defining it explicitly (even though that should work since it's the same data in the end). So, rather than setting training_frame = h2o.rbind(train, valid, test) in the grid and ensemble functions, set the following at the top of your code:
df <- h2o.rbind(train, valid, test)
And then set training_frame = df in the grid and ensemble functions.
As a side note, you may get better DL models if you use a validation frame (for early stopping), rather than using all your data for the training frame. Also, if you want to use all the models in your grid (might lead to better performance, but not always), you can set base_models = DLsortedGridEnsemble_logloss#model_ids in the h2o.stackedEnsemble() function.

Gremlin get all incoming and outgoing vertex, including their edges and directions

I spent a week at Gremlin shell trying to compose one query to
get all incoming and outgoing vertexes, including their edges and directions. All i tried everything.
g.V("name","testname").bothE.as('both').select().back('both').bothV.as('bothV').select(){it.map()}
output i need is (just example structure ):
[v{'name':"testname"}]___[ine{edge_name:"nameofincomingedge"}]____[v{name:'nameofconnectedvertex']
[v{'name':"testname"}]___[oute{edge_name:"nameofoutgoingedge"}]____[v{name:'nameofconnectedvertex']
So i just whant to get 1) all Vertices with exact name , edge of each this vertex (including type inE or outE), and connected Vertex. And ideally after that i want to get their map() so i'l get complete object properties. i dont care about the output style, i just need all of information present, so i can manipulate with it after. I need this to train my Gremlin, but Neo4j examples are welcome. Thanks!
There's a variety of ways to approach this. Here's a few ideas that will hopefully inspire you to an answer:
gremlin> g = TinkerGraphFactory.createTinkerGraph()
==>tinkergraph[vertices:6 edges:6]
gremlin> g.V('name','marko').transform{[v:it,inE:it.inE().as('e').outV().as('v').select().toList(),outE:it.outE().as('e').inV().as('v').select().toList()]}
==>{v=v[1], inE=[], outE=[[e:e[9][1-created->3], v:v[3]], [e:e[7][1-knows->2], v:v[2]], [e:e[8][1-knows->4], v:v[4]]]}
The transform converts the incoming vertex to a Map and does internal traversal over in/out edges. You could also use path as follows to get a similar output:
gremlin> g.V('name','marko').transform{[v:it,inE:it.inE().outV().path().toList().toList(),outE:it.outE().inV().path().toList()]}
==>{v=v[1], inE=[], outE=[[v[1], e[9][1-created->3], v[3]], [v[1], e[7][1-knows->2], v[2]], [v[1], e[8][1-knows->4], v[4]]]}
I provided these answers using TinkerPop 2.x as that looked like what you were using as judged from the syntax. TinkerPop 3.x is now available and if you are just getting started, you should take a look at the latest that has to offer:
http://tinkerpop.incubator.apache.org/
Under 3.0 syntax you might do something like this:
gremlin> g.V().has('name','marko').as('a').bothE().bothV().where(neq('a')).path()
==>[v[1], e[9][1-created->3], v[3]]
==>[v[1], e[7][1-knows->2], v[2]]
==>[v[1], e[8][1-knows->4], v[4]]
I know that you wanted to know what the direction of the edge in the output but that's easy enough to detect on analysis of the path.
UPDATE: Here's the above query written with Daniel's suggestion of otherV usage:
gremlin> g.V().has('name','marko').bothE().otherV().path()
==>[v[1], e[9][1-created->3], v[3]]
==>[v[1], e[7][1-knows->2], v[2]]
==>[v[1], e[8][1-knows->4], v[4]]
To see the data from this you can use by() to pick apart each Path object - The extension to the above query applies valueMap to each piece of each Path:
gremlin> g.V().has('name','marko').bothE().otherV().path().by(__.valueMap(true))
==>[{label=person, name=[marko], id=1, age=[29]}, {label=created, weight=0.4, id=9}, {label=software, name=[lop], id=3, lang=[java]}]
==>[{label=person, name=[marko], id=1, age=[29]}, {label=knows, weight=0.5, id=7}, {label=person, name=[vadas], id=2, age=[27]}]
==>[{label=person, name=[marko], id=1, age=[29]}, {label=knows, weight=1.0, id=8}, {label=person, name=[josh], id=4, age=[32]}]

Spark Performance on "Small Data"

I'm hoping that someone familiar with Spark can give me a "gut check" on whether I'm likely abusing the SparkML framework or if the performance I'm seeing is understandable given the context (#rows, #features).
Briefly, I have a small dataset (~150 rows) that is fairly wide (~180 features). I have coded up analogous Lasso training codes in Spark and Scikit-learn, which result in identical models (same model coefficients and LOOCVE). However, the Spark code takes roughly 100x longer (sklearn takes about 5 seconds, close to 600 secs.
I understand that Spark is optimized for large distributed datasets and that this difference can reasonably attributed to overhead latency that would be hidden by data parallelism, but this still feels extremely sluggish.
The spark code is essentially:
//... code to add a number of PipelineStages to a List<PipelineStage> (~90 UnaryTransformer stages), ending in a StandardScaler
// Add Lasso model
LinearRegression lasso = new LinearRegression()
.setLabelCol(response)
.setFeaturesCol("normed_features")
.setMaxIter(100000)
.setPredictionCol(response+"_prediction")
.setElasticNetParam(1.0)
.setFitIntercept(true)
.setRegParam(0.2);
// stages is the List<PipelineStage> loaded with 90 or so UnaryTransformer steps
stages.add(lasso);
Pipeline pipeline = new Pipeline(stages);
DataFrame df = getTrainingData(trainingData, response);
RegressionEvaluator evaluator = new RegressionEvaluator()
.setLabelCol(response)
.setMetricName("mae")
.setPredictionCol(response+"_prediction")
);
df.cache();
ParamMap[] paramGrid = new ParamGridBuilder().build();
CrossValidator cv = new CrossValidator()
.setEstimator(pipeline)
.setEvaluator(evaluator)
.setEstimatorParamMaps(paramGrid)
.setNumFolds(20);
double cve = cv.fit(df).avgMetrics()[0];
the Python code uses Lasso and GridSearchCV with the same #folds (20).
Unfortunately, I can't really provide a MWE as we use a custom Transformer that I'd have to paste in, but I'm wondering if anyone would be willing to weigh in on whether this runtime difference between sklearn and spark implies user error. The only good practice I am knowingly applying is caching the training DataFrame before fitting the CrossValidator.

how to get correctly matched features

I am making features detection in two different images, and the resultant image from the Descriptors Matcher contains features that do not belong to each other as see in the img_1 below.
The steps I followed are as follows:
Features detection using SIFT algorithm. and this step yields MatOdKeyPoint object for each image, which means
MatKeyPts_1 and MatKeyPts_2
Descriptors extractor using SURF algorithm.
Descriptors matching using BRUTFORCE algorithm. the code of this step is posted below, and the descriptor extractor of the query_img and the train_img were used as input in this step. I am also using my own classes that I created to control and maintain this process.
The problem is, the result from step 3 is the image posted below img_1, which has completely non-similar features linked to eah others, i expected to see,for an example, the specific region of the hand(img_1, right) is linked to the similar feature in the hand in the (img_1,left), but as you see i got mixed and unrelated features.
My question is how to get correct features matching using SIFT and SURF as features detectors and descriptor extractor respectively?
private static void descriptorMatcher() {
// TODO Auto-generated method stub
MatOfDMatch matDMatch = new MatOfDMatch();//empty MatOfDmatch object
dm.match(matFactory.getComputedDescExtMatAt(0), matFactory.getComputedDescExtMatAt(1), matDMatch);//descriptor extractor of the query and the train image are used as parameters
matFactory.addRawMatchesMatDMatch(matDMatch);
/*writing the raw MatDMatches*/
Mat outImg = new Mat();
Features2d.drawMatches(matFactory.getMatAt(0), matFactory.getMatKeyPtsAt(0), matFactory.getMatAt(1), matFactory.getMatKeyPtsAt(1), MatFactory.lastAddedObj(matFactory.getRawMatchesMatDMatchList()),
outImg);
matFactory.addRawMatchedImage(outImg);
MatFactory.writeMat(FilePathUtils.newOutputPath(SystemConstants.RAW_MATCHED_IMAGE), MatFactory.lastAddedObj(matFactory.getRawMatchedImageList()));//this produce img_2 below posted
/*getting the top 10 shortest distance*/
List<DMatch> dMtchList = matDMatch.toList();
List<DMatch> goodDMatchList = MatFactory.getTopGoodMatches(dMtchList, 0, 10);//this method sort the dMatchList ascendingly and picks onlt the top 10 distances and assign these values to goodDMatchList
/*converting the goo DMatches to MatOfDMatches*/
MatOfDMatch goodMatDMatches = new MatOfDMatch();
goodMatDMatches.fromList(goodDMatchList);
matFactory.addGoodMatchMatDMatch(goodMatDMatches);
/*drawing the good matches and writing the good matches images*/
Features2d.drawMatches(matFactory.getMatAt(0), matFactory.getMatKeyPtsAt(0), matFactory.getMatAt(1), matFactory.getMatKeyPtsAt(1), MatFactory.lastAddedObj(matFactory.getGoodMatchMatDMatchList()),
outImg);
MatFactory.writeMat(FilePathUtils.newOutputPath(SystemConstants.GOOD_MATCHED_IMAGE), outImg);// this produce img_1 below posted
}
Img_1

Copying Mat to raw array in OpenCV with Java? (Getting "multiple of channels count" error)

I'm trying to load an image in Scala using OpenCV with the Java bindings. After loading the image, I'd like to convert it to a traditional Scala Array[Float].
Following the suggestions in this post, I implemented the following code to achieve this:
val image = Highgui.imread(imgName)
image.convertTo(image, CvType.CV_32FC1) //convert 8-bit char -> single channel 32-bit float
val s = image.size()
val height = s.height.asInstanceOf[Int]
val width = s.width.asInstanceOf[Int]
val nChannels = image.channels()
printf("img size = %d, %d, %d \n", height, width, nChannels); // 512, 512, 3
//thanks: http://answers.opencv.org/question/4761/mat-to-byte-array/
val imageInFloats = new Array[Float](height * width * image.channels())
image.get(0, 0, imageInFloats)
When compiling the code, I get the following error:
[error] (run-main) java.lang.UnsupportedOperationException:
Provided data element number (1) should be multiple of the Mat channels count (3)
java.lang.UnsupportedOperationException: Provided data element number (1) should
be multiple of the Mat channels count (3)
at org.opencv.core.Mat.get(Mat.java:2587)
at HelloOpenCV$.main(conv.scala:25)
...
There are a couple of reasons why this error doesn't make sense to me:
The image should be 1-channel because we do convertTo(...32FC1). Printing image.channels() reveals that there are 3 channels. Huh?
The size of imageInFloats is a multiple of image.channels(). I think this contradicts the error message about it not being a multiple of the number of channels.
Why does this code throw the should be a multiple of Mat channels count error?
Configuration details:
sbt 0.12.4
OpenCV 2.4.9
Final notes:
There's a more lightweight Scala library that would work as well as OpenCV for loading images into Scala. I'm using OpenCV at the for this because I've been doing a bunch of other vision stuff in Scala with OpenCV. That said, I'm willing to explore other libraries for image I/O.
if you do like : Highgui.imread(imgName) , it loads it as a 3 channel rgb image.
it should work, as you expected, if you either Highgui.imread(imgName,0) ( load as grayscale ) or apply cvtColor() to do a manual conversion.

Categories