port java lambda predicate to scala - java

How can I port https://github.com/davidmoten/rtree2/blob/master/src/test/java/com/github/davidmoten/rtree2/LatLongExampleTest.java#L55
Iterables.filter(tree
// do the first search using the bounds
.search(bounds),
// refine using the exact distance
entry -> {
Point p = entry.geometry();
Position position = Position.create(p.y(), p.x());
return from.getDistanceToKm(position) < distanceKm;
});
from Java to scala? My approach below fails:
import com.github.davidmoten.grumpy.core.Position
import com.github.davidmoten.rtree2.{Iterables, RTree}
import com.github.davidmoten.rtree2.geometry.{Geometries, Point}
val sydney = Geometries.point(151.2094, -33.86)
val canberra = Geometries.point(149.1244, -35.3075)
val brisbane = Geometries.point(153.0278, -27.4679)
val bungendore = Geometries.point(149.4500, -35.2500)
var tree = RTree.star.create[String, Point]
tree = tree.add("Sydney", sydney)
tree = tree.add("Brisbane", brisbane)
val distanceKm = 300
val list = Iterables.toList(search(tree, canberra, distanceKm))
def createBounds(from: Position, distanceKm: Double) = { // this calculates a pretty accurate bounding box. Depending on the
// performance you require you wouldn't have to be this accurate because
// accuracy is enforced later
val north = from.predict(distanceKm, 0)
val south = from.predict(distanceKm, 180)
val east = from.predict(distanceKm, 90)
val west = from.predict(distanceKm, 270)
Geometries.rectangle(west.getLon, south.getLat, east.getLon, north.getLat)
}
import com.github.davidmoten.grumpy.core.Position
import com.github.davidmoten.rtree2.RTree
def search[T](tree: RTree[String, Point], lonLat: Point, distanceKm: Double) = { // First we need to calculate an enclosing lat long rectangle for this
// distance then we refine on the exact distance
val from = Position.create(lonLat.y, lonLat.x)
val bounds = createBounds(from, distanceKm)
Iterables.filter(tree.search // do the first search using the bounds
(bounds), // refine using the exact distance
(entry) => {
def foo(entry) = {
val p = entry.geometry
val position = Position.create(p.y, p.x)
from.getDistanceToKm(position) < distanceKm
}
foo(entry)
})
}
as the type of entry does not seem to be well defined.

tree has type RTree[String, Point] so T=String, S=Point. So tree.search(bounds) has type Iterable[Entry[String, Point]]. So entry has type Entry[String, Point].
Try
(entry: Entry[String,Point]) => {
def foo(entry: Entry[String,Point]) = {
val p = entry.geometry
val position = Position.create(p.y, p.x)
from.getDistanceToKm(position) < distanceKm
}
foo(entry)
})
Tested in Scala 2.13.0, rtree2 0.9-RC1, grumpy-core 0.2.4.
In 2.11 this should be just
import scala.compat.java8.FunctionConverters._
((entry: Entry[String,Point]) => {
def foo(entry: Entry[String,Point]) = {
val p = entry.geometry
val position = Position.create(p.y, p.x)
from.getDistanceToKm(position) < distanceKm
}
foo(entry)
}).asJava
libraryDependencies += "org.scala-lang.modules" %% "scala-java8-compat" % "0.9.0"

Related

How to Implement Three Level Expandable list/Recycler View/Tree View with Radio Groups in the third level

I am stuck in a situation where I have to use 3 level Expandable list and in the lowest level/ 3 level there will be radio group. 2nd level can have multiple sub levels. In other words 2nd level can have multiple children.
I have tried Expandable List View inside Recycler View but since the Recycler View inflates the view in the OnCreateView. I cannot change the size of the view in the OnBindHolder. Similar problem is with the ExpandableListView inside ExpandableListView.
I have tried AndroidTreeView library.
https://github.com/bmelnychuk/AndroidTreeView
It works fine for the check boxes as I do not have to handle the checks of the other boxes. But in case of Radio Buttons. I am unable to create a radio group so that only one item is selected at a time in the 3rd level. Since I do not have control over the other radio buttons.
I have tried to modify the OnViewHolder Class for the Third level. but I am unable to handle the checks of the other radio buttons properly and even if I do it. The activity has to be restarted to apply the changes which produces a jerk. That is something I do not want.
Here is my third level OnViewHolderClass:
class ThirdListViewHolder(context: Context, var reloadView: () -> Unit) :
TreeNode.BaseNodeViewHolder<ThirdListViewHolder.IconTreeItem>(context) {
override fun createNodeView(node: TreeNode, value: IconTreeItem): View {
val inflater = LayoutInflater.from(context)
val view = inflater.inflate(R.layout.item_child_view_package_order_items_expandable_list_view, null, false)
val tvValue = view.findViewById(R.id.select_item_check_box) as RadioButton
tvValue.text = value.text
tvValue.tag = value.tag
tvValue.setOnCheckedChangeListener { buttonView, isChecked ->
if (isChecked) {
val getTag = tvValue.tag.toString().split("_")
for (i in 0 until checkListControl.size) {
val grandParentID = checkListControl.get(i).grandParentID
val parentID = checkListControl.get(i).parentID
val childID = checkListControl.get(i).childID
val isChecked = checkListControl.get(i).isChecked
if (grandParentID.equals(getTag[0]) && parentID.equals(getTag[1]) && childID.equals(getTag[2])) {
val oldItem = CheckListControl(getTag[0], getTag[1], getTag[2], tvValue.text.toString(), false)
checkListControl.remove(oldItem)
val newItem = CheckListControl(getTag[0], getTag[1], getTag[2], tvValue.text.toString(), true)
tvValue.isChecked = true
checkListControl.add(newItem)
break
}
}
}
reloadView()
}
return view
}
class IconTreeItem {
var icon: Int = 0
var tag: String? = null
var text: String? = null
var isChecked: Boolean = false
}
fun manageList(tag: String, name: String, isChecked: Boolean) {
Log.d("tag_value", tag)
var parentId: String? = null
var itemId: String? = null
val ids = tag.split("_")
parentId = (ids.get(0).toInt() + 1).toString()
for (i in 0 until PackageOrderItemsActivity.itemIdsList.size) {
if (PackageOrderItemsActivity.itemIdsList.get(i).name.equals(name)) {
itemId = PackageOrderItemsActivity.itemIdsList.get(i).item_id
}
}
if (isChecked) {
//Add
if (PackageOrderItemsActivity.ParentChildHashMap.containsKey(parentId)) {
val array = PackageOrderItemsActivity.ParentChildHashMap.get(parentId)
if (!array!!.contains(itemId)) {
array.add(itemId!!)
PackageOrderItemsActivity.ParentChildHashMap.put(parentId, array)
}
} else {
val array: ArrayList<String> = ArrayList()
array.add(itemId!!)
PackageOrderItemsActivity.ParentChildHashMap.put(parentId, array)
}
} else {
if (PackageOrderItemsActivity.ParentChildHashMap.containsKey(parentId)) {
val array = PackageOrderItemsActivity.ParentChildHashMap.get(parentId)
if (array!!.contains(itemId)) {
array.remove(itemId)
if (array.isEmpty())
PackageOrderItemsActivity.ParentChildHashMap.remove(parentId)
else
PackageOrderItemsActivity.ParentChildHashMap.put(parentId, array)
}
}
}
Log.d("hashMap", PackageOrderItemsActivity.ParentChildHashMap.toString())
}
}
And here is my code for the TreeView creation in the Activity:
fun populateTree() {
val root = TreeNode.root()
var parent: TreeNode? = null
// val parent = TreeNode("parentList")
// val child0 = TreeNode("secondList")
// val child1 = TreeNode("thirdList")
for (i in 0 until parentList.size) {
var child1: TreeNode? = null
var child2: TreeNode? = null
val nodeItem = ParentListViewHolder.IconTreeItem()
nodeItem.text = parentList.get(i)
parent = TreeNode(nodeItem).setViewHolder(ParentListViewHolder(this#PackageOrderItemsActivity))
for (j in 0 until secondList.size) {
val nodeItem1 = SecondListViewHolder.IconTreeItem()
nodeItem1.text = secondList.get(j)
child1 = TreeNode(nodeItem1).setViewHolder(SecondListViewHolder(this#PackageOrderItemsActivity))
parent.addChild(child1)
if (thirdList.containsKey(secondList.get(j))) {
val list = thirdList.get(secondList.get(j))!!
Log.d("firstList", list.toString())
for (l in 0 until list.size) {
val nodeItem2 = ThirdListViewHolder.IconTreeItem()
nodeItem2.text = list.get(l)
nodeItem2.tag = i.toString() + "_" + j + "_" + l
if(firstTime) {
val checkListItem =
CheckListControl(i.toString(), j.toString(), l.toString(), list.get(l), false)
checkListControl.add(checkListItem)
}
for (m in 0 until checkListControl.size) {
val grandParentId = checkListControl.get(m).grandParentID
val parentId = checkListControl.get(m).parentID
val childId = checkListControl.get(m).childID
val isChecked = checkListControl.get(m).isChecked
if (grandParentId.equals(i.toString()) && parentId.equals(j.toString()) &&
childId.equals(l.toString())
) {
nodeItem2.isChecked = isChecked
}
}
child2 =
TreeNode(nodeItem2).setViewHolder(
ThirdListViewHolder(
this#PackageOrderItemsActivity, reloadView = {
// populateTree()
// init()
ParentChildHashMap.clear()
parentList.clear()
secondList.clear()
thirdList.clear()
container.removeAllViews()
getSubscriptionDetails()
})
)
child1.addChild(child2)
if (nodeItem2.isChecked) {
parent.isExpanded = true
child1.isExpanded = true
}
}
}
}
firstTime = false
root.addChild(parent)
}
tView = AndroidTreeView(this#PackageOrderItemsActivity, root)
container.addView(tView!!.view)
}

How can i calculate True Positive Rate (TPR) and False Positive Rate (FPR) for different threshold values to generate ROC for classification model

I build a machine learning model to classify documents using NaiveBayesMultinomial. I am using Java Weka Api to train and test model. To evaluate model performance I want to generate ROC curve. I do not understand how to calculate TPR and FPR for different threshold values. I attached my source code and sample dataset. I would be very grateful if anyone help me to calculate TPR and FPR for different threshold values for generating ROC curve. Thanks in advance for your help.
My Java Code:
package smote;
import java.io.File;
import java.util.Random;
import weka.classifiers.Classifier;
import weka.classifiers.bayes.NaiveBayesMultinomial;
import weka.core.Instance;
import weka.core.Instances;
import weka.core.converters.ConverterUtils.DataSource;
import weka.filters.Filter;
import weka.filters.unsupervised.attribute.StringToWordVector;
public class calRoc {
public static void main(String agrs[]) throws Exception{
String fileRootPath = "...../DocsFIle.arff";
Instances rawData = DataSource.read(fileRootPath);
StringToWordVector filter = new StringToWordVector(10000);
filter.setInputFormat(rawData);
String[] options = { "-W", "10000", "-L", "-M", "2",
"-stemmer",
"weka.core.stemmers.IteratedLovinsStemmer",
"-stopwords-handler",
"weka.core.stopwords.Rainbow",
"-tokenizer",
"weka.core.tokenizers.AlphabeticTokenizer"
};
filter.setOptions(options);
filter.setIDFTransform(true);
filter.setStopwords(new
File("/Research/DoctoralReseacher/IEICE/Dataset/stopwords.txt"));
Instances data = Filter.useFilter(rawData,filter);
data.setClassIndex(0);
int numRuns = 10;
double[] recall=new double[numRuns];
double[] precision=new double[numRuns];
double[] fmeasure=new double[numRuns];
double tp, fp, fn, tn;
String classifierName[] = { "NBM"};
double totalPrecision,totalRecall,totalFmeasure;
totalPrecision=totalRecall=totalFmeasure=0;
double avgPrecision, avgRecall, avgFmeasure;
avgPrecision=avgRecall=avgFmeasure=0;
for(int run = 0; run < numRuns; run++) {
Classifier classifier = null;
classifier = new NaiveBayesMultinomial();
int folds = 10;
Random random = new Random(1);
data.randomize(random);
data.stratify(folds);
tp = fp = fn = tn = 0;
for (int i = 0; i < folds; i++) {
Instances trains = data.trainCV(folds, i,random);
Instances tests = data.testCV(folds, i);
classifier.buildClassifier(trains);
for (int j = 0; j < tests.numInstances(); j++) {
Instance instance = tests.instance(j);
double classValue = instance.classValue();
double result = classifier.classifyInstance(instance);
if (result == 0.0 && classValue == 0.0) {
tp++;
} else if (result == 0.0 && classValue == 1.0) {
fp++;
} else if (result == 1.0 && classValue == 0.0) {
fn++;
} else if (result == 1.0 && classValue == 1.0) {
tn++;
}
}
}
if (tn + fn > 0)
precision[run] = tn / (tn + fn);
if (tn + fp > 0)
recall[run] = tn / (tn + fp);
if (precision[run] + recall[run] > 0)
fmeasure[run] = 2 * precision[run] * recall[run] / (precision[run] + recall[run]);
System.out.println("The "+(run+1)+"-th run");
System.out.println("Precision: " + precision[run]);
System.out.println("Recall: " + recall[run]);
System.out.println("Fmeasure: " + fmeasure[run]);
totalPrecision+=precision[run];
totalRecall+=recall[run];
totalFmeasure+=fmeasure[run];
}
avgPrecision=totalPrecision/numRuns;
avgRecall=totalRecall/numRuns;
avgFmeasure=totalFmeasure/numRuns;
System.out.println("avgPrecision: " + avgPrecision);
System.out.println("avgRecall: " + avgRecall);
System.out.println("avgFmeasure: " + avgFmeasure);
}
}
Sample Dataset with few instances:
#relation 'CamelBug'
#attribute Feature string
#attribute class-att {0,1}
#data
'XQuery creates an empty out message that makes it impossible to chain
more processors behind it ',1
'org apache camel Message hasAttachments is buggy ',0
'unmarshal new JaxbDataFormat com foo bar returning JAXBElement ',0
'Can t get the soap header when the camel cxf endpoint working in the
PAYLOAD data fromat ',0
'camel jetty Exchange failures should not be returned as ',1
'Delayer not working as expected ',1
'ParallelProcessing and executor flags are ignored in Multicast
processor ',1

Loading sklearn model in Java. Model created with DNNClassifier in python

The goal is to open in Java a model created/trained in python with tensorflow.contrib.learn.learn.DNNClassifier.
At the moment the main issue is to know the name of the "tensor" to give in java on the session runner method.
I have this test code in python :
from __future__ import division, print_function, absolute_import
import tensorflow as tf
import pandas as pd
import tensorflow.contrib.learn as learn
import numpy as np
from sklearn import metrics
from sklearn.cross_validation import train_test_split
from tensorflow.contrib import layers
from tensorflow.contrib.learn.python.learn.utils import input_fn_utils
from tensorflow.python.ops import array_ops
from tensorflow.python.framework import dtypes
from tensorflow.python.util.compat import as_text
print(tf.VERSION)
df = pd.read_csv('../NNNormalizeData-out.csv')
inputs = []
target = []
y=0;
for x in df.columns:
if y != 35 :
#print("added %d" %y)
inputs.append(x)
else :
target.append(x)
y+=1
total_inputs,total_output = df.as_matrix(inputs).astype(np.float32),df.as_matrix([target]).astype(np.int32)
train_inputs, test_inputs, train_output, test_output = train_test_split(total_inputs, total_output, test_size=0.2, random_state=42)
feature_columns = [tf.contrib.layers.real_valued_column("", dimension=train_inputs.shape[1],dtype=tf.float32)]
#target_column = [tf.contrib.layers.real_valued_column("output", dimension=train_output.shape[1])]
classifier = learn.DNNClassifier(hidden_units=[10, 20, 5], n_classes=5
,feature_columns=feature_columns)
classifier.fit(train_inputs, train_output, steps=100)
#Save Model into saved_model.pbtxt file (possible to Load in Java)
tfrecord_serving_input_fn = tf.contrib.learn.build_parsing_serving_input_fn(layers.create_feature_spec_for_parsing(feature_columns))
classifier.export_savedmodel(export_dir_base="test", serving_input_fn = tfrecord_serving_input_fn,as_text=True)
# Measure accuracy
pred = list(classifier.predict(test_inputs, as_iterable=True))
score = metrics.accuracy_score(test_output, pred)
print("Final score: {}".format(score))
# test individual samples
sample_1 = np.array( [[0.37671986791414125,0.28395908337619136,-0.0966095873607713,-1.0,0.06891621389763203,-0.09716678086712205,0.726029084013637,4.984689881073479E-4,-0.30296253267499107,-0.16192917054985334,0.04820256230479658,0.4951319883569152,0.5269983894210499,-0.2560313828048315,-0.3710980821053321,-0.4845867212612598,-0.8647234314469595,-0.6491591208322198,-1.0,-0.5004549422844073,-0.9880910165770813,0.5540293108747256,0.5625990251930839,0.7420121698556554,0.5445551415657979,0.4644276850235627,0.7316976292340245,0.636690006814346,0.16486621649984112,-0.0466018967678159,0.5261100063227044,0.6256168612312738,-0.544295484930702,0.379125782517193,0.6959368575211544]], dtype=float)
sample_2 = np.array( [[1.0,0.7982741870963959,1.0,-0.46270838239235024,0.040320274521029376,0.443451913224413,-1.0,1.0,1.0,-1.0,0.36689718911339564,-0.13577379160035796,-0.5162916256414466,-0.03373651520104648,1.0,1.0,1.0,1.0,0.786999801054777,-0.43856035121103853,-0.8199093927945158,1.0,-1.0,-1.0,-0.1134921695894473,-1.0,0.6420892436196663,0.7871737734493178,1.0,0.6501788845358409,1.0,1.0,1.0,-0.17586627413625022,0.8817194210401085]], dtype=float)
pred = list(classifier.predict(sample_2, as_iterable=True))
print("Prediction for sample_1 is:{} ".format(pred))
pred = list(classifier.predict_proba(sample_2, as_iterable=True))
print("Prediction for sample_2 is:{} ".format(pred))
A model_saved.pbtxt file is created.
I try to load this model in Java with the following code :
public class HelloTF {
public static void main(String[] args) throws Exception {
SavedModelBundle bundle=SavedModelBundle.load("/java/workspace/APIJavaSampleCode/tfModels/dnn/ModelSave","serve");
Session s = bundle.session();
double[] inputDouble = {1.0,0.7982741870963959,1.0,-0.46270838239235024,0.040320274521029376,0.443451913224413,-1.0,1.0,1.0,-1.0,0.36689718911339564,-0.13577379160035796,-0.5162916256414466,-0.03373651520104648,1.0,1.0,1.0,1.0,0.786999801054777,-0.43856035121103853,-0.8199093927945158,1.0,-1.0,-1.0,-0.1134921695894473,-1.0,0.6420892436196663,0.7871737734493178,1.0,0.6501788845358409,1.0,1.0,1.0,-0.17586627413625022,0.8817194210401085};
float [] inputfloat=new float[inputDouble.length];
for(int i=0;i<inputfloat.length;i++)
{
inputfloat[i]=(float)inputDouble[i];
}
Tensor inputTensor = Tensor.create(new long[] {35}, FloatBuffer.wrap(inputfloat) );
Tensor result = s.runner()
.feed("input_example_tensor", inputTensor)
.fetch("dnn/multi_class_head/predictions/probabilities")
.run().get(0);
float[] m = new float[5];
float[] vector = result.copyTo(m);
float maxVal = 0;
int inc = 0;
int predict = -1;
for(float val : vector)
{
System.out.println(val+" ");
if(val > maxVal) {
predict = inc;
maxVal = val;
}
inc++;
}
System.out.println(predict);
}
}
I get the error on the .run().get(0); line :
Exception in thread "main" org.tensorflow.TensorFlowException: Output 0 of type float does not match declared output type string for node _recv_input_example_tensor_0 = _Recv[_output_shapes=[[-1]], client_terminated=true, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=3663984897684684554, tensor_name="input_example_tensor:0", tensor_type=DT_STRING, _device="/job:localhost/replica:0/task:0/cpu:0"]()
at org.tensorflow.Session.run(Native Method)
at org.tensorflow.Session.access$100(Session.java:48)
at org.tensorflow.Session$Runner.runHelper(Session.java:285)
at org.tensorflow.Session$Runner.run(Session.java:235)
at tensorflow.HelloTF.main(HelloTF.java:35)
Ok I finally Solve : the main problem was the name of the input to use in java that is ""dnn/input_from_feature_columns/input_from_feature_columns/concat" and not "input_example_tensor".
I have discover this using the graph navigation with: tensorboard --logdir=D:\python\Workspace\Autoencoder\src\dnn\ModelSave
here is the java code :
public class HelloTF {
public static void main(String[] args) throws Exception {
SavedModelBundle bundle=SavedModelBundle.load("/java/workspace/APIJavaSampleCode/tfModels/dnn/ModelSave","serve");
Session s = bundle.session();
double[] inputDouble = {1.0,0.7982741870963959,1.0,-0.46270838239235024,0.040320274521029376,0.443451913224413,-1.0,1.0,1.0,-1.0,0.36689718911339564,-0.13577379160035796,-0.5162916256414466,-0.03373651520104648,1.0,1.0,1.0,1.0,0.786999801054777,-0.43856035121103853,-0.8199093927945158,1.0,-1.0,-1.0,-0.1134921695894473,-1.0,0.6420892436196663,0.7871737734493178,1.0,0.6501788845358409,1.0,1.0,1.0,-0.17586627413625022,0.8817194210401085};
float [] inputfloat=new float[inputDouble.length];
for(int i=0;i<inputfloat.length;i++)
{
inputfloat[i]=(float)inputDouble[i];
}
FloatBuffer.wrap(inputfloat) );
float[][] data= new float[1][35];
data[0]=inputfloat;
Tensor inputTensor=Tensor.create(data);
Tensor result = s.runner()
.feed("dnn/input_from_feature_columns/input_from_feature_columns/concat", inputTensor)
//.feed("input_example_tensor", inputTensor)
//.fetch("tensorflow/serving/classify")
.fetch("dnn/multi_class_head/predictions/probabilities")
//.fetch("dnn/zero_fraction_3/Cast")
.run().get(0);
float[][] m = new float[1][5];
float[][] vector = result.copyTo(m);
float maxVal = 0;
int inc = 0;
int predict = -1;
for(float val : vector[0])
{
System.out.println(val+" ");
if(val > maxVal) {
predict = inc;
maxVal = val;
}
inc++;
}
System.out.println(predict);
}
}
I have tested the output :
phyton side :
Prediction for sample_2 is:[3]
Prediction for sample_2 is:[array([ 0.17157166, 0.24475774, 0.16158019, 0.24648622, 0.17560424], dtype=float32)]
Java Side :
0.17157166
0.24475774
0.16158019
0.24648622
0.17560424
3
The error message offers a clue: the tensor named "input_example_tensor" in the model expects to have string contents, whereas you provided float values.
Judging by the name of the tensor and your code, I'd guess that the tensor you're feeding is defined in input_fn_utils.py. This tensor is passed to the tf.parse_example() op, which expects a vector of tf.train.Example protocol buffers, serialized as strings.
I got an error without feed("input_example_tensor", inputTensor) on Tensorflow 1.1.
But I found that example.proto can be fed as "input_example_tensor", although it took a lot of time to figure out how to create string tensors for serialized protocol buffer.
This is how I created inputTensor.
org.tensorflow.example.Example.Builder example = org.tensorflow.example.Example.newBuilder();
/* set some features to example... */
Tensor exampleTensor = Tensor.create(example.build().toByteArray());
// Here, the shape of exampleTensor is not specified yet.
// Set the shape to feed this as "input_example_tensor"
Graph g = bundle.graph();
Output examplePlaceholder =
g.opBuilder("Placeholder", "example")
.setAttr("dtype", exampleTensor.dataType())
.build().output(0);
Tensor shapeTensor = Tensor.create(new long[]{1}, IntBuffer.wrap(new int[]{1}));
Output shapeConst = g.opBuilder("Const", "shape")
.setAttr("dtype", shapeTensor.dataType())
.setAttr("value", shapeTensor)
.build().output(0);
Output shaped = g.opBuilder("Reshape", "output").addInput(examplePlaceholder).addInput(shapeConst).build().output(0);
Tensor inputTensor = s.runner().feed(examplePlaceholder, exampleTensor).fetch(shaped).run().get(0);
// Now, inputTensor has shape of [1] and ready to feed.
Your parameters in .feed() and .fetch() should be matching with your input and output datatype.
You can look at your savedmodel.pbtxt file. There are details about your paramaters and their input/output types.
For instance,
my java code
Tensor result = s.runner()
.feed("ParseExample/ParseExample", inputTensor)
.fetch("dnn/binary_logistic_head/predictions/probabilities")
.run().get(0);
my savedModel.pbtxt (part of it)
node {
name: "ParseExample/ParseExample"
op: "ParseExample"
input: "input_example_tensor"
input: "ParseExample/ParseExample/names"
input: "ParseExample/ParseExample/dense_keys_0"
input: "ParseExample/Const"
attr {
key: "Ndense"
value {
i: 1
}
}
attr {
key: "Nsparse"
value {
i: 0
}
}
attr {
key: "Tdense"
value {
list {
type: DT_FLOAT
}
}
}
attr {
key: "_output_shapes"
value {
list {
shape {
dim {
size: -1
}
dim {
size: 2
}
}
}
}
}
attr {
key: "dense_shapes"
value {
list {
shape {
dim {
size: 2
}
}
}
}
}
attr {
key: "sparse_types"
value {
list {
}
}
}
}
outputs {
key: "scores"
value {
name: "dnn/binary_logistic_head/predictions/probabilities:0"
dtype: DT_FLOAT
tensor_shape {
dim {
size: -1
}
dim {
size: 2
}
}
}
}
They both compatible with my datatype, float.

Appending a new column to existing CSV file in Spark with Java

I have found a solution to my problem here Create new column with function in Spark Dataframe
But i am having difficulty in converting the below code to Java since it's in Scala
import org.apache.spark.sql.functions._
val myDF = sqlContext.parquetFile("hdfs:/to/my/file.parquet")
val coder: (Int => String) = (arg: Int) => {if (arg < 100) "little" else "big"}
val sqlfunc = udf(coder)
myDF.withColumn("Code", sqlfunc(col("Amt")))
Can someone provide me the Java equivalent code for this?. I am stuck in converting below 2 lines
val coder: (Int => String) = (arg: Int) => {if (arg < 100) "little" else "big"}
val sqlfunc = udf(coder)
Thanks,
Create your User Defined Function:
public class CodeUdf implements UDF1<Integer, String>{
#Override
public String call(Integer integer) throws Exception {
if(integer < 100)
return "little";
else
return"big";
}
}
Tell Spark about it
sqlContext.udf().register("Code", new CodeUdf(), DataTypes.IntegerType);
Use it in a select.
df.selectExpr("value", "Code(value)").show();
import org.apache.spark.sql.functions._
val myDF = sqlContext.parquetFile("hdfs:/to/my/file.parquet")
//val coder: (Int => String) = (arg: Int) => {if (arg < 100) "little" else "big"}
//val sqlfunc = udf(coder)
myDF.selectExpr("Code", "case when Amt < 100 'little' else 'big' end ")

Converting an iterative function to recursive

I am trying to convert an iterative function to Recursion.
But once I tried to do that it is runnning continuously like an infinite loop.
This is my iterative code
private static Node buildModelTree(String[] args) {
// TODO Auto-generated method stub
String clsIndex = args[3];
splitted.add(currentsplit);
double entropy = 0;
int total_attributes = (Integer.parseInt(clsIndex));// class index
int split_size = splitted.size();
GainRatio gainObj = new GainRatio();
while (split_size > current_index) { //iterate through all distinct pair for building children
currentsplit = (SplitInfo) splitted.get(current_index);
System.out.println("After currentsplit --->" + currentsplit);
gainObj = new GainRatio();
int res = 0;
res = ToolRunner.run(new Configuration(),new CopyOfFunID3Driver(), args);
gainObj.getcount(current_index);
entropy = gainObj.currNodeEntophy();
clsIndex = gainObj.majorityLabel();
currentsplit.classIndex = clsIndex;
if (entropy != 0.0 && currentsplit.attr_index.size() != total_attributes) { //calculate gain ration
bestGain(total_attributes,entropy,gainObj);
} else {
//When entropy is zero build tree
Node branch = new Node();
String rule = "";
Gson gson = new Gson();
int temp_size = currentsplit.attr_index.size();
for (int val = 0; val < temp_size; val++) {
int g = 0;
g = (Integer) currentsplit.attr_index.get(val);
if (val == 0) {
rule = g + " " + currentsplit.attr_value.get(val);
//JSON
// branch.add(g, currentsplit.attr_value.get(val).toString(), new Node(currentsplit.classIndex, true));
} else {
rule = rule + " " + g + " "+ currentsplit.attr_value.get(val);
//branch.add(g, currentsplit.attr_value.get(val).toString(), buildModelTree(args));
}
}
rule = rule + " " + currentsplit.classIndex;
}
split_size = splitted.size();
current_index++;
}
}
where all should I make change?
I am trying to build tree. So inoredr to get the tree structure I am trying to make my id3 code recursive.
with my current code I am only getting output as this ,But I want it as tree structure
Please suggest.
The Recursion algorithm must have following
1.Each time the function invokes itself, the Problem size has to be reduced.
(ie. If suppose first you are calling the function with array of size n, then the next time it has to be lesser than n.
Base Case - the condition for the return statement.
(For example, if the array size is 0 then return)
In your code, these two are missing.
You're keep on calling the function with the same size of array. That's the problem.
Thanks

Categories