google prediction api, java giving different result to web interface - java

I am using Google's prediction API. I have trained a few models and made predictions using Google's web interface. I want to make a few thousand predictions but the web interface only lets you make one prediction at a time. I have thus slightly adapted the "prediction-cmdline-sample" which is a sample for using the Java library to interface with the Google prediction API. However the results I am getting using the Java library are different to the web interface.
The code which I use to make a prediction is:
private static String predict(Prediction prediction, String text) throws IOException {
Input input = new Input();
InputInput inputInput = new InputInput();
inputInput.setCsvInstance(Collections.<Object>singletonList(text));
input.setInput(inputInput);
Output output = prediction.trainedmodels().predict(PROJECT_ID, MODEL_ID, input).execute();
return output.getOutputValue();
}
The method returns 0.500305 irrespective of what input I give (0.500305 is roughly the average value of the first column of the training data).
Any suggestions to fix this issue would be greatly appreciated.
If anyone knows of another way to make a few thousand predictions please also let me know.

I found my problem: I wasn't formatting my input correctly. The code
Collections.singletonList(text)
is incorrect for double type inputs, it is only correct for String inputs. What made it difficult to find this error is that the prediction API doesn't throw an error if incorrect inputs are given, it just returns some result.

Related

ANTLR - How to determine what kind of parse tree "best fits" some code

I'm building a program with ANTLR where I ask the user to enter some Java code, and it spits out equivalent C# code. In my program, I ask the user to enter some Java code and then parse it. Up until now I've been assuming that they will enter something that will parse as a valid compilation unit on its own, e.g. something like
package foo;
class A { ... }
class B { ... }
class C { ... }
However, that isn't always the case. They might just enter code from the inside of a class:
public void method1() {
...
}
public void method2() {
...
}
Or the inside of a method:
System.out.print("hello ");
System.out.println("world!");
Or even just an expression:
context.getSystemService(Context.ACTIVITY_SERVICE)
If I try to parse such snippets by calling parser.compilationUnit(), it won't work correctly because most of the code is parsed as error nodes. I need to call the correct method depending on the nature of the code, such as parser.expression() or parser.blockStatements(). However, I don't want to ask the user to explicitly indicate this. What's the best way to infer what kind of code I'm parsing?
Rather than trying to guess a valid grammar rule entry point to parse a language snippet of unknown scope, progressively add scope wrappers to the source text until a valid top-level rule parse is achieved.
That is, with each successive parse failure, progressively add dummy package, class, & method statements as source text wrappers.
Whichever wrapper was added to achieve a successful parse will then be a known quantity. Therefore, the parse tree node representing the original source text can be easily identified.
Probably want to use a fail-fast parser; construct the parser with the BailErrorStrategy to obtain this behavior.
Our algorithm in Swiftify tries to select the best suitable parse rule from the defined rule set. This web-service converts Objective-C code fragments to Swift and you can estimate the quality of conversion immediately by your own.
Algorithm
We use open-sourced ObjectiveC grammar. Detail Steps of algorithm look like this:
Parse input Objective-C code fragment with the following rules
translationUnit
implementationDefinitionList
interfaceDeclarationList
expression
compoundStatement
If parse result of the certain rule does not contain any error returns this
rule at once.
Select the rule with the nearest to the end parse error.
If there are two or more rules with the same nearest to the end error
location, select the rule with the minimum number of syntax errors.
Demo
There are test code samples that parsed with different parse rules:
translationUnit: http://swiftify.me/clye5z
implementationDefinitionList: http://swiftify.me/fpasza
interfaceDeclarationList: http://swiftify.me/13rv2j
compoundStatement: http://swiftify.me/4cpl9n
Our algorithm is able to detect suitably parse rule even with an incorrect input:
compoundStatement with errors: http://swiftify.me/13rv2j/1

Incorrect class prediction using Weka

I am using the WEKA API weka-stable-3.8.1.
I have been trying to use J48 decision tree(C4.5 implementation of weka).
My data has around 22 features and a nominal class with 2 possible values : yes or no.
While evaluating with the following code :
Classifier model = (Classifier) weka.core.SerializationHelper.read(trainedModelDestination);
Evaluation evaluation = new Evaluation(trainingInstances);
evaluation.evaluateModel(model, testingInstances);
System.out.println("Number of correct predictions : "+evaluation.correct());
I get all predictions correct.
But when I try these test cases individually using :
for(Instance i : testingInstances){
double predictedClassLabel = model.classifyInstance(i);
System.out.println("predictedClassLabel : "+predictedClassLabel);
}
I always get the same output, i.e. 0.0.
Why is this happening ?
If the provided snippet is indeed from your code, you seem to be always classifying the first test instance: "testingInstances.firstInstance()".
Rather, you may want to make a loop to classify each test instance.
for(Instance i : testingInstances){
double predictedClassLabel = model.classifyInstance(i);
System.out.println("predictedClassLabel : "+predictedClassLabel);
}
Should have updated much sooner.
Here's how I fixed this:
During the training phase, the model learns from your training set. While learning from this set it encounters categorical/nominal features as well.
Most algorithms require numerical values to work. To deal with this the algorithm maps the variables to a specific numerical value. longer explanation here
Since the algorithm has learned this during the training phase, the Instances object holds this information. During testing phase you have to use the same Instances object that was created during training phase. Otherwise, the testing classifier will not correctly map your nominal values to their expected values.
Note:
This kind of encoding gives biased training results in Non-tree based models and things like One-Hot-Encoding should be used in such cases.

Explain the functionality of JSON

I think it is better to understand why I have so many problems with JSON that I explain you what my goal is:
I work with Googles App Engine. There I want to store data. The data looks like
user - username
question - question
date1 - date1
date2 - date2
An Android App have the "simple" function to: Send the data which the user has entered and reviece the data from the complete database.
Ok, fine.
So I searched for a good "API" for that. The question about that was: "how can I read the data" and "how can I sent". The "simple" anwere was: use JSON.. . Many people say's that to me.
The first step was to show the data from the database. I write in python that:
json.dumps({"info": [{'user': 'username1', 'question': 'question1', 'date1':'date1', 'date2':'date1'}, {'user': 'username2', 'question': 'question2', 'date1':'date2', 'date2':'date2'}]})
It works. On the Client site I write in Java these:
JSONObject ob = new JSONObject(result);
JSONArray arNames = ob.getJSONArray("info");
for(int i = 0; i < arNames.length(); i++){
JSONObject c = arNames.getJSONObject(i);
Log.i("name", c.getString("name"));
Log.i("frage", c.getString("question"));
}
These works also.
But (and now the main question about the thread!):
Why we use JSON to format?! Why? I can with this data an other simple "API" without the JSON libarys and classes.
Example:
If I say on the Server site only:
!user:user;question:question;date1:date1;date2:date2
!user:user1;question:question1;data2:date3;date2:date3
... and so one...
On the Client site the same:
[READ THE DATA WITH ClientHTTP]
String[] all = result.Split("!");
for(int i = 0; i<all.length; i+= 1)
{
String[] split2 = all[i].Split(";");
String[] user = split2[0].Split(":");
// user[1] holds now the user
String[] split3 = split2[1].Split(";");
String[] questinn = split3.Split(":");
// question[1] holds now the question
... AND SO ONE!
So, why I use JSON? My option or example do the same. But with my own Syntax..
Thank you for help
JSON is a standard format and it's implementations make it easy to use -- No split() and other stuff necessary. Also, it's supported by all kinds of programming languages (like Python and Java in your own example) and so it provides a simple way to exchange data between completly different systems.
And it's well thought out and could for example also handle questions with ':' or ';' in it. A case where your suggested solution would fail.
I am not sure with JSON but there alreday was a thread explaining JSON (google knows everything). Maybe you can find some help here:
What is JSON and why would I use it?
http://www.copterlabs.com/blog/json-what-it-is-how-it-works-how-to-use-it/
EDIT: I forgot to answer the question why not to use your own function. Of course you can use it and it works. But a lot of services give a JSON to you. It is like a standard. Furthermore there is an JavaClass. So you do not have to do the work which others already have done (see: http://goo.gl/9X4HU)
Best regards
Don't do it by hand, it's error-prione and violates DRY (don't repeat yourself). Instead:
On server use a REST framework that automatically produces JSON. For example RESTEasy. Search the net for examples.
On Android use either built in support for JSON or better use on of well-known and tested libs: GSON or Jackson. See some speed comparisons. Alternativelly you can use Spring Android, which mashes networking+JSON in one easy to use package.
I use JSON in Android because it is lightweight data format which I can easily convert to Java objects using this google library.
You always have 2 possibilities - to use some library, or to write the code by yourself. I'm not saying that using the library is always an option, but in many cases it can save your time and reduce errors. It's up to you to decide.

Output of JESS in Java

I want to send a "fact" to a JESS file within java and get the results back. I basicly batch the JESS file and then send my data (structure in here) into the engine by .add(). I tried to get the JESS results, which should be a string, into a "Value".
Rete engine = new Rete();
engine.batch("file.clp");
Value = AAAnull;
try{
engine.add(structure)
AAA = engine.eval("(run)");
} catch ...
System.out.println(AAA);
The result is always a number, although the result should be a string. I have worked it out in a simple java project and the AAA is returning the string, but here it is not working.
The (run) function returns the number of rules fired; that's the number you're seeing here.
The real results of running your program are the side effects it causes; getting the result in Java depends on what side effects you're expecting. That may mean anything from collecting output printed to the screen, finding newly created facts in working memory, or having your Jess program call Java methods that effect the outside world. Without seeing the contents of file.clp I can't say what you're expecting, but all of these things listed are covered in the Jess manual; the phrases above are links to the appropriate sections. I'm happy to answer any followup questions you might have.

Invoking AS400 RPG From Java

I have a very limitied (0) knowledge on AS400 and RPG. But we have a urgent requirement where we need to invoke a RPG program from a java class. So I found that we can achieve it through JTOpen. But I am stuck at declaring the ProgramParameter list. I have the following information about RPG Program
Program name: ZM30000R
Parameters:
Branch 7,0 (Numeric)
Account type 2 (01-cheque,02 savings)
Account Number 20 (character)
Error code 7 (character)
DR/CR indicater 1 (character D,C)
But no clue about what is the intput and output .How to declare the ProgramParameter. I have done as below. I cannot test as well because I dont have connectivity to these systems.
// Create AS400 Text objects for the different lengths
// of parameters you are sending in.
AS400Text branchTxt = new AS400Text(7);
AS400Text accntTypeTxt = new AS400Text(2);
AS400Text accntNumberTxt = new AS400Text(20);
AS400Text errorCodeTxt = new AS400Text(7);
AS400Text DCIndicatorTxt = new AS400Text(1);
// declare and instantiate your parameter list.
ProgramParameter[] parmList = new ProgramParameter[5];
// assign values to your parameters using the AS400Text class to convert to bytes
// the second parameter is an integer which sets the length of your parameter output
parmList[0] = new ProgramParameter( branchTxt.toBytes(branch),7);
parmList[1] = new ProgramParameter( accntTypeTxt.toBytes(accntTypeTxt),2);
parmList[2] = new ProgramParameter( accntNumberTxt.toBytes(accntNumberTxt),20);
parmList[3] = new ProgramParameter( errorCodeTxt.toBytes(""),7);
parmList[4] = new ProgramParameter( DCIndicatorTxt.toBytes(indicator),5);
Any help will be really highly useful.
Thanks and Regards,
Srinivas
Well, I do have a clue just by the description of the parameters. Branch, account type and account number are IN. You need that information for a financial booking or transaction. The error code is appearently OUT. In my experience with financial systems it's reasonable normal that the API returns the way the amount is booked. Normally one would use the sign, but in financial systems the (D)ebit or (C)redit is the better way.
The API is very likely the API of a financial system. If that is true, then I'm missing the amount. Are you sure you've the complete description?
Notice that the first parameter is numeric. You're not in luck. The iSeries and RPG are not very forgiving about the type of a numeric. One can choose from Bit, Zoned, Packed, Decimal, Integer, Float and so on. If the RPG is really RPG instead of ILE RPG, then you can bring that down to Zoned, Packed and Byte.
I assume you've access to the iSeries. Then you can watch the program call, debug information and dump information. That will help you if you have to do "trial and error". If you don't have access, the road will be very hard. You'll receive an error in your java class if the program call is not succesfull. But it will be hard to identify the real error without the information from the iSeries yourself. Therefore, access is really required.
Your sample is mostly on the right track. But your branch parameter is numeric. So you should use AS400ZonedDecimal instead of AS400Text:
AS400ZonedDecimal branchNbr = new AS400ZonedDecimal(7,0)
The RPG program may be expecting packed instead of zoned. No big deal, just use AS400PackedDecimal instead.
As you construct your ProgramParameter object, your constructor requirements are different depending on if they are input or output parameters to your program. For input parameters, just pass the toBytes() results. There is no need to include the length. For output-only parameters, just pass the length.
I agree with Robert's answer that there is some missing information, but his assumptions on the outputness of the error code seems valid. I would guess, however, that the DCIndicator parameter is input since your sample passes a value. For the error code parameter, after your program call, you'll need to extract the value and do something with it. Given what you have already, here is how the program call would work. Take note that I specified a library name of "MyLibrary". That is for example purposes. You will have to determine which library your program object is in.
ProgramCall pgm = new ProgramCall(as400, QSYSObjectPathName.toPath("MyLibrary","ZM30000R","PGM"), parmList);
if (pgm.run() == true) {
String sErrorCode = (String) errorCodeTxt.toObject(parmList[3].getOutputData());
//Do something with your output data.
}
else {
AS400Message[] messageList = pgm.getMessageList();
for (int i=0; i<messageList.length; i++) {
String sMessageID = messageList[i].getID();
String sMessageText = messageList[i].getText();
//Do something with the error messages
}
}
Something else to consider is library lists. Does the RPG program expect certain libraries to be in the library list? If so, you should issue CommandCalls to add the libraries to the library list before calling the program.
FWIW: It's a lot easier to call IBM i host programs & service programs using PCML rather than ProgramCall.
The compilers will even generate the PCML document for you.
See http://javadoc.midrange.com/jtopen/com/ibm/as400/data/ProgramCallDocument.html for details.
If you don't have connectivity, then you really can't do what is asked. How do you test it? Is there numeric parameters or are they all character?

Categories