Mallet HMM Training Problems - java

I am struggling at the moment with Mallet's ridiculously poor documentation regarding HMMs. I have managed to import the data into instances(adapted from the ImportExample.java snippet) and I was just wondering how they can be used to train an HMM model.
I first started by creating an HMM instance but I wasn't sure whether to go for:
HMM hmm = new HMM(instances.getDataAlphabet(), instances.getTargetAlphabet());
Or use the same data alphabet twice like so:
HMM hmm = new HMM(instances.getDataAlphabet(), instances.getDataAlphabet());
Either way when I get to
hmm.train(instances);
I get the following error:
cc.mallet.types.FeatureVector cannot be cast to
cc.mallet.types.FeatureVectorSequence
I would be grateful for any help you can provide.
Cheers

I have managed to solve this particular problem and thought it may be useful to others with the same problem. There is a solution within the examples package in mallet: http://hg-iesl.cs.umass.edu/hg/mallet/file/83adf71b0824/src/cc/mallet/examples/TrainHMM.java
The main problem was related to how you imported the data through the pipe. Also from what I can tell it helps if you data is in this format:
TOKEN TAG
TOKEN TAG
I assuming you can have features in between the TOKEN and TAG but am not a 100% sure. If anyone knows of any good examples and documentation about using HMM within mallet, please let me know.

Related

How to write the whole code regarding wrapping text around image

Im trying to implement 'wrap text around image' but I'd rather write it step by step on my own so I can understand it fully.
Can somebody tell me how to do so? Any websites worth recommending regarding this issue?
I think FlowTextView is exactly what you need.

deeplearning4j generate response to input

I have recently been trying to learn DL4J but have run into some issues. They have an example of a neural network generating Shakespeare-like text based off and input character but I can't seem to find anything that wold indicate a possible way of creating a response to an input statement.
I would like to use an input string such as "Hello" and have it be able to generate a response of varying length depended on the input. I would like to know if this is possible using LSTM and have a point in the right direction as I have no idea where to even start.
We have plenty of documentation this actually. This gives you a layout of what an RNN looks like:
http://deeplearning4j.org/usingrnns
The model you would be looking at is character level, in general what you want is question answering though. You may want to look at an architecture like this: https://cs.umd.edu/~miyyer/pubs/2014_qb_rnn.pdf
If you are completely new to NLP, I would look at this class:
https://www.youtube.com/playlist?list=PLhVhwi0Pz282aSA2uZX4jR3SkF3BKyMOK
It covers question answering as well.

Possible alternatives or solution to reading and writing large objects with Gson java

I'm attempting to read and write an object through gson. Early in the project this was completely viable and doing great, but as I wrote more data for that object I eventually ran across something along the lines of this:
I can't seem to grab the full stacktrack seeing as it overflows my console within milliseconds, but I've pastebinned everything my console could grab: http://pastebin.com/v36d5qua
If there is a solution to this, or possibly just a better api for this purpose I would really appreciate some advice.
Current usage: http://pastebin.com/2Yk2v0Tm
GsonUtil.save(player, Player.class, new File("./resources/players/"+player.getId()+".json"));
P.S I'm new to java & this site in general, if I have misleading tags, title etc please let me know.
Don't use gson. It's slow, it's buggy, it's inconvenient to use. Just use org.json - http://theoryapp.com/parse-json-in-java/

Need to export data in CCR format from Java

I'm working in a project which needs to export EHR information in CCR format. I must use Java. The problem that I'm facing is that I can't find an easy way to do it.
The better way to do what I'm doing would be to export as CDA using something like CDAPI but it's overly expensive (30k/year) and complicated. However it shows an example of what I'd like. Something like:
CCR ccr = new CCR();
...
out.print(ccr.toString()); // Returns XML
But it's as if this doesn't exist.
There's CCR4J but it can only read XML files and make Java objects. Not the other way around.
There's Google Health (now discontinued) which might have what I'm looking for, but I can't even figure out how to use it.
There's CCR Binder which has some convenience methods for creating CCR XML from code built on top of Google Health API, but I can't figure out how to use that either.
I could also just read the ASTM CCR Spec and implement something on my own which at this point begins to look like the faster option.
Now I would really like to stay away from Google Health. Seems to be an overkill for my task as is exporting do CDA. Any comments and suggestions are appreciated.
Just for the benefit of people searching for the same info. Here's the CCR Spec.
Sorry for this (very) late answer, but i stumbled uppon this post, cause it's still ranked high in Google if you search for java and CCR. To prevent others from giving up to quick I have to correct you:
With CCR4J you CAN create CCRs from Java Objects (since 2008) and it works like a charm! Not just parsing it from a given file.
Perhaps you just didn't got how to use the library back in time?
So here's a little Example (no valid CCR!) for the next one, who stumble over this post trying to create a CCR with this library:
//New XML-Document
ContinuityOfCareRecordDocument newDoc = ContinuityOfCareRecordDocument.Factory.newInstance();
//New CCR
ContinuityOfCareRecord newCCR = ContinuityOfCareRecord.Factory.newInstance();
//Add Object ID
newCCR.setCCRDocumentObjectID("asdasdbdffdjg343204dsss3490");
//Add new Language
newCCR.addNewLanguage().setText("English");
//Add new Body
newCCR.addNewBody();
//Add new Problem with Code
newCCR.getBody().addNewProblems().addNewProblem().addNewDescription().addNewCode().setCodingSystem("ICD");
newCCR.getBody().getProblems().getProblemArray(0).getDescription().getCodeArray(0).setValue("1225-55558");
//Add CCR to document and save
newDoc.setContinuityOfCareRecord(newCCR);
newDoc.save(new File("My-Generated-CCR.xml"));
I ended up doing something like this:
Video: Quick and Dirty CCR
To summarize: Use JAXB to make the classes them marshall them using JAXB marshaller.

HTML Parser to extract text out of the body (in java)

I am working on this project that requires me to carry out some text manipulation out of the text that I obtain from web pages.
Now, the first step towards doing this would be for me to find a parser that would extract the required body text ignoring the redundant information. I am not sure how I would do this, since I am extremely new to programming. I would really appreciate any help I could get.
Thanks in advance
I found this html parser very useful. It also provides a sample example . http://jericho.htmlparser.net/docs/index.html
I am just now doing it using HTMLParser, available at Sourceforge:
http://sourceforge.net/projects/htmlparser/
Seems very easy and straightforward, but since you claim to be new at this, here is an example with source code:
http://kickjava.com/src/org/htmlparser/parserapplications/StringExtractor.java.htm

Categories