Has anyone ever persisted a training set for CI-Bayes? I have sample code from this site: http://www.theserverside.com/news/thread.tss?thread_id=49773
here is the code:
FisherClassifier fc=new FisherClassifierImpl();
fc.train("The quick brown fox jumps over the lazy dog's tail","good");
fc.train("Make money fast!", "bad");
String classification=fc.getClassification("money", "unknown"); // should be "bad"
so I need to be able to store the training set in a local file.
Has anyone ever done this before?
To persist a java Object in a local file, the Object must first implement the Serializable interface.
import java.io.Serializable;
public class MyClass implements Serializable {...
Then, the class from which you would like to persist this training set, should include a method like:
public void persistTrainingSet(FisherClassifier fc) {
String outputFile = <path/to/output/file>;
try {
FileOutputStream fos = new FileOutputStream(outputFile);
ObjectOutputStream oos = new ObjectOutputStream(fos);
oos.writeObject(fc);
oos.close();
}
catch (IOException e) {
//handle exception
}
finally {
//do any cleaning up
}
}
I have. After doing a couple projects with CI-Bayes, I would recommend you look elsewhere (of course this was a long time ago). It is a very bad idea to use an inference engine that needs to be trained before each use and if you really consider the issue of state management, it's complicated (e.g. do you want to just store the training data? or perhaps the trained distributions? chains?).
CI-Bayes is also kind of a convoluted codebase. It was modeled off some Python code that appeared in a book about intelligence. The Java version is not very well designed. It also does not use TDD, does not really have JavaDoc to speak of.
That said, you can get a simple classifier going pretty quickly. The longer term goal is the one you asked about though.
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
Does anyone know how I can include encryption and decryption in my code? I am using FileInput and FileOutput Stream for serialized files. I have an arraylist of students, and I have an arraylist of books. I can save and read them from their individual files. But for security, I want to encrypt and decrypt them. How do I do that?
private static void ReadBook() {
try {
FileInputStream fi = new FileInputStream("bookData.ser");
ObjectInputStream oi = new ObjectInputStream(fi);
bookList = (ArrayList<Book>) oi.readObject();
oi.close();
} catch (Exception e) {
e.printStackTrace();
}
}
protected static void SaveBook(ArrayList<Book> books) {
ArrayList<Book> tempbookList = books;
try {
FileOutputStream fs = new FileOutputStream("bookData.ser");
ObjectOutputStream os = new ObjectOutputStream(fs);
os.reset();
os.writeObject(tempbookList);
os.close();
} catch (Exception e) {
e.printStackTrace();
}
}
private static void ReadStudent() {
try {
FileInputStream fi = new FileInputStream("studentData.ser");
ObjectInputStream oi = new ObjectInputStream(fi);
studentList = (ArrayList<Student>) oi.readObject();
oi.close();
} catch (Exception e) {
e.printStackTrace();
}
}
protected static void SaveStudent(ArrayList<Student> students) {
ArrayList<Student> tempstudentList = students;
try {
FileOutputStream fs = new FileOutputStream("studentData.ser");
ObjectOutputStream os = new ObjectOutputStream(fs);
os.reset();
os.writeObject(tempstudentList);
os.close();
} catch (Exception e) {
e.printStackTrace();
}
What you want isn't possible without an external secret. The problem is, you can 'encrypt' this, but it's not actually encryption (just obfuscation) unless there is a key involved, and the point of a key is: If you know it, you can decrypt it.
So, where does the key come from? You can't hardcode it into your source (sources can be decompiled or just opened with a hex editor), you can't load it off of a file (because anybody that can fetch the encrypted file can also fetch the file with the key in it and thus now they have all they need to decrypt the data). You can try to add layers into this, but it's turtles all the way down: If the application itself can obtain the secret, and the unauthorized person has full access to the computer that the application runs on, this is just not possible.
One way out is to actually say that the owner of the computer doesn't own it - this gets us into messing with security chips such as apple's T2 or the windows ecosystem's TPM. You can't interact with these from java without native code.
Another much simpler way out is to ensure that the application cannot decrypt the data unassisted. Simply ask the user for a password every time they start up the app. Then as long as the app is open, any hacker can just memorydump the VM and get the data, but once the app is closed and the memory is cleaned up (a little tricky at times), it's a secret again.
First think about those more high flying concepts of exactly which scenarios you want to protect and how you want to protect them. Only after that is it time to think about how you implement such things.
Seriously: Write down james bond scenarios. Rate them according to how much you want to protect against them (hint: It won't come for free).
For example: If the computer is stolen, I want to be able to say that as long as the power was pulled and the thieves aren't doing crazy stuff such as pulling the memory chips and blasting a can of CO2 at it to freeze them - I want the data to be gone - that's workable. But note that this is far better achieved by the user themselves: Have the OS apply full disk encryption. They'll do a far better job than you can, and those DO get to enjoy the benefits of security chips (TPM or T2, for example).
Another example: "Someone with a little knowledge and access to the room, I want to prevent these people from looking at the data". That's VERY tricky, they can use physical keyloggers (stick a tiny little USB dongle in between keyboard and system, or install a camera pointing at screen and keyboard) or just open the computer up and install a custom boot. If you want to keep those out, we need to talk about securing the case, or protecting the room itself with physical alarm systems, custom devices, or other extreme measures. It's good to know that this particular threat (so-called 'evil maid attack') is most likely not what you want to protect against (security involves tradeoffs. To properly assess tradeoffs, you need these scenarios).
I am having trouble deserializing objects that contain an enum. The object serializes without complaint, but I get an InvalidObjectException when I deserialize the object. The exception message says that there is "No enum constant com.mypackagname."
I have isolated and reproduced the problem by creating some test code based on the testSerialization() method in SerializationTest.java.
public class SerializationTest {
private static final String TEST_FILE_NAME = "serialization-test.bin";
public enum Gender { MALE, FEMALE }
public void testEnumSerialization() throws IOException, ClassNotFoundException {
Gender gender = Gender.MALE;
// Save the enum to a file.
ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream(TEST_FILE_NAME));
out.writeObject(gender);
out.close();
// Read back the enum.
ObjectInputStream in = new ObjectInputStream(new FileInputStream(TEST_FILE_NAME));
Gender gender2 = (Gender) in.readObject();
in.close();
}
}
I have discovered that if I add a string value to the enum initialization in the generated Objective C code the deserialization works fine. The resulting initialize method in Obj C looks like this:
+ (void)initialize {
if (self == [SerializationTest_Gender class]) {
JreEnum(SerializationTest_Gender, MALE) = new_SerializationTest_Gender_initWithNSString_withInt_(#"MALE", 0);
JreEnum(SerializationTest_Gender, FEMALE) = new_SerializationTest_Gender_initWithNSString_withInt_(#"FEMALE", 1);
J2OBJC_SET_INITIALIZED(SerializationTest_Gender)
}
}
Note that I added the #"MALE" and #"FEMALE", the default from the j2objc output is #"".
I have two questions. (1) Is this the correct way to enable a round trip serialization/deserialization of enums? (2) If so, is there a way to have j2objc automatically populate the string constants in the enum rather than coding them by hand?
Thanks for any help you can provide.
We probably broke this with a recent change eliminating redundant enum constant name strings. We had the name defined both in the enum's class initializer and in its metadata, plus we had an important request to stop making enum constants easily discovered in app binaries (apparently tech writers have been known to dump early access binaries and run strings on them to get scoops on any new features). Now the constant name is only in the metadata (no redundancy), and if an app builds with --strip-reflection, the enum has no metadata and the name becomes the enum class plus the constant's ordinal. However, serialization support was overlooked since Google apps use protocol buffers instead (faster and less version-sensitive).
Thanks for the excellent test case, which will make it easier to fix. Please file a bug if you want to be notified when this is fixed.
I'm still working on the project I already needed a bit of help with:
JavaFX - TableView doesn't update items
Now I want to understand how this whole Serialization process in Java works, because unfortunately, I don't really get it now.
Before I go on, first of all, I'm a student, I'm not a professional. Second, I'm neither familiar with using DBs, nor XML or JSON, so I'd just like to find solution to my approach, no matter how inelegant it might be in the end, it just needs to work. So please don't feel offended if I just reject any advice in using other techniques.
So here's what I want:
Saving three different class objects to separate files BUT maintaining backward compatibility to each of it. The objects are Settings, Statistics and a "database" object, containing all words in a list added to it. In the future I may add more statistics or settings, means adding new variables, mostly type of IntegerProperty or DoubleProperty.
Now the question is: is it possible to load old version saved files and then during the process just initiate new variables not found in the old version with just null but keep the rest as it has been saved?
All I know is that the first thing to do so is not to alter the serialVersionUID.
Another thing would be saving the whole Model object (which contains the three objects mentioned before), so I just have to implement stuff for one class instead of three. But how would that work then concerning backward compatibility? I mean the class itself would not change but it's attributes in their own class structure.
Finally, what approach should I go for? And most of all, how do I do this and maintaning backward compatibilty at the same time? I do best with some concrete examples rather than plain theory.
Here are two example methods, if it's of any help. I already have methods for each class to write and read an object.
public static void saveModel(Model model, String destination) throws IOException
{
try
{
fileOutput = new FileOutputStream(destination);
objectOutput = new ObjectOutputStream(fileOutput);
objectOutput.writeObject(model);
}
catch (IOException e)
{
e.printStackTrace();
}
finally
{
if (objectOutput != null)
try
{
objectOutput.close();
}
catch (IOException e) {}
if (fileOutput != null)
try
{
fileOutput.close();
}
catch (IOException e) {}
}
}
public static Settings readSettings(String destination) throws IOException, FileNotFoundException
{
Settings s = null;
try
{
fileInput = new FileInputStream(destination);
objectInput = new ObjectInputStream(fileInput);
Object obj = objectInput.readObject();
if (obj instanceof Settings)
{
s = (Settings)obj;
}
}
catch (IOException e)
{
e.printStackTrace();
}
catch (ClassNotFoundException e)
{
e.printStackTrace();
}
finally
{
if (objectInput != null) try { objectInput.close(); } catch (IOException e) {}
if (fileInput != null) try { fileInput.close(); } catch (IOException e) {}
}
return s;
}
Tell me if you need more of my current code.
Thank you in advance!
... you must be this tall
Best advice for Serialisation is to avoid it for application persistence, especially if backwards compatibility is desired property in your application.
Answers
Is it possible to load old version saved files and then during the process just initiate new variables not found in the old version with just null but keep the rest as it has been saved?
Yes. Deserialising objects saved using previous versions of the class into a new version of this class will work only if:
fully qualified name of the class has not changed (same name and package)
previous and current class have exactly the same serialVersionUID; if one of the versions is missing it, it will be calculated as a 'hash' of all fields and methods and upon a mismatch deserialisation will fail.
inheritance hierarchy has not changed for that class (the same ancestors)
no fields have been removed in the new version of the class
no fields have become static
no fields have become transient
I just have to implement stuff for one class instead of three. But how would that work then concerning backward compatibility?
Yes. Providing that all classes of all fields of Model and Model class itself adhere to the rules above.
Finally, what approach should I go for? And most of all, how do I do this and maintaning backward compatibilty at the same time?
Yes, as long as you can guarantee all of the above rules forever, you will be backwards compatible.
I am sure you can appreciate that forever, or even for next year can be very hard to guarantee, especially in software.
This is why people do application persistence using more robust data exchange formats, than binary representation of serialised Java objects.
Raw data for the table, could be saved using anything from CSV file to JSON docs stored as files or as documents in NoSQL database.
For settings / config have a look at Java's Properties class which could store and load properties to and from *.properties or *.xml files or separately have a look at YAML.
Finally for backwards compatibility, have a look at FlatBuffers
The field of application persistence is very rich and ripe, so happy exploring.
I want to check if a Windows Workstation is logged on or off. I've found a solution in C#:
public class CheckForWorkstationLocking : IDisposable
{
private SessionSwitchEventHandler sseh;
void SysEventsCheck(object sender, SessionSwitchEventArgs e)
{
switch (e.Reason)
{
case SessionSwitchReason.SessionLock: Console.WriteLine("Lock Encountered"); break;
case SessionSwitchReason.SessionUnlock: Console.WriteLine("UnLock Encountered"); break;
}
}
public void Run()
{
sseh = new SessionSwitchEventHandler(SysEventsCheck);
SystemEvents.SessionSwitch += sseh;
}
#region IDisposable Members
public void Dispose()
{
SystemEvents.SessionSwitch -= sseh;
}
#endregion
}
but at the end I'm going to need this boolean in my Java Program.
I already tried the following:
I started both programs and C# writes into a file from where I can check all few seconds if the data has changed or not from java (don't need to say that this solution is just slow and insufficient)
Another solution would be :
Java starts the C# .exe which waits until Java connects to it through sockets and they share the data over the open connection.
Is there a better way to solve this with less effort than with this socket interface solution?
You don't have to go to any complicated lengths to get this done. It can be quite simple.
Save the boolean into a file in C#, then have a file watcher watching the directory in Java. If there is a change it can read the file in Java and find the value of the boolean. Such a solution would not be expensive and eat up a lot of CPU cycles, like a solution where you had a while loop that checked the file would be.
The beginnings of the Java code can be as simple as
import static java.nio.file.StandardWatchEventKinds.*;
Path dir = ...;
try {
WatchKey key = dir.register(watcher,
ENTRY_CREATE,
ENTRY_DELETE,
ENTRY_MODIFY);
} catch (IOException x) {
System.err.println(x);
}
There are lots of possible solutions to this issue. My personal preference would be to use a message queue to post messages between the applications. (http://zeromq.org/ is light and would be my recommendation)
The advantage of this approach is the two applications are decoupled and and its not relying on the filesystem which is notoriously prone to errors.
To call a function that is written in C# (or any .NET library function) from Java, you can use JNI.
However, all JNI will do is get you to C/C++. You will need to write a simple managed C++ object that can forward request from the unmanaged side to the .NET library.
Example Here
I was planning (and implemented -_- my bad because didn't read anything on this article) to use Java Serialization for creating data files from my objects, but as i noticed right now, it's fairly SLOW on android even with powerful devices.
Simply, i have an application that stores dozens of Stroke objects, i need to save and load those objects for later use. My current workflow looks like that ;
public class StrokePoint extends PointF implements Serializable { ... }
public class CardinalStroke extends Stroke implements Serializable {
...
protected ArrayList<StrokePoint> mPoints = new ArrayList<StrokePoint>();
...
}
Finally, i have a class which is containing stroke objects, and the other things, it looks like ;
public class NoteElement implements Serializable {
...
private ArrayList<IStroke> mStrokes = new ArrayList<IStroke>();
...
}
I'm using for serialization that code and saving serialized data into SQLITE DB (should i keep using SQLITE for this use or is it better to save those data into FILE for future using.);
private void saveElement(NoteElement element) {
try {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
ObjectOutputStream oos = new ObjectOutputStream(baos);
oos.writeObject(element);
mProvider.insertOrUpdateElement(mCurrentModel.getBookId(), element.getUuid(), baos.toByteArray());
element.setIsModified(false);
} catch (IOException e) {
}
}
When i run saveElement serialization keeps about 500+ms at least with non-complex elements, i could have very complex NoteElements, they will keep probably 2000ms+ which is not acceptable for serializing that data, also saving into SQLITE keeps like Serialization/4 ms.
Well regarding my question is, How can I improve serialization speed, or should i use completely different technique for that kind of data storage. I read about Percelable but that couldn't be used for data storage maybe there is some alternative built-in or should i implement my own data serializer classes which could be pain in some point ? I'm open for any kind of suggestions, Thanks for your time!
This Question is not about SQLITE Slow insertation, it's about slow Serialization. I wonder who is marked that question is marked as duplicate to Slow SQLITE inseration weird.