Text file to string matrix java - java

I am making an auto chat client like Cleverbot for school. I have everything working, but I need a way to make a knowledge base of responses. I was going to make a matrix with all the responses that I need the bot to say, but I think it would be hard to edit the code every time I want to add a responses to the bot. This is the code that I have for the knowledge base matrix:
`String[][] Database={
{"hi","hello","howdy","hey"},//possible user input
{"hi","hello","hey"},//response
{"how are you", "how r u", "how r you", "how are u"},
{"good","doing well"}`
How would I make a matrix like this from a text file? Is there a better way than reading from a text file to deal with this?

You could...
Use a properties file
The properties file is something that can easily be read into (and stored from, but you're not interested in that) Java. The class java.util.Properties can make that easier, but it's fairly simple to load it and then you access it like a Map.
hello.input=hi,hello,howdy,hey
hello.output=hi,hello,hey
Note the matching formats there. This has its own set of problems and challenges to work with, but it lets you easily pull things in to and out of properties files.
Store it in JSON
Lots of things use JSON for a serialization format. And thus, there are lots of libraries that you can use to read and store from it. It would again make some things easier and have its own set of challenges.
{
"greeting":{
"input":["hi","hello","howdy","hey"],
"output":["hi","hello","hey"]
}
}
Something like that. And then again, you read this and store it into your data structures. You could store JSON in a number of places such as document databases (like couch) which would make for easy updates, changes, and access... given you're running that database.
Which brings us to...
Embedded databases
There are lots of databases that you can stick right in your application and access it like a database. Nice queries, proper relationships between objects. There are lots of advantages to using a database when you actually want a database rather than hobbling strings together and doing all the work yourself.
Custom serialization
You could create a class (instead of a 2d array) and then store the data in a class (in which it might be a 2d array, but that's an implementation detail). At this point, you could implement Serializable and write the writeObject and readObject methods and store the data somehow in a file which you could then read back into the object directly. If you have the administration ability of adding new things as part of this application (or another that uses the same class) you could forgo the human readable aspect of it and use the admin tool (that you write) to update the object.
Lots of others
This is just the tip of the iceberg. There are lots of ways to go about this.
P.S.
Please change the name of the variable from Database to something in lower case that better describes it such as input2output or the like. Upper case names are typically reserved for class names (unless its all upper case, in which case it's a final static field)

A common solution would be to dump the data in to a properties file, and then load it with the standard Properties.load(...) method.
Once you have your data like that, you can then access the data by a map-like interface.
You could find different ways of storing the data in the file like:
userinput=hi,hello,howdy,hey
response=hi,hello,hey
...
Then, when you read the file, you can split the values on the comma:
String[] expectHello = properties.getProperty("userinput").split(",");

Related

Is serializing in Java the best/easiest way to store and later access (a small amount of) data?

I am relatively new to Java and have much more experience with Matlab. I was wondering what the best way is to store a relatively small amount of data, which has been calculated in one program, that should be used in another program.
Example: program A computes 100 values to be stored in an array. Now I would like to access this array in program B, as it needs these values. Of course, I could just write one program all together, which also implements the part of A. However, now every time I want to execute the total program, all the values have to be calculated again (in part A), which is a waste of resources. In Matlab, I was able to easily save the array in a .mat file and load it in a different script.
Looking around to find my answer I found the option of serializing (What is object serialization? ), which I think would be a suitable for doing what I want. My question: is serializing the easiest and quickest solution to store a small amount of data in Java, or is there a quicker, more user-friendly option (like .mat files in Matlab)?
I think you have several options to do this job. Java object serialization is one possible way. From my point of view there are other options to serialize the data:
Write and read a simple text file to store the computed values.
Using Java Architecture for XML Binding (JAXB) to write annotated Java classes to XML file. Same for JSON is also available.
Using a lightweight database like SQLite or HSQLDB (native Java database).
Using Apache Thrift or Protocol Buffer to de/serializing Java objects to files.

Java framework to manage BLOB data outside of database

I want to store my blobs outside of the database in files, however they are just random blobs of data and aren't directly linked to a file.
So for example I have a table called Data with the following columns:
id
name
comments
...
I can't just include a column called fileLink or something like that because the blob is just raw data. I do however want to store it outside of the database. I would love to create a file called 3.dat where 3 is the id number for that row entry. The only thing with this setup is that the main folder will quickly start to have a large number of files as the id is a flat folder structure and there will be OS file issues. And no the data is not grouped or structured, it's one massive list.
Is there a Java framework or library that will allow me to store and manage the blobs so that I can just do something like MyBlobAPI.saveBlob(id, data); and then do MyBlobAPI.getBlob(id) and so on? In other words something where all the File IO is handled for me?
Simply use an appropriate database which implements blobs as you described, and use JDBC. You really are not looking for another API but a specific implementation. It's up to the DB to take care of effective storing of blobs.
I think a home rolled solution will include something like a fileLink column in your table and your api will create files on the first save and then write that file on update.
I don't know of any code base that will do this for you. There are a bunch that provide an in memory file system for java. But it's only a few lines of code to write something that writes and reads java objects to a file.
You'll have to handle any file system limitations yourself. Though I doubt you'll ever burn through the limitations of modern file systems like btrfs or zfs. FAT32 is limited to 65K files per directory. But even last generation file systems support something on the order of 4 billion files per directory.
So by all means, write a class with two functions. One to serialize an object to a file; given it a unique key as a name. And another to deserialize the object by that key. If you are using a modern file system, you'll never run out of resources.
As far as I can tell there is no framework for this. The closest I could find was Hadoop's HDFS.
That being said the advice of just putting the BLOB's into the database as per the answers below is not always advisable. Sometimes it's good and sometimes it's not, it really depends on your situation. Here are a few links to such discussions:
Storing Images in DB - Yea or Nay?
https://softwareengineering.stackexchange.com/questions/150669/is-it-a-bad-practice-to-store-large-files-10-mb-in-a-database
I did find some addition really good links but I can't remember them offhand. There was one in particular on StackOverFlow but I can't find it. If you believe you know the link please add it in the comments so that I can confirm it's the right one.

Store data in an application

I have an application which stores information in a JList. However, of course, when the application is closed all of the information is deleted from memory.
I'm trying to build the app so that when re-launched, it will contain the same data. So is there a way to store this data in a database or similar and if so? Where and how do I go about this?
The simplest way to persist IMHO is in a File.
Try using Properties if you need a key-value map.
Or, if it you're binding more complex objects I recommend a Simple XML serialization package.
You need to connect your application to a database using JDBC. JDBC stands for Java Database Connectivity. As you can see from the name, it lets you to connect to a database. Hence, you can link your application to a database,and store your data persistenly.Here's a link to start off with. And here is something for further reading.
If the data is not complex and is not large (more than a few instances of a few objects) you could persist the list to a file using serialization. This will get you started. If you list is large or complex you might consider a database. Searching for JDBC will in your favorite search engine will get you started.
I think you want a plain flat file. It's simple; you can have one going in no time. (The learning curve is much less than with databases.) And it's fast; you can read a 1 GB file before you can even log on to a DB. Java serialization is a bit tricky, but it can be a very powerful way to save vast amounts of complicated data. (See here for things to watch out for, plus more helpful links.) If, for instance, you wanted to save a large, complex game between sessions, serializing it is the way to go. No need to convert an Object Oriented structure to a relational one.
Use a database:
if you want to add data to a large file, or read only part of the data from a large file. Or if other processes are going to read and modify it.
Consider a DB:
if you are already using one for other purposes. If the user might start on another machine and have trouble finding the file from the last session and the data is not too extensive. Or if the data is relational in nature anyway and someone else may be interested in looking at it.
So if you have a simple case where the user always starts in the same directory, just write and read a simple file. If you have a lot of complex, extensive OO data, use a flat file even if it is not easy to do--you'll need the speed. Otherwise, think about a DB.

Using list of serialized objects

I'm learning Android/Java programming, but I'm confused about persistant data.
After a lot of research it seems that the best way to store objects is to serialize them in a file, but I couldn't find a simple way to retrieve these objects.
If I create two objects and save their serialized versions, how can I retieve and list both of them? Do I need to create a file for each object with a specific ID in the filename so I can list them with a getFilesDir?
Depends on how complex those objects are (your personal preferences I guess), I have used SharedPreferences to store simple objects before, just for the sake of simplicity, while a co-worker makes generous use of SQLite, but that suits his needs.
Since you do not state what is being stored, the best advice I can give you right now is have read here, it covers how persistent data should be dealt with on Android.
"best" way? Please define your criteria.
There are databases (relational and non-relational) or file systems. You can serialize lots of ways: Java serialization, XML, Google's protobuf, and others.
Yes, you'll need a way to specify a unique representation with its object. In a relational database, you'd use a primary key. You need something like that in any system you use.
If you serialize via some mechanism to a file system, you'll have to write the object into the desired format and stream it to a file. To go the other way, specify the key for the object, read the serialized data, and parse it back into the object.

Sanitize json input to a java server

I'm using json to pass data between the browser and a java server.
I'm using Json-lib to convert between java objects and json.
I'd like to strip out susupicious looking stuff (i.e "doSomethingNasty().) from the user input while converting from json to java.
I can imagine several points at which I could do this:
I could examine the raw json string and strip out funny-looking stuff
I could look for a way to intercept every json value on its way into the java object, and look for funny stuff there.
I could traverse my new java objects immediately after reconstitution from json, look for any fields that are Strings, and stripp stuff out there.
What's the best approach? Are there any technologies built for this this task that I tack tack on to what I have already?
I suggest approach 3: traverse the reconstructed Java objects immediately upon arrival, and before any other logic can act on them. Build the most restrictive validation you can get away with (that is, do white-listing).
You can probably do this in a single depth-first traversal of the object hierarchy that you retrieve from Json-lib. In general, you will not want to accept any JSON functions in your objects, and you will want to make sure that all values (numbers, strings, depth of object tree, ...) are within expected ranges. Yes, it is a huge hassle to do this well, but believe me, the alternative to good user-input validation is much, much worse. It may be a good idea to add logging for whenever you chop things out, to diagnose any possible bugs in your validation code.
As I understand you need to validate the JSON data coming into your application.
If you want to do white listing ("You know the data you expect and nothing else is valid"), then it makes sense to validate your java objects once they are created ("make sure not to send the java object to DB or back to UI in some way before validation is done).
In case you want to black listing of characters (you know some of the threat characters which you want to avoid"), then you can directly look at the json string as this validation would not change much over a period of time and even if it does, you only need to enhance one common place. For while listing iot would depend on your business logic.

Categories