How to store matrix information in MySQL? - java

I'm working on an application that analizes music similarity. In order to do that I proccess audio data and store the results in txt files. For each audio file I create 2 files, 1 containing and 16 values (each value can be like this:2.7000023942731723) and the other file contains 16 rows, each row containing 16 values like the one previously shown.
I'd like to store the contents of these 2 file in a table of my MySQL database.
My table looks like:
Name varchar(100)
Author varchar (100)
in order to add the content of those 2 file I think I need to use the BLOB data type:
file1 blob
file2 blob
My question is how should I store this info in the data base? I'm working with Java where I have a double array containing the 16 values (for the file1) and a matrix containing the file2 info. Should I process the values as strings and add them to the columns in my database?
Thanks

Hope I don't get negative repped into oblivion with this crazy answer, but I am trying to think outside the box. My first question is, how are you processing this data after a potential query? If I were doing something similar, I would likely use something like matlab or octave, which have a specific notation for representing matricies. It is basically a bunch of comma and semicolon delimited text with square brackets at the right spots. I would store just a string that my mathematics software or module can parse natively. After all, it doesn't sound like you want to do some kind of query based on a data point.

I think you need to normalize a schema like this if you intend to keep it in a relational database.
Sounds like you have a matrix table that has a one-to-many relationship with its files.
If you insist on one denormalized table, one way to do it would be to store the name of the file, its author, the name of its matrix, and its row and column position in the named matrix that owns it.
Please clarify one thing: Is this a matrix in the linear algebra sense? A mathematical entity?
If yes, and you only use the matrix in its entirety, then maybe you can store it in a single column as a blob. That still forces you to serialize and deserialize to a string or blob every time it goes into and comes out of the database.

Do you need to query the data (say for all the values that are bigger than 2.7) or just store it (you always load the whole file from the database)?
Given the information in the comment I would save the files in a BLOB or TEXT like said in other answers. You don't even need a line delimiter since you can do a modulus operation on the list of values to get the row of the matrix.

I think the problem that dedalo is facing is that he's working with arrays (I assume one is jagged, one is multi-demensional) and he wants to serialize these to blob.
But, arrays aren't directly serializable so he's asking how to go about doing this.
The simplest way to go about it would be to loop through the array and build a string as Dave suggested and store the string. This would allow you to view the contents from the value in the database instead of deserializing the data whenever you need to inpsect it, as duffymo points out.
If you'd like to know how to serialize the array into BLOB...(this just seems like overkill)
You are able to serialize one-dimensional arrays and jagged arrays, e.g.:
public class Test {
public static void main(String[] args) throws Exception {
// Serialize an int[]
ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream("test.ser"));
out.writeObject(new int[] {0, 1, 2, 3, 4, 5, 6, 7, 8, 9});
out.flush();
out.close();
// Deserialize the int[]
ObjectInputStream in = new ObjectInputStream(new FileInputStream("test.ser"));
int[] array = (int[]) in.readObject();
in.close();
// Print out contents of deserialized int[]
System.out.println("It is " + (array instanceof Serializable) + " that int[] implements Serializable");
System.out.print("Deserialized array: " + array[0]);
for (int i=1; i<array.length; i++) {
System.out.print(", " + array[i]);
}
System.out.println();
}
}
As for what data type to store it as in MySQL, there are only four blob types to choose from:
The four BLOB types are TINYBLOB, BLOB, MEDIUMBLOB, and LONGBLOB
Choose the best one depends on the size of the serialized object. I'd imagine BLOB would be good enough.

Related

How to store array of objects in SQL Table column of String type in Java?

I have a table EMPLOYEE_PREF in ORACLE SQL.
It has columns EMP_ID, EMP_KEY, EMP_TYPE, EMP_VALUE
Basically EMP_VALUE is a string. It is a bookmarked url.
I have to re-use the table for Employee's preference on selection.
The selection is Array of objects.
Facility, Healthplan, ..
Each of the object has key,value pairs:
Ex for Facility object
{ dataType : "string"
isEligible : true
label : "Facility Name"
name :"facilityName"
}
{
...
}
Now I am not getting how to store the Array of objects in EMP_VALUE column which stores VARCHAR.
Please help me with some solution.
I should be able to store in String and retrieve in array of objects.
I dont think if it is possible. You can store current object's reference though but it doesnt make much sense. In next run the object will change. What do you mean by bookmarked url?
Instead you can use some format to store the preferences under each category..for eg [Facility=(isEligible : true, name :"facilityName",..)] something like this. To retrieve it back you would have to crack it in reverse way.
Note : Consider changing your DB design, and add preferences for each category under different column. That will make your life much easier.
What is your database? When you need to store a large amount of text, in this case is what you are doing, you can just use in Oracle the TEXT type, in SQL Server, since SQL Server 2005, you can use the VARCHAR(MAX) or TEXT for older versions.
In these types, you can store the text as an array, once you convert to an String as this example:
List<String> strings = new ArrayList<String>();
strings.add("String 1");
strings.add("Strings 2");
strings.add("Strings 3");
String token = strings.toString();
Then you can put the String token on database.

Merge CSV files with dynamic headers in Java

I have two or more .csv files which have the following data:
//CSV#1
Actor.id, Actor.DisplayName, Published, Target.id, Target.ObjectType
1, Test, 2014-04-03, 2, page
//CSV#2
Actor.id, Actor.DisplayName, Published, Object.id
2, Testing, 2014-04-04, 3
Desired Output file:
//CSV#Output
Actor.id, Actor.DisplayName, Published, Target.id, Target.ObjectType, Object.id
1, Test, 2014-04-03, 2, page,
2, Testing, 2014-04-04, , , 3
For the case some of you might wonder: the "." in the header is just an additional information in the .csv file and shouldn't be treated as a separator (the "." results from the conversion of a json-file to csv, respecting the level of the json-data).
My problem is that I did not find any solution so far which accepts different column counts.
Is there a fine way to achieve this? I did not have code so far, but I thought the following would work:
Read two or more files and add each row to a HashMap<Integer,String> //Integer = lineNumber, String = data, so that each file gets it's own HashMap
Iterate through all indices and add the data to a new HashMap.
Why I think this thought is not so good:
If the header and the row data from file 1 differs from file 2 (etc.) the order won't be kept right.
I think this might result if I do the suggested thing:
//CSV#Suggested
Actor.id, Actor.DisplayName, Published, Target.id, Target.ObjectType, Object.id
1, Test, 2014-04-03, 2, page //wrong, because one "," is missing
2, Testing, 2014-04-04, 3 // wrong, because the 3 does not belong to Target.id. Furthermore the empty values won't be considered.
Is there a handy way I can merge the data of two or more files without(!) knowing how many elements the header contains?
This isn't the only answer but hopefully it can point you in a good direction. Merging is hard, you're going to have to give it some rules and you need to decide what those rules are. Usually you can break it down to a handful of criteria and then go from there.
I wrote a "database" to deal with situations like this a while back:
https://github.com/danielbchapman/groups
It is basically just a Map<Integer, Map<Integer. Map<String, String>>> which isn't all that complicated. What I'd recommend is you read each row into a structure similar to:
(Set One) -> Map<Column, Data>
(Set Two) -> Map<Column, Data>
A Bidi map (as suggested in the comments) will make your lookups faster but carries some pitfalls if you have duplicate values.
Once you have these structures you lookup can be as simple as:
public List<Data> process(Data one, Data two) //pseudo code
{
List<Data> result = new List<>();
for(Row row : one)
{
Id id = row.getId();
Row additional = two.lookup(id);
if(additional != null)
merge(row, additional);
result.add(row);
}
}
public void merge(Row a, Row b)
{
//Your logic here.... either mutating or returning a copy.
}
Nowhere in this solution am I worried about the columns as this is just acting on the raw data-types. You can easily remap all the column names either by storing them each time you do a lookup or by recreating them at output.
The reason I linked my project is that I'm pretty sure I have a few methods in there (such as outputing column names etc...) that might save you considerable time/point you in the right direction.
I do a lot of TSV processing in my line of work and maps are my best friends.

Merge two JSON file through a value in java

I want to implement a method to merge two huge file (the files contains JsonObject for each row) through a common value.
The first file is like this:
{
"Age": "34",
"EmailHash": "2dfa19bf5dc5826c1fe54c2c049a1ff1",
"Id": 3,
...
}
and the second:
{
"LastActivityDate": "2012-10-14T12:17:48.077",
"ParentId": 34,
"OwnerUserId": 3,
}
I have implemented a method that read the first file and take the first JsonObject, after it takes the Id and if in the second file there is a row that contains the same Id (OwnerUserId == Id), it appends the second JsonObject to the first file, otherwise I wrote another file that contains only the row that doesn't match with the first file. In this way if the first JsonObject has 10 match, the second row of the first file doesn't seek these row.
The method works fine, but it is too slow.
I have already trying to load the data in mongoDb and query the Db, but it is slow too.
Is there another way to process the two file?
What you're doing simply must be damn slow. If you don't have the memory for all the JSON object, then try to store the data as normal Java objects as this way you surely need much less.
And there's a simple way needing even much less memory and only n passes, where n is the ratio of required memory to available memory.
On the ith pass consider only objects with id % n == i and ignore all the others. This way the memory consumption reduces by nearly factor n, assuming the ids are nicely distributed modulo n.
If this assumption doesn't hold, use f(id) % n instead, where f is some hash function (feel free to ask if you need it).
I have solved using a temporary DB.
I have created a index with the key in which I want to make a merge and in this way I can make a query over the DB and the response is very fast.

Two-dimensional array of different types

I want to create a two-dimensional array in which I want to store records from the database. So lets say that the first is of type int and the second of type String (here I am describing just one record so basically types of db columns). How can I do it? Is an array the right data structure for that?
I am not sure I am following, but you might be looking for a Map<Integer,String>. or Map<Integer,List<String>>. [have a look on List, and HashMap]
Map allows association of the key [Integer] to the value [String or List].
Map also allows fast lookup of key, and its attached value.
(*) You should use Map<Integer,List<String>> if you want to attach more then one String per Integer, or alternatively you can use apache commons MultiMap
Arrays can only contain one type. If that type happens to be Object then it can store Object and any of its sub-types, but that doesn't really sound like what you're trying to accomplish here.
It sounds like what you're describing is a 2D array to store database information, with each element in the array being a column in one of the rows. This isn't an array of records, it's an array of column data.
Instead, just store a one-dimensional array of records, where each element of the array is a reference to the entire DB row.
You can do the same thing with the help of this
Object[][] o = new Object[10][10];
o[0][0] = 1;
o[0][1] ="hello";
System.out.println(o[0][0]);
System.out.println(o[0][1]);
You can use
HashMap<Integer, ArrayList<String>>
If you simply want to have one column of String data and another column of int data, this is what you can consider doing:
Declare a 2 dimensional String array
String[][] words = new String[][];
Your first column can contain all the String data. The second column can have the numeric data but in the form of a String. You may want to use the Integer.toString() and Integer.parseInt() methods to do this
words[index][index] = Integer.toString(Integer.parseInt(args));
I'm not sure what exactly you hope to achieve but you may consider modifying this snippet to suit your needs

sql search for a number into a BLOB (list of numbers)

I've stored an ArrayList of longs (ID's) into a Blob column.
(followed question: BLOB vs. VARCHAR for storing arrays in a MySQL table)
That works, but now I've a problem: how can I search into that BLOB?
Imagine I've stored this numbers into the BLOB: 1,2,3,4
And what I want to do is:
SELECT * FROM table WHERE blob_column CONTAINS 3
Is it possible?
The simple answer is "No, it's not possible". Your BLOB contains a serialised java object - and the database knows nothing about it's implementation.
When designing a database, you should always think about how you will need to access the data. In your case, you would be much better off having a separate table, which would contain your ID's from the list in separate rows. Then you could simply join that table into your query.

Categories