Need help with Degrees of Separation using SQL & java program - java

I'm trying my best but I am lost right now. I know that this is horrible code but can someone just try to help me with the syntax and the overall problem if you're really ambitious. Thank you.
This is the SQL schema:
create table Person(
id int primary key,
name varchar(255) not null
);
create table Band(
id int primary key,
name varchar(255),
style varchar(255)
);
create table memberOf(
person int references Person(id),
band int references Band(id),
primary key(person, band)
);
Here is my horribly written program so far:
import java.sql.*;
public class DegreeNumber {
public static void degreesOfSeparation(int origin) throws Exception {
//Loads driver.
Class.forName("com.mysql.jdbc.Driver");
// Makes connection to server
Connection c = DriverManager.getConnection
("jdbc:mysql://cursa.ccs.neu.edu/test");
//Create a new prepared Statement that makes a
temporary table that can hold people in bands
PreparedStatement temp = c.prepareStatement
("create temporary table CurrentBand (
id int primary key);
");
//Execute query, creates table above
temp.execute();
//Create prepared statement to insert people of band into table
PreparedStatement insert = c.prepareStatement
("insert into CurrentBand (id)
select m.musician
from memberOf m
where ? = m.band;");
//Get Bands in table
PreparedStatement getBands = c.prepareStatement
("select b.id
from Bands b");
ResultSet bands = getBands.executeQuery();
int bandCount = 1;
//Sets parameter to first band on list
insert.setInt(1, bands.getInt(bandCount));
//Execute bands being inserted
insert.execute();
//Gives back ResultSet with band listed
PreparedStatement returnCurrentBand = c.prepareStatement
("select * from CurrentBand");
//Execute to give back CurrentBand records
ResultSet currentBand = returnCurrentBand.executeQuery();
//Creates table to hold musicians
PreparedStatement createDegree = c.prepareStatement
("create temporary table Degrees(
id int,
name varchar(255),
degree int,
primary key (id))");
//Execute to create table
createDegree.execute();
//Insert original Person into table Degrees
PreparedStatement insertOrig = c.prepareStement
("insert into Degrees (id, name, degree)
select p.id, p.name, 0
from Person p
where p.id = ?");
insertOrig.setInt(1, origin);
insertOrig.execute();
int count = 1;
//If there are still bands left on list,
while {(!bands.isAfterLast()){
PreparedStatement thisRound = c.prepareStatement
("insert into Degrees (id, name, degree)
select p.id, p.name, ?
from Person p, memberOf m, currentBand c
where p.id = m.musician
and m.band = c.id");
thisRound.setInt (1, count);
count++;
PreparedStatement truncate = c.prepareStatement("truncate table currentBand");
truncate.execute();
bandCount++;
insert.execute();
bands.next();
}
//means all bands have been gone through, find unique people, make new table & sort and print out
PreparedStatement createFinal = c.prepareStatement
("create table Final (
name varchar(255),
degree int,
primary key (name, degree))");
createFinal.execute();
PreparedStatement makeFinal = c.prepareStatement(
("insert into Final (name, degree)
select unique (d.name, d.degree)
from Degrees d
sort by degree asc, name asc");
makeFinal.execute();
ResultSet final = c.prepareStatement("select * from Final").executeQuery();
while (final.next()) {
System.out.println("Musician Name " + final.getString("name") +
" Degree " + rs.getInt("degree"));

You terribly need a DAO. Also take a look at this thread.

The problems you are encountering are mostly due to poor abstraction and design. It's good that you realize that this isn't a great piece of code. To give you a bit of incite about exactly why this is bad, let's consider the following problems with the class you presented.
It's not Object-Oriented. It's merely
a procedure statically tucked into a
class.
It throws the generic Exception. We won't get much of a chance to
handle errors outside of its scope.
Every time you use this function (I wouldn't even call it a method),
you make a new connection to the
database. That's pretty costly.
Tables in your database aren't well represented as first-class objects in your Java program. They suffer from weak abstraction and can't have their details encapsulated. This is mostly what's contributing to your feeling that this is poor code.
This isn't a very testable function. Break this function into many methods, and pass parameters to each of them to get their work done.
It's long, making it rather incomprehensible. Aim for methods of 5-15 lines.
It uses temporary tables. That'll kill efficiency when you can well work in number.
The problem isn't well documented. I understand from the code that it has to do with bands and people, but how does this relate to degrees of freedom? Make the problem and solution obvious.
I feel that if you address these critical areas, you can refactor towards a better solution.

Related

Order of columns returned in ResultSet

I have a Java program reading data from an Access database where the table is created dynamically each time, and the number of columns varies depending on the data populated.
The table has columns as shown below. Only columns RowID and StatusOfProcessing are fixed and will be at the end.
column1,column2,column3, ... columnN,RowID,StatusOfProcessing
Below is piece of code
String str = "SELECT TOP 50 * FROM Dynamic_table WHERE StatusOfProcessing='0'";
resultSet = stmt.executeQuery(str);
When reading data from the ResultSet, does it always have columns in the order listed above, or should I use
String str = "SELECT TOP 50 column1,column2,column3 .... columnN,RowID,StatusOfProcessing FROM Dynamic_table WHERE StatusOfProcessing='0'";
resultSet = stmt.executeQuery(str);
Can someone clarify?
SELECT * will normaly return columns in the order in which they were created, e.g., the order in which they appeared in the CREATE TABLE statement. However, bear in mind that in addition to retrieving the value of a column by its index, as in
int col1value = resultSet.getInt(1);
you can also retrieve the value in a given column by referring to its name, as in
int col1value = resultSet.getInt("column1");
Furthermore, if you need to see the actual order of the columns or verify the names and types of the columns in a ResultSet you can use a ResultSetMetaData object to get that information.

Retrieve and Insert million records into table

there's column I want to retrieve and insert into another table
For example, below is first table I want to retrieve values
Table1
Records
1 ABC Singapore
2 DEF Vietnam
I retrieve above column value from Table1, then insert into another table as below
Table 2
ID Name Country
1 ABC Singapore
2 DEF Vietname
Currently, I can do with java, I first retrieve records then split the values and insert. However, I want to do it by batch or pagination for better performance when Table1 got million of records to retrieve and insert those million records into Table2.
Any pointer to show me how to use pagination in my case would be appreciated.
I"m use MSSQL 2008
If you need to do that in code (and not in SQL which should be easier even with multiple delimiters), what you probably want to use would be batched inserts with proper batch size combined with a good fetch-size on your select:
//Prepare statements first
try(PreparedStatement select = con.prepareStatement("SELECT * FROM SOURCE_TABLE");
PreparedStatement insert = con.prepareStatement("INSERT INTO TARGET_TABLE(col1, col2, col3) VALUES (?,?,?)")) {
//Define Parameters for SELECT
select.setFetchDirection(ResultSet.FETCH_FORWARD);
select.setFetchSize(10000);
int rowCnt = 0;
try(ResultSet rs = select.executeQuery()) {
while(rs.next()) {
String row = rs.getString(1);
String[] split = row.split(" |\\$|\\*"); //However you want to do that
//Todo: Error handling for array length
//Todo: Type-Conversions, if target data is not a string type
insert.setString(1, split[0]);
insert.setString(2, split[1]);
insert.setString(3, split[2]);
insert.addBatch();
//Submit insert in batches of a good size:
if(++rowCnt % 10000 == 0) {
int[] success = insert.executeBatch();
//Todo: Check if that worked.
}
}
//Handle remaining inserts
int[] success = insert.executeBatch();
//Todo: Check if that worked.
}
} catch(SQLException e) {
//Handle your Exceptions
}
On calculating on "good" fetch and batch sizes you'll want to consider some parameters:
Fetchsize impacts memory consumption in your client. If you have enough of that you can make it big.
Committing an insert of millions of rows will take some time. Depending on your requirements you might want to commit the insert transaction every once in a while (every 250.000 inserts?)
Think about your transaction isolation: Make sure auto-commit is turned off as committing each insert will make most of the batching gains go away.

Read MySQL table into two-dimensional array

I have a rather sticky situation:
I have been tasked with designing an application that will read the contents of a table from a MySQL database into a two-dimensional string array, scalable to 100.
The schema for the table is:
id = INT(11)
Username = VARCHAR(100)
Password = VARCHAR(45)
UserType = VARCHAR(10)
Full Name = VARCHAR(100)
Age = INT(11)
DOB = DATE
Location = VARCHAR(100)
Record_Created = DATETIME
The issue is that I cannot read this to a file and then to the application for security reasons (contains actual admin account information). Can anyone think of an efficient way of doing this?
.... into a two-dimensional string array, scalable to 100.
I take it that you mean the array can have up to 100 elements.
If so, then one solution is to preallocate an array of size 100 x 9, then query and read the table rows and populate the array.
A second solution is to query, read the table rows, populate a list-of-arrays, and use List.toArray to extract the list contents into a 2-D array.
A final solution is to use select ... count to find the number of rows, and use that to allocate a 2-D array with the correct size. Then proceed as per the first solution.
However, using a 2-D array is a bad idea from an O-O design perspective. Instead, you should declare a class with a field for each of the 9 columns, and then represent the table contents as an array or list of that class.
I implemented the following design for my above question:
public static String[][] Table() throws SQLException
{
String[][] Table = null;
table = conn.createStatement();
String sql = "select * from usersTable";
ResultSet rs = table.executeQuery(sql);
rs.last();
int rowNumb = rs.getRow();
ResultSetMetaData rsmd = rs.getMetaData();
int columnS = rsmd.getColumnCount();
rs.beforeFirst();
dbTable= new String[rowNumb][columnS];
int i=0;
while(rs.next() && i<rowNumb && rowNumb<100)
{
for(int j=0;j<columnS;j++)
{
Table[i][j] = rs.getString(j+1);
}
i++;
}
return Table;
}

OutOfMemoryError: Java heap space

I'm having a problem with a java OutOfMemoryError. The program basically looks at mysql tables that are running on mysql workbench, and queries them to get out certain information, and then puts them in CSV files.
The program works just fine with a smaller data set, but once I use a larger data set (hours of logging information as opposed to perhaps 40 minutes) I get this error, which to me says that the problem comes from having a huge data set and the information not being handled too well by the program. Or it not being possible to handle this amount of data in the way that I have.
Setting Java VM arguments to -xmx1024m worked for a slightly larger data set but i need it to handle even bigger ones but it gives the error.
Here is the method which I am quite sure is the cause of the program somewhere:
// CSV is csvwriter (external lib), sment are Statements, rs is a ResultSet
public void pidsforlog() throws IOException
{
String[] procs;
int count = 0;
String temp = "";
System.out.println("Commence getting PID's out of Log");
try {
sment = con.createStatement();
sment2 = con.createStatement();
String query1a = "SELECT * FROM log, cpuinfo, memoryinfo";
rs = sment.executeQuery(query1a);
procs = new String[countThrough(rs)];
// SIMPLY GETS UNIQUE PROCESSES OUT OF TABLES AND STORES IN ARRAY
while (rs.next()) {
temp = rs.getString("Process");
if(Arrays.asList(procs).contains(temp)) {
} else {
procs[count] = temp;
count++;
}
}
// BELIEVE THE PROBLEM LIES BELOW HERE. SIZE OF THE RESULTSET TOO BIG?
for(int i = 0; i < procs.length; i++) {
if(procs[i] == null) {
} else {
String query = "SELECT DISTINCT * FROM log, cpuinfo, memoryinfo WHERE log.Process = " + "'" + procs[i] + "'" + " AND cpuinfo.Process = " + "'" + procs[i] + "'" + " AND memoryinfo.Process = " + "'" + procs[i] + "' AND log.Timestamp = cpuinfo.Timestamp = memoryinfo.Timestamp";
System.out.println(query);
rs = sment.executeQuery(query);
writer = new CSVWriter(new FileWriter(procs[i] + ".csv"), ',');
writer.writeAll(rs, true);
writer.flush();
}
}
writer.close();
} catch (SQLException e) {
notify("Error pidslog", e);
}
}; // end of method
Please feel free to ask if you want source code or more information as I'm desperate to get this fixed!
Thanks.
SELECT * FROM log, cpuinfo, memoryinfo will sure give a huge result set. It will give a cartesian product of all rows in all 3 tables.
Without seeing the table structure (or knowing the desired result) it's hard to pinpoint a solution, but I suspect that you either want some kind of join conditions to limit the result set, or use a UNION a'la;
SELECT Process FROM log
UNION
SELECT Process FROM cpuinfo
UNION
SELECT Process FROM memoryinfo
...which will just give you all distinct values for Process in all 3 tables.
Your second SQL statement also looks a bit strange;
SELECT DISTINCT *
FROM log, cpuinfo, memoryinfo
WHERE log.Process = #param1
AND cpuinfo.Process = #param1
AND memoryinfo.Process = #param1
AND log.Timestamp = cpuinfo.Timestamp = memoryinfo.Timestamp
Looks like you're trying to select from all 3 logs simultaneously, but ending up with another cartesian product. Are you sure you're getting the result set you're expecting?
You could limit the result returned by your SQL queryes with the LIMIT estatementet.
For example:
SELECT * FROM `your_table` LIMIT 100
This will return the first 100 results
SELECT * FROM `your_table` LIMIT 100, 200
This will return results from 100 to 200
Obviously you can iterate with those values so you get to all the elements on the data base no matter how many there are.
I think your are loading too many data at the same in the memory. try to use offset and limit in your sql statement so that you can avoid this problem
Your Java code is doing things that the database could do more efficiently. From query1a, it looks like all you really want is the unique processes. select distinct Process from ... should be sufficient to do that.
Then, think carefully about what table or tables are needed in that query. Do you really need log, cpuinfo, and memoryinfo? As Joachim Isaksson mentioned, this is going to return the Cartesian product of those three tables, giving you x * y * z rows (where x, y, and z are the row counts in each of those three tables) and a + b + c columns (where a, b, and c are the column counts in each of the tables). I doubt that's what you want or need. I assume you could get those unique processes from one table, or a union (rather than join) of the three tables.
Lastly, your second loop and query are essentially doing a join, something again better and more efficiently left to the database.
Like others said, fetching the data in smaller chunks might resolve the issue.
This is one of the other threads in stackoverflow that talks about this issue:
How to read all rows from huge table?

Get the MYSQL Table Name from ResultSet

I have a SELECT query which combining three tables. I want to add them to a Jtable by separating the MYSQL tables. I just want to know how can I identify the table name in a Resultset?
resultLoad("SELECT sqNum,RegDate,JobNo,DecName,NoOfLines,EntryType,EntrySubType,EntrySubType2,Amount,Status FROM paysheet WHERE TypeBy='" + cmbStaff.getSelectedItem().toString() + "' AND CommissionStatus='no' UNION SELECT sqNum,RegDate,JobNo,DecName,NoOfLines,EntryType,EntrySubType,EntrySubType2,Amount,Status FROM creditsheet WHERE TypeBy='" + cmbStaff.getSelectedItem().toString() + "' AND CommissionStatus='no' ORDER BY RegDate UNION SELECT sqNumber,date,JobNumber,DecName,noOfLines,type,type,type,CommissionAmount,status FROM newjobsheet WHERE TypeBy='" + cmbStaff.getSelectedItem().toString() + "' AND CommissionStatus='no' ORDER BY RegDate");
private void resultLoad(String loadQuery) {
try {
Connection c = DB.myConnection();
Statement s = c.createStatement();
ResultSet r = s.executeQuery(loadQuery);
while (r.next()) {
Vector v = new Vector();
v.addElement(r.getString("sqNum"));
v.addElement(r.getString("RegDate"));
v.addElement(r.getString("JobNo"));
v.addElement(r.getString("DecName"));
v.addElement(r.getString("NoOfLines"));
v.addElement(r.getString("EntryType") + " - " + r.getString("EntrySubType") + " - " + r.getString("EntrySubType2"));
v.addElement(r.getString("Amount"));
v.addElement(r.getString("Status"));
tm.addRow(v);
totalComAmount = totalComAmount + Integer.parseInt(r.getString("Amount"));
}
} catch (Exception e) {
// e.printStackTrace();
JOptionPane.showMessageDialog(this, e, "Error!", JOptionPane.ERROR_MESSAGE);
}
}
I want to add to the Jtable like this by sorting the dates. But the three tables containing different columns.
From your result set, you can get ResultSetMetaData. It looks like this:
rs.getMetaData().getTableName(int Column);
"I want to add them to a table by separating the tables."
Not sure what you mean by that, but:
"I just want to know how can I identify the table name in a Resultset?"
the answer is no, not unless you rewrite the query so that it does it for you.
A SELECT statement yields a new (virtual) table - any columns it delivers are technically columns of that new virtual table. The result does not remember the origin of those columns
However, you can write the query so as to give every expression in the SELECT list a column alias that allows you to identify the origin. For instance, you could do:
SELECT table1.column1 AS table1_column1
, table1.column2 AS table1_column2
, ...
, table2.column1 AS table2_column1
, ...
FROM table1 INNER JOIN table2 ON ... etc
If the underscore is not suitable as a separator for your purpose, then you could consider to quote the aliases and pick whatever character as separator you like, including a dot:
SELECT table1.column1 AS `table1.column1`
, ....
etc.
UPDATE:
I just noticed your query is a UNION over several tables. Obviously this method won't work in that case. I think your best bet is still to rewrite the query so that it reads:
SELECT 'paysheet' as table_name, ...other columns...
FROM paysheet
UNION
SELECT 'creditsheet', ...other columns...
...
It seems to me like you want to SELECT data from three separate tables and have it returned in one query as one ResultSet, then you want to separate the results back out into individual tables.
This is completely counter intuitive.
If you want to have your results separated so that you are able to look at each table's results separately, your best bet is to perform 3 different SELECT queries. Basically, what you are asking it to combine 3 things and then separate them again in one pass. Don't do this; if you leave them uncombined in the first place, separating them back out won't be an issue for you.

Categories