Retrieve DataFrame Values in a Java Array - java

I am using apache spark. I want to retrieve the values pf DataFrame in a String type array. I have created a table using DataFrame.
dataframe.registerTempTable("table_name");
DataFrame d2=sqlContext.sql("Select * from table_name");
Now I want this data to be retrieved in a java Array(String type would be fine). How can I do that.

You can use collect() method to get Row[]. Each Row contains column values of your Dataframe.If there is single value in each row then you can add them in ArrayList of String. If there are more than one column in each row then use ArrayList of your custom object type and set the properties. In below code instead of printing "Row Data" you can add them in ArrayList.
Row[] dataRows = d2.collect();
for (Row row : dataRows) {
System.out.println("Row : "+row);
for (int i = 0; i < row.length(); i++) {
System.out.println("Row Data : "+row.get(i));
}
}

Related

How to get column names of Spark Row using java

I am trying to convert a spark dataframe to rdd and apply a function using map.
In pyspark, we can fetch the values of corresponding column by converting the Row to dictionary (key being column name, value being the value of that column) as below
row_dict = Row.asDict()
val = row_dict['column1'] # I can access the value of any column
Now, in java, I am trying to do similar thing. I am getting the Row and I found that it has APIs to get the values based on index value
JavaRDD<Row> resultRdd = df.JavaRDD().map(x -> customFunction(x, customParam1, customParam2));
public static Row customFunction(Row row, Object o1, Object o2) {
// need to access "column1" value from the row
// how to get column name of each index if we have to use row.get(index)
}
How can I access the row values based on column names in java code?

How to select a particular hashmap<String,String> from a list by its value

I'm fairly new to Java and have been searching everywhere for an answer.
I run a sql query and use the response to build a list of hashmaps from the column name and values.
List<HashMap<String,String>> rulesList = Sql.getStuff("abc");
This gets me a list like this {column_1=abc, column_3=ghi, column_2=def}
I want to do two things with this list. First off I want to see if any column contains a particular value (ruleName). This part decent enough.
if (rulesList.get(0).containsValue(ruleName)) {
System.out.println("Expected: " + ruleName);
System.out.println("Actual: " + ruleName); //Would like to change this to include the actual column name and result
Then I want to check all the other columns that have a particular phrase in their name to see if they contain a value or not, such as the word "column" from "column_1, column_2, column_3". If they do contain the value then I want to print out the column name and value.
However, this is when I run into the problem of not knowing how to select from within the list. How can I get only column_2 and its accompanying data, or the column name associated with value abc?
something like this example?
List<HashMap<String,String>> rulesList = new ArrayList<HashMap<String,String>>() {{
for (int i = 0; i < 4; i++) {
HashMap<String,String> map = new HashMap<>();
map.put("key", Integer.toString(i));
map.put("column1", "Value in Column1 in row " + i);
map.put("column2", "Value in Column2 in row " + i);
add(map);
}
}};
for (HashMap<String,String> row : rulesList) {
for (String columnName : row.keySet()) {
// look for a column name
if(columnName.contains("column")) {
System.out.printf("Column \"%s\" has a value of \"%s\"%n", columnName, row.get(columnName));
}
// look for a cell value
if(row.get(columnName).matches(".+Column\\d in row 1")) {
System.out.printf("Value found in %s, row %s%n", columnName, row.get("key"));
}
}
}
I think you just need vocabulary : a hashmap associates a key (unique) with one value.
So with that keywords in mind you can easily find answer, like :
First off I want to see if any column contains a particular value (ruleName)
Here is an example
Then I want to check all the other columns that have a particular phrase in their name to see if ...
And there.
your question is premised around a list of hashmaps
Based on your question, it looks like you might be trying to use this structure:
List_Row<HashMap<String_ColumnName, String_Value>>
If this is the case, consider modifying your storage structure and storing a hashmap each with a list:
HashMap_ColumnName<ArrayList<String_Values>>
then you can simply grab a column and look through data. to get the row back as a list, you can write a function to do that pretty easily
getRow(i) {
HashMap<String,String> row = new HashMap<String,String>()
for(k:results) {
row.put(k, results.get(k).get(i))
}
return row
}

Get individual values from an array created from a resultset in java

I have an array that was created from an ArrayList which was in turn created from a ResultSet. This array contains rows of database table and each row (with several columns based on my query) exists as a single element in the array. So far so good. My problem is how to get individual values (columns) from each row which, I said earlier, now exists as an element. I can get each element (row, of course) but that is not what I want. Each element is a composite of several values and how to get those? I am a beginner and really stuck here. I think this all make sense. Here's the code how I created the array.
List resultsetRowValues = new ArrayList();
while (resultSet.next()){
for (int i = 1; i <= columnCount; i++) {
resultsetRowValues.add(resultSet.getString(i));
}
}
String[] databaseRows = (String[]) resultsetRowValues.toArray(new String[resultsetRowValues.size()]);
EDIT: More explanation
My MySQL query is as follows:
String query = "SELECT FIRSTNAME, LASTNAME, ADDRESS FROM SOMETABLE WHERE CITY='SOMECITY'";
This returns several rows in a ResultSet. And according to the sample query each element of an array will cotain three values (columns) i.e FIRSTNAME, LASTNAME and ADDRESS. But these three values exist in the array as a single element. While I want each column separately from each element (which is actually a row of the database table). When I iterate through the aarray using for loop and print the values to the console, I get output similar to the following:
Doe
Jhon
Some Street (End of First element)
Smith
Jhon
Some Apartment (End of Second element and so on)
As it is evident from the output, each element of the contains three values which are printed on separate lines.
How to get these individual values.
You probably want something like that:
List<Map<String, String>> data = new ArrayList<>();
while (resultSet.next()){
Map<String, String> map = new HashMap<>();
for (int i = 1; i <= columnCount; i++) {
map.put("column" + i, resultSet.getString(i));
}
data.add(map)
}
// usage: data.get(2).get("column12") returns line 3 / column 12
Note that there are other possible options (2D-array, guava Table, ...)

Turning a ResultSet into 2DArray/HashMap/Lists of Lists

I think I've become code-blind.
I'm currently doing a small project with a friend and we're using JDBC to select from a mySQL database.
What I want is that after running the select statement, I get some sort of '2D' list of all the data back.
I think I want something returned like this -
array[columnName][all of the data inside the selected column]
Pseudo Code
// Count all rows from a column
// Count the columns selected
// Adds the column names to 'array'
for(int i = 1; i <= columnCount; i++)
columns.add(resultSetMeta.getColumnName(i));
// Adds the results of the first column to the 'array'
// How can I add the results from n columns?
for(int i = 1; i <= rowCount; i++ ) {
while (resultSet.next())
rows.add(resultSet.getString(i));
}
columnsAndRows.put(columns, rows);
What is the most appropriate data type to use to 'replicate' a table - ArrayLists, Arrays, or HashMaps?
What's the best way of making a ResultSet into a 2D datatype?
When iterating through a ResultSet, how can I move to the next column?
you can use hashmap for key value pairs, where as key you put your resultset metadata and values as resultset values.
HashMap<String, Object> p = new HashMap<String, Object>();
p.put("ResultSetkey1","ResultSetvalue1");
p.put("ResultSetkey2","ResultSetvalue2");
Also I would like to say use ResultsetUtils ResultsetUtils

how to export dynamic column data to excel

I have to export data to excel having dynamic number of columns
using java apache workbook,
on every execution, column details will be saved in ListObject,
which will be dynamically generated and get saved in
List<Object> expColName = new ArrayList<Object>();
From the List , I have to obtain individual values and export into every column of the excel sheet,
for(int i=0; i<expColName.size(); i++){
data.put("1",new Object[] {
expColName.get(i)
});
}
The above code gives only the last column value in the excel sheet
What type is data and how do you read the values from the map?
It seems like you are putting every object into the same "key" of the Map, thats why you only get the last item from the list.
You could try to give it a test with:
for(int i=0; i<expColName.size(); i++){
data.put(i+"",new Object[] {
expColName.get(i)
});
}

Categories