DataFrame transformation in Spark, Java

DataFrame transformation in Spark, Java - java

After I load a json file with:
df = sqlContext.read().json(path);
I will get my DataFrame in Java Spark. I have for example the next DF:
id item1 item2 item3 ....
id1 0 3 4
id2 1 0 2
id3 3 3 0
...
I want to transform it in the most easy way to (probably of Object of the class Rating, id and item then to Integer by .hashCode())
id item ranking
id1 item1 0
id1 item2 3
id1 item3 4
....
id2 item1 1
id2 item2 0
id1 item1 2
...
PS Some first attempt to create the flatMap function:
void transformTracks() {
JavaRDD<Rating> = df.flatMap(new Function<Row, Rating>(){
public Rating call(Row r) {
for (String i : r) {
return Rating(1, 1, r.apply(Double.parseDouble(i)));
}
}
})
}

You have to forgive me if the syntax is slightly off - I program in Scala nowadays and it's been a while since I used Java - but something along the lines of:
DataFrame df = sqlContext.read().json(path);
String[] columnNames = df.columns;
DataFrame newDF = df.flatMap(row -> {
ArrayList list = new ArrayList<>(columnNames.length);
String id = (String)row.get(0);
for (int i = 1; i < columnNames.length, i++) {
list.add(id, columnNames[i], (int)row.get(i));
}
return list;
}).toDF("id", "item", "ranking");

Related

Android database table data to specific model object

I have sqlite database with table 'data' as
uid type name value
1 abc North 10
1 abc South 0
1 abc East 0
1 abc West 0
1 abc Total 10
1 xyz Total 20
1 xyz Open 10
1 xyz Close 10
Using select query I am able to get this table data, now I need to create response object having following structure
DataModel(String type, Arraylist<Data> data)
Data(String name, String value)
expected output json will be
{"data":[{"type":"abc", "values":[{"name":"North", "value":"10"}, {"name":"South", "value":"0"}...]},
{"type":"xyz", "values":[{"name":"total", "value":"20"}, {"name":"open", "value":"10"}...]}]}
for now I tried this kotlin code in android studio & its working well 'result' Arraylist contains above table data
val dataList: ArrayList<DashboardDataModel> = ArrayList()
val typeAyyList: ArrayList<String> = ArrayList()
for(res in result){
typeAyyList.add(res.type)
}
val distinct = typeAyyList.toSet().toList()
for(type in distinct) {
val values: ArrayList<DashboardData> = ArrayList()
for (res in result) {
if (type == res.type) {
val dash=DashboardData()
dash.name=res.name
dash.value=res.value
values.add(dash)
}
}
val data = DashboardDataModel()
data.type=type
data.values=values
dataList.add(data)
}
Is there any better solution available?
Any help/suggestion please?

How to read from mysql and write column wise in csv file?

I have different tables in a database(say mysql). i need to extract some columns from different table and write it in a csv file. consider table 1
A 1
B 2
C 3
table 2 table 3
X 7 AB A1
Y 8 BC B2
Z 10 CD C3
U 11 DE D4
V 12
W 13
i want to write 1st column from table 1,2nd col from table 2, and 1st col from table 3 in a csv file such that empty rows are made null.
output:
A,7,AB
B,8.BC
C,10,CD
null,11,DE
null,12,null
null,13,null
i can do the basic reading and writing from mysql to csv, need help in the logic or code to get the above output. "Looking for a generic solution for say 'n' number of columns from 'n' number of tables". above is jus a example.

I do not know how you read your database but assume you do it with JDBC:
ResultSet tableAReusltSet= null;
ResultSet tableBReusltSet= null;
ResultSet tableCReusltSet= null;
List<PseudeContainer> container = new ArrayList<>();
while (tableAReusltSet.next()) {
PseudeConteiner ps = new PseudoContainer();
ps.col1 = tableAReusltSet.getString("ColumnName");
container.add(ps);
}
int i = 0;
while (tableBReusltSet.next()) {
if(container.size() <= i){
container.add(new PseudeContainer());
}
container.get(i).col2 = tableBReusltSet.getString("ColumnName");
}
i = 0;
while (tableBReusltSet.next()) {
if(container.size() <= i){
container.add(new PseudeContainer());
}
container.get(i).col2 = tableBReusltSet.getString("ColumnName");
}
//.. now you have a collection to work with which you can write
public PseudeContainer {
String col1 = null;
String col2 = null;
String col3 = null;
}
Above should work.. still pseudo code...

Since you just want the logic here is some pseudo code that may help you in whatever language you are using. Since I don't know how you are exporting to csv I made it pretty generic.
arr1 = select column1 from table1;
arr2 = select column2 from table2;
arr3 = select column1 from table3;
max = isBigger(arr1.length, arr2.length);
max = isBigger(max, arr3.length);
for(i=0; i<max; i++)
{
if(arr1[i]=="") arr1[i]=null;
if(arr2[i]=="") arr2[i]=null;
if(arr3[i]=="") arr3[i]=null;
print arr1[i] + "," + arr2[i] + "," + arr3[i];
}

Recommend 3rd party libraries to handle CSV files:
Apache Commons CSV
Open CSV

mysql query in clause at playframework

in below two sql query sql1 not selecting any row, and sql2 selecting only 1 for 111#k2.com
var ids="'111#k2.com','222#k2.com','333#k2.com','444#k2.com','555#k2.com','666#k2.com'"
val sql1 = SQL("SELECT id,point,privacy FROM `pointTable` WHERE state=1 and id in ({users})").on("users" -> ids)
sql1().map { row =>
val point = if (row[Boolean]("privacy")) { row[Double]("point").toString } else { "0" }
println(write(Map("id" -> row[String]("id"), "point" -> point)))
}
val sql2 = SQL("SELECT id,point,privacy FROM `pointTable` WHERE state=1 and id in (" + ids + ")")
sql2().map { row =>
val point = if (row[Boolean]("privacy")) { row[Double]("point").toString } else { "0" }
println(write(Map("id" -> row[String]("id"), "point" -> point)))
}
in phpmyadmin when i run this query manualy it returns 6 rows then why not working perfectly here.
i am using play framework 2.2 with scala 2.1

That's not going to work. Passing users though on is going to escape the entire string, so it's going to appear as one value instead of a list. Anorm in Play 2.3 actually allows you to pass lists as parameters, but here you'll have to work around that.
val ids: List[String] = List("111#k2.com", "222#k2.com", "333#k2.com")
val indexedIds: List[(String, Int)] = ids.zipWithIndex
// Create a bunch of parameter tokens for the IN clause.. {id_0}, {id_1}, ..
val tokens: String = indexedIds.map{ case (id, index) => s"{id_${index}}" }.mkString(", ")
// Create the parameter bindings for the tokens
val parameters = indexedIds.map{ case (id, index) => (s"id_${index}" -> toParameterValue(id)) }
val sql1 = SQL(s"SELECT id,point,privacy FROM `pointTable` WHERE state=1 and id in (${tokens})")
.on(parameters: _ *)

get the column value for the two consecutive rows having the same other column values

Column 1: Date Column 2: Type
So if rows are sorted in descending order by Date, then find the Date during which user has the same consecutive Type "Clever" (See example below).
For exampe:
If my table contains the following data-
Sr Date Type
1 2013-05-24T16:21:06.728Z Alaska
2 2013-05-27T20:44:32.412Z Clever
3 2013-05-27T20:45:33.301Z Clever
4 2013-05-27T21:45:46.127Z Clever
5 2013-05-27T21:46:27.825Z Self
6 2013-05-28T15:18:48.430Z Clever
So I want the Date 2013-05-27T21:45:46.127Z
I tried the following-
ArrayList<String> startTimeList = new ArrayList<String>();
cur = dbAdapter.rawQuery("select Date, Type from User ORDER BY Date DESC", null);
cur.moveToFirst();
if(cur.getCount() > 0)
{
int i = 0;
while(cur.isAfterLast() == false)
{
if(cur.getString(1).equals("Clever"))
{
startTimeList.add(cur.getString(0));
cur.moveToNext();
}
else
{
cur.moveToNext();
}
if(startTimeList.size() == 2)
{
return;
}
}
But this is not giving me the Date for the consecutive rows that have the Type as "Clever".

String dateOfLastCleverPrecededByADuplicate;
cur = dbAdapter.rawQuery("select Date, Type from User ORDER BY Date DESC", null);
cur.moveToFirst();
String prevType;
String prevDate;
while (!cur.isAfterLast())
{
if (cur.getString(1) != null &&
cur.getString(1).equals(prevType) &&
cur.getString(1).equals("Clever"))
{
// found consecutives record with identical type 'Clever'
dateOfLastCleverPrecededByADuplicate = prevDate;
break;
}
prevDate = cur.getString(0);
prevType = cur.getString(1);
cur.moveToNext();
}
After that, dateOfLastCleverPrecededByADuplicate is "2013-05-27T21:45:46.127Z" in your example.

Elements repeated on a php array from sql

I have two arrays: in one i inserted all the Questions ID's from my SELECT and in the other array i want insert the sames ID'S but NON repetead this time. My code in the second array don't works and i don't know why. I can't use DISTINCT in my SELECT because don't works (rows are diferents) and i don't wanna use two selects for this.
$query_slidersanswers= "SELECT A.QuestionIDFK, A.AnswerIDPK, A.AnswerValue, A.SortOrder
FROM tblquestionset AS QS
INNER JOIN tblquestion AS Q ON QS.QuestionIDFKPK = Q.QuestionIDPK
INNER JOIN tblanswer AS A ON Q.QuestionIDPK = A.QuestionIDFK
WHERE QS.QuestionSetIDPK = '0'
AND QS.OnPage = '1'
AND Q.Constructor = '".$_session['slider']."'";
$Query_Sliders= mysql_query($query_slidersanswers);
$currentQuestionID= 0;
while($row_Slider=mysql_fetch_array($Query_Sliders)){
$QuestionID=$row_Slider['QuestionIDFK'] ;
$AnswerID=$row_Slider['AnswerIDPK'] ;
$AnswerValue=$row_Slider['AnswerValue'] ;
$SortOrder=$row_Slider['SortOrder'] ;
$tableslidersqid[] = array($QuestionID);
if($QuestionID != $currentQuestionID){
//I DO THIS FOR OBTAIN other array with THE UNIQUES ID'S (non repeated)
$tableslidersREALqid[] = array($QuestionID);
$CurrentQuestionID = $QuestionID;
}
}

Suppose that your array for question id is as below
$input = array( "19", "55", "19", "55", "78" );
$result = array_unique($input);
print_r($result);
The above example will output:
Array
(
[0] => 19
[1] => 55
[4] => 78
)
array_unique($array) will detect the same values in the array and only give the first occured value, rest are skipped.
EDITED :
while($row_Slider=mysql_fetch_array($Query_Sliders)){
$QuestionID[]=$row_Slider['QuestionIDFK']; //used $QuestionID[], instead of $QuestionID
//$AnswerID=$row_Slider['AnswerIDPK'] ;
//$AnswerValue=$row_Slider['AnswerValue'] ;
//$SortOrder=$row_Slider['SortOrder'] ;
//$tableslidersqid[] = array($QuestionID);
//if($QuestionID != $currentQuestionID){
//$tableslidersREALqid[] = array($QuestionID);
//$CurrentQuestionID = $QuestionID;
//}
}
$result = array_unique($QuestionID); //make array unique here, after all the ids are in the array

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

DataFrame transformation in Spark, Java - java

Related

Android database table data to specific model object

How to read from mysql and write column wise in csv file?

mysql query in clause at playframework

get the column value for the two consecutive rows having the same other column values

Elements repeated on a php array from sql

Categories

Resources