Is there a way to import csv into cassandra through spark's java api without creating a pojo class for the csv. I am able to insert the csv by creating a pojo class like below , Is there any way to do so without creating pojo class for the csv programatically using spark java api.
My csv looks like this
Name,Age,bg,sex
ammar,67,ab+,M
nehan,88,b+,M
moin,99,m+,M
arbaaz,67,a+,M
...
And the program is below.
import org.apache.commons.lang3.StringUtils;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.Function;
import com.cassandra.insertion.MergeGeneSymDataInsertion;
import com.cassandra.insertion.MergeGeneSymDataInsertion.HgIpsenGeneSym;
import com.publicdata.task.PublicDataInsertion.PublicData;
import static com.datastax.spark.connector.japi.CassandraJavaUtil.*;
public class InsertCsv {
static JavaSparkContext ctx = null;
static boolean isHeader = true;
public static void main(String[] args) {
try {
ctx = new JavaSparkContext(new SparkConf().setMaster("local[4]")
.setAppName("TestCsvInserion"));
insertCsv(ctx);
} catch (Exception e) {
e.printStackTrace();
}
}
private static void insertCsv(JavaSparkContext ctx) {
JavaRDD<String> testfileRdd = ctx
.textFile("/home/syedammar/Pilot Project /test.csv");
JavaRDD<Bats> batsclassRdd = testfileRdd
.map(new Function<String, Bats>() {
#Override
public Bats call(String line) throws Exception {
// TODO Auto-generated method stub
if(!isHeader){
String[] words=StringUtils.split(line, ",");
String name = words[0];
String age = words[1];
String bg = words[2];
String sex = words[3];
return new Bats(name, age, bg, sex);
}
else
{
isHeader=false;
return null;
}
}
}).filter(new Function<Bats, Boolean>() {
#Override
public Boolean call(Bats obj) throws Exception {
// TODO Auto-generated method stub
return obj!=null;
}
}).coalesce(1);
javaFunctions(batsclassRdd).writerBuilder("test", "bats", mapToRow(Bats.class)).saveToCassandra();
}
public static class Bats {
public Bats() {
// TODO Auto-generated constructor stub
}
private String name;
private String age;
private String bg;
public Bats(String name, String age, String bg, String sex) {
super();
this.name = name;
this.age = age;
this.bg = bg;
this.sex = sex;
}
private String sex;
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getAge() {
return age;
}
public void setAge(String age) {
this.age = age;
}
public String getBg() {
return bg;
}
public void setBg(String bg) {
this.bg = bg;
}
public String getSex() {
return sex;
}
public void setSex(String sex) {
this.sex = sex;
}
}
}
Yes you can do that. I found it while browsing... please refer -
How to Parsing CSV or JSON File with Apache Spark
There are two approaches, follow Procedure for approach B
POJO classes are not required for the approach B, but POJO classes would make your code easier to read if you are using Java
Hope this will help.
Related
I create an java class:
public class ReturnObj {
private String returncode;
private String returndesc;
private Pkg pkg;
public String getReturncode() {
return returncode;
}
public void setReturncode(String returncode) {
this.returncode = returncode;
}
public String getReturndesc() {
return returndesc;
}
public void setReturndesc(String returndesc) {
this.returndesc = returndesc;
}
}
and other class:
public class Pkg {
private String packagecode;
private String cycle;
private String price;
private String desc;
public String getPackagecode() {
return packagecode;
}
public void setPackagecode(String packagecode) {
this.packagecode = packagecode;
}
public String getCycle() {
return cycle;
}
public void setCycle(String cycle) {
this.cycle = cycle;
}
public String getPrice() {
return price;
}
public void setPrice(String price) {
this.price = price;
}
public String getDesc() {
return desc;
}
public void setDesc(String desc) {
this.desc = desc;
}
}
And I Want to convert object ReturnObj to this XML
<return>
<returncode>1</returncode>
<returndesc>DANG_KY_THANH_CONG</returndesc>
<package>
<packagecode>BD30</packagecode>
<cycle>1</cycle>
<price>15000</price>
<desc> BD30</desc>
</package>
</return>
So how do I serialize an attribute pkg to package in XML? Because Java doesn't allow to name variable as an keyword anh package is an keyword in Java !
You can use JAXB marshling in your class it will convert the object to XML, here is link to help you JAXB Marshling
Try xstream
XStream xstream = new XStream();
xstream.alias("package", Pkg.class);
String xml = xstream.toXML(myReturnObj);
You can use JAXB API that comes with java for converting java object to XML.
Below is the code that will solve your requirement.
import javax.xml.bind.annotation.XmlElement;
import javax.xml.bind.annotation.XmlRootElement;
#XmlRootElement(name = "return")
public class ReturnObj {
private String returncode;
private String returndesc;
private Pkg pkg;
public Pkg getPkg() {
return pkg;
}
#XmlElement(name = "package")
public void setPkg(Pkg pkg) {
this.pkg = pkg;
}
public String getReturncode() {
return returncode;
}
#XmlElement(name = "returncode")
public void setReturncode(String returncode) {
this.returncode = returncode;
}
public String getReturndesc() {
return returndesc;
}
#XmlElement(name = "returndesc")
public void setReturndesc(String returndesc) {
this.returndesc = returndesc;
}
}
#XmlRootElement
public class Pkg {
private String packagecode;
private String cycle;
private String price;
private String desc;
public String getPackagecode() {
return packagecode;
}
#XmlElement(name="packagecode")
public void setPackagecode(String packagecode) {
this.packagecode = packagecode;
}
public String getCycle() {
return cycle;
}
#XmlElement(name="cycle")
public void setCycle(String cycle) {
this.cycle = cycle;
}
public String getPrice() {
return price;
}
#XmlElement(name="price")
public void setPrice(String price) {
this.price = price;
}
public String getDesc() {
return desc;
}
#XmlElement
public void setDesc(String desc) {
this.desc = desc;
}
}
import java.io.File;
import javax.xml.bind.JAXBContext;
import javax.xml.bind.JAXBException;
import javax.xml.bind.Marshaller;
public class JAXBExample {
private static final String FILE_NAME = "C:\\ru\\jaxb-returnObj.xml";
public static void main(String[] args) {
ReturnObj returnObj = new ReturnObj();
returnObj.setReturncode("1");
returnObj.setReturndesc("DANG_KY_THANH_CONG");
Pkg pkg = new Pkg();
pkg.setCycle("1");
pkg.setPrice("15000");
pkg.setDesc("BD30");
returnObj.setPkg(pkg);
jaxbObjectToXML(returnObj);
}
private static void jaxbObjectToXML(ReturnObj emp) {
try {
JAXBContext context = JAXBContext.newInstance(ReturnObj.class);
Marshaller m = context.createMarshaller();
// for pretty-print XML in JAXB
m.setProperty(Marshaller.JAXB_FORMATTED_OUTPUT, Boolean.TRUE);
// Write to System.out, this will print the xml on console
m.marshal(emp, System.out);
// Write to File
m.marshal(emp, new File(FILE_NAME));
} catch (JAXBException e) {
e.printStackTrace();
}
}
}
Explanation:
#XmlRootElement: This is a must have annotation for the Object to be used in JAXB. It defines the root element for the XML content.
#XmlElement: This will create the element. If you want to give some other name to the xml element when converting java object to xml then you can pass name attribute to the #XmlElement Example:
#XmlElement(name = "package")
Execute above code to see the desired output.
Happy Coding.
I'm trying to read the values from a JSON URL, however I don't know how I can proceed with reading the values from a List inside of an Array? Below you will find my POJO, Main, and JSON code. Thank you so much for your help
POJO:
package org.jcexchange.FBApp;
import java.util.List;
import org.jcexchange.FBApp.Details;
public class Users {
private List<Details> Values;
public List<Details> getValues() {
return this.Values;
}
public void setValues(List<Details> Values) {
this.Values = Values;
}
}
public class Details {
private String user_name;
private String user_password;
private int age;
private String user_email;
public String getUserName() {
return this.user_name;
}
public void setUserName(String user_name) {
this.user_name = user_name;
}
public String getUserPassword() {
return this.user_password;
}
public void setUserPassword(String user_password) {
this.user_password = user_password;
}
public int getAge() {
return this.age;
}
public void setAge(int age) {
this.age = age;
}
public String getUserEmail() {
return this.user_email;
}
public void setUserEmail(String user_email) {
this.user_email = user_email;
}
}
Main:
public class Main {
public static void main(String[] args) {
try {
URL jsonURL = new URL("https://jchtest.herokuapp.com/index.php?
PW=2");
ObjectMapper mapper = new ObjectMapper();
mapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES,
false);
Users[] a1 = mapper.readValue(jsonURL, Users[].class);
}
catch (Exception e) {
throw new RuntimeException(e);
}
}
}
I'm able to pull the JSON from a webservice, however I'm stuck trying to figure out how I could retrieve for instance the user_name from the first "Values" index of the JSON array
JSON:
[
{
"Values": {
"user_name": "jhart",
"user_password": "gooddeval1",
"age": 28,
"user_email": "heheh"
}
},
{
"Values": {
"user_name": "bdole",
"user_password": "Passwordd",
"age": 82,
"user_email": "hahah"
}
}
]
Well , it is a little confusing here may be because i dont have the full context. From the de-serializer you are telling me that i expect an Array of Users and then within each User i have a List of "Values" , but the JSON tells me that Values is a singular property for Users. Anyways , here is a sample that works on the assumption i have made. This can be fiddled around to change the collection and singular properties
import org.codehaus.jackson.annotate.JsonProperty;
public class Users {
#JsonProperty("Values")
private Details Values;
public Details getValues() {
return this.Values;
}
public void setValues(Details Values) {
this.Values = Values;
}
}
import org.codehaus.jackson.annotate.JsonProperty;
public class Details {
#JsonProperty("user_name")
private String user_name;
#JsonProperty("user_password")
private String user_password;
#JsonProperty("age")
private int age;
#JsonProperty("user_email")
private String user_email;
public String getUserName() {
return this.user_name;
}
public void setUserName(String user_name) {
this.user_name = user_name;
}
public String getUserPassword() {
return this.user_password;
}
public void setUserPassword(String user_password) {
this.user_password = user_password;
}
public int getAge() {
return this.age;
}
public void setAge(int age) {
this.age = age;
}
public String getUserEmail() {
return this.user_email;
}
public void setUserEmail(String user_email) {
this.user_email = user_email;
}
}
import java.net.URL;
import org.codehaus.jackson.map.ObjectMapper;
public class Main {
public static void main(String[] args) {
try {
URL jsonURL = new URL("https://jchtest.herokuapp.com/index.php?PW=2");
ObjectMapper mapper = new ObjectMapper();
Users[] a1 = mapper.readValue(jsonURL, Users[].class);
System.out.println(a1[0].getValues().getUserName());
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}
This prints "jhart" for me.
Please note : One thing you can try is based on the array/singular property you can populate the Object and write it as JSON. That way you can find what is different in what Jackson Deserializer expects vs What we are actually supplying.
import java.io.File;
import com.db4o.Db4o;
import com.db4o.Db4oEmbedded;
import com.db4o.ObjectContainer;
import com.db4o.ObjectSet;
import com.db4o.query.Query;
public class Student {
private String name;
public AlumnoBDOO(String name){
this.name = name;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public static void main(String[] args) {
ObjectContainer bd = Db4oEmbedded.openFile("students.db4o");
try {
Student s1 = new Student("Carl");
bd.store(s1)
showStudents(bd);
} catch (Exception e) {
e.printStackTrace();
} finally {
bd.close();
}
}
public static void showResult(ObjectSet rs){
System.out.println("Retrieved "+rs.size()+" objects");
while(rs.hasNext()){
System.out.println(rs.next());
}
}
public static void showStudents(ObjectContainer bd){
Query query = bd.query();
query.constrain(Student.class);
query.descend("name");
ObjectSet rs = query.execute();
showResult(rs);
}
}
I just simply want to store a Student in the db4o database but when I want to retrieve all of them it outputs like this:
Student#61070a02
I'm using Eclipse Juno and db40 v.8.0 which I already added as external jar.
Why am I getting those weird characters instead of "Carl"?
That is not weired, but the default implementation of the toString() method. To get meaningfull information you should override this method in your Student class.
I have been playing around with serialization-XML in java and am a little stuck. When I run this program I get two exceptions and I am not sure what the cause is:
java.lang.InstantiationException: Ship
Continuing ...
java.lang.Exception: XMLEncoder: discarding statement XMLEncoder.writeObject(Ship);
Continuing ...
I suspect that there is something wrong with the class that I am trying to serialize because when I use an example of the internet it works fine.
Can someone please point out what mistake I am making.
Main:
public class Main {
private static final String XMLLocation = "xmlTest.xml";
static ObjectSerializationToXML serializer = new ObjectSerializationToXML();
public Main() {
// TODO Auto-generated constructor stub
}
/**
* #param args
* #throws Exception
*/
public static void main(String[] args) throws Exception {
// TODO Auto-generated method stub
Ship ship = new Ship("name", "324");
serializer.serializeObjectToXML(XMLLocation, ship);
}
}
Object Serialization-XML Class:
import java.beans.XMLDecoder;
import java.beans.XMLEncoder;
import java.io.FileInputStream;
import java.io.FileOutputStream;
public class ObjectSerializationToXML {
/**
* <span id="IL_AD10" class="IL_AD">This method</span> saves (serializes) any java bean object into xml file
*/
public void serializeObjectToXML(String xmlFileLocation,
Object objectToSerialize) throws Exception {
FileOutputStream os = new FileOutputStream(xmlFileLocation);
XMLEncoder encoder = new XMLEncoder(os);
encoder.writeObject(objectToSerialize);
encoder.close();
}
/**
* Reads Java Bean Object From XML File
*/
public Object deserializeXMLToObject(String xmlFileLocation)
throws Exception {
FileInputStream os = new FileInputStream(xmlFileLocation);
XMLDecoder decoder = new XMLDecoder(os);
Object deSerializedObject = decoder.readObject();
decoder.close();
return deSerializedObject;
}
}
Object To Serialize (My object that causes the exception):
public class Ship {
private String name;
private String yearBuilt;
public Ship(String name, String yearBuilt) {
this.name = name;
this.yearBuilt = yearBuilt;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getYearBuild() {
return yearBuilt;
}
public void setYearBuild(String yearBuild) {
this.yearBuilt = yearBuild;
}
#Override
public String toString() {
return "ship [name=" + name + ", yearBuilt=" + yearBuilt + "]";
}
}
Object To Serialize (example from the internet that works):
public class MyBeanToSerialize {
private String firstName;
private String lastName;
private int age;
public int getAge() {
return age;
}
public void setAge(int age) {
this.age = age;
}
public String getFirstName() {
return firstName;
}
public void setFirstName(String firstName) {
this.firstName = firstName;
}
public String getLastName() {
return lastName;
}
public void setLastName(String lastName) {
this.lastName = lastName;
}
}
Whenever any class containing parameterized constructor and trying to serialize, then it should be only instantiated by default Constructor. So, XMLEncoder requires an object to serialize it by default constructor.
Ship class must implement the default constructor while it is containing parameterized constructor because whenever Ship class becomes to serializable, it would be looking for default constructor to instantiate for XMLEncoder.
Please find corrected Ship class as per below.
public class Ship {
private String name;
private String yearBuilt;
public Ship(String name, String yearBuilt) {
this.name = name;
this.yearBuilt = yearBuilt;
}
//Default constructor must be implemented for XMLEncoder serializing
public Ship() {
super();
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getYearBuild() {
return yearBuilt;
}
public void setYearBuild(String yearBuild) {
this.yearBuilt = yearBuild;
}
#Override
public String toString() {
return "ship [name=" + name + ", yearBuilt=" + yearBuilt + "]";
}
}
I have error:
com.google.gwt.user.client.rpc.SerializationException: Type 'ru.xxx.empeditor.client.Dept$$EnhancerByCGLIB$$2f6af516' was not included in the set of types which can be serialized by this SerializationPolicy or its Class object could not be loaded. For security purposes, this type will not be serialized.: instance = ru.xxx.empeditor.client.Dept#e53d4e
Why this class not serializable?
package ru.xxx.empeditor.client;
import java.util.HashSet;
import java.util.Set;
import com.google.gwt.user.client.rpc.IsSerializable;
/**
* Dept generated by hbm2java
*/
public class Dept implements IsSerializable {
private byte deptno;
private String dname;
private String loc;
private Set<Emp> emps = new HashSet<Emp>(0);
public Dept() {
}
public Dept(byte deptno) {
this.deptno = deptno;
}
public Dept(byte deptno, String dname, String loc, Set<Emp> emps) {
this.deptno = deptno;
this.dname = dname;
this.loc = loc;
this.emps = emps;
}
public byte getDeptno() {
return this.deptno;
}
public void setDeptno(byte deptno) {
this.deptno = deptno;
}
public String getDname() {
return this.dname;
}
public void setDname(String dname) {
this.dname = dname;
}
public String getLoc() {
return this.loc;
}
public void setLoc(String loc) {
this.loc = loc;
}
public Set<Emp> getEmps() {
return this.emps;
}
public void setEmps(Set<Emp> emps) {
this.emps = emps;
}
}
Check if the class Emp is serialiable.
Another potential issue (since you are using Hibernate - noticed the auto-generated comment) could be because of Proxies that modify your bean's byte code, as a result of which GWT fails to serialize it. As mentioned here - http://code.google.com/webtoolkit/articles/using_gwt_with_hibernate.html