this is my first Stack Overflow question, so I hope everything is alright: I create a GEXF file in Scala, convert it to a SVG graph (with the Gephi toolkit), and convert the results once more to PNG file format. The creation of the GEXF file is done in Scala, and the SVG and PNG conversion in Java.
The script runs flawless, when I first create the GEXF file and afterwards start the SVG and PNG conversion. I can do this with the if-functions: I first set if (1) on true and if (2) on false. Second: I set if (1) on false and if (2) on true.
When I run the code with both ifs on true, I get a NullPointerException that the SVG does not exist. Note: I have uploaded the code to GitHub for a better overview and rebuild of the error: https://github.com/commutativity/svg-problem
val conf: SparkConf = new SparkConf().setAppName("SparkTest").setMaster("local[*]").set("spark.executor.memory", "16g")
val sparkContext = new SparkContext(conf)
val sparkSession: SparkSession = SparkSession.builder.config(sparkContext.getConf) .config().getOrCreate()
val sqlContext: SQLContext = sparkSession.sqlContext
if (true) { // if (1)
val v = sqlContext.createDataFrame(List(
("a", "Alice", 34),
("b", "Bob", 36),
)).toDF("id", "name", "age")
val e = sqlContext.createDataFrame(List(
("a", "b", "friend"),
)).toDF("src", "dst", "relationship")
val g = GraphFrame(v, e)
val subgraphX = g.toGraphX
val pw = new PrintWriter("src\\main\\resources\\gexf\\" + "friends" + ".gexf")
val gexfString = toGexf(subgraphX)
pw.write(gexfString)
pw.close()
}
if (true) { // if (2)
// import java classes
class ScalaDriver extends JavaDriver {
runGEXFtoSVG("friends", "friends")
runSVGtoPNG("src\\main\\resources\\svg\\friends.svg",
"src\\main\\resources\\png\\friends.png")
}
// run imported java classes
new ScalaDriver
}
def toGexf[VD, ED](g: Graph[VD, ED]): String = {
val header =
"""<?xml version="1.0" encoding="UTF-8"?>
|<gexf xmlns="https://www.gexf.net/1.2draft" version="1.2">
|<meta>
|<description>A gephi graph in GEXF format</description>
|</meta>
|<graph mode="static" defaultedgetype="directed">
|<attributes class="node">
|<attribute id="1" title="redirect" type="string"/>
|<attribute id="2" title="namespace" type="string"/>
|<attribute id="3" title="category" type="string"/>
|</attributes>
""".stripMargin
val vertices = "<nodes>\n" + g.vertices.map(
v => s"""<node id=\"${v._1}\" label=\"${v._2.asInstanceOf[Row].getAs("id")}\">\n
</node>"""
).collect.mkString + "</nodes>\n"
val edges = "<edges>\n" + g.edges.map(
e =>
s"""<edge source=\"${e.srcId}\" target=\"${e.dstId}\"
label=\"${e.attr}\"/>\n"""
).collect.mkString + "</edges>\n"
val footer = "</graph>\n</gexf>"
header + vertices + edges + footer
}
The error message looks like follows:
java.lang.NullPointerException
at java.util.Objects.requireNonNull(Objects.java:203)
at graph.GEXFtoSVG.script(GEXFtoSVG.java:46)
at graph.JavaDriver.runGEXFtoSVG(JavaDriver.java:7)
at utils.ReproduceError$ScalaDriver$1.<init>(ReproduceError.scala:38)
at utils.ReproduceError$.delayedEndpoint$utils$ReproduceError$1(ReproduceError.scala:45)
at utils.ReproduceError$delayedInit$body.apply(ReproduceError.scala:11)
at scala.Function0.apply$mcV$sp(Function0.scala:39)
at scala.Function0.apply$mcV$sp$(Function0.scala:39)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
at scala.App.$anonfun$main$1$adapted(App.scala:80)
at scala.collection.immutable.List.foreach(List.scala:431)
at scala.App.main(App.scala:80)
at scala.App.main$(App.scala:78)
at utils.ReproduceError$.main(ReproduceError.scala:11)
at utils.ReproduceError.main(ReproduceError.scala)
Exception in thread "main" org.apache.batik.transcoder.TranscoderException: null
Enclosed Exception:
File file:/D:/scala_wiki/sbt-test1/src/main/resources/svg/friends.svg does not exist
at org.apache.batik.transcoder.XMLAbstractTranscoder.transcode(XMLAbstractTranscoder.java:136)
at org.apache.batik.transcoder.SVGAbstractTranscoder.transcode(SVGAbstractTranscoder.java:156)
at graph.SVGtoPNG.createImage(SVGtoPNG.java:35)
at graph.SVGtoPNG.<init>(SVGtoPNG.java:19)
at graph.JavaDriver.runSVGtoPNG(JavaDriver.java:11)
at utils.ReproduceError$ScalaDriver$1.<init>(ReproduceError.scala:41)
at utils.ReproduceError$.delayedEndpoint$utils$ReproduceError$1(ReproduceError.scala:45)
at utils.ReproduceError$delayedInit$body.apply(ReproduceError.scala:11)
at scala.Function0.apply$mcV$sp(Function0.scala:39)
at scala.Function0.apply$mcV$sp$(Function0.scala:39)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:17)
at scala.App.$anonfun$main$1$adapted(App.scala:80)
at scala.collection.immutable.List.foreach(List.scala:431)
at scala.App.main(App.scala:80)
at scala.App.main$(App.scala:78)
at utils.ReproduceError$.main(ReproduceError.scala:11)
at utils.ReproduceError.main(ReproduceError.scala)
The JavaDriver classes contains:
public void runGEXFtoSVG(String gexfName, String svgName) throws Exception {
GEXFtoSVG graph = new GEXFtoSVG();
graph.script(gexfName, svgName);
}
public void runSVGtoPNG(String svgName, String pngName) throws Exception {
new SVGtoPNG(svgName, pngName);
}
The GEXFtoSVG class contains:
public class GEXFtoSVG {
public void script(String gexfName, String svgName) throws Exception {
ProjectController pc = Lookup.getDefault().lookup(ProjectController.class);
pc.newProject();
Workspace workspace = pc.getCurrentWorkspace();
//Get models and controllers for this new workspace - will be useful later
GraphModel graphModel = Lookup.getDefault().lookup(GraphController.class).getModel();
PreviewModel model = Lookup.getDefault().lookup(PreviewController.class).getModel();
ImportController importController = Lookup.getDefault().lookup(ImportController.class);
Container container;
try {
File file = new File(Objects.requireNonNull(getClass().getResource(String.format("/gexf/%s.gexf", gexfName))).toURI());
container = importController.importFile(file);
container.getLoader().setEdgeDefault(EdgeDefault.DIRECTED);
} catch (Exception ex) {
ex.printStackTrace();
return;
}
// import the container into the workspace. The workspace contains the graph model and model
importController.process(container, new DefaultProcessor(), workspace);
//Layout for 1 second
AutoLayout autoLayout = new AutoLayout(1, TimeUnit.SECONDS);
autoLayout.setGraphModel(graphModel);
YifanHuLayout firstLayout = new YifanHuLayout(null, new StepDisplacement(1f));
ForceAtlasLayout secondLayout = new ForceAtlasLayout(null);
AutoLayout.DynamicProperty adjustBySizeProperty = AutoLayout
.createDynamicProperty("forceAtlas.adjustSizes.name", Boolean.TRUE, 0.1f);
//True after 10% of layout time
AutoLayout.DynamicProperty repulsionProperty = AutoLayout
.createDynamicProperty("forceAtlas.repulsionStrength.name", 500., 0f);
//500 for the complete period
autoLayout.addLayout(firstLayout, 0.5f);
autoLayout.addLayout(secondLayout, 0.5f, new AutoLayout
.DynamicProperty[]{adjustBySizeProperty, repulsionProperty});
autoLayout.execute();
// Preview
model.getProperties().putValue(PreviewProperty.SHOW_NODE_LABELS, Boolean.TRUE);
model.getProperties().putValue(PreviewProperty.EDGE_COLOR, new EdgeColor(Color.BLACK));
model.getProperties().putValue(PreviewProperty.EDGE_THICKNESS, 0.1f);
model.getProperties().putValue(PreviewProperty.SHOW_EDGES, Boolean.TRUE);
model.getProperties().putValue(PreviewProperty.DIRECTED, true);
model.getProperties().putValue(PreviewProperty.ARROW_SIZE, 20.0f);
model.getProperties().putValue(PreviewProperty.EDGE_CURVED, false);
model.getProperties().putValue(PreviewProperty.NODE_LABEL_FONT, new Font("Arial", Font.PLAIN, 2));
//Export
ExportController ec = Lookup.getDefault().lookup(ExportController.class);
try {
ec.exportFile(new File(String.format("src\\main\\resources\\svg\\%s.svg", svgName)));
} catch (IOException ex) {
ex.printStackTrace();
return;
}
container.closeLoader();
}
}
And the SVGtoPNG class contains:
public class SVGtoPNG {
String svgDirAndName;
String pngDirAndName;
SVGtoPNG(String svgDirAndName, String pngDirAndName) throws Exception {
this.svgDirAndName =svgDirAndName;
this.pngDirAndName =pngDirAndName;
createImage();
}
public void createImage() throws Exception {
String svg_URI_input = new File(String.format("%s", svgDirAndName)).toURI().toString();
TranscoderInput input_svg_image = new TranscoderInput(svg_URI_input);
// define OutputStream to PNG Image and attach to TranscoderOutput
OutputStream png_ostream = Files.newOutputStream(Paths.get(pngDirAndName));
TranscoderOutput output_png_image = new TranscoderOutput(png_ostream);
// create PNGTranscoder and define hints if required
PNGTranscoder my_converter = new PNGTranscoder();
// convert and Write output
System.out.println("It will print");
my_converter.transcode(input_svg_image, output_png_image);
System.out.println("It will not print");
png_ostream.flush();
png_ostream.close();
}
}
I have tried inserting a wait-function between the two if-constructs, however, this did not solve the error.
The full example is available on GitHub. https://github.com/commutativity/svg-problem
Please make sure to include the libraries "batik-codec-1.16" and the "gephi-toolkit-0.8.7" as JARs. I have included the links to download the JARs.
Any help, suggestions, and recommendations, are highly appreciated.
During the build process, the contents of your source resource folder are copied into your target directory along with the compiled .class files (and then eventually into the generated .jar - if you're generating a .jar). When you run the compiled program, getResource will load the resources from the target directory (or from the .jar if you're running a .jar), not the source directory.
So adding files to the source resources directory will not have any effect on getResources (not to mention that, outside of development, programs are usually run in an environment where the source directory isn't even available). You could, technically, make it work by writing to the target directory instead of the source one, but then it would still only work when running the program directly from the target directory - not from a .jar.
Really, you should use getResources only to access files that exist at build time. In order to read files generated at runtime, you should use normal file path and IO classes and methods, not getResources.
I am loading a model which I have saved from the WEKA Explorer to my Java code as seen below. I am now trying to give it an instance in the form of a .arff file so that I can get a prediction however it is giving an output of NaN 0.0 every time.
The prediction should be in the form of Levels (eg. Level 1).
The Screen Shot attached is the output I get.
I also attached another screenshot of a dummy .arff file I am giving the model.
try {
NaiveBayes nb = new NaiveBayes();
nb = (NaiveBayes) weka.core.SerializationHelper.read("Models/NaiveBayesModel.model");
DataSource source1 = new DataSource(final_filePath);
Instances testDataSet = source1.getDataSet();
testDataSet.setClassIndex(testDataSet.numAttributes() - 1);
double actualValue = testDataSet.instance(0).classValue();
Instance newInst = testDataSet.instance(0);
double NaiveBayes = nb.classifyInstance(newInst);
System.out.println(actualValue + " " + NaiveBayes);
} catch (Exception e) {
e.printStackTrace();
}
Output
Input
How can I fix this, please?
Thanks,
Andre
I would like to save RDD to text file grouped by key, currently I can't figure out how to split the output to multiple files, it seems all the output spanning across multiple keys which share the same partition gets written to the same file. I would like to have different files for each key. Here's my code snippet :
JavaPairRDD<String, Iterable<Customer>> groupedResults = customerCityPairRDD.groupByKey();
groupedResults.flatMap(x -> x._2().iterator())
.saveAsTextFile(outputPath + "/cityCounts");
This can be achieved by using foreachPartition to save each partitions into separate file.
You can develop your code as follows
groupedResults.foreachPartition(new VoidFunction<Iterator<Customer>>() {
#Override
public void call(Iterator<Customer> rec) throws Exception {
FSDataOutputStream fsoutputStream = null;
BufferedWriter writer = null;
try {
fsoutputStream = FileSystem.get(new Configuration()).create(new Path("path1"))
writer = new BufferedWriter(fsoutputStream)
while (rec.hasNext()) {
Customer cust = rec.next();
writer.write(cust)
}
} catch (Exception exp) {
exp.printStackTrace()
//Handle exception
}
finally {
// close writer.
}
}
});
Hope this helps.
Ravi
So I figured how to solve this. Convert RDD to Dataframe and then just partition by key during write.
Dataset<Row> dataFrame = spark.createDataFrame(customerRDD, Customer.class);
dataFrame.write()
.partitionBy("city")
.text("cityCounts"); // write as text file at file path cityCounts
I am trying to export a database into XML file using DBUNIT. I am facing problem while generating separate XML for each table. I could not able to do this.
Can someone help me with this?
Following is the code:
`
QueryDataSet partialDataSet = new QueryDataSet(connection);
addTables(partialDataSet);
// XML file into which data needs to be extracted
FlatXmlDataSet.write(partialDataSet, new FileOutputStream("C:/Users/name/Desktop/test-dataset_temp.xml"));
System.out.println("Data set written");
static private void addTables(QueryDataSet dataSet) {
if (tableList == null) return;
for (Iterator k = tableList.iterator(); k.hasNext(); ) {
String table = (String) k.next();
try {
dataSet.addTable(table);
} catch (AmbiguousTableNameException e) {
e.printStackTrace();
}
}
}`
Now my problem is how to I seperate tables so that I can generate seperate xml file for each table.
Thanks in Advance
I try to implement linear regression over an csv file. Here is the content of the csv file:
X1;X2;X3;X4;X5;X6;X7;X8;Y1;Y2;
0.98;514.50;294.00;110.25;7.00;2;0.00;0;15.55;21.33;
0.98;514.50;294.00;110.25;7.00;3;0.00;0;15.55;21.33;
0.98;514.50;294.00;110.25;7.00;4;0.00;0;15.55;21.33;
0.98;514.50;294.00;110.25;7.00;5;0.00;0;15.55;21.33;
0.90;563.50;318.50;122.50;7.00;2;0.00;0;20.84;28.28;
0.90;563.50;318.50;122.50;7.00;3;0.00;0;21.46;25.38;
0.90;563.50;318.50;122.50;7.00;4;0.00;0;20.71;25.16;
0.90;563.50;318.50;122.50;7.00;5;0.00;0;19.68;29.60;
0.86;588.00;294.00;147.00;7.00;2;0.00;0;19.50;27.30;
0.86;588.00;294.00;147.00;7.00;3;0.00;0;19.95;21.97;
0.86;588.00;294.00;147.00;7.00;4;0.00;0;19.34;23.49;
0.86;588.00;294.00;147.00;7.00;5;0.00;0;18.31;27.87;
0.82;612.50;318.50;147.00;7.00;2;0.00;0;17.05;23.77;
...
0.71;710.50;269.50;220.50;3.50;2;0.40;5;12.43;15.59;
0.71;710.50;269.50;220.50;3.50;3;0.40;5;12.63;14.58;
0.71;710.50;269.50;220.50;3.50;4;0.40;5;12.76;15.33;
0.71;710.50;269.50;220.50;3.50;5;0.40;5;12.42;15.31;
0.69;735.00;294.00;220.50;3.50;2;0.40;5;14.12;16.63;
0.69;735.00;294.00;220.50;3.50;3;0.40;5;14.28;15.87;
0.69;735.00;294.00;220.50;3.50;4;0.40;5;14.37;16.54;
0.69;735.00;294.00;220.50;3.50;5;0.40;5;14.21;16.74;
0.66;759.50;318.50;220.50;3.50;2;0.40;5;14.96;17.64;
0.66;759.50;318.50;220.50;3.50;3;0.40;5;14.92;17.79;
0.66;759.50;318.50;220.50;3.50;4;0.40;5;14.92;17.55;
0.66;759.50;318.50;220.50;3.50;5;0.40;5;15.16;18.06;
0.64;784.00;343.00;220.50;3.50;2;0.40;5;17.69;20.82;
0.64;784.00;343.00;220.50;3.50;3;0.40;5;18.19;20.21;
0.64;784.00;343.00;220.50;3.50;4;0.40;5;18.16;20.71;
0.64;784.00;343.00;220.50;3.50;5;0.40;5;17.88;21.40;
0.62;808.50;367.50;220.50;3.50;2;0.40;5;16.54;16.88;
0.62;808.50;367.50;220.50;3.50;3;0.40;5;16.44;17.11;
0.62;808.50;367.50;220.50;3.50;4;0.40;5;16.48;16.61;
0.62;808.50;367.50;220.50;3.50;5;0.40;5;16.64;16.03;
I read this csv file and implement linear regression implementation. Here is the source code in java:
public static void main(String[] args) throws IOException
{
String csvFile = null;
CSVLoader loader = null;
Remove remove =null;
Instances data =null;
LinearRegression model = null;
int numberofFeatures = 0;
try
{
csvFile = "C:\\Users\\Taha\\Desktop/ENB2012_data.csv";
loader = new CSVLoader();
// load CSV
loader.setSource(new File(csvFile));
data = loader.getDataSet();
//System.out.println(data);
numberofFeatures = data.numAttributes();
System.out.println("number of features: " + numberofFeatures);
data.setClassIndex(data.numAttributes() - 2);
//remove last attribute Y2
remove = new Remove();
remove.setOptions(new String[]{"-R", data.numAttributes()+""});
remove.setInputFormat(data);
data = Filter.useFilter(data, remove);
// data.setClassIndex(data.numAttributes() - 2);
model = new LinearRegression();
model.buildClassifier(data);
System.out.println(model);
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
I am getting an error, weka.core.UnassignedClassException: Class index is negative (not set)! at the line model.buildClassifier(data); Number of features is 1, however, it is expected to be 9.They are X1;X2;X3;X4;X5;X6;X7;X8;Y1;Y2 What am I missing?
Thanks in advance.
You can add after the line data=loader.getDataSet(), the next lines which will resolve your exception:
if (data.classIndex() == -1) {
System.out.println("reset index...");
instances.setClassIndex(data.numAttributes() - 1);
}
This worked for me.
Since I can not find any solution to that problem, I decided to position data into Oracle database and I read data from Oracle. There is an import utility in Oracle Sql Developer and I used it. That solves my problem. I write this article for people who has the same problem.
Here is the detailed information about connecting an Oracle database for weka.
http://tahasozgen.blogspot.com.tr/2016/10/connection-to-oracle-database-in-weka.html