How to load CSV file with cypher in java? - java

I am new to cypher. I want to load a csv using cypher in java. I googled and found the following piece
LOAD CSV WITH HEADERS FROM "http://neo4j.com/docs/2.3.1/csv/import/movies.csv" AS csvLine
MERGE (country:Country { name: csvLine.country })
.....
How to use this load csv query into java code. I tried something like this.
import java.io.File;
import java.io.IOException;
import java.util.Map;
import java.util.Map.Entry;
import javax.naming.spi.DirStateFactory.Result;
import org.neo4j.cypher.javacompat.ExecutionEngine;
import org.neo4j.cypher.javacompat.ExecutionResult;
import org.neo4j.graphdb.Direction;
import org.neo4j.graphdb.GraphDatabaseService;
import org.neo4j.graphdb.Node;
import org.neo4j.graphdb.Relationship;
import org.neo4j.graphdb.RelationshipType;
import org.neo4j.graphdb.Transaction;
import org.neo4j.graphdb.factory.GraphDatabaseFactory;
import org.neo4j.kernel.impl.util.FileUtils;
public class test_new {
private static final String DB_PATH = "C:...../default.graphdb";
public static void main( final String[] args ) throws IOException
{
GraphDatabaseService db = new GraphDatabaseFactory().newEmbeddedDatabase( DB_PATH );
Transaction tx1 = db.beginTx();
try{
ExecutionEngine engine = new ExecutionEngine(db);
ExecutionResult result = engine.execute("LOAD CSV WITH HEADERS FROM "C:/..../Mock_data.csv" AS csvLine ");
tx1.success();
} finally {
tx1.close();
}
db.shutdown();
}
}
But I am not sure about this line.
ExecutionResult result = engine.execute("LOAD CSV WITH HEADERS FROM "C:/..../Mock_data.csv" AS csvLine ");
It throws syntax error.
Exception in thread "main" java.lang.Error: Unresolved compilation problems:
Syntax error, insert ")" to complete MethodInvocation
Syntax error, insert ";" to complete LocalVariableDeclarationStatement
I don't know the syntax construction myself. How to load the csv path?

To correct the Java syntax error, you need to escape double quotes in the middle of the string; otherwise it looks like your string literal finishes at the quote around the path:
"LOAD CSV WITH HEADERS FROM \"C:/..../Mock_data.csv\" AS csvLine "

Finally this Worked for me!!
ExecutionResult result = engine.execute("LOAD CSV WITH HEADERS FROM 'file:///Users/xxxxx/Documents/Txx.csv' AS csvLine ")

Related

QueryParser short cannot be dereferenced

i have some indexed docs and i wanna search in them using a query, i checked lucene documentation and made this code but somehow im getting "short cannot be dereferenced" in the QueryParser line, Im new to Java and to Lucene, im using Lucene 5.3.1
import java.io.IOException;
import java.nio.file.Path;
import java.nio.file.Paths;
import static java.time.Clock.system;
import javax.management.Query;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.core.KeywordAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.Term;
import org.apache.lucene.queryparser.classic.QueryParser;
import org.apache.lucene.search.FuzzyQuery;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.ScoreDoc;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.FSDirectory;
import static sun.rmi.transport.TransportConstants.Version;
import static sun.rmi.transport.TransportConstants.Version;
public class Searcher
{
public static void main(String args[]) throws IOException{
String query="computer science";
Analyzer analyzer = new KeywordAnalyzer();
Query q = new QueryParser(Version.LUCENE_CURRENT, "W", analyzer).parse(query); //ERROR IS HERE
Path indexPath = Paths.get("MonIndex");
Directory directory = FSDirectory.open(indexPath);
DirectoryReader reader = DirectoryReader.open(directory);
IndexSearcher iSearcher = new IndexSearcher(reader);
TopDocs topdocs = iSearcher.search(q2, 100);
ScoreDoc[] resultsList = topdocs.scoreDocs;
for(int i = 0; i<resultsList.length; i++){
Document book = iSearcher.doc(resultsList[i].doc);
System.out.println(book.getField("I").stringValue());
}
}
}
The problem is Version.LUCENE_CURRENT. You are not importing Lucene's Version, but you do have sun.rmi.transport.TransportConstants.Version, which, while I'm not familiar with the library, it certainly does appear to be a short. So attempting to dereference that, by attempting to reference the nonexistent sun.rmi.transport.TransportConstants.Version.LUCENE_CURRENT is causing that error to be thrown.
However, in the version of Lucene you say you are using, the QueryParserctor no longer even accepts a Version argument, so just remove it:
Query q = new QueryParser("W", analyzer).parse(query);
Your next error: The Query that the queryparser returns is not a javax.management.Query.

How to extract rtf tables

I have an rtf file. It has lots of tables in it. I have been trying to use java (POI and tika) to extract the tables. This is easy enough in a .doc where the tables are defined as such. However in a rtf file there doesn't seem to be any 'this is a table' tag as part of the meta data. Does anyone know what the best strategy is for extracting a table from such a file? Would converting it to another file format help. Any clues for me to look up?
There is a linux tool called unrtf, look at manual
With the app you can transform your rtf file into html:
unrtf --html your_input_file.rtf > your_output_file.html
Now you can use any programming api for manipulation of html/xml and extract tables easily. Is it enough you need?
Thanks hexin for your answer. In the end I was able to use Tika by using the TXTParser and then putting all the segments between bold tags(which is how my tables are separated) into an arraylist. I had to use the tab seperators to define tables from there.
Here is the code without the bit to extract the tables based on tabs (still working on it):
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.ArrayList;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.apache.tika.exception.TikaException;
import org.apache.tika.metadata.Metadata;
import org.apache.tika.metadata.TikaCoreProperties;
import org.apache.tika.parser.ParseContext;
import org.apache.tika.parser.html.HtmlParser;
import org.apache.tika.parser.rtf.RTFParser;
import org.apache.tika.parser.txt.TXTParser;
import org.apache.tika.sax.BodyContentHandler;
import org.xml.sax.SAXException;
public class TextParser {
public static void main(final String[] args) throws IOException,TikaException{
//detecting the file type
BodyContentHandler handler = new BodyContentHandler(-1);
Metadata metadata = new Metadata();
FileInputStream inputstream = new FileInputStream(new File("/Users/mydoc.rtf"));
ParseContext pcontext = new ParseContext();
//Text document parser
TXTParser TXTParser = new TXTParser();
try {
TXTParser.parse(inputstream, handler, metadata,pcontext);
} catch (SAXException e) {
e.printStackTrace();
}
String s=handler.toString();
Pattern pattern = Pattern.compile("(\\\\b\\\\f1\\\\fs24.+?\\\\par .+?)\\\\b\\\\f1\\\\fs24.*?\\{\\\\",Pattern.DOTALL);
Matcher matcher = pattern.matcher(s);
ArrayList<String> arr= new ArrayList<String>();
while (matcher.find()) {
arr.add(matcher.group(1));
}
for(String name : arr){
System.out.println("The array number is: "+arr.indexOf(name)+" \n\n "+name);
}
}
}

Lucene 5.5.0 StopFilter Error

I am trying to use StopFilter in Lucene 5.5.0. I tried the following:
package lucenedemo;
import java.io.StringReader;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.HashSet;
import java.util.List;
import java.util.Set;
import java.util.Iterator;
import org.apache.lucene.*;
import org.apache.lucene.analysis.*;
import org.apache.lucene.analysis.standard.*;
import org.apache.lucene.analysis.core.StopFilter;
import org.apache.lucene.analysis.en.EnglishAnalyzer;
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.analysis.standard.StandardTokenizer;
import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import org.apache.lucene.analysis.util.CharArraySet;
import org.apache.lucene.util.AttributeFactory;
import org.apache.lucene.util.Version;
public class lucenedemo {
public static void main(String[] args) throws Exception {
System.out.println(removeStopWords("hello how are you? I am fine. This is a great day!"));
}
public static String removeStopWords(String strInput) throws Exception {
AttributeFactory factory = AttributeFactory.DEFAULT_ATTRIBUTE_FACTORY;
StandardTokenizer tokenizer = new StandardTokenizer(factory);
tokenizer.setReader(new StringReader(strInput));
tokenizer.reset();
CharArraySet stopWords = EnglishAnalyzer.getDefaultStopSet();
TokenStream streamStop = new StopFilter(tokenizer, stopWords);
StringBuilder sb = new StringBuilder();
CharTermAttribute charTermAttribute = tokenizer.addAttribute(CharTermAttribute.class);
streamStop.reset();
while (streamStop.incrementToken()) {
String term = charTermAttribute.toString();
sb.append(term + " ");
}
streamStop.end();
streamStop.close();
tokenizer.close();
return sb.toString();
}
}
But it gives me the following error:
Exception in thread "main" java.lang.IllegalStateException: TokenStream contract violation: reset()/close() call missing, reset() called multiple times, or subclass does not call super.reset(). Please see Javadocs of TokenStream class for more information about the correct consuming workflow.
at org.apache.lucene.analysis.Tokenizer$1.read(Tokenizer.java:109)
at org.apache.lucene.analysis.standard.StandardTokenizerImpl.zzRefill(StandardTokenizerImpl.java:527)
at org.apache.lucene.analysis.standard.StandardTokenizerImpl.getNextToken(StandardTokenizerImpl.java:738)
at org.apache.lucene.analysis.standard.StandardTokenizer.incrementToken(StandardTokenizer.java:159)
at org.apache.lucene.analysis.util.FilteringTokenFilter.incrementToken(FilteringTokenFilter.java:51)
at lucenedemo.lucenedemo.removeStopWords(lucenedemo.java:42)
at lucenedemo.lucenedemo.main(lucenedemo.java:27)
What exactly am I doing wrong here? I have closed both the Tokeinzer and TokenStream clasess. Is there something else I am missing here?
Calling reset on a filter will, in turn, reset the underlying stream. Since you reset the tokenizer manually, and then create a StopFilter with the tokenizer it's underlying stream, and reset that, the Tokenizer is being reset twice.
So just remove this line:
tokenizer.reset();

load csv file to oracle database

Hi I wanted to load csv file in Oracle database using java but what I am getting error like "ora-00900 invalid sql statement". I am using oracle Database 11g Enterprise Edition. So I don't understand why it doesn't accept my load statement. Any help? Thanks in advance.
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.OutputStream;
import java.io.OutputStreamWriter;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.ResultSetMetaData;
import java.sql.Statement;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
public class test {
public static void main(String[] args){
test t=new test();
t.inserintoDb("C:\\Users\\fiels\\2.csv");
}
public void inserintoDb(String path){
Connection conn=null;
Statement stmt=null;
try{
Class.forName("oracle.jdbc.driver.OracleDriver");
conn=(Connection) DriverManager.getConnection(
"jdbc:oracle:thin:#address:orcl","user","password");
stmt=conn.createStatement();
String select1="truncate table table1";
stmt.execute(select1);
String select2="LOAD DATA INFILE'" +path+"' INTO TABLE table1 FIELDS TERMINATED BY ',' (customer_nbr, nbr)";
stmt.execute(select2);
}catch(Exception e){
e.printStackTrace();
}
}
}
Does infile works on Oracle? It seems that only on MySql.. The SQL Loader alternative is really fast. Check the official documentation to see how to config it:
As the question states that you want to use Java here is the help for calling the SQL Loader from Java. It bassically uses a Runtime but depends on the operating system:
String[] stringCommand = { "bash", "-c", "/usr/bin/sqlldr username/password#sid control=/path/to/sample.ctl"};
Runtime rt = Runtime.getRuntime();
Process proc = null;
try {
proc = rt.exec(stringCommand);
}catch (Exception e) {
// TODO something
}finally {
proc.destroy();
}
But if you just want to load some table for your personal use you wont need java. You can call it from a .bat file.

Execution engine not printing results

I have a simple Query method that runs cypher queries as noted below. If I run the EXACT same query in the web console (yes, same db instance, correct path), I get a non-empty iterator in the console. Shouldn't I 1) not get that message and 2) get the results I see in my database?
This class has other methods that add data to the database and that functionality works well. This query method is not working...
Class:
import org.neo4j.cypher.javacompat.ExecutionEngine;
import org.neo4j.cypher.javacompat.ExecutionResult;
import org.neo4j.graphdb.Direction;
import org.neo4j.graphdb.GraphDatabaseService;
import org.neo4j.graphdb.Node;
import org.neo4j.graphdb.Relationship;
import org.neo4j.graphdb.RelationshipType;
import org.neo4j.graphdb.Transaction;
import org.neo4j.graphdb.factory.GraphDatabaseFactory;
import org.neo4j.helpers.collection.IteratorUtil;
import java.util.Iterator;
import java.util.LinkedList;
import java.util.List;
import java.util.Map;
import java.util.Map.Entry;
import java.sql.*;
public class NeoProcessor {
//private GraphDatabaseService handle;
private static final String DB_PATH = "/usr/local/Cellar/neo4j/2.0.1/libexec/data/new_graph.db";
static GraphDatabaseService graphDb = new GraphDatabaseFactory().newEmbeddedDatabase( DB_PATH );
public NeoProcessor()
{
}
public void myQuery(String cypherText)
{
//System.out.println("executing the above query");
cypherText = "MATCH (n:Phone{id:'you'}) MATCH n-[r:calling]->m WHERE n<>m RETURN n, r, m";
ExecutionEngine engine = new ExecutionEngine( this.graphDb );
ExecutionResult result;
try ( Transaction ignored = graphDb.beginTx() )
{
result = engine.execute( cypherText + ";");
System.out.println(result);
ignored.success();
}
}
}
Below is a pic showing how the query rreturns results from the DB:
result = engine.execute(cypherText + ";");
System.out.println(result.dumpToString());
Specified by:
http://api.neo4j.org/2.0.3/org/neo4j/cypher/javacompat/ExecutionResult.html#dumpToString()
To consume the result you need to use the iterator. If you just want a string representation use the ExecutionResult.dumpToString(). Be aware this method exhausts the iterator.
You should be calling:
System.out.println(result.dumpToString)
Which will prettify it for you. Of course, there is always the possibility that your match returns no results. You shouuld also close the transaction in a finally block, although that won't matter much here.
EDIT: Taking a second look at this, your Cypher query is wrongly formed, It should be
MATCH (n:Phone) - [r:calling] -> (m)
WHERE n.id = `you'
RETURN n, r, m

Categories