I have two separate Database for Neo4j nodes. How can i pass Nodes from one database to another??
LIKE
1. Machine1 - GraphDB1- (Nodes-Students)
2.Machine2 - GraphDB2- (Nodes-Books)
so how can i pass book nodes to GraphDB1.
Any help would be appreciated.
You wouldn't do that, you would create all your data in one database.
In general you can query one databases with Cypher and then create / insert the data in the second database.
On the first database, return a node and relationship-list:
start n=node(*)
match n-[r]->()
return n,r
Us a programming language to create a CSV file or a set of cypher CREATE statements from those results. For importing CSV see: http://neo4j.org/develop/import esp. the "spreadsheet method" and/or the CSV batch-importer.
Enable auto-indexing in your second server: http://docs.neo4j.org/chunked/milestone/auto-indexing.html
Cypher Create statements for nodes and relationships look like this:
CREATE ({name:"Foo", age: 12});
CREATE ({name:"Bar", age: 18});
START n=node:node_auto_index(name="Foo"),
m=node:node_auto_index(name="Bar")
CREATE n-[:KNOWS {since:2012}]->m;
You can also check out my Neo4j-Import tools for the Neo4j-Shell: https://github.com/jexp/neo4j-shell-tools
Related
I am using JDBCIO.write() function of apache beam to write streaming data into cloudSQL. As per my requirement I have to write same data in two different tables.
Actually I am creating two differnt JDBCIO connection to write data in cloudSQL tables.
Is there any way to write two insert queries in single JDBCIO.write() function?
outputStringPcollection
.apply("Write to CloudSQL table",
JdbcIO.<String> write()
.withDataSourceConfiguration(JdbcIO.DataSourceConfiguration
.create(DRIVER_CLASS_NAME,
URL)
.withUsername(USERNAME)
.withPassword(PASSWORD)
.withStatement(insertQueryTable1)
.withPreparedStatementSetter(new SetQueryParameter())
.withStatement(insertQueryTable2)
.withPreparedStatementSetter(new SetQueryParameter()));
I tried to execute above code by writing two different insert queries in single JDBC connection but data is inserting in only one table (i.e. Table2).
So, can we execute multiple queries in single connection? If yes, is there any other way to do?
Thanks in advance.
No, you can't do it in a single connection. The best you can do is:
JdbcIO<String> configuredWrite = JdbcIO.<String>.withDataSourceConfiguration(...);
outputStringPcollection.apply(configuredWrite.withStatement(s1)...);
outputStringPcollection.apply(configuredWrite.withStatement(s2)...);
I'm using neo4j 3.1 with java 8 and I want to extract a connected subgraph as to store it as a test database.
Is it possible to do it and how?
How to do it with the clause Return which returns the output. So, I had to create new nodes and relations or just export the subgraph and put it in a new database.
How can I extract a connected subgraph since I have a disconnected graph.
Thank you
There are two parts to this...getting the connected subgraph, and then finding a means to export.
APOC Procedures seems like it can cover both of these. The approach in this answer using the path expander should get you all the nodes in the connected subgraph (if the relationship type doesn't matter, leave off the relationshipFilter parameter).
The next step is to get all relationships between all of those nodes. APOC's apoc.algo.cover() function in the graph algorithms section should accomplish this.
Something like this (assuming this is after the subgraph query, and subgraphNode is in scope for the column of distinct subgraph nodes):
...
WITH COLLECT(subgraphNode) as subgraph, COLLECT(id(subgraphNode)) as ids
CALL apoc.algo.cover(ids) YIELD rel
WITH subgraph, COLLECT(rel) as rels
...
Now that you have the collections of both the nodes and relationships in the subgraph, you can export them.
APOC Procedures offers several means of exporting, from CSV to CypherScript. You should be able to find an option that works for you.
You can also use the neo4j-shell to extract the result of a query to a file and use this same file to re-import it in the neo4j database :
ikwattro#graphaware-team ~/d/_/310> ./bin/neo4j-shell -c 'dump MATCH (n:Product)-[r*2]->(x) RETURN n, r, x;' > result.cypher
check the file
ikwattro#graphaware-team ~/d/_/310> cat result.cypher
begin
commit
begin
create (_1:`Product` {`id`:"product123"})
create (_2:`ProductInformation` {`id`:"product123EXCEL"})
create (_3:`ProductInformationElement` {`id`:"product123EXCELtitle", `key`:"title", `value`:"Original Title"})
create (_5:`ProductInformationElement` {`id`:"product123EXCELproduct_type", `key`:"product_type", `value`:"casual_bag"})
create (_1)-[:`PRODUCT_INFORMATION`]->(_2)
create (_2)-[:`INFORMATION_ELEMENT`]->(_3)
create (_2)-[:`INFORMATION_ELEMENT`]->(_5)
;
commit
Use this file for feeding another neo4j :
ikwattro#graphaware-team ~/d/_/310> ./bin/neo4j-shell -file result.cypher
Transaction started
Transaction committed
Transaction started
+-------------------+
| No data returned. |
+-------------------+
Nodes created: 4
Relationships created: 3
Properties set: 8
Labels added: 4
52 ms
Transaction committed
I am using Neo4j Procedure to create relationships on bulk data.
Initially insert that all data using load csv.
USING PERIODIC COMMIT 500
LOAD CSV WITH HEADERS FROM "file:///XXXX.csv" AS row
....
data size is too large[10M] but its successfully executed
my problem is i want to create relationships between this all nodes many-many
but i got exception [OutMemoryException] while executing queries
MATCH(n1:x{REMARKS :"LATEST"}) MATCH(n2:x{REMARKS :"LATEST"}) WHERE n1.DIST_ID=n2.ENROLLER_ID CREATE (n1)-[:ENROLLER]->(n2) ;
I have already created Indexing and Constraints also
Any idea please help me?
The problem is that your query is performed in one transaction, which leads to the exception [OutMemoryException]. And this is a problem, since at this moment the possibility of periodic transactions only have to load the CSV. So, you can, for example, re-read the CSV after first load:
USING PERIODIC COMMIT 500
LOAD CSV WITH HEADERS FROM "file:///XXXX.csv" AS row
MATCH (n1:x{REMARKS :"LATEST", DIST_ID: row.DIST_ID})
WITH n1
MATCH(n2:x{REMARKS :"LATEST"}) WHERE n1.DIST_ID=n2.ENROLLER_ID
CREATE (n1)-[:ENROLLER]->(n2) ;
Or try the trick with periodic committing from the APOC library:
call apoc.periodic.commit("
MATCH (n2:x {REMARKS:'Latest'}) WHERE exists(n2.ENROLLER_ID)
WITH n2 LIMIT {perCommit}
OPTIONAL MATCH (n1:x {REMARKS:'Latest'}) WHERE n1.DIST_ID = n2.ENROLLER_ID
WITH n2, collect(n1) as n1s
FOREACH(n1 in n1s|
CREATE (n1)-[:ENROLLER]->(n2)
)
REMOVE n2.ENROLLER_ID
RETURN count(n2)",
{perCommit: 1000}
)
P.S. ENROLLER_ID property is used as a flag for selecting nodes for processing. Of course, you can use another flag, which is set in the processing.
Or a more accurate with apoc.periodic.iterate:
CALL apoc.periodic.iterate("
MATCH (n1:x {REMARKS:'Latest'})
MATCH (n2:x {REMARKS:'Latest'}) WHERE n1.DIST_ID = n2.ENROLLER_ID
RETURN n1,n2
","
WITH {n1} as n1, {n2} as n2
MERGE (n1)-[:ENROLLER]->(n2)
", {batchSize:10000, parallel:true}
)
I am using mapreduce and HfileOutputFormat to produce hfiles and bulk load them directly into the hbase table.
Now, while reading the input files, I want to produce hfiles for two tables and bulk load the outputs in a single mapreduce.
I searched the web and see some links about MultiHfileOutputFormat and couldn't find a real solution to that.
Do you think that it is possible?
My way is :
use HFileOutputFormat as well, when the job is completed , doBulkLoad, write into table1.
set a List puts in mapper, and a MAX_PUTS value in global.
when puts.size()>MAX_PUTS, do:
String tableName = conf.get("hbase.table.name.dic", table2);
HTable table = new HTable(conf, tableName);
table.setAutoFlushTo(false);
table.setWriteBufferSize(1024*1024*64);
table.put(puts);
table.close();
puts.clear();
notice:you mast have a cleanup function to write the left puts .
I am working with a 3rd product called JPOS and it has an XMLPackager whereby I get a string from this packager that contains a record in an XML format such as:
<MACHINE><B000>STRING_VALUE</B000><B002>STRING_VALUE</B002><B003>STRING_VALUE</B003><B004>STRING_VALUE</B004><B007>STRING_VALUE</B007><B011>STRING_VALUE</B011><B012>STRING_VALUE</B012><B013>STRING_VALUE</B013><B015>STRING_VALUE</B015><B018>STRING_VALUE</B018><B028>STRING_VALUE</B028><B032>STRING_VALUE</B032><B035>STRING_VALUE</B035><B037>STRING_VALUE</B037><B039>STRING_VALUE</B039><B041>STRING_VALUE</B041><B043>STRING_VALUE</B043><B048>STRING_VALUE</B048><B049>STRING_VALUE</B049><B058>STRING_VALUE</B058><B061>STRING_VALUE</B061><B063>STRING_VALUE</B063><B127>STRING_VALUE</B127></MACHINE>
I have a SQL server table that contains a column for each of the listed. Not that it matters but I could potentially have thru defined with specific STRING_VALUEs. I'm not sure what is the best way to go about this in Java. My understanding is that SQL Server can take an XML string (not document) and do an insert. Is it best to parse each value and then put into a list that populate each value into? This is the first time I've used an XML file and therefore trying to get some help/direction.
Thanks.
Sorry, one of my colleagues was able to help and provide a quick answer. I'll try it from my Java code and it looks like it should work great. Thanks anyway.
Here is the SP that she created whereby I can pass in my XML string and bit value:
CREATE PROCEDURE [dbo].[sbssp_InsertArchivedMessages]
(
#doc varchar(max),
#fromTo bit
)
AS
BEGIN
DECLARE #idoc int, #lastId int
EXEC sp_xml_preparedocument #idoc OUTPUT, #doc
INSERT INTO [dbo].[tblArchivedMessages]
SELECT * FROM OPENXML(#idoc, '/MACHINE', 2) WITH [dbo].[tblArchivedMessages]
SET #lastId = (SELECT IDENT_CURRENT('tblArchivedMessages'))
UPDATE [dbo].[tblArchivedMessages]
SET FromToMach = #fromTo
WHERE ID = #lastId
END
GO
Regards.