Talend ETL Job Error in tOracleOutput Component - java

I am a newbie to TalendETL and am using Talend Open Studio for Big Data version 5.4.1 . I have developed a simple Talend ETL job that picks up data from a csv file and inserts data into my local Oracle Database. Below is how my package looks:
The job returns an exception that ArrayIndexOutOfBounds after the last record of the csv file. But I'm uncertain as to why it should return that in the first place? I checked out the solution given on this link: http://www.talendforge.org/forum/viewtopic.php?id=21644
But it doesn't seem to work at all. I have the latest driver for the oracle component and increasing/decreasing the commit size does not seem to affect it.
Can someone please help me out on this? Please let me know in case more information is needed.
P.S: The complete error log is below:-
Starting job Kaggle_Data_Load_Training at 09:31 25/06/2014.
[statistics] connecting to socket on port 3957
[statistics] connected
Exception in component tOracleOutput_1
java.lang.ArrayIndexOutOfBoundsException: -32203
at oracle.jdbc.driver.OraclePreparedStatement.setupBindBuffers(OraclePreparedStatement.java:2677)
at oracle.jdbc.driver.OraclePreparedStatement.executeBatch(OraclePreparedStatement.java:9270)
at oracle.jdbc.driver.OracleStatementWrapper.executeBatch(OracleStatementWrapper.java:210)
at test.kaggle_data_load_training_0_1.Kaggle_Data_Load_Training.tFileInputDelimited_1Process(Kaggle_Data_Load_Training.java:4360)
at test.kaggle_data_load_training_0_1.Kaggle_Data_Load_Training.runJobInTOS(Kaggle_Data_Load_Training.java:4717)
at test.kaggle_data_load_training_0_1.Kaggle_Data_Load_Training.main(Kaggle_Data_Load_Training.java:4582)
[statistics] disconnected
Job Kaggle_Data_Load_Training ended at 09:31 25/06/2014. [exit code=1]

Can you try to decrease the commit size on the tOracleOutput component? I remember there is some kind of bug in 5.4.1. of TOS which resulted in this error. Therefore please lower commit size (let's say to 500) and see if the problem still exists. Here's more information about the bug: http://www.talendforge.org/forum/viewtopic.php?id=5931

Had same issue in Talend 6.2.1
It can be resolved by changing updating DB Version in metadata of connection.
Same is confirmed on Talend blog

Related

Corda Bootstrapper failing after node-info generation

I am currently working on R3 Corda 4.4 and generating my network using the network Bootstrapper tool. I have a 3 node setup which is running on 3 different VMs with a load balancer attached to each of them. I am also using postgresql as my db for the nodes. All the steps till node-info creation run successfully, during node-info creation the 2nd party's node-info is generated after the last party node-info causing an error.
picocli.CommandLine$ExecutionException: Error while calling command (net.corda.bootstrapper.NetworkBootstrapperRunner#4e28bdd1): java.lang.IllegalThreadStateException: process hasn't exited
The error occurs as soon as the node-info for the 3rd party is generated and the 2nd party one is in progress. When I open the logs for each of the parties there is no Error and all node-info are getting generated and I can see them in their respective folders.
Detailed Error:
picocli.CommandLine$ExecutionException: Error while calling command (net.corda.bootstrapper.NetworkBootstrapperRunner#4e28bdd1): java.lang.IllegalThreadStateException: process hasn't exited
at picocli.CommandLine.execute(CommandLine.java:1180)
at picocli.CommandLine.access$800(CommandLine.java:141)
at picocli.CommandLine$RunLast.handle(CommandLine.java:1367)
at picocli.CommandLine$RunLast.handle(CommandLine.java:1335)
at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:1243)
at picocli.CommandLine.parseWithHandlers(CommandLine.java:1526)
at net.corda.cliutils.CordaCliWrapperKt.start(CordaCliWrapper.kt:73)
at net.corda.bootstrapper.MainKt.main(Main.kt:19)
Caused by: java.lang.IllegalThreadStateException: process hasn't exited
at java.lang.UNIXProcess.exitValue(UNIXProcess.java:421)
at net.corda.nodeapi.internal.network.NetworkBootstrapper$Companion$generateNodeInfo$1.invoke(NetworkBootstrapper.kt:116)
at net.corda.nodeapi.internal.network.NetworkBootstrapper$Companion$generateNodeInfo$1.invoke(NetworkBootstrapper.kt:69)
at net.corda.nodeapi.internal.network.NetworkBootstrapper$Companion.printNodeInfoGenLogToConsole(NetworkBootstrapper.kt:128)
at net.corda.nodeapi.internal.network.NetworkBootstrapper$Companion.generateNodeInfo(NetworkBootstrapper.kt:116)
at net.corda.nodeapi.internal.network.NetworkBootstrapper$Companion.access$generateNodeInfo(NetworkBootstrapper.kt:69)
at net.corda.nodeapi.internal.network.NetworkBootstrapper$Companion$generateNodeInfos$1$1.invoke(NetworkBootstrapper.kt:95)
at net.corda.nodeapi.internal.network.NetworkBootstrapper$Companion$generateNodeInfos$1$1.invoke(NetworkBootstrapper.kt:69)
at net.corda.core.internal.concurrent.ValueOrException$DefaultImpls.capture(CordaFutureImpl.kt:141)
at net.corda.core.internal.concurrent.OpenFuture$DefaultImpls.capture(CordaFutureImpl.kt)
at net.corda.core.internal.concurrent.CordaFutureImpl.capture(CordaFutureImpl.kt:153)
at net.corda.core.internal.concurrent.CordaFutureImplKt$fork$$inlined$also$lambda$1.run(CordaFutureImpl.kt:22)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
This error message unfortunately doesn't give us a lot to work with but it would seem that there's a problem with a particular node config option.
Is this issue happening when running the deploy nodes task? (meaning compiling your cordapps) Or is this issue happening when you use the runnodes script?
can you share more info on what the node configs look like? that's probably going to be the most likely culprit as it's not clear what you've changed since the last working version.
Here's my recommendation:
See if you're able to isolate the issue and run the network without the problematic node and then share the config of the one that was giving the issue.
Here's a page on the configuration options for nodes : https://docs.corda.net/docs/corda-os/4.6/corda-configuration-fields.html
If you're able to find a specific node config option that's causing the breaking definitely update here or on slack so we can get that bug reported if it turns out to be that.

Spring Mongo's find operation freezes on windows server 2008 Machine

I am currently using Spring's Mongo persistence layer for querying MongoDB. The collection I query contains about 4G of data. When I run the find code on my IDE it retrieves the data. However, when I run the same code on my server, it freezes for about 15 to 20 minutes and eventually throws the error below. My concern is that it runs without a hitch on my IDE running on my 4G Ram windows PC and fails on my 14G ram server. I have looked through the Mongo Log, and there's nothing there that points to the problem. I also assumed that the problem might be an environmental issue since it works on my local spring IDE, however the libraries on both my local pc are the same as the ones on my server. Has anyone had this kind of issue or can any one point me to what I'm doing wrong. Also weirdly, the find operation works when I revert to Mongo's java driver find methods.
I'm using mongo-java-driver - 2.12.1
spring-data-mongodb - 1.7.0.RELEASE
See below sample find operation code and error message.
List<HTObject> empObjects =mongoOperations.find(new Query(Criteria.where("date").gte(dateS).lte(dateE)),HTObject.class);
The exception I get is:
09:42:01.436 [main] DEBUG o.s.data.mongodb.core.MongoDbUtils - Getting Mongo Database name=[Hansard]
Exception in thread "main" org.springframework.dao.DataAccessResourceFailureException: Cursor 185020098546 not found on server 172.30.128.155:27017; nested exception is com.mongodb.MongoException$CursorNotFound: Cursor 185020098546 not found on server 172.30.128.155:27017
at org.springframework.data.mongodb.core.MongoExceptionTranslator.translateExceptionIfPossible(MongoExceptionTranslator.java:73)
at org.springframework.data.mongodb.core.MongoTemplate.potentiallyConvertRuntimeException(MongoTemplate.java:2002)
at org.springframework.data.mongodb.core.MongoTemplate.executeFindMultiInternal(MongoTemplate.java:1885)
at org.springframework.data.mongodb.core.MongoTemplate.doFind(MongoTemplate.java:1696)
at org.springframework.data.mongodb.core.MongoTemplate.doFind(MongoTempate.java:1679)
at org.springframework.data.mongodb.core.MongoTemplate.find(MongoTemplate.java:598)
at org.springframework.data.mongodb.core.MongoTemplate.find(MongoTemplate.java:589)
at com.sa.dbObject.TestDb.main(TestDb.java:74)
Caused by: com.mongodb.MongoException$CursorNotFound: Cursor 185020098546 not found on server 172.30.128.155:27017
at com.mongodb.QueryResultIterator.throwOnQueryFailure(QueryResultIterator.java:218)
at com.mongodb.QueryResultIterator.init(QueryResultIterator.java:198)
at com.mongodb.QueryResultIterator.initFromQueryResponse(QueryResultIterator.java:176)
at com.mongodb.QueryResultIterator.getMore(QueryResultIterator.java:141)
at com.mongodb.QueryResultIterator.hasNext(QueryResultIterator.java:127)
at com.mongodb.DBCursor._hasNext(DBCursor.java:551)
at com.mongodb.DBCursor.hasNext(DBCursor.java:571)
at org.springframework.data.mongodb.core.MongoTemplate.executeFindMultiInternal(MongoTemplate.java:1871)
... 5 more
In short
The MongoDB result cursor is not available anymore on the server.
Explanation
This can happen when using Sharding and a connection to a mongos fails over or if you run into timeouts (see http://docs.mongodb.org/manual/core/cursors/#closure-of-inactive-cursors).
You're performing a query that loads all objects into one list (mongoOperations.find). Depending on the result size, this may take a long time. Using an Iterator can help to leverage but even loading huge amounts using Iterators is limited at a certain point.
You should partition the results if you have to query very large data amounts using either paging (paging gets slower the more records you skip) or by querying with splits of your range (you have already a date range, so this could work).

Apache nutch is not crawling any more

I have a two machine cluster. On one machine nutch is configured and on second hbase and hadoop are configured. hadoop is in fully distributed mode and hbase in pseudo distributed mode. I have crawled about 280GB data. But now when I start crawling . It gives following message and do not crawl any more in previous table
INFO mapreduce.GoraRecordReader - gora.buffer.read.limit = 10000
INFO crawl.FetchScheduleFactory - Using FetchSchedule impl: org.apache.nutch.crawl.DefaultFetchSchedule
and following bug
ERROR store.HBaseStore
- [Ljava.lang.StackTraceElement;#7ae0c96b
Documents are fetched but they are not saved in hbase.
But if I crawl data in a new table, it works well and crawl properly witout any error. I think this is not a connection problem as for new table it works. I think it is bacause of some property etc.
Can anyone guide me as I am not an expert in apache nutch?
Not quite my field, but looks like thread exhaustion on the underlying machines.
As I was also facing similiar problem. Actual problem was with regionserver (Hbase deamon ). So try to restart it as it is shutdown when used with default seeting and data is too mutch in hbase. For more information, see log files of regionserver.

JavaDB connection issues; database not found

I am having a problem with Java DB that I just don't know how to resolve. I am creating a DB and connecting to it using Java DB's native JDBC driver. If I relocate that database physically and try to connect to it using its new path, I consistently get XJ004 errors:
ERROR XJ004: Database 'blahblah' not found.
I am sure I am using the correct connection string. Is there any possibility the DB is somehow getting corrupted? Or is there some encoding of the DB path in the DB such that if you relocate a Java DB it gets confused?
I'm really at a loss here. :( Please help!
Jim
Have you verified that this error message isn't also used when there's no listener on the host machine ... and were you using JavaDB on your local machine before the relocation? Many database systems (and I'm not that familiar with JavaDB) ship set-up to only allow connections from localhost for security reasons. On PostgreSQL for instance, you have to allow TCP connections and bounce the daemon to obtain a remote connection.
Anyway ... since the problem started when you when remote, look for issues related to that first! (And if you can run your application on the remote machine, does that work?)
There must be a file named derby.log somewhere. Check the error there. If it is not detailed enough, try setting derby.stream.error.logSeverityLevel to a lower value. See the manual for more information.

Help I don't know how to handle this error (java.lang.RuntimeException: EMBEDDED Broker start failure:code = 1)

I follow this tutorial (http://www.netbeans.org/kb/docs/websvc/rest-mysql.html) and it's success, but when i try with my database, it become error. I already follow step by step , but still error, anyone know how to handle this error? or it's bug too?
MQJMSRA_RA4001:
start:Aborting:Exception starting
EMBEDDED broker=EMBEDDED Broker start
failure:code = 1
java.lang.RuntimeException: EMBEDDED
Broker start failure:code = 1
at com.sun.messaging.jms.ra.EmbeddedBrokerRunner.start(EmbeddedBrokerRunner.java:268)
at com.sun.messaging.jms.ra.ResourceAdapter.start(ResourceAdapter.java:472)
I had the same issue. Apparently, this happens when you change your server IP address and the lock file keeps the old IP. Simply delete the lock file:
\imq\instances\imqbroker\lock
Source:
http://old.nabble.com/Glassfish-V2-failed-to-start,-broker%3Dembedded,-failure:code-%3D-1-td17728741.html
This seems to be related to bug in GF. Probably upgrade to later GF version will help.

Categories