I am working with a route which downloads files from a remote server to local directory using SFTP.
Apache Camel version: 2.15.2
From endpoint: sftp://xxx.xxx.xxx.xxx:xx//User/User01?delay=30s&include=File.*.csv&initialDelay=1m&password=xxxxxx&stepwise=false&streamDownload=true&username=User01
To endpoint: file:///var/opt/myfolder/incoming?doneFileName=${file:name}.done
The remote location has more than 10 files available for download. After downloading 2-3 files, the route gets stuck and after around 30 seconds I get the below error in logs:
DEBUG 04/11/15 07:45:24,183 org.apache.camel.component.file.FileOperations :Using InputStream to write file: /var/opt/myfolder/incoming/File01.csv
...
around 30 secs gap
...
INFO 04/11/15 07:49:53,820 org.apache.camel.component.file.remote.SftpOperations$JSchLogger :JSCH -> Caught an exception, leaving main loop due to Connection reset
INFO 04/11/15 07:49:53,821 org.apache.camel.component.file.remote.SftpOperations$JSchLogger :JSCH -> Disconnecting from xxx.xxx.xxx.xxx port xx
WARN 04/11/15 07:49:53,823 org.apache.camel.util.IOHelper :Cannot close: File01.csv. Reason: Pipe closed
java.io.IOException: Pipe closed
at java.io.PipedInputStream.read(PipedInputStream.java:308)
at java.io.PipedInputStream.read(PipedInputStream.java:378)
at java.io.InputStream.skip(InputStream.java:222)
at com.jcraft.jsch.ChannelSftp.skip(ChannelSftp.java:2894)
at com.jcraft.jsch.ChannelSftp.access$600(ChannelSftp.java:36)
at com.jcraft.jsch.ChannelSftp$RequestQueue.cancel(ChannelSftp.java:1246)
at com.jcraft.jsch.ChannelSftp$2.close(ChannelSftp.java:1503)
at org.apache.camel.util.IOHelper.close(IOHelper.java:326)
at org.apache.camel.component.file.FileOperations.writeFileByStream(FileOperations.java:404)
at org.apache.camel.component.file.FileOperations.storeFile(FileOperations.java:274)
at org.apache.camel.component.file.GenericFileProducer.writeFile(GenericFileProducer.java:277)
at org.apache.camel.component.file.GenericFileProducer.processExchange(GenericFileProducer.java:165)
at org.apache.camel.component.file.GenericFileProducer.process(GenericFileProducer.java:79)
at org.apache.camel.util.AsyncProcessorConverterHelper$ProcessorToAsyncProcessorBridge.process(AsyncProcessorConverterHelper.java:61)
at org.apache.camel.processor.SendProcessor.process(SendProcessor.java:129)
at org.apache.camel.management.InstrumentationProcessor.process(InstrumentationProcessor.java:77)
at org.apache.camel.processor.RedeliveryErrorHandler.process(RedeliveryErrorHandler.java:448)
at org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:191)
at org.apache.camel.processor.Pipeline.process(Pipeline.java:118)
at org.apache.camel.processor.Pipeline.process(Pipeline.java:80)
at org.apache.camel.processor.ChoiceProcessor.process(ChoiceProcessor.java:111)
at org.apache.camel.management.InstrumentationProcessor.process(InstrumentationProcessor.java:77)
at org.apache.camel.processor.RedeliveryErrorHandler.process(RedeliveryErrorHandler.java:448)
at org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:191)
at org.apache.camel.processor.Pipeline.process(Pipeline.java:118)
at org.apache.camel.processor.Pipeline.process(Pipeline.java:80)
at org.apache.camel.util.AsyncProcessorHelper.process(AsyncProcessorHelper.java:109)
at org.apache.camel.processor.Pipeline.process(Pipeline.java:60)
at org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:166)
at org.apache.camel.component.file.GenericFileConsumer.processExchange(GenericFileConsumer.java:435)
at org.apache.camel.component.file.remote.RemoteFileConsumer.processExchange(RemoteFileConsumer.java:137)
at org.apache.camel.component.file.GenericFileConsumer.processBatch(GenericFileConsumer.java:211)
at org.apache.camel.component.file.GenericFileConsumer.poll(GenericFileConsumer.java:175)
at org.apache.camel.impl.ScheduledPollConsumer.doRun(ScheduledPollConsumer.java:174)
at org.apache.camel.impl.ScheduledPollConsumer.run(ScheduledPollConsumer.java:101)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
I have set redelivery so it re attempts the download. The reattempt happens for the last file and as per the logs the file is downloaded. But when i check the folder the file size is 0 and even the .done file is created, the actual file size on remote server is 28 KB.
For rest of the files i get the below error for each file and none of the file is downloaded:
WARN 04/11/15 07:49:58,877 org.slf4j.helpers.MarkerIgnoringBase :Error processing file RemoteFile[/User/User01/File02.csv] due to Cannot retrieve file: /User/User01/File02.csv. Caused by: [org.apache.camel.component.file.GenericFileOperationFailedException - Cannot retrieve file: /User/User01/File02.csv]
org.apache.camel.component.file.GenericFileOperationFailedException: Cannot retrieve file: /User/User01/File02.csv
at org.apache.camel.component.file.remote.SftpOperations.retrieveFileToStreamInBody(SftpOperations.java:651)
at org.apache.camel.component.file.remote.SftpOperations.retrieveFile(SftpOperations.java:594)
at org.apache.camel.component.file.GenericFileConsumer.processExchange(GenericFileConsumer.java:396)
at org.apache.camel.component.file.remote.RemoteFileConsumer.processExchange(RemoteFileConsumer.java:137)
at org.apache.camel.component.file.GenericFileConsumer.processBatch(GenericFileConsumer.java:211)
at org.apache.camel.component.file.GenericFileConsumer.poll(GenericFileConsumer.java:175)
at org.apache.camel.impl.ScheduledPollConsumer.doRun(ScheduledPollConsumer.java:174)
at org.apache.camel.impl.ScheduledPollConsumer.run(ScheduledPollConsumer.java:101)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: 4:
at com.jcraft.jsch.ChannelSftp.get(ChannelSftp.java:1513)
at com.jcraft.jsch.ChannelSftp.get(ChannelSftp.java:1266)
at org.apache.camel.component.file.remote.SftpOperations.retrieveFileToStreamInBody(SftpOperations.java:636)
... 14 more
Caused by: java.io.IOException: Pipe closed
at java.io.PipedInputStream.read(PipedInputStream.java:308)
at com.jcraft.jsch.Channel$MyPipedInputStream.updateReadSide(Channel.java:362)
at com.jcraft.jsch.ChannelSftp.get(ChannelSftp.java:1287)
... 16 more
I tried using both disconnect=true and false, the issue is happening in both the cases.
Any suggestions what could be wrong?
You had used both initial delay which is the delay before polling for a file by the consumer and delay which is before the next poll. Both creates a time buffer which is
Initialdelay+DownloadTime+Delay+WriteDestinationDirectoryTime
Try using only "delay".
Try appending a timestamp to the file being stored in destination.
Try camel-ftp2
Related
I am running a Spring-Boot project in IDEA Intelli for training that writes a recipe to the database using REST-full services. The IDE runs the server and I use Postman to query.
My problem is that the H2 database does not persist the data entered. In memory it's fine, but as soon as I shutdown and start up again, the data is gone.
Looking at the database file, I see a trace file that seems to show that the DB wasn't updated because there was a lock on it. I have searched for a lock file (locate *.lock.db), stopped all Java processes, and even rebooted without a change in behavior. I've changed the FILE_LOCK options for H2 but the problem persists.
EDIT: I just noticed that I hadn't tried FILE_LOCK=FILE. I just tried it and the db isn't updated. However, a trace file is not produced.
EDIT 2: I forgot to say that I'm running on Ubuntu 20.04.
EDIT 3: I am being careful to shutdown using the "actuator/shutdown" from Postman. Using file locking, I do this:
start the server running from my IDE. I can see the lock file being created
create a new recipe
make sure I can get it (returns the recipe, no error)
shutdown using the actuator. The lock file is removed.
startup the server again. New lock file.
try to get the recipe. Returns a 404 error. No trace file.
What could be causing this? Here is my trace.db file [note: when I use FILE_LOCK=FILE, no trace file is created, so maybe the locking problem was a red herring?]:
2021-09-09 07:54:08 database: flush
org.h2.message.DbException: General error: "java.lang.IllegalStateException: The file is locked: nio:/home/knute/IdeaProjects/recipes_db.mv.db [1.4.200/7]" [50000-200]
at org.h2.message.DbException.get(DbException.java:194)
at org.h2.message.DbException.convert(DbException.java:347)
at org.h2.mvstore.db.MVTableEngine$1.uncaughtException(MVTableEngine.java:93)
at org.h2.mvstore.MVStore.handleException(MVStore.java:2877)
at org.h2.mvstore.MVStore.panic(MVStore.java:481)
at org.h2.mvstore.MVStore.<init>(MVStore.java:402)
at org.h2.mvstore.MVStore$Builder.open(MVStore.java:3579)
at org.h2.mvstore.db.MVTableEngine$Store.open(MVTableEngine.java:170)
at org.h2.mvstore.db.MVTableEngine.init(MVTableEngine.java:103)
at org.h2.engine.Database.getPageStore(Database.java:2659)
at org.h2.engine.Database.open(Database.java:675)
at org.h2.engine.Database.openDatabase(Database.java:307)
at org.h2.engine.Database.<init>(Database.java:301)
at org.h2.engine.Engine.openSession(Engine.java:74)
at org.h2.engine.Engine.openSession(Engine.java:192)
at org.h2.engine.Engine.createSessionAndValidate(Engine.java:171)
at org.h2.engine.Engine.createSession(Engine.java:166)
at org.h2.engine.Engine.createSession(Engine.java:29)
at org.h2.engine.SessionRemote.connectEmbeddedOrServer(SessionRemote.java:340)
at org.h2.jdbc.JdbcConnection.<init>(JdbcConnection.java:173)
at org.h2.jdbc.JdbcConnection.<init>(JdbcConnection.java:152)
at org.h2.Driver.connect(Driver.java:69)
at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:138)
at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:358)
at com.zaxxer.hikari.pool.PoolBase.newPoolEntry(PoolBase.java:206)
at com.zaxxer.hikari.pool.HikariPool.createPoolEntry(HikariPool.java:477)
at com.zaxxer.hikari.pool.HikariPool.access$100(HikariPool.java:71)
at com.zaxxer.hikari.pool.HikariPool$PoolEntryCreator.call(HikariPool.java:725)
at com.zaxxer.hikari.pool.HikariPool$PoolEntryCreator.call(HikariPool.java:711)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.h2.jdbc.JdbcSQLNonTransientException: General error: "java.lang.IllegalStateException: The file is locked: nio:/home/knute/IdeaProjects/recipes_db.mv.db [1.4.200/7]" [50000-200]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:505)
at org.h2.message.DbException.getJdbcSQLException(DbException.java:429)
... 33 more
Caused by: java.lang.IllegalStateException: The file is locked: nio:/home/knute/IdeaProjects/recipes_db.mv.db [1.4.200/7]
at org.h2.mvstore.DataUtils.newIllegalStateException(DataUtils.java:950)
at org.h2.mvstore.FileStore.open(FileStore.java:166)
at org.h2.mvstore.MVStore.<init>(MVStore.java:381)
... 27 more
Caused by: java.nio.channels.OverlappingFileLockException
at java.base/sun.nio.ch.FileLockTable.checkList(FileLockTable.java:229)
at java.base/sun.nio.ch.FileLockTable.add(FileLockTable.java:123)
at java.base/sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:1154)
at org.h2.store.fs.FileNio.tryLock(FilePathNio.java:121)
at java.base/java.nio.channels.FileChannel.tryLock(FileChannel.java:1165)
at org.h2.mvstore.FileStore.open(FileStore.java:163)
... 28 more
2021-09-09 07:54:08 database: flush
org.h2.message.DbException: General error: "java.lang.IllegalStateException: The file is locked: nio:/home/knute/IdeaProjects/recipes_db.mv.db [1.4.200/7]" [50000-200]
at org.h2.message.DbException.get(DbException.java:194)
at org.h2.message.DbException.convert(DbException.java:347)
at org.h2.mvstore.db.MVTableEngine$1.uncaughtException(MVTableEngine.java:93)
at org.h2.mvstore.MVStore.handleException(MVStore.java:2877)
at org.h2.mvstore.MVStore.panic(MVStore.java:481)
at org.h2.mvstore.MVStore.<init>(MVStore.java:402)
at org.h2.mvstore.MVStore$Builder.open(MVStore.java:3579)
at org.h2.mvstore.db.MVTableEngine$Store.open(MVTableEngine.java:170)
at org.h2.mvstore.db.MVTableEngine.init(MVTableEngine.java:103)
at org.h2.engine.Database.getPageStore(Database.java:2659)
at org.h2.engine.Database.open(Database.java:675)
at org.h2.engine.Database.openDatabase(Database.java:307)
at org.h2.engine.Database.<init>(Database.java:301)
at org.h2.engine.Engine.openSession(Engine.java:74)
at org.h2.engine.Engine.openSession(Engine.java:192)
at org.h2.engine.Engine.createSessionAndValidate(Engine.java:171)
at org.h2.engine.Engine.createSession(Engine.java:166)
at org.h2.engine.Engine.createSession(Engine.java:29)
at org.h2.engine.SessionRemote.connectEmbeddedOrServer(SessionRemote.java:340)
at org.h2.jdbc.JdbcConnection.<init>(JdbcConnection.java:173)
at org.h2.jdbc.JdbcConnection.<init>(JdbcConnection.java:152)
at org.h2.Driver.connect(Driver.java:69)
at com.zaxxer.hikari.util.DriverDataSource.getConnection(DriverDataSource.java:138)
at com.zaxxer.hikari.pool.PoolBase.newConnection(PoolBase.java:358)
at com.zaxxer.hikari.pool.PoolBase.newPoolEntry(PoolBase.java:206)
at com.zaxxer.hikari.pool.HikariPool.createPoolEntry(HikariPool.java:477)
at com.zaxxer.hikari.pool.HikariPool.access$100(HikariPool.java:71)
at com.zaxxer.hikari.pool.HikariPool$PoolEntryCreator.call(HikariPool.java:725)
at com.zaxxer.hikari.pool.HikariPool$PoolEntryCreator.call(HikariPool.java:711)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.h2.jdbc.JdbcSQLNonTransientException: General error: "java.lang.IllegalStateException: The file is locked: nio:/home/knute/IdeaProjects/recipes_db.mv.db [1.4.200/7]" [50000-200]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:505)
at org.h2.message.DbException.getJdbcSQLException(DbException.java:429)
... 33 more
Caused by: java.lang.IllegalStateException: The file is locked: nio:/home/knute/IdeaProjects/recipes_db.mv.db [1.4.200/7]
at org.h2.mvstore.DataUtils.newIllegalStateException(DataUtils.java:950)
at org.h2.mvstore.FileStore.open(FileStore.java:166)
at org.h2.mvstore.MVStore.<init>(MVStore.java:381)
... 27 more
Caused by: java.nio.channels.OverlappingFileLockException
at java.base/sun.nio.ch.FileLockTable.checkList(FileLockTable.java:229)
at java.base/sun.nio.ch.FileLockTable.add(FileLockTable.java:123)
at java.base/sun.nio.ch.FileChannelImpl.tryLock(FileChannelImpl.java:1154)
at org.h2.store.fs.FileNio.tryLock(FilePathNio.java:121)
at java.base/java.nio.channels.FileChannel.tryLock(FileChannel.java:1165)
at org.h2.mvstore.FileStore.open(FileStore.java:163)
... 28 more
...and the application.properties file [note: in my current iteration, I am using FILE_LOCK=FILE]:
# Required by HyperSkill
server.port=8881
management.endpoints.web.exposure.include=*
management.endpoint.shutdown.enabled=true
spring.datasource.url=jdbc:h2:file:../recipes_db
# Solves file locking problem? No.
#spring.datasource.url=jdbc:h2:file:../recipes_db;DB_CLOSE_ON_EXIT=TRUE;FILE_LOCK=NO
#spring.datasource.url=jdbc:h2:file:../recipes_db;FILE_LOCK=NO
#spring.datasource.url=jdbc:h2:file:../recipes_db;FILE_LOCK=SOCKET
#spring.datasource.url=jdbc:h2:file:../recipes_db;FILE_LOCK=FS
# Needed?
spring.datasource.auto-commit=true
# To remove warning
spring.jpa.open-in-view=true
I found the problem! Adding debug=true to the applications.properties file, I saw this:
Starting delayed evictData of schema as part of SessionFactory shut-down
So for some reason, Spring was explicitly dropping my table! Adding this to the applications.properties file stopped that:
spring.jpa.hibernate.ddl-auto=update
I wrote a mapreduce job to scan an hbase table for a certain time range to count certain elements we need for analysis.
Mappers in the MR job keeps failing but I don't know why. Seems like each time I run the job, a different number of mappers fail. The YARN log (see below) from Cloudera manager isn't helpful in pointing what the problem is, although, someone said I might be running out of memory.
It seems to retry multiple times but each time it fails. What do I need to do to make it stop failing or how can I log things to help me better determine what is happening?
Below is a log from YARN for one of the mappers that failed.
Error: org.apache.hadoop.hbase.client.RetriesExhaustedException:
Failed after attempts=36, exceptions: Thu Jun 15 16:26:57 PDT 2017,
null, java.net.SocketTimeoutException: callTimeout=60000,
callDuration=60301: row '152_p3401.db161139.sjc102.dbi_1496271480' on
table 'dbi_based_data' at
region=dbi_based_data,151_p3413.db162024.iad4.dbi_1476974340,1486675565213.d83250d0682e648d165872afe5abd60e., hostname=hslave35118.ams9.mysecretdomain.com,60020,1483570489305,
seqNum=19308931 at
org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.throwEnrichedException(RpcRetryingCallerWithReadReplicas.java:276)
at
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:207)
at
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:60)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
at
org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320)
at
org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:403)
at
org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:364)
at
org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:236)
at
org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:147)
at
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase$1.nextKeyValue(TableInputFormatBase.java:216)
at
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
at
org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at
org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144) at
org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at
org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:415) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused
by: java.net.SocketTimeoutException: callTimeout=60000,
callDuration=60301: row '152_p3401.db161139.sjc102.dbi_1496271480' on
table 'dbi_based_data' at
region=dbi_based_data,151_p3413.db162024.iad4.dbi_1476974340,1486675565213.d83250d0682e648d165872afe5abd60e., hostname=hslave35118.ams9.mysecretdomain.com,60020,1483570489305,
seqNum=19308931 at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:159)
at
org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture.run(ResultBoundedCompletionService.java:65)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745) Caused by:
java.io.IOException: Call to
hslave35118.ams9.mysecretdomain.com/10.216.35.118:60020 failed on
local exception: org.apache.hadoop.hbase.ipc.CallTimeoutException:
Call id=12, waitTime=60001, operationTimeout=60000 expired. at
org.apache.hadoop.hbase.ipc.AbstractRpcClient.wrapException(AbstractRpcClient.java:291)
at
org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1272)
at
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:226)
at
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:331)
at
org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.scan(ClientProtos.java:34094)
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:219)
at
org.apache.hadoop.hbase.client.ScannerCallable.call(ScannerCallable.java:64)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
at
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:360)
at
org.apache.hadoop.hbase.client.ScannerCallableWithReplicas$RetryingRPC.call(ScannerCallableWithReplicas.java:334)
at
org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithRetries(RpcRetryingCaller.java:126)
... 4 more Caused by:
org.apache.hadoop.hbase.ipc.CallTimeoutException: Call id=12,
waitTime=60001, operationTimeout=60000 expired. at
org.apache.hadoop.hbase.ipc.Call.checkAndSetTimeout(Call.java:73) at
org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1246)
... 13 more
So it looks like for my case I needed to extend the timeout setting. In my Java program I had to add the following lines to make the exception go away:
conf.set("hbase.rpc.timeout","90000");
conf.set("hbase.client.scanner.timeout.period","90000");
The answer was found on this link on Cloudera's site
I have big problem with ES :(
When i insert bulk (size = 20) a lot of document to ES, ES server throw below exception.
I find many topic discuss about this, but nothing. :sosad: , Any help me, what actually happened ??? Thks so much.
Sr for my bad english.
I using ES 2.3 , client Transport 2.2.1.
Server config
http.port: 9200
http.max_content_length: 100mb
node.name: "es_test"
nod.master: true
node.data: true
index.store.type: niofs
index.number_of_shards: 5
index.number_of_replicas: 0
discovery.zen.ping.multicast.enabled: false
script.inline: on
script.indexed: on
bootstrap.mlockall: true
Erros1
[2016-03-31 07:45:02,601][ERROR][index.engine ] [es_test] [my_index][1] failed to merge
java.io.EOFException: read past EOF: NIOFSIndexInput(path="/data/es_test/data/es_test/nodes/0/indices/my_index/1/index/_190.fnm")
at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:336)
at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:54)
at org.apache.lucene.store.BufferedChecksumIndexInput.readByte(BufferedChecksumIndexInput.java:41)
at org.apache.lucene.store.DataInput.readInt(DataInput.java:101)
at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:195)
at org.apache.lucene.codecs.CodecUtil.checkIndexHeader(CodecUtil.java:256)
at org.apache.lucene.codecs.lucene50.Lucene50FieldInfosFormat.read(Lucene50FieldInfosFormat.java:115)
at org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:99)
at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:65)
at org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:145)
at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4233)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3664)
at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:588)
at org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:94)
at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:626)
Suppressed: org.apache.lucene.index.CorruptIndexException: checksum status indeterminate: remaining=0, please run checkindex for more details (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/data/es_test/data/es_test/nodes/0/indices/my_index/1/index/_190.fnm")))
at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:371)
at org.apache.lucene.codecs.lucene50.Lucene50FieldInfosFormat.read(Lucene50FieldInfosFormat.java:164)
... 8 more
[2016-03-31 07:45:02,608][WARN ][index.engine ] [es_test] [my_index][1] failed engine [already closed by tragic event on the index writer]
java.io.EOFException: read past EOF: NIOFSIndexInput(path="/data/es_test/data/es_test/nodes/0/indices/my_index/1/index/_190.fnm")
at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:336)
at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:54)
at org.apache.lucene.store.BufferedChecksumIndexInput.readByte(BufferedChecksumIndexInput.java:41)
at org.apache.lucene.store.DataInput.readInt(DataInput.java:101)
at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:195)
at org.apache.lucene.codecs.CodecUtil.checkIndexHeader(CodecUtil.java:256)
at org.apache.lucene.codecs.lucene50.Lucene50FieldInfosFormat.read(Lucene50FieldInfosFormat.java:115)
at org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:99)
at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:65)
at org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:145)
at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4233)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3664)
at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:588)
at org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:94)
at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:626)
Suppressed: org.apache.lucene.index.CorruptIndexException: checksum status indeterminate: remaining=0, please run checkindex for more details (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/data/es_test/data/es_test/nodes/0/indices/my_index/1/index/_190.fnm")))
at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:371)
at org.apache.lucene.codecs.lucene50.Lucene50FieldInfosFormat.read(Lucene50FieldInfosFormat.java:164)
... 8 more
[2016-03-31 07:45:02,609][ERROR][index.engine ] [es_test] [my_index][4] failed to merge
java.io.EOFException: read past EOF: NIOFSIndexInput(path="/data/es_test/data/es_test/nodes/0/indices/my_index/4/index/_190.fdx")
at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:336)
at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:54)
at org.apache.lucene.store.BufferedChecksumIndexInput.readByte(BufferedChecksumIndexInput.java:41)
at org.apache.lucene.store.DataInput.readInt(DataInput.java:101)
at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:195)
at org.apache.lucene.codecs.CodecUtil.checkIndexHeader(CodecUtil.java:256)
at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.<init>(CompressingStoredFieldsReader.java:133)
at org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsReader(CompressingStoredFieldsFormat.java:121)
at org.apache.lucene.codecs.lucene50.Lucene50StoredFieldsFormat.fieldsReader(Lucene50StoredFieldsFormat.java:173)
at org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:117)
at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:65)
at org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:145)
at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4233)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3664)
at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:588)
at org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:94)
at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:626)
Suppressed: org.apache.lucene.index.CorruptIndexException: checksum status indeterminate: remaining=0, please run checkindex for more details (resource=BufferedChecksumIndexInput(NIOFSIndexInput(path="/data/es_test/data/es_test/nodes/0/indices/my_index/4/index/_190.fdx")))
at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:371)
at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.<init>(CompressingStoredFieldsReader.java:140)
... 10 more
Erros2
[2016-03-31 20:04:07,419][DEBUG][action.admin.cluster.node.stats] [es_test] failed to execute on node [mplUA6JET92RPgmNx-DPMA]
RemoteTransportException[[es_test][ip:9300][cluster:monitor/nodes/stats[n]]]; nested: AlreadyClosedException[this IndexReader is closed];
Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexReader is closed
at org.apache.lucene.index.IndexReader.ensureOpen(IndexReader.java:274)
at org.apache.lucene.index.CompositeReader.getContext(CompositeReader.java:101)
at org.apache.lucene.index.CompositeReader.getContext(CompositeReader.java:55)
at org.apache.lucene.index.IndexReader.leaves(IndexReader.java:438)
at org.elasticsearch.search.suggest.completion.Completion090PostingsFormat.completionStats(Completion090PostingsFormat.java:330)
at org.elasticsearch.index.shard.IndexShard.completionStats(IndexShard.java:765)
at org.elasticsearch.action.admin.indices.stats.CommonStats.<init>(CommonStats.java:164)
at org.elasticsearch.indices.IndicesService.stats(IndicesService.java:253)
at org.elasticsearch.node.service.NodeService.stats(NodeService.java:157)
at org.elasticsearch.action.admin.cluster.node.stats.TransportNodesStatsAction.nodeOperation(TransportNodesStatsAction.java:82)
at org.elasticsearch.action.admin.cluster.node.stats.TransportNodesStatsAction.nodeOperation(TransportNodesStatsAction.java:44)
at org.elasticsearch.action.support.nodes.TransportNodesAction.nodeOperation(TransportNodesAction.java:92)
at org.elasticsearch.action.support.nodes.TransportNodesAction$NodeTransportHandler.messageReceived(TransportNodesAction.java:230)
at org.elasticsearch.action.support.nodes.TransportNodesAction$NodeTransportHandler.messageReceived(TransportNodesAction.java:226)
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:75)
at org.elasticsearch.transport.TransportService$4.doRun(TransportService.java:376)
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
I suggest that you receive the error, since upgrading to ES 2.3?
The cause is most likely that your transport client version is older than your clusters version.
When using the transport Client, the versions have to be compatible.
I am very new to Camus and Hadoop, and am running into an exception error. I am trying to write some avro files to a hdfs, and keep getting the following error block:
[EtlMultiOutputRecordWriter] - ExceptionWritable key: topic=_schemas partition=0leaderId=0 server= service= beginOffset=0 offset=0 msgSize=1024 server= checksum=0 time=1450371931447 value: java.lang.Exception
at com.linkedin.camus.etl.kafka.common.KafkaReader.getNext(KafkaReader.java:108)
at com.linkedin.camus.etl.kafka.mapred.EtlRecordReader.nextKeyValue(EtlRecordReader.java:232)
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
... 14 more
I looked up line 108 in com.linkedin.camus.etl.kafka.common.KafkaReader.getNext and found it to be this: MessageAndOffset msgAndOffset = messageIter.next();.
I am using io.confluent.camus.etl.kafka.coders.AvroMessageDecoder for my decoder and com.linkedin.camus.example.DummySchemaRegistry for my coder.
At the end of the logs I get another line indicating an error from one of the hdfs files: Error from file [hdfs://localhost:9000/user/username/exec/2015-12-17-17-05-25/errors-m-00000]. The error-m-00000 file contains a somewhat readable beginning, but then changes to an undecipherable string:
SEQ*com.linkedin.camus.etl.kafka.common.EtlKey5com.linkedin.camus.etl.kafka.common.ExceptionWritable*org.apache.hadoop.io.compress.DefaultCodec|Ò ∫±ß˝}pºHí$ò¸·:0schemasQ∞∆øÿxúïîÀN√0E7l‡+∫»¢lFMõ>á*êxU®™ËzÍmàc[ÆÕ„XÚÕÿqZ%#[ÿD±gÓô…¯∆üGœ¯Ç¿Q,·Úçë2ô'«hZL¿3ëSöXÿ5ê·ê„Sé‡ÇÖpÎS¬î4,…LËÕ¥Î{û}wFßáâ*M)>%&uZÑCfi“˚#rKÌÔ¡flÌu^Í%†B∂"Xa*•⁄0ÔQÕpùGzùidy&ñªkT…śԈ≥-#0>›…∆RG∫.ˇÅ¨«JÚ®sÃ≥Ö¡\£Rîfi˚ßéT≥D#%T8ãW®ÚµÌ∫4N˙©W∫©mst√—Ô嶥óhÓ$C~#S+Ñâ{ãÇfl¡ßí⁄L´ÏíÙºÙΩ5wfÃjM¬∏_Äò5RØ£
Ë"Eeúÿëx{ÆÏ«{XW÷XM€O¨-C#É¡Òl•ù9§‰õö2ó:wɲ%Œ-N∫ˇbFXˆ∑:àá5fyQÑ‘ö™:roõ1⁄5•≠≈˚yM0±ú?»ÃW◊.h≈I´êöNæ
[û3
At the end it appears that a hadoop job has run, but a commit never takes place, based of the timing report:
Job time (seconds):
pre setup 1.0 (11%)
get splits 1.0 (11%)
hadoop job 4.0 (44%)
commit 0.0 (0%)
Total: 0 minutes 9 seconds
Any help or an idea of where to look to resolve this would be greatly appreciated. Thank you.
I am trying to use sstableloader to stream data to a Cassandra database, which is in fact in the same node. It used to work when i was using DSE 2.2 but when i upgraded it to DSE 4.5 and made all the relevant changes in the cassandra.yaml file, it stopped working and now it is throwing an error like this:
Established connection to initial hosts
Opening sstables and calculating sections to stream
Streaming relevant part of demo/test_yale/demo-test_yale-jb-2-Data.db demo/test_yale/demo-test_yale-jb-1-Data.db to [/127.0.0.1]
Streaming session ID: 02225ef0-1c17-11e4-a1ea-5f2d4f6a32c1
progress: [/127.0.0.1 1/2 (88%)] [total: 88% - 2147483647MB/s (avg: 14MB/s)]ERROR 16:36:29,029 [Stream #02225ef0-1c17-11e4-a1ea-5f2d4f6a32c1] Streaming error occurred
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487)
at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
at java.nio.channels.Channels.writeFully(Channels.java:98)
at java.nio.channels.Channels.access$000(Channels.java:61)
at java.nio.channels.Channels$1.write(Channels.java:174)
at com.ning.compress.lzf.LZFChunk.writeCompressedHeader(LZFChunk.java:77)
at com.ning.compress.lzf.ChunkEncoder.encodeAndWriteChunk(ChunkEncoder.java:132)
at com.ning.compress.lzf.LZFOutputStream.writeCompressedBlock(LZFOutputStream.java:203)
at com.ning.compress.lzf.LZFOutputStream.write(LZFOutputStream.java:97)
at org.apache.cassandra.streaming.StreamWriter.write(StreamWriter.java:151)
at org.apache.cassandra.streaming.StreamWriter.write(StreamWriter.java:101)
at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:59)
at org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42)
at org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:45)
at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.sendMessage(ConnectionHandler.java:383)
at org.apache.cassandra.streaming.ConnectionHandler$OutgoingMessageHandler.run(ConnectionHandler.java:363)
at java.lang.Thread.run(Thread.java:745)
WARN 16:36:29,032 [Stream #02225ef0-1c17-11e4-a1ea-5f2d4f6a32c1] Stream failed
Streaming to the following hosts failed:
[/127.0.0.1]
java.util.concurrent.ExecutionException: org.apache.cassandra.streaming.StreamException: Stream failed
I have even tried assigning actual ip address of the node for the listen_address, broadcast_address, and rpc_address in the cassandra.yaml file but the same error occurs.
Can anyone be of assistance please?
It's worth looking at your system.log as specified in cassandra/conf/logback.xml, as suggested by Zanson.
In my case the issue was simply with exhausting disk space on the node:
ERROR [STREAM-IN-/xx.xx.xx.xx] 2016-08-02 10:50:31,125 StreamSession.java:505 - [Stream #8420bfa0-589c-11e6-9512-235b1f79cf1b] Streaming error occurred
java.io.IOException: No space left on device
at java.io.RandomAccessFile.writeBytes(Native Method) ~[na:1.8.0_101]
at java.io.RandomAccessFile.write(RandomAccessFile.java:525) ~[na:1.8.0_101]