Java Mail api, copy message sucessfully but email deleted from exchange server - java

Using java mail api, we monitor a the Inbox folder and process emails. If an error occurs while processing an email, we move it to an error folder.
If that is successful we delete the email from the inbox folder. Following is a snippet of mail debugging. It shows the copy as successful, but the email is never found in the error directory and its also deleted from inbox.
Why would this happen? Also why would java mail api report a success even though the mail is not copied.
2013-10-04 14:25:20,886 [] [] [] INFO [monitorScheduler-1] monitor.EmailMonitor monitor.EmailMonitor (EmailMonitor.java:393) - Copy error message to error folder
2013-10-04 14:25:20,889 [] [] [] INFO [monitorScheduler-1] STDOUT util.LoggerStream (LoggerStream.java:156) - A10 COPY 1 Inbox/error
2013-10-04 14:25:20,896 [] [] [] INFO [monitorScheduler-1] STDOUT util.LoggerStream (LoggerStream.java:156) - A10 OK COPY completed.
2013-10-04 14:25:20,897 [] [] [] INFO [monitorScheduler-1] monitor.EmailMonitor monitor.EmailMonitor (EmailMonitor.java:400) - Mark message as deleted from monitored folder
2013-10-04 14:25:20,897 [] [] [] INFO [monitorScheduler-1] STDOUT util.LoggerStream (LoggerStream.java:156) - A11 STORE 1 +FLAGS (\Deleted)
2013-10-04 14:25:20,907 [] [] [] INFO [monitorScheduler-1] STDOUT util.LoggerStream (LoggerStream.java:156) - * 1 FETCH (FLAGS (\Seen \Deleted \Recent))
A11 OK STORE completed.
2013-10-04 14:25:20,907 [] [] [] INFO [monitorScheduler-1] monitor.EmailMonitor monitor.EmailMonitor (EmailMonitor.java:404) - Expunge the monitored folder
2013-10-04 14:25:20,908 [] [] [] INFO [monitorScheduler-1] STDOUT util.LoggerStream (LoggerStream.java:156) - A12 EXPUNGE
2013-10-04 14:25:20,922 [] [] [] INFO [monitorScheduler-1] STDOUT util.LoggerStream (LoggerStream.java:156) - * 1 EXPUNGE
* 0 EXISTS
A12 OK EXPUNGE completed.

It's your server that's reporting success.
Try using a different name for the error folder, not something named under Inbox, in case that helps.

Related

Flink Taskmanager crashed while accessing S3 FileSource (on Kubernetes/OpenShift)

I have a job receiving events with a s3 link. It attempts to load the resource using the following snippet
// value of s3 source is "s3://bucket_id/path/to/object.json"
List<String> collect = ExecutionEnvironment.getExecutionEnvironment().readTextFile(s3_source.toString()).collect();
Flink is configured accordingly in flink-conf.yaml
s3.access-key: XXX
s3.secret-key: XXX
s3.endpoint: s3.openshift-storage.svc
s3.path.style.access: true
The library flink-s3-fs-hadoop-1.16.0.jar is in path /opt/flink/plugins/flink-s3-fs-hadoop. I had some issues setting up the self-signed certificates (see this Gist for my config), but it seems to be working.
When starting the job through the JobManager's WebUI, I get the following logs
Job is starting
2023-01-26 10:33:09,891 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Receive slot request f9f291f0e4b74471e59a74602212060b for job c62504dec97d185a3e86fc390256e3f9 from resource manager with leader id 00000000000000000000000000000000.
2023-01-26 10:33:09,894 DEBUG org.apache.flink.runtime.memory.MemoryManager [] - Initialized MemoryManager with total memory size 178956973 and page size 32768.
2023-01-26 10:33:09,895 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Allocated slot for f9f291f0e4b74471e59a74602212060b.
2023-01-26 10:33:09,896 INFO org.apache.flink.runtime.taskexecutor.DefaultJobLeaderService [] - Add job c62504dec97d185a3e86fc390256e3f9 for job leader monitoring.
2023-01-26 10:33:09,897 DEBUG org.apache.flink.runtime.taskexecutor.DefaultJobLeaderService [] - New leader information for job c62504dec97d185a3e86fc390256e3f9. Address: akka.tcp://flink#flink-jobmanager:6123/user/rpc/jobmanager_10, leader id: 00000000000000000000000000000000.
2023-01-26 10:33:09,897 INFO org.apache.flink.runtime.taskexecutor.DefaultJobLeaderService [] - Try to register at job manager akka.tcp://flink#flink-jobmanager:6123/user/rpc/jobmanager_10 with leader id 00000000-0000-0000-0000-000000000000.
2023-01-26 10:33:09,898 DEBUG org.apache.flink.runtime.rpc.akka.AkkaRpcService [] - Try to connect to remote RPC endpoint with address akka.tcp://flink#flink-jobmanager:6123/user/rpc/jobmanager_10. Returning a org.apache.flink.runtime.jobmaster.JobMasterGateway gateway.
2023-01-26 10:33:09,910 INFO org.apache.flink.runtime.taskexecutor.DefaultJobLeaderService [] - Resolved JobManager address, beginning registration
2023-01-26 10:33:09,910 DEBUG org.apache.flink.runtime.taskexecutor.DefaultJobLeaderService [] - Registration at JobManager attempt 1 (timeout=100ms)
2023-01-26 10:33:09,991 DEBUG org.apache.flink.runtime.taskexecutor.DefaultJobLeaderService [] - Registration with JobManager at akka.tcp://flink#flink-jobmanager:6123/user/rpc/jobmanager_10 was successful.
2023-01-26 10:33:09,993 INFO org.apache.flink.runtime.taskexecutor.DefaultJobLeaderService [] - Successful registration at job manager akka.tcp://flink#flink-jobmanager:6123/user/rpc/jobmanager_10 for job c62504dec97d185a3e86fc390256e3f9.
2023-01-26 10:33:09,993 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Establish JobManager connection for job c62504dec97d185a3e86fc390256e3f9.
2023-01-26 10:33:09,995 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Offer reserved slots to the leader of job c62504dec97d185a3e86fc390256e3f9.
2023-01-26 10:33:10,011 INFO org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl [] - Activate slot f9f291f0e4b74471e59a74602212060b.
Event is received from RabbitMQ Stream
2023-01-26 10:33:11,193 INFO io.av360.maverick.insights.rmqstreams.RMQStreamSource [] - Running event consumer as stream source
2023-01-26 10:33:11,194 INFO io.av360.maverick.insights.rmqstreams.config.RMQStreamsConfig [] - Creating consumer for stream 'events'
2023-01-26 10:33:11,195 INFO io.av360.maverick.insights.rmqstreams.config.RMQStreamsConfig [] - Creating environment required to connect to a RabbitMQ Stream.
2023-01-26 10:33:11,195 DEBUG io.av360.maverick.insights.rmqstreams.config.StreamsClientFactory [] - Building environment
2023-01-26 10:33:11,195 INFO io.av360.maverick.insights.rmqstreams.config.StreamsClientFactory [] - Valid configuration for host 'rabbitmq'
....
2023-01-26 10:33:14,907 INFO XXX [] - Event of type 'crawled.source.channel' with source 's3://bucket-d9c5a56e-c4a9-4b48-82dc-04241cb2b72c/scraped/source/channel/channel_UCucD43ut3DEx6QDK2JOEI1w.json'
Attempting to read from s3
2023-01-26 10:33:15,303 DEBUG org.apache.flink.fs.s3.common.AbstractS3FileSystemFactory [] - Creating S3 file system backed by Hadoop s3a file system
2023-01-26 10:33:15,303 DEBUG org.apache.flink.fs.s3.common.AbstractS3FileSystemFactory [] - Loading Hadoop configuration for Hadoop s3a file system
2023-01-26 10:33:15,500 DEBUG org.apache.flink.fs.s3hadoop.common.HadoopConfigLoader [] - Adding Flink config entry for s3.secret-key as fs.s3a.secret-key to Hadoop config
2023-01-26 10:33:15,500 DEBUG org.apache.flink.fs.s3hadoop.common.HadoopConfigLoader [] - Adding Flink config entry for s3.endpoint as fs.s3a.endpoint to Hadoop config
2023-01-26 10:33:15,500 DEBUG org.apache.flink.fs.s3hadoop.common.HadoopConfigLoader [] - Adding Flink config entry for s3.access-key as fs.s3a.access-key to Hadoop config
2023-01-26 10:33:15,500 DEBUG org.apache.flink.fs.s3hadoop.common.HadoopConfigLoader [] - Adding Flink config entry for s3.path.style.access as fs.s3a.path.style.access to Hadoop config
2023-01-26 10:33:15,705 DEBUG org.apache.flink.fs.s3hadoop.S3FileSystemFactory [] - Using scheme s3://bucket-d9c5a56e-c4a9-4b48-82dc-04241cb2b72c/scraped/source/channel/channel_UCucD43ut3DEx6QDK2JOEI1w.json for s3a file system backing the S3 File System
a few hadoop exceptions (relevant?)
2023-01-26 10:33:15,800 DEBUG org.apache.hadoop.util.Shell [] - Failed to detect a valid hadoop home directory
java.io.FileNotFoundException: HADOOP_HOME and hadoop.home.dir are unset.
...
2023-01-26 10:33:16,108 DEBUG org.apache.hadoop.metrics2.impl.MetricsConfig [] - Could not locate file hadoop-metrics2-s3a-file-system.properties
org.apache.commons.configuration2.ex.ConfigurationException: Could not locate: org.apache.commons.configuration2.io.FileLocator#4ac69a5d[fileName=hadoop-metrics2-s3a-file-system.properties,basePath=<null>,sourceURL=,encoding=<null>,fileSystem=<null>,locationStrategy=<null>]
...
2023-01-26 10:33:16,111 DEBUG org.apache.hadoop.metrics2.impl.MetricsConfig [] - Could not locate file hadoop-metrics2.properties
org.apache.commons.configuration2.ex.ConfigurationException: Could not locate: org.apache.commons.configuration2.io.FileLocator#4faa9611[fileName=hadoop-metrics2.properties,basePath=<null>,sourceURL=,encoding=<null>,fileSystem=<null>,locationStrategy=<null>]
...
2023-01-26 10:33:16,508 DEBUG org.apache.hadoop.util.NativeCodeLoader [] - Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib]
2023-01-26 10:33:16,508 DEBUG org.apache.hadoop.util.NativeCodeLoader [] - java.library.path=/usr/java/packages/lib:/usr/lib64:/lib64:/lib:/usr/lib
2023-01-26 10:33:16,508 WARN org.apache.hadoop.util.NativeCodeLoader [] - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2023-01-26 10:33:16,508 DEBUG org.apache.hadoop.util.PerformanceAdvisory [] - Falling back to shell based
...
2023-01-26 10:33:17,119 DEBUG org.apache.hadoop.security.ssl.DelegatingSSLSocketFactory [] - Initializing SSL Context to channel mode Default_JSSE
2023-01-26 10:33:17,710 DEBUG org.apache.hadoop.fs.s3a.impl.NetworkBinding [] - Unable to create class org.apache.hadoop.fs.s3a.impl.ConfigureShadedAWSSocketFactory, value of fs.s3a.ssl.channel.mode will be ignored
java.lang.NoClassDefFoundError: com/amazonaws/thirdparty/apache/http/conn/socket/ConnectionSocketFactory
Connection attempt
2023-01-26 10:33:17,997 DEBUG org.apache.hadoop.fs.s3a.DefaultS3ClientFactory [] - Creating endpoint configuration for "s3.openshift-storage.svc"
2023-01-26 10:33:17,998 DEBUG org.apache.hadoop.fs.s3a.DefaultS3ClientFactory [] - Endpoint URI = https://s3.openshift-storage.svc
2023-01-26 10:33:18,096 DEBUG org.apache.hadoop.fs.s3a.DefaultS3ClientFactory [] - Endpoint https://s3.openshift-storage.svc is not the default; parsing
2023-01-26 10:33:18,097 DEBUG org.apache.hadoop.fs.s3a.DefaultS3ClientFactory [] - Region for endpoint s3.openshift-storage.svc, URI https://s3.openshift-storage.svc is determined as openshift-storage
...
2023-01-26 10:33:18,806 DEBUG com.amazonaws.request [] - Sending Request: HEAD https://s3.openshift-storage.svc /bucket-d9c5a56e-c4a9-4b48-82dc-04241cb2b72c/scraped/source/channel/channel_UCucD43ut3DEx6QDK2JOEI1w.json Headers: (amz-sdk-invocation-id: xxx, Content-Type: application/octet-stream, User-Agent: Hadoop 3.3.2, aws-sdk-java/1.11.951 Linux/4.18.0-372.26.1.el8_6.x86_64 OpenJDK_64-Bit_Server_VM/11.0.17+8 java/11.0.17 vendor/Eclipse_Adoptium, )
...
2023-01-26 10:33:20,108 DEBUG org.apache.hadoop.fs.s3a.Invoker [] - Starting: open s3a://bucket-d9c5a56e-c4a9-4b48-82dc-04241cb2b72c/scraped/source/channel/channel_UCucD43ut3DEx6QDK2JOEI1w.json at 0
...
2023-01-26 10:33:19,598 DEBUG com.amazonaws.request [] - Received successful response: 200, AWS Request ID: ldcyiwlo-6w4j96-om3
Everything looks fine for now, last logs are
2023-01-26 10:33:24,299 INFO org.apache.flink.runtime.taskexecutor.TaskManagerServices [] - Temporary file directory '/tmp': total 119 GB, usable 48 GB (40.34% usable)
2023-01-26 10:33:24,299 DEBUG org.apache.flink.runtime.io.disk.FileChannelManagerImpl [] - FileChannelManager uses directory /tmp/flink-io-15f45aea-fa25-4f90-be7e-ad49e8722980 for spill files.
2023-01-26 10:33:24,299 INFO org.apache.flink.runtime.io.disk.iomanager.IOManager [] - Created a new FileChannelManager for spilling of task related data to disk (joins, sorting, ...). Used directories:
/tmp/flink-io-15f45aea-fa25-4f90-be7e-ad49e8722980
2023-01-26 10:33:24,301 DEBUG org.apache.flink.runtime.io.disk.FileChannelManagerImpl [] - FileChannelManager uses directory /tmp/flink-netty-shuffle-ff78f4af-d02b-412b-b305-414b570917a8 for spill files.
2023-01-26 10:33:24,301 INFO org.apache.flink.runtime.io.network.NettyShuffleServiceFactory [] - Created a new FileChannelManager for storing result partitions of BLOCKING shuffles. Used directories:
/tmp/flink-netty-shuffle-ff78f4af-d02b-412b-b305-414b570917a8
These are the last logs. From this point on, the job fails and the task manager is gone (and restarted through K8S).
My two questions are:
Can I tune the log levels to find out more (root and hadoop are on trace)?
What am I missing here?

WSO2AM strange errors after successful startup

I have installed a fresh version of WSO2AM 3.2.0 by just extracting the downloaded ZIP file to /opt/wso2am/wso2am-3.2.0/. I have also set up two databases (db_am and db_shared) and I am using those in the file deployment.toml as described in the documentation. Then I'm starting the application by just executing the wso2server.sh script. It starts up just fine and the following lines appear, telling me that the application has started successfully.
TID: [-1234] [] [2020-10-26 15:59:14,710] INFO {org.wso2.carbon.core.internal.StartupFinalizerServiceComponent} - Server : WSO2 API Manager-3.2.0
TID: [-1234] [] [2020-10-26 15:59:14,717] INFO {org.wso2.carbon.core.internal.StartupFinalizerServiceComponent} - WSO2 Carbon started in 53 sec
TID: [-1] [] [2020-10-26 15:59:14,776] INFO {org.wso2.carbon.databridge.core.DataBridge} - user admin connected
But right after that it gives me the following errors:
TID: [-1234] [] [2020-10-26 15:59:15,600] ERROR {org.wso2.carbon.user.core.common.AbstractUserStoreManager} - java.lang.NumberFormatException: For input string: "q0MHWf0UB+WyZD03ES/pzA=="
TID: [-1234] [internal/data/v1] [2020-10-26 15:59:15,903] ERROR {org.wso2.carbon.user.core.common.AbstractUserStoreManager} - java.lang.NumberFormatException: For input string: "q0MHWf0UB+WyZD03ES/pzA=="
TID: [-1234] [internal/data/v1] [2020-10-26 15:59:15,903] ERROR {org.wso2.carbon.user.core.common.AbstractUserStoreManager} - java.lang.NumberFormatException: For input string: "q0MHWf0UB+WyZD03ES/pzA=="
TID: [-1234] [internal/data/v1] [2020-10-26 15:59:15,932] ERROR {org.wso2.carbon.user.core.common.AbstractUserStoreManager} - java.lang.NumberFormatException: For input string: "q0MHWf0UB+WyZD03ES/pzA=="
TID: [-1234] [internal/data/v1] [2020-10-26 15:59:15,933] ERROR {org.wso2.carbon.user.core.common.AbstractUserStoreManager} - java.lang.NumberFormatException: For input string: "q0MHWf0UB+WyZD03ES/pzA=="
TID: [-1234] [internal/data/v1] [2020-10-26 15:59:15,933] ERROR {org.wso2.carbon.user.core.common.AbstractUserStoreManager} - java.lang.NumberFormatException: For input string: "q0MHWf0UB+WyZD03ES/pzA=="
TID: [-1] [] [2020-10-26 15:59:15,651] ERROR {org.wso2.carbon.databridge.agent.endpoint.DataEndpointConnectionWorker} - Error while trying to connect to the endpoint. Cannot borrow client for ssl://<the servers local ip>:9711. org.wso2.carbon.databridge.agent.exception.DataEndpointLoginException: Cannot borrow client for ssl://<the servers local ip>:9711.
at org.wso2.carbon.databridge.agent.endpoint.DataEndpointConnectionWorker.connect(DataEndpointConnectionWorker.java:145)
at org.wso2.carbon.databridge.agent.endpoint.DataEndpointConnectionWorker.run(DataEndpointConnectionWorker.java:59)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.wso2.carbon.databridge.agent.exception.DataEndpointLoginException: Error while trying to login to data receiver :/<the servers local ip>:9711
at org.wso2.carbon.databridge.agent.endpoint.binary.BinaryDataEndpoint.login(BinaryDataEndpoint.java:50)
at org.wso2.carbon.databridge.agent.endpoint.DataEndpointConnectionWorker.connect(DataEndpointConnectionWorker.java:139)
... 6 more
Caused by: org.wso2.carbon.databridge.commons.exception.AuthenticationException: org.wso2.carbon.identity.base.IdentityRuntimeException: com.mysql.cj.jdbc.exceptions.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
Those lines then repeat indefinitely. I really don't understand where those errors are coming from.
The string from the first few errors looks like a password or a token but I have not set this string anywhere in the config.
The other errors, telling me about borrowing some client don't make any more sense to me
Where do these errors come from and how could I fix them or even find out what is causing them?
For the database issue Check whether your database server is up and running and your database configs are correctly configured in deployment.toml(username, password, URL, port) and you have run the correct MySQL scripts to your databases as in the docs. (Also don't forget to include the JDBC driver)
ex: deployment.toml configs
[database.shared_db]
type = "mysql"
url = "jdbc:mysql://localhost:3306/shared_db?useSSL=false"
username = "root"
password = "admin"
Check whether port 9711 is opened on a different network interface instead of your . You can verify this by executing the following command.
sudo lsof -i -P -n
If it is open to another network interface then close that port and
then start the APIM server again.
Or else you can change the APIM configs to configure a new port.
[[apim.throttling.url_group]]
traffic_manager_urls = ["tcp://<server_ip>:<port>","tcp://<server_ip>:<port>"]
traffic_manager_auth_urls = ["ssl://<server_ip>:<port>","ssl://<server_ip>:<port>"]
type = "failover"
An example configuration is given below.
[[apim.throttling.url_group]]
traffic_manager_urls = ["tcp://localhost:9611","tcp://localhost:9611"]
traffic_manager_auth_urls = ["ssl://localhost:9711","ssl://localhost:9711"]
type = "failover"

Log files are being overwritten on tomcat shutdown

I am facing a weird issue. When I shutdown the tomcat first time on a day it is overwriting log file contents. However on 2nd or any subsequent restart I don't face that issue.
I am seeing following errors in log on tomcat shutdown;
23:08:03,390 [] [] INFO XmlWebApplicationContext:873 - Closing Root WebApplicationContext: startup date [Wed Apr 29 23:47:05 BST 2015]; root of context hierarchy
23:08:03,397 [] [] INFO ThreadPoolTaskExecutor:203 - Shutting down ExecutorService 'org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor#1d7b51e8'
23:11:33,880 [] [] [] INFO PropertiesFactoryBean:172 - Loading properties file from class path resource [apppname/application.properties]
23:11:41,413 [] [] [] INFO Reflections:238 - Reflections took 5894 ms to scan 112 urls, producing 5518 keys and 32092 values
23:11:42,242 [] [] [] INFO ThreadPoolTaskExecutor:165 - Initializing ExecutorService 'org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor#28a50da4'
23:11:42,596 [] [] [] INFO ContextLoader:325 - Root WebApplicationContext: initialization completed in 11465 ms
23:11:48,525 [] [] [] INFO PropertiesFactoryBean:172 - Loading properties file from class path resource [apppname/application.properties]
23:11:55,130 [] [] [] INFO Reflections:238 - Reflections took 5765 ms to scan 112 urls, producing 5518 keys and 32092 values
23:11:55,807 [] [] [] INFO ThreadPoolTaskExecutor:165 - Initializing ExecutorService 'org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor#1a46a171'
23:11:56,081 [] [] [] INFO ContextLoader:325 - Root WebApplicationContext: initialization completed in 9491 ms
23:12:01,469 [] [] [] INFO PropertiesFactoryBean:172 - Loading properties file from class path resource [apppname/application.properties]
23:12:08,106 [] [] [] INFO Reflections:238 - Reflections took 5757 ms to scan 112 urls, producing 5518 keys and 32092 values
23:12:08,793 [] [] [] INFO ThreadPoolTaskExecutor:165 - Initializing ExecutorService 'org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor#7213bc54'
23:12:09,062 [] [] [] INFO ContextLoader:325 - Root WebApplicationContext: initialization completed in 9260 ms
Log configuration
log4j.rootLogger=INFO, file
log4j.appender.file=org.apache.log4j.DailyRollingFileAppender
log4j.appender.file.File=/logs/logfilename.log
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%d{ABSOLUTE} %5p %c{1}:%L - %m%n
What can be the possible reason?
I have the same log4j configuration for other application. But they work perfectly fine. It looks like somehow tomcat is writing logs to the the application log instead of catalina.
It happens only on first restart in a day and when log level is set to INFO or DEBUG not ERROR.
Use the log4j Append variable. By default it should be true though...
log4j.appender.LOGFILE.Append=true
I also see you are using Rolling Appender but its not in your root logger
log4j.rootLogger=INFO, file, RollingAppender

Execution of stored function using spring's SimpleJdbcCall not giving output

I am trying to call a stored function using spring's SimpleJdbc call. I have written a simple function which takes two numbers as input and returns their sum. I am using Oracle 11.2g as the database. I am not getting any exceptions but at the same time not getting the result either. The function works well when called from an anonymous PL/SQL block through sql-plus. The code is as follows:
SimpleJdbcCall _simpleJdbcCall=new SimpleJdbcCall(this.jdbcTemplate);
_simpleJdbcCall.withCatalogName("BROADCASTSHEETMANAGEMENT");
_simpleJdbcCall.withSchemaName("PPV");
_simpleJdbcCall.withFunctionName("TEST");
_simpleJdbcCall.withoutProcedureColumnMetaDataAccess();
_simpleJdbcCall.declareParameters(new SqlParameter("newChangeSequence",java.sql.Types.NUMERIC));
_simpleJdbcCall.declareParameters(new SqlParameter("number1",java.sql.Types.NUMERIC));
_simpleJdbcCall.declareParameters(new SqlParameter("number2",java.sql.Types.NUMERIC));
MapSqlParameterSource mapSqlParameterSource1=new MapSqlParameterSource();
mapSqlParameterSource1.addValue("newChangeSequence", Integer.valueOf(0));
mapSqlParameterSource1.addValue("number1", Integer.valueOf(10));
mapSqlParameterSource1.addValue("number2", Integer.valueOf(20));
newChangeSequence = _simpleJdbcCall.executeFunction(Integer.class,mapSqlParameterSource1);
System.out.println("Returned changeSequence is: " + newChangeSequence);
The stack trace shows following information:
2013/12/19 18:52:53,604 [main] - [] DEBUG org.springframework.jdbc.core.simple.SimpleJdbcCall - Added declared parameter for [TEST]: newChangeSequence
2013/12/19 18:52:53,604 [main] - [] DEBUG org.springframework.jdbc.core.simple.SimpleJdbcCall - Added declared parameter for [TEST]: number1
2013/12/19 18:52:53,604 [main] - [] DEBUG org.springframework.jdbc.core.simple.SimpleJdbcCall - Added declared parameter for [TEST]: number2
2013/12/19 18:52:53,605 [main] - [] DEBUG org.springframework.jdbc.core.simple.SimpleJdbcCall - JdbcCall call not compiled before execution - invoking compile
2013/12/19 18:52:53,608 [main] - [] DEBUG org.springframework.jdbc.datasource.DataSourceUtils - Fetching JDBC Connection from DataSource
2013/12/19 18:52:53,609 [main] - [] DEBUG org.springframework.jdbc.datasource.DriverManagerDataSource - Creating new JDBC DriverManager Connection to [jdbc:oracle:thin:#localhost:1521:orcl]
2013/12/19 18:52:53,647 [main] - [] DEBUG org.springframework.jdbc.datasource.DataSourceUtils - Registering transaction synchronization for JDBC Connection
2013/12/19 18:52:53,649 [main] - [] DEBUG org.springframework.jdbc.core.metadata.CallMetaDataProviderFactory - Using org.springframework.jdbc.core.metadata.OracleCallMetaDataProvider
2013/12/19 18:52:53,649 [main] - [] DEBUG org.springframework.jdbc.core.simple.SimpleJdbcCall - Compiled stored procedure. Call string is [{? = call PPV.BROADCASTSHEETMANAGEMENT.TEST(?, ?)}]
2013/12/19 18:52:53,649 [main] - [] DEBUG org.springframework.jdbc.core.simple.SimpleJdbcCall - SqlCall for function [TEST] compiled
2013/12/19 18:52:53,651 [main] - [] DEBUG org.springframework.jdbc.core.metadata.CallMetaDataContext - Matching [number2, number1, newChangeSequence] with [number2, newChangeSequence, number1]
2013/12/19 18:52:53,651 [main] - [] DEBUG org.springframework.jdbc.core.metadata.CallMetaDataContext - Found match for [number2, number1, newChangeSequence]
2013/12/19 18:52:53,652 [main] - [] DEBUG org.springframework.jdbc.core.simple.SimpleJdbcCall - The following parameters are used for call {? = call PPV.BROADCASTSHEETMANAGEMENT.TEST(?, ?)} with: {number2=20, number1=10, newChangeSequence=0}
2013/12/19 18:52:53,652 [main] - [] DEBUG org.springframework.jdbc.core.simple.SimpleJdbcCall - 1: newChangeSequence SQL Type 2 Type Name null org.springframework.jdbc.core.SqlParameter
2013/12/19 18:52:53,652 [main] - [] DEBUG org.springframework.jdbc.core.simple.SimpleJdbcCall - 2: number1 SQL Type 2 Type Name null org.springframework.jdbc.core.SqlParameter
2013/12/19 18:52:53,652 [main] - [] DEBUG org.springframework.jdbc.core.simple.SimpleJdbcCall - 3: number2 SQL Type 2 Type Name null org.springframework.jdbc.core.SqlParameter
2013/12/19 18:52:53,653 [main] - [] DEBUG org.springframework.jdbc.core.JdbcTemplate - Calling stored procedure [{? = call PPV.BROADCASTSHEETMANAGEMENT.TEST(?, ?)}]
2013/12/19 18:52:53,655 [main] - [] DEBUG org.springframework.jdbc.core.JdbcTemplate - CallableStatement.execute() returned 'false'
2013/12/19 18:52:53,655 [main] - [] DEBUG org.springframework.jdbc.core.JdbcTemplate - CallableStatement.getUpdateCount() returned -1
Returned changeSequence is: null
The stored procedure code is:
function test(number1 number, number2 number) return number is
newChangeSequence number(4);
begin
newChangeSequence:= number1 + number2;
return newChangeSequence;
end test;
you have to use SqlOutParameter like this:
_simpleJdbcCall.declareParameters.declareParameters(new SqlOutParameter("newChangeSequence",java.sql.Types.NUMERIC));
and please post your stored procedure code too.

algebraic error when running "aggregate" function on dataset

I'm learning hadoop/pig/hive through running through tutorials on hortonworks.com
I have indeed tried to find a link to the tutorial, but unfortunately it only ships with the ISA image that they provide to you. It's not actually hosted on their website.
batting = load 'Batting.csv' using PigStorage(',');
runs = FOREACH batting GENERATE $0 as playerID, $1 as year, $8 as runs;
grp_data = GROUP runs by (year);
max_runs = FOREACH grp_data GENERATE group as grp,MAX(runs.runs) as max_runs;
join_max_run = JOIN max_runs by ($0, max_runs), runs by (year,runs);
join_data = FOREACH join_max_run GENERATE $0 as year, $2 as playerID, $1 as runs;
dump join_data;
I've copied their code exactly as it was stated in the tutorial and I'm getting this output:
2013-06-14 14:34:37,969 [main] INFO org.apache.pig.Main - Apache Pig version 0.11.1.1.3.0.0-107 (rexported) compiled May 20 2013, 03:04:35
2013-06-14 14:34:37,977 [main] INFO org.apache.pig.Main - Logging error messages to: /hadoop/mapred/taskTracker/hue/jobcache/job_201306140401_0020/attempt_201306140401_0020_m_000000_0/work/pig_1371245677965.log
2013-06-14 14:34:38,412 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /usr/lib/hadoop/.pigbootup not found
2013-06-14 14:34:38,598 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://sandbox:8020
2013-06-14 14:34:38,998 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: sandbox:50300
2013-06-14 14:34:40,819 [main] WARN org.apache.pig.PigServer - Encountered Warning IMPLICIT_CAST_TO_DOUBLE 1 time(s).
2013-06-14 14:34:40,827 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: HASH_JOIN,GROUP_BY
2013-06-14 14:34:41,115 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2013-06-14 14:34:41,160 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.CombinerOptimizer - Choosing to move algebraic foreach to combiner
2013-06-14 14:34:41,201 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler$LastInputStreamingOptimizer - Rewrite: POPackage->POForEach to POJoinPackage
2013-06-14 14:34:41,213 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 3
2013-06-14 14:34:41,213 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - Merged 1 map-reduce splittees.
2013-06-14 14:34:41,214 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - Merged 1 out of total 3 MR operators.
2013-06-14 14:34:41,214 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 2
2013-06-14 14:34:41,488 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2013-06-14 14:34:41,551 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2013-06-14 14:34:41,555 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
2013-06-14 14:34:41,559 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=6398990
2013-06-14 14:34:41,559 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1
2013-06-14 14:34:44,244 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - creating jar file Job5371236206169131677.jar
2013-06-14 14:34:49,495 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - jar file Job5371236206169131677.jar created
2013-06-14 14:34:49,517 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up multi store job
2013-06-14 14:34:49,529 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code.
2013-06-14 14:34:49,530 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche
2013-06-14 14:34:49,530 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Setting key [pig.schematuple.classes] with classes to deserialize []
2013-06-14 14:34:49,755 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2013-06-14 14:34:50,144 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2013-06-14 14:34:50,145 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2013-06-14 14:34:50,256 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2013-06-14 14:34:50,316 [JobControl] INFO com.hadoop.compression.lzo.GPLNativeCodeLoader - Loaded native gpl library
2013-06-14 14:34:50,444 [JobControl] INFO com.hadoop.compression.lzo.LzoCodec - Successfully loaded & initialized native-lzo library [hadoop-lzo rev cf4e7cbf8ed0f0622504d008101c2729dc0c9ff3]
2013-06-14 14:34:50,665 [JobControl] WARN org.apache.hadoop.io.compress.snappy.LoadSnappy - Snappy native library is available
2013-06-14 14:34:50,666 [JobControl] INFO org.apache.hadoop.util.NativeCodeLoader - Loaded the native-hadoop library
2013-06-14 14:34:50,666 [JobControl] INFO org.apache.hadoop.io.compress.snappy.LoadSnappy - Snappy native library loaded
2013-06-14 14:34:50,680 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2013-06-14 14:34:52,796 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201306140401_0021
2013-06-14 14:34:52,796 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases batting,grp_data,max_runs,runs
2013-06-14 14:34:52,796 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: batting[1,10],runs[2,7],max_runs[4,11],grp_data[3,11] C: max_runs[4,11],grp_data[3,11] R: max_runs[4,11]
2013-06-14 14:34:52,796 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - More information at: http://sandbox:50030/jobdetails.jsp?jobid=job_201306140401_0021
2013-06-14 14:36:01,993 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 50% complete
2013-06-14 14:36:04,767 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2013-06-14 14:36:04,768 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_201306140401_0021 has failed! Stop running all dependent jobs
2013-06-14 14:36:04,768 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2013-06-14 14:36:05,029 [main] ERROR org.apache.pig.tools.pigstats.SimplePigStats - ERROR 2106: Error executing an algebraic function
2013-06-14 14:36:05,030 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2013-06-14 14:36:05,042 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:
HadoopVersion PigVersion UserId StartedAt FinishedAt Features
1.2.0.1.3.0.0-107 0.11.1.1.3.0.0-107 mapred 2013-06-14 14:34:41 2013-06-14 14:36:05 HASH_JOIN,GROUP_BY
Failed!
Failed Jobs:
JobId Alias Feature Message Outputs
job_201306140401_0021 batting,grp_data,max_runs,runs MULTI_QUERY,COMBINER Message: Job failed! Error - # of failed Map Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask: task_201306140401_0021_m_000000
Input(s):
Failed to read data from "hdfs://sandbox:8020/user/hue/batting.csv"
Output(s):
Counters:
Total records written : 0
Total bytes written : 0
Spillable Memory Manager spill count : 0
Total bags proactively spilled: 0
Total records proactively spilled: 0
Job DAG:
job_201306140401_0021 -> null,
null
2013-06-14 14:36:05,042 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Failed!
2013-06-14 14:36:05,043 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066: Unable to open iterator for alias join_data
Details at logfile: /hadoop/mapred/taskTracker/hue/jobcache/job_201306140401_0020/attempt_201306140401_0020_m_000000_0/work/pig_1371245677965.log
When switching this part: MAX(runs.runs) to avg(runs.runs) then I am getting a completely different issue:
2013-06-14 14:38:25,694 [main] INFO org.apache.pig.Main - Apache Pig version 0.11.1.1.3.0.0-107 (rexported) compiled May 20 2013, 03:04:35
2013-06-14 14:38:25,695 [main] INFO org.apache.pig.Main - Logging error messages to: /hadoop/mapred/taskTracker/hue/jobcache/job_201306140401_0022/attempt_201306140401_0022_m_000000_0/work/pig_1371245905690.log
2013-06-14 14:38:26,198 [main] INFO org.apache.pig.impl.util.Utils - Default bootup file /usr/lib/hadoop/.pigbootup not found
2013-06-14 14:38:26,438 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to hadoop file system at: hdfs://sandbox:8020
2013-06-14 14:38:26,824 [main] INFO org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting to map-reduce job tracker at: sandbox:50300
2013-06-14 14:38:28,238 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve avg using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]
Details at logfile: /hadoop/mapred/taskTracker/hue/jobcache/job_201306140401_0022/attempt_201306140401_0022_m_000000_0/work/pig_1371245905690.log
Anybody know what the issue might be?
I am sure lot of people would have figured this out. I combined Eugene's solution with the original code from Hortonworks such that we get the exact output as specific in the tutorial.
Following code works and produces exact output as specified in the tutorial:
batting = LOAD 'Batting.csv' using PigStorage(',');
runs_raw = FOREACH batting GENERATE $0 as playerID, $1 as year, $8 as runs;
runs = FILTER runs_raw BY runs > 0;
grp_data = group runs by (year);
max_runs = FOREACH grp_data GENERATE group as grp, MAX(runs.runs) as max_runs;
join_max_run = JOIN max_runs by ($0, max_runs), runs by (year,runs);
join_data = FOREACH join_max_run GENERATE $0 as year, $2 as playerID, $1 as runs;
dump join_data;
Note: line "runs = FILTER runs_raw BY runs > 0;" is additional than what has been provided by Hortonworks, thanks to Eugene for sharing working code which I used to modify original Hortonworks code to make it work.
UDFs are case sensitive, so at least to answer the second part of your question - you'll need to use AVG(runs.runs) instead of avg(runs.runs)
It's likely that once you correct your syntax you'll get the original error you reported...
i am having the same exact same issue with exact same log output, but this solution doesn't work because i believe changing MAX with AVG here dumps the whole purpose of this hortonworks.com tutorial - it was to get the MAX runs by playerID for each year.
UPDATE
Finally i got it resolved - you have to either remove the first line in Batting.csv (column names) or edit your Pig Latin code like this:
batting = LOAD ‘Batting.csv’ using PigStorage(‘,’);
runs_raw = FOREACH batting GENERATE $0 as playerID, $1 as year, $8 as runs;
runs = FILTER runs_raw BY runs > 0;
grp_data = group runs by (year);
max_runs = FOREACH grp_data GENERATE group as grp, MAX(runs.runs) as max_runs;
dump max_runs;
After that you should be able to complete tutorial correctly and get the proper result.
It also looks like this is due to the "bug" in the older versions of Pig rhich was used in the tutorial
Please specify appropriate data type for playerID, year & runs like below:
runs = FOREACH batting GENERATE $0 as playerID:int, $1 as year:chararray, $8 as runs:int;
Not, it should work.

Categories