Troubleshooting COPY errors on AWS Redshift

Troubleshooting COPY errors on AWS Redshift - java

Update: If figured this out but am still interested in an explanation. The problem was that I was running the code below while also connected to my Redshift cluster from SqlWorkbenchJ (both running on the same laptop). The second I disconnect my SqlWorkbenchJ session and re-run my code, it doesn't hang. Why?
Please note: Although I mention Java/JDBC in this question, it is strictly a question about troubleshooting Redshift and is language/framework-agnostic!!!
Also here's an SSCCE repo that perfectly reproduces the hanging issue:
https://github.com/bitbythecron/redshift-copy-troubleshooting
I'm trying to run the following Redshift COPY command from Java code (using Postgres JDBC driver):
COPY my_schema.mytable
FROM 's3://com.example.mybucket/mydata.csv/part-00000-bc1b179d-b4c1-459f-8f5e-8fe361d4b40f-c000.csv'
iam_role 'arn:aws:iam::blah:role/MyRedshiftRole'
csv;
If I've read the docs right, this should:
Read a CSV file stored on S3
Copy its contents into a Redshift table (my_schema.mytable)
When I run this command in my Redshift UI client (SqlWorkbenchJ) it executes correctly and runs in a few seconds. However when I execute the following JDBC code (using the exact same connection URL, credentials, etc.) the code just hangs at the executeUpdate command:
Connection conn = null;
Statement statement = null;
try {
Class.forName("org.postgresql.Driver");
Properties props = new Properties();
props.setProperty("user", redshiftInfo.username);
props.setProperty("password", redshiftInfo.password);
log.info("\n\nAttempting to connect!\n\n");
conn = DriverManager.getConnection("jdbc:postgresql://<sameExactUrl_thatIUser_inSqlWorkbenchJ>", props);
log.info("\n\nConnection made!\n\n");
statement = conn.createStatement();
String command = "COPY my_schema.my_table FROM 's3://com.example.mybucket/mydata.csv/part-00000-bc1b179d-b4c1-459f-8f5e-8fe361d4b40f-c000.csv' iam_role 'arn:aws:iam::blah:role/MyRedshiftRole' csv";
log.info("\n\nExecuting...\n\n");
statement.executeUpdate(command);
log.info("\n\nHey I think it worked!!!\n\n");
statement.close();
conn.close();
} catch (Exception ex) {
log.info(ExceptionUtils.getStackTrace(ex));
}
When this runs, in the logs I get to the Executing... log statement, but then the software just hangs. I've waited for as long as 30 minutes to see if it was just slow for some reason. I've also refreshed my SqlWorkbenchJ connection throughout (and after) this 30 minutes and ran SELECT COUNT(*) FROM my_schema.my_table and the count is always 0. So its making the connection but then nothing is actually being copied, or if it is, its not being committed.
I'd like to see what's happening on the Redshift side of things: are there any tables or logs (in the AWS console or otherwise) I can tail or inspect to see if records are actually being copied and staged somewhere, or to see if there are any errors being thrown reported from Redshift's perspective?

There is no problem with your Java code. It works perfectly fine if number of records are less.
create table my_table (
c_name varchar(25) not null,
c_address varchar(25) not null,
c_city varchar(25) not null);
Create a CSV with data# and put it in S3 with just 2-3 records,
one,two,three
example1,example2,example3
Then, run your code, it will following output.
Attempting to connect!
Connection made!
Executing...
Hey I think it worked!!!
Now, do
Select * from my_table;
c_name | c_address | c_city
----------+-----------+----------
one | two | three
example1 | example2 | example3
Coming back to your question, why you see 0 records in Select * from my_table;
Fact:
Amazon Redshift is fully ACID Complaint, means until your copy command completed and committed, hence, you will not see any records in SELECT.
Solution:
You would like to see, what is happening with your query, whether getting executed or terminated?
You could run following command to see all the current running queries.
select pid, user_name, starttime, query from stv_recents where status='Running';
//OR
select query, pid, elapsed, substring from svl_qlog where userid = 100 order by starttime desc limit 5;
Refer AWS Redshift system query documentation for more details.

The problem was that I was running the code below while also connected to my Redshift cluster from SqlWorkbenchJ (both running on the same laptop). The second I disconnect my SqlWorkbenchJ session and re-run my code, it doesn't hang.

Related

SQL delete query in java app takes too long

I'm starting with SQL and trying to mix it with Java app. I have table ZAMESTNANEC containing 6 rows.
When I issue the command delete from ZAMESTNANEC where ID = 7; in SQL it will delete in no time. A few milliseconds. But when I use this in my Java app, the app will freeze in processing. I waited for 4 minutes and nothing happened (and due to its working state I can't do anything else). Oh and the row wasn't deleted.
I read this topic about deleting but it didn't help me much. In fact it didn't help me at all.
oracle delete query taking too much time
I tried to debug it but it's frozen on this command. I don't understand why in SQL it works fine and in Java app it doesn't. Other commands like SELECT works fine.
JDBC here - http://pastebin.com/BRh06yc8
Code from button here
private void jButtonOdeberZamActionPerformed(java.awt.event.ActionEvent evt) {
try{
OracleConnector.setUpConnection("xxxxxxxx", 1521, "ee11",
"NAME", "PASSWORD");
conn = OracleConnector.getConnection();
stmt = conn.createStatement();
stmt.executeQuery("delete from ZAMESTNANEC where ID = 7");
} catch(SQLException ex){
System.out.println(ex);
}

executeQuery should be used for queries that are expected to return results. Try executeUpdate instead and see if that helps. It could be that your app is waiting to receive results which never come back. By Tom H
Thank you Tom.

How to "package" an SQL database

I am looking for a way to save an SQL database and then reference it by means other than localhost which would not work because it is being used on other computers.
I realize that my terminology may not be correct in asking for a means to "package" an SQL database however I am not very sure how to put my desire such a concise title.
I have a database that I created through mySQL here: http://gyazo.com/fcac155a60c0d2587442c3e4807ef98a
I can access this database with no problems through the following code...
try
{
//Get connection
Connection myConn = DriverManager.getConnection("jdbc:mysql://localhost:3306/term_database","root", "_cA\"#8X(XHm+++E");
//**********
//Connection myConn = DriverManager.getConnection("jdbc:mysql:translationDatabase","root", "_cA\"#8X(XHm+++E");
//**********
//create statement
Statement myStmt = myConn.createStatement();
//execute sql query
ResultSet myRs = myStmt.executeQuery("select * from terms WHERE idNumber=" +termNumber);
//process result set
while(myRs.next()){
term= (myRs.getString(language));
}
}
catch (Exception exc)
{
exc.printStackTrace();
}
However, I assume that my users will be on different computers and so a "//localhost" reference will not work. They do not have access to the internet either. So I aim to include the database in my program's files to be downloaded with the software or to include it in the jar. I was not able to find any means to do that online. The code I surrounded with *'s was an attempt to reference translationDatabase.sql which I saved through the program mySQL into my software's directory but it did not work as shown here: http://gyazo.com/e9d4339435dedecab4e7ad960e9b13b6
To recap: I am looking for a way to save an SQL database and then reference it by means other than localhost which would not work because it is being used on other computers.

The idiomatic terminology is "embedded" or "serverless" database.
There are several pure-java solutions. There is also the popular SQLite, which you can manipulate via its command line client, or via a third-party JDBC driver (example 1, example 2)
Any of the above solutions will require that you convert your existing MySQL database to the target system..
Alternatively, you may consider bundling your application with MySQL server (possibly with an automated installation process, so that installation is invisible to the end-user).

Java Statement.executeUpdate(sql) not working when executeQuery(sql) works

I have a wierd behavior in a Java application.
It issues simple queries and modifications to a remote MySQL database. I found that queries, run by executeQuery() work just fine, but inserts or delete to the database run through executeUpdate() will fail.
Ruling out the first thing that comes to mind: the user the app connects with has correct privilledges set up, as the same INSERT run from the same machine, but in DBeaver, will produce the desired modification.
Some code:
Connection creation
Class.forName("com.mysql.jdbc.Driver");
connection = DriverManager.getConnection(url, user, pass);
Problematic part:
Statement parentIdStatement = connection.createStatement();
String parentQuery = String.format(ProcessDAO.GET_PARENT_ID, parentName);
if (DEBUG_SQL) {
plugin.getLogger().log(Level.INFO, parentQuery);
}
ResultSet result = parentIdStatement.executeQuery(parentQuery);
result.first();
parentId = result.getInt(1);
if (DEBUG_SQL) {
plugin.getLogger().log(Level.INFO, parentId.toString()); // works, expected value
}
Statement createContainerStatement = connection.createStatement();
String containerQuery = String.format(ContainerDAO.CREATE_CONTAINER, parentId, myName);
if (DEBUG_SQL) {
plugin.getLogger().log(Level.INFO, containerQuery); // works when issued through DBeaver
}
createContainerStatement.executeUpdate(containerQuery); // does nothing
"DAOs":
ProcessDAO.GET_PARENT_ID = "SELECT id FROM mon_process WHERE proc_name = '%1$s'";
ContainerDAO.CREATE_CONTAINER = "INSERT INTO mon_container (cont_name, proc_id, cont_expiry, cont_size) VALUES ('%2$s', %1$d, CURRENT_TIMESTAMP(), NULL)";
I suspect this might have to do with my usage of Statement and Connection.
This being a lightweight lightly-used app, I went to simplicity, so no framework, and no specific isntructions regarding transactions or commits.

So, in the end, this code was just fine. It worked today.
To answer the question: where to look first in a similar case (SELECT works but UPDATE / INSERT / DELETE does not)
If rights are not the problem, then there is probably a lock on the table you try to modify. In my case, someone left with an uncommited transaction open.
Proper SQL exceptions logging (which was suboptimal in my case) will help you figure it out.

Java MySQL PreparedStatement.setBoolean wraps value in quotes

Short version of my question is:
PreparedStatement ps;
ps = connection.prepareStatement("Insert into T values (?)");
ps.setBoolean(1, true);
ps.executeUpdate();
What can be the reasons for this code sample to produce query with value wrapped in quotes?
Long version of my question is:
I have JavaEE application with plain JDBC for DB interactions and recently I noticed that there are some MySQLDataTruncation exceptions appearing in my logs. These exceptions were occurring on attempt to save entity into DB table which have boolean column defined as BIT(1). And it was because generated query looked like this:
Insert into T values ('1');
Note that value is wrapped with quotes. Query was logged from application with Log4J log.info(ps); statement.
Previous logs demonstrate that there where no quotes.
Furthermore, even MySQL server logs started to look different. Before this happened I had given pairs of records for each query executed:
12345 Prepare Insert into T values (?)
12345 Execute Insert into T values (1)
And after:
12345 Query Insert into T values ('1')
It is worth noting that those changes wasn`t a result of deploying new version of application or even restarting MySQL/Application server and code, responsible of query generation, is as straightforward as example in this question.
Application server restart fixed the issue for about 12 hours, and then it happened again. As a temporary solution I changed BIT columns to TINYINT
P.S. Examining both aplication and MySQL logs allowed to narrow down the time span when something went wrong to about 2 minutes, but there were nothing abnormal in the logs in this period.
P.P.S. Application server is Glassfish 2.1.1, MySQL server version is 5.5.31-1~dotdeb and MySQL Connector/J version is 5.0.3.

Well, it turned out it was actually an issue with unclosed prepared statements.
When opened statements count at MySQL server reached its allowed maximum, application was still able to continue working somehow, withoout producing sql error:
Error Code: 1461 Can’t create more than max_prepared_stmt_count statements
But in that mode it started to wrap boolean values with quotes, causing all my troubles affecting BIT(1) columns.

Java app hangs after calling PreparedStatement (against SQL Server DB)

I'm trying to get to grips with a Java app that talks to a SQL Server 2008 R2 DB. The app imports data into the DB, and it has a 'test mode'; the DB requests are wrapped up in a transaction, which is rolled back at the end.
With a particular dataset, the tool disables a trigger, and then re-enables it after the import. In test mode, on the first pass, everything works as expected - the dataset in 'imported' without problems. However, if I try to repeat the exercise, the app hangs at the point where it tries to disable the trigger.
Looking at SQL Profiler, I can see an RPC:Completed trace item, which suggests that SQL Server has received and successfully processed the request. At which point, I would expect the Java app to pick up control and continue -except that it doesn't, I'm struggling to think where to look next.
Java code:
String sql = "ALTER TABLE MyTable DISABLE TRIGGER ALL";
PreparedStatement stmt = mDBConnection.prepareStatement (sql);
stmt.execute();
Trace TextData:
declare #p1 int
set #p1=1
exec sp_prepare #p1 output,N'',N'ALTER TABLE MyTable DISABLE TRIGGER ALL',1
select #p1
Q: Any idea what the problem might be? Or any suggestions as to how I investigate further?
UPDATE:
Of course, the trace above only only shows the sp_prepare. There is a corresponding sp_execute statement - and the lack of RPC:Completed trace item, indicates that the problem is on SQL Servers side. A modified trace shows an RPC:Starting entry ('exec sp_execute 1'), but no matching RPC:Completed.
I can run sp_prepare & sp_execute in SSMS (providing I remove the set statement), as expected - it executes OK on the first pass after all.
Solution:
Using sp_who2 (see below), I could see that there the first connection/spid was blocked the second; on commit, the db connection was closed, but on rollback it wasn't. Since I'm running in test-and-rollback mode, this was the crux of my problem - closing the connection solved the problem.
sp_who2:
CREATE TABLE #sp_who2
(
SPID INT,
Status VARCHAR(1000) NULL,
Login SYSNAME NULL,
HostName SYSNAME NULL,
BlkBy SYSNAME NULL,
DBName SYSNAME NULL,
Command VARCHAR(1000) NULL,
CPUTime INT NULL,
DiskIO INT NULL,
LastBatch VARCHAR(1000) NULL,
ProgramName VARCHAR(1000) NULL,
SPID2 INT,
RequestID int
)
GO
INSERT INTO #sp_who2 EXEC sp_who2
GO
SELECT spid, status, blkby, command, ProgramName FROM #sp_who2 WHERE DBName = 'rio7_bch_test'
GO
DROP TABLE #sp_who2
GO

This very much sounds like you have locks that aren't released properly and block your DDL execution.
When your statement hangs, run the stored procedure sp_who2.
In the result of that procedure you'll which session is blocking your DDL and then you can take the approriate actions.

Don't use a PreparedStatement for this. Use just a plain Statement.
Statement stmt = mDBConnection.createStatement(sql);

The "ALTER TABLE" statement is DDL (Data Definition Language). DDL must wait for all DML (Data Manipulation Language) statements to complete. If you have an unclosed ResultSet, Statement, or PreparedStatement that is querying the table or a view upon that table, or a join with that table, or updating with auto-commit turned off - then that is DML that is not complete.
Before altering the table like this, ensure that every possible result set open on it has been explicitly closed, and similarly any statements. That will ensure that all DML is complete and DDL can be performed.
In general it is better to use PreparedStatements over Statements. A PreparedStatement is compiled once. A Statement every time it is executed. This means there is no difference for unparameterised statements like yours, and a potential benefit for any parameterised once.
Assuming a trusted JDBC implementation, there is no time a Statement might work when a PreparedStatement does not.
You may also find this question helpful.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Troubleshooting COPY errors on AWS Redshift - java

The problem was that I was running the code below while also connected to my Redshift cluster from SqlWorkbenchJ (both running on the same laptop). The second I disconnect my SqlWorkbenchJ session and re-run my code, it doesn't hang.

Related

SQL delete query in java app takes too long

How to "package" an SQL database

Java Statement.executeUpdate(sql) not working when executeQuery(sql) works

Java MySQL PreparedStatement.setBoolean wraps value in quotes

Java app hangs after calling PreparedStatement (against SQL Server DB)

Categories

Resources