Postgres COPY command running manually but not through JAVA - java

I am running this command via JAVA program builder
plpgsql = "\"path_to_psql_executable\psql.exe" -U myuser -w -h myhost -d mydb -a -f "some_path\copy_7133.sql" 2> "log_path\plsql_7133.log\"";
ProcessBuilder pb = new ProcessBuilder("C:\\Windows\\System32\\cmd.exe", "/c", plpgsql);
Process p = pb.start();
p.getOutputStream().close();
p.waitFor();
This is returning me the following error:
ERROR: invalid byte sequence for encoding "UTF8": 0xbd CONTEXT: COPY
copy_7133, line 4892
The catch is if I the run the SQL command manually in cmd, then it is copying all of the data successfully giving me the number of rows inserted. Not able to figure out the reason
NOTE: The code is causing problem only for one particular file, for rest working fine.
EDIT:
Copy command being run:
\copy s_m_asset_7140 FROM 'C:\ER\ETL\Unzip_7140\asset.csv' csv HEADER QUOTE '"' ENCODING 'UTF8';
The last error the command gave:
psql:C:/ER/ETL/Unzip_7140/copy_s_m_asset_7140.sql:1: ERROR: invalid
byte sequence for encoding "UTF8": 0xa0 CONTEXT: COPY s_m_asset_7140,
line 10282
But there doesn't seem to be any special character except a '-'. Not sure what it is not able read.
Few more details abt DB:
show client_encoding;
"UNICODE"
show server_encoding;
"UTF8"

Worked. But still not understand why UTF8 did not work.
I changed the encoding to LATIN1 and it worked
\copy s_m_asset_7140 FROM 'C:\ER\ETL\Unzip_7140\asset.csv' csv HEADER QUOTE '"' ENCODING 'LATIN1';
Can somebody pls explain why UTF8 did not work?

Related

Jar file escapes "--" from substituted variable - Bash

Code:
#!/bin/bash
MyVariable="--option arg1,arg2"
echo Variable output : $MyVariable
java -jar HelloInternet.jar "$MyVariable"
Expected results:
The jar file should recognize and use the value stored in variable.
Actual results:
The jar file escapes "--" from "--option arg1,arg2" , and interprets the variable without the "--" .
Include any error messages:
Exception in thread "main" joptsimple.UnrecognizedOptionException: option arg1,arg2 is not a recognized option
Describe what you have tried:
I tried using ' ' instead of " " and vice versa without success.
Use an array when you need an array:
MyVariable=(--option arg1,arg2)
java -jar HelloInternet.jar "${MyVariable[#]}"

IP as Linux array element throws UnknownHostException but as constant works

I have the following script in the directory /home/test/javacall that parses csv of IP pair , invokes a sh file that calls an executable jar to get output from these IPs.
In the below code ip1=${IPArray[0]} throws UnknownHostException from java.
But If I use the ip directly ip1="10.10.10.10" java code works fine. I did System.out.println from java and I got the same IP displayed in both cases. But in the case of ip1=${IPArray[0]} only, I get the exception.
#!/bin/bash
INPUT="IPPairs.csv"
array=()
while IFS="," read var1 var2 ; do
echo $var1 $var2
pairString="$var1***$var2"
array+=("$pairString")
done < $INPUT
for i in "${array[#]}" ; do
echo $i
IPString=$(echo $i | tr '***' ' ')
read -ra IPArray <<< "$IPString"
ip1=${IPArray[0]}
#ip1="10.10.10.10"
ip2=${IPArray[1]}
source /home/test/javacall/javacmd.sh "$ip1" "/home/test/javacall/out.txt" "show running-config all-properties"
done
Exception:
com.jcraft.jsch.JSchException: java.net.UnknownHostException: 10.10.10.10
at com.jcraft.jsch.Util.createSocket(Util.java:349)
at com.jcraft.jsch.Session.connect(Session.java:215)
at com.jcraft.jsch.Session.connect(Session.java:183)
That string (357\273\277) indicates that your csv file is encoded with a Byte-Order Mark (BOM) at the front of the file. The read command is not interpreting the BOM as having special meaning, just passing on the raw characters, so you see them as part of your output.
Since you didn't indicate how your source file is generated, you may be able to adjust the settings on that end to prevent writing the BOM, which is optional in many cases. Alternatively, you can work around it various ways on the script side. These questions both offer some examples:
How can I remove the BOM from a UTF-8 file?
Cygwin command not found bad characters found in .bashrc 357\273\277
But honestly, if you just follow Charles Duffy's advice and run your file through dos2unix before parsing it, it should clean this up for you automatically. i.e.:
...
array=()
dos2unix $INPUT
while IFS="," read var1 var2 ; do
...
Or, building on Charles' version:
#!/usr/bin/env bash
case $BASH_VERSION in ''|[123].*) echo "ERROR: Bash 4.0+ needed" >&2; exit 1;; esac
INPUT="IPPairs.csv"
declare -A pairs=( )
dos2unix $INPUT
while IFS=$',\r' read -r var1 var2 _ ; do
pairs[$var1]=$var2
done <"$INPUT"
for ip1 in "${!pairs[#]}"; do
ip2=${pairs[$ip1]}
# Using printf %q causes nonprintable characters to be visibly shown
printf 'Processing pair: %q and %q\n' "$ip1" "$ip2" >&2
done
Do note that running dos2unix in your script is not necessarily the best approach, as the file only needs to be converted once. Generally speaking, it shouldn't hurt anything, especially with such a small file. Nonetheless, a better approach would be to run dos2unix as part of whatever process pushes your csv to the server, and keep it out of this script.
System.out.println() only shows visible characters.
If your input file contains DOS newlines, System.out.println() won't show them, but they'll still be present in your command line, and parsed as part of the IP address to connect to, causing an UnknownHostException. Converting it to a UNIX text file, as with dos2unix, or using :set fileformat=unix in vim, is typically the quickest way to fix this.
BTW, if you don't need ordering retained, an associative array is typically a more appropriate data structure to use to store pairs:
#!/usr/bin/env bash
case $BASH_VERSION in ''|[123].*) echo "ERROR: Bash 4.0+ needed" >&2; exit 1;; esac
declare -A pairs=( )
while IFS=$',\r' read -r var1 var2 _ ; do
pairs[$var1]=$var2
done <"$input"
for ip1 in "${!pairs[#]}"; do
ip2=${pairs[$ip1]}
# Using printf %q causes nonprintable characters to be visibly shown
printf 'Processing pair: %q and %q\n' "$ip1" "$ip2" >&2
done
In the above, using IFS=$',\r' prevents LF characters (from the "CRLF" sequence that makes up a DOS newline) from becoming either part of var1 or var2. (Adding an _ placeholder variable to consume any additional content in a given line of the file adds extra insurance towards this point).

How to find Invalid UTF-8 Character in oracle column

I have oracle table in which I am storing XML file , column is of CLOB type . Then we picked that xml file for further processing . It is somewhere breaking with below exception
"com.ctc.wstx.exc.WstxIOException: Invalid UTF-8 start byte 0xa0 (at char #931, byte #20)"
When we copy the content in notepad++ ,it didn't show any invalid UTF-8 Character.
Could any one help how to find invalid UTF-8 character in XML file in oracle column , request you to considering column is of CLOB type.
ANy help is greatly appreciated
Do you have access to Unix? You can use iconv -f utf-8 -t utf-8 -c yourfile.xml. You can find more possible options in this thread.

How to break up long continuous lines in log by inserting line breaks on Solaris 10

I need your assistance to chop a single continuous line from a log file of HL7 messages, into individual lines such as the following in using Unix commands / Java or Both:
Timestamp : 20130221 001805
Code : OUT031
Severity : F
Component : cmHL7OutboundBuild_jcdOutboundBuild1/cmHL7OutboundBuild/dpPRODHL7Outbound
Description : Last HL7 Message read <MSH|^~\&|target application|hostname|source application|A212|201302210016||ORM^O01|62877102|D|2.2|||ER
PID||.|0625176^^^00040||JOHN^SMITH^ ||19390112|F||4|address||^^^^^^93271081||||||41253603422
PV1|1||ED^^^40||||||name of physician||||||||physician operator id
ORC|SC||13-4529701-TFT-0|13-4529701|P|||||||name of physician
OBR|1||13-4529701-TFT-0|0360000^THYROID FUNCTION TESTS^1||201302212108|201302212102|||||||201302212108||name of physician||.|||13-4529701|201302210016||department|P||^^^201302212108
> Exception:com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Cannot add or update a child row: a foreign key constraint fails (`schema/table`, CONSTRAINT `table_ibfk_request_order_panel` FOREIGN KEY (`fk_gp_latest_request_order_panel`) REFERENCES `gp_request_order_panel` (`gp_request_order_panel_seq`))
SendEmail : Y
The issue is that all 11 lines are wrapped around in one continuous line which made up a single record, followed by the next record of another 11 lines and so on Solaris 10 Unix server. Yet
the same log would display properly when ftped over to a desktop Windows side. As a result, I am looking for a solution, possibly a combination of Unix & Java to break down each record of
11 lines on their own and the same thing for the next record....
Below is a break down on what I believe need to be done on the Solaris 10 server:
( i ) Would be great if this log could be converted to Windows ASCII format but staying on the Solaris 10 server, like the one being ftped to Windows. When the same file is ftped from
Windows back to Solaris 10 server, it reverted back to its original 1 continuous line format.
( ii ) Otherwise, break it down line by line or insert and end of line but where to?
I have tried various Unix string manipulated commands such as sed, awk, grep, fold, cut, unix2dos including from
Bash: how to find and break up long lines by inserting continuation character and newline? without success.
Your assistance would be much appreciated.
Thanks,
Jack
This is a windows text file (windows carriage control) since it displays ok in Windows but not UNIX.
Solaris has a dos2unix command to handle this issue.
dos2unix filename > newfilename
Try that. To see what I mean use the vi editor on "filename" in the example above. You will see ^M characters in the file. UNIX does not use those for formatting lines.
If you have the vim editor it will read "long lines". Or try od -c < filename | more to see the ^M or ascii 13 characters. od will show them as \r I'm reasonably sure the issue is that the carriage control on the file prevents UNIX from seeing the lines. Try the od approach.

executing csv copy command from remote machine with jdbc

How to create stdin object to pass as a parameter to copy command for csv upload into db table and execute with jdbc api
Example that i tried
DataInputStream in = new DataInputStream(new BufferedInputStream(
new FileInputStream("C:/Documents and Settings/517037/Desktop/new.csv")));
copy temp123 from "+in.read()+" using delimiters '|'
But here i am getting error.
org.postgresql.util.PSQLException: ERROR: syntax error at or near "48"
Position: 19
can any one help me out in this
Have you tried invoking copy command with full file name like:
copy temp123 from 'C:/Documents and Settings/517037/Desktop/new.csv' using delimiters '|'
If it do not work, then you can try code:
copy temp123 from stdin using delimiters '|'
and then invoke your Java program with redirecting of stdin:
c:\tmp>java my_import < "C:\Documents and Settings\517037\Desktop\new.csv"
Please go through the below link of how to store BLOB or CLOB(Binary large objects). It contains comprehensive list of examples for storing those types.Please make sure to check the specific database documentation in addition to that.
Storing CLOB or BLOB to DB

Categories