Java XMLInputFactory - truncates text when reading data with .getData()

Java XMLInputFactory - truncates text when reading data with .getData() - java

I'm using XMLInputFactory to read data (sql queries) from xml file.
In some cases, the data is truncated. For example:
select CASE WHEN count(*) > 0 THEN 'LX1VQMSSRV069 OK' ELSE 'LX1VQMSSRV069 NOK' END from [PIWSLog].[dbo].[log]
is read as (text is truncated after the last '.'):
select CASE WHEN count(*) > 0 THEN 'LX1VQMSSRV069 OK' ELSE 'LX1VQMSSRV069 NOK' END from [PIWSLog].[dbo]
I've tested with several string and it seems that the problem is with the char's in [].[].[]..
I'm readind data using:
mySQLquery = event.asCharacters().getData();
Another situation is if the string has '\n'. Like, if it has two '\n', the event.asCharacters().getData(); reads correctly, but if it has three '\n' it truncates the string after the second '\n'. This is very odd!
Any idea what's the problem and how can I solve it?

The XMLInputFactory API is not obliged to give you all of the characters of a String in one go. It's permitted to pass you a sequence of events, each containing a fragment of the string.
You'll probably find that if you read another event after the one containing the truncated string, you'll find the remainder of your string (possibly after several events).

Related

Unable to capture next line character in Java

I have a requirement of parsing through an python file which contains multiple sql queries and get the start and end positions of the query to get only the query part using JAVA
I am using .contains function to check for sql(''' as my opening character for the query and now for the closing character I have ''') but there are some cases where ''') comes in between the query when there is a variable involved which should not be detected as an end of the query.
Something like this :
spark.sql(''' SELECT .......
FROM.....
WHERE xxx IN ('''+ Variable +''')
''')
here the last but one line also gets detected as end of line if I use line.contains(" ''') ") which is wrong.
All I can think of is to check for next line character as the end of the query as each query is separated by two empty lines. So tried these if (line.contains(" ''')\n") & if (line.contains(" ''')\r\n") but none of them work for me.
Kindly let me know of any other way to do this.
Note that I do not have the privilege to change the query file.
Thanks

I believe simple contains won't solve this problem.
You will have to use Pattern if you are looking to match \n.
String query = "spark.sql(''' SELECT .......\n" +
"FROM..... \n" +
"WHERE xxx IN ('''+ Variable +''')\n" +
"''')";
Pattern pattern = Pattern.compile("^spark.sql\\('''(.*)'''\\)$", Pattern.DOTALL);
System.out.println(pattern.matcher(query).find());
Output:
true
Pattern.DOTALL tells Java to allow the dot to match newline characters, too.

Cant parse pipe delimited header data into correct variable

I have a file with data in the first row that i want to extract the data looks like
20200403|AS421|||FINN|
public void handleLine(String line) {
if (line.contains(firstJobConfig.DELIMITER_PIPE)){
headerInfo.setcreateDate(line.substring(0, line.indexOf(firstJobConfig.DELIMITER_PIPE)));
headerInfo.setformName(line.substring(line.indexOf(firstJobConfig.DELIMITER_PIPE)));
}
}
}
I have code that pulls 20200403 into my createDate variable but i cant figure out how to get my formName to be set to AS421. right now its set to |AS421|||FINN|. i know that if i doline.substring(9,14)); it will work but i want to start after the first pipe delimiter( |) and stop at the next one.

Right now, you're doing this: headerInfo.setformName(line.substring(line.indexOf(firstJobConfig.DELIMITER_PIPE))) -> you're taking substring starting with the index equals to index where the first delimiter is and aren't specifying the end of this substring (That's why the result of the second substring is: |AS421|||FINN|). So the better way will be to use line.split("\\|") - It will return the table of 5 elements in your case: ["20200403","AS421","","","FINN"]. And then you can do:
headerInfo.setcreateDate(table[0]);
headerInfo.setformName(table[1])

You can split the strings like below.
Add a + to match one or more instances of the pipe:
temp.split("\\|+");

AS400 SQL Script on a parameter file returns

I'm integrating an application to the AS400 using Java/JT400 driver. I'm having an issue when I extract data from a parameter file - the data retrieved seems to be encoded.
SELECT SUBSTR(F00001,1,20) FROM QS36F."FX.PARA" WHERE K00001 LIKE '16FFC%%%%%' FETCH FIRST 5 ROWS ONLY
Output
00001: C6C9D9C540C3D6D4D4C5D9C3C9C1D34040404040, - 1
00001: C6C9D9C5406040C3D6D4D4C5D9C3C9C1D3406040, - 2
How can I convert this to a readable format? Is there a function which I can use to decode this?
On the terminal connection to the AS400 the information is displayed correctly through the same SQL query.
I have no experience working with AS400 before this and could really use some help. This issue is only with the parameter files. The database tables work fine.

What you are seeing is EBCDIC output instead of ASCII. This is due to the CCSID not being specified in the database as mentioned in other answers. The ideal solution is to assign the CCSID to your field in the database. If you don't have the ability to do so and can't convince those responsible to do so, then the following solution should also work:
SELECT CAST(SUBSTR(F00001,1,20) AS CHAR(20) CCSID(37))
FROM QS36F."FX.PARA"
WHERE K00001 LIKE '16FFC%%%%%'
FETCH FIRST 5 ROWS ONLY
Replace the CCSID with whichever one you need. The CCSID definitions can be found here: https://www-01.ibm.com/software/globalization/ccsid/ccsid_registered.html

Since the file is in QS36F, I would guess that the file is a flat file and not externally defined ... so the data in the file would have to be manually interpreted if being accessed via SQL.
You could try casting the field, after you substring it, into a character format.
(I don't have a S/36 file handy, so I really can't try it)

It is hex of bytes of a text in EBCDIC, the AS/400 charset.
static String fromEbcdic(String hex) {
int m = hex.length();
if (m % 2 != 0) {
throw new IllegalArgumentException("Must be even length");
}
int n = m/2;
byte[] bytes = new byte[n];
for (int i = 0; i < n; ++i) {
int b = Integer.parseInt(hex.substring(i*2, i*2 + 2), 16);
bytes[i] = (byte) b;
}
return new String(bytes, Charset.forName("Cp500"));
}
passing "C6C9D9C540C3D6D4D4C5D9C3C9C1D34040404040".
Convert the file with Cp500 as charset:
Path path = Paths.get("...");
List<String> lines = Files.readAllLines(path, Charset.forName("Cp500"));
For line endings, which are on AS/400 the NEL char, U+0085, one can use regex:
content = content.replaceAll("\\R", "\r\n");
The regex \R will match exactly one line break, whether \r, \n, \r\n, \u0085.

A Big thank you for all the answers provided, they are all correct.
It is a flat parameter file in the AS400 and I have no control over changing anything in the system. So it has to be at runtime of the SQL query or once received.
I had absolutely no clue about what the code page was as I have no prior experience with AS400 and files in it. Hence all your answers have helped resolve and enlighten me on this. :)
So, the best answer is the last one. I have changed the SQL as follows and I get the desired result.
SELECT CAST(F00001 AS CHAR(20) CCSID 37) FROM QS36F."FX.PARA" WHERE K00001 LIKE '16FFC%%%%%' FETCH FIRST 5 ROWS ONLY
00001: FIRE COMMERCIAL , - 1
00001: FIRE - COMMERCIAL - , - 2
Thanks once again.
Dilanke

Using trim() but still didn't get expected output

Ok,i am developing spring MVC based web application, application shows data is list, and i also facilitate filter options to enhanced search functionality, I also remove extra space by using trim(), but what happening now, when user input data in text field and enter the corresponding result will be displayed into the list, but if space added after input, the result will be "NOT FOUND" even i handle the space in javascript too
Java Code which fetches data from database
if (searchParamDTO.getRegNO().trim() != null && !searchParamDTO.getRegNO().trim().equals("") && !searchParamDTO.getRegNO().trim().equals("null")) {
query += " AND UR.REG_UNIQUE_ID = :REG_UNIQUE_ID ";
param.addValue("REG_UNIQUE_ID", searchParamDTO.getRegNO());
}
JavaScript Code: fetches the value in behalf of id
function setSearchParameters() {
regNo = $('#regNo').val().trim();}
i also attached two screenshot with spaces and without spaces
Without space
With space

As #Greg H said you're trimming the string when checking if it's blank, but then adding the raw string to the query which will include any trailing spaces.
Then, this line param.addValue("REG_UNIQUE_ID", searchParamDTO.getRegNO()); should be replaced by param.addValue("REG_UNIQUE_ID", searchParamDTO.getRegNO().trim());

How to distinguish in quotes delimiter vs out of quotes delimiter

I have a txt file that contains the following
SELECT TOP 20 personid AS "testQu;otes"
FROM myTable
WHERE lname LIKE '%pi%' OR lname LIKE '%m;i%';
SELECT TOP 10 personid AS "testQu;otes"
FROM myTable2
WHERE lname LIKE '%ti%' OR lname LIKE '%h;i%';
............
The above query can be any legit SQl statement (on one or multiple lines , i.e. any way user wishes to type in )
I need to split this txt and put into an array
File file ... blah blah blah
..........................
String myArray [] = text.split(";");
But this does not work properly because it take into account ALL ; . I need to ignore those ; that are within ";" AND ';'. For example ; in here '%h;i%' does not count because it is inside ''. How can I split correctly ?

Assuming that each ; you want to split on is at the end of line you can try to split on each ; + line separator after it like
text.split(";"+System.lineSeparator())
If your file has other line separators then default ones you can try with
text.split(";\n")
text.split(";\r\n")
text.split(";\r")
BTW if you want to include ; in split result (if you don't want to get rid of it) you can use look-behind mechanism like
text.split("(?<=;)"+System.lineSeparator())
In case you are dynamically reading file line-by-line just check if line.endsWith(";").

I see a 'new line' after your ';' - It is generalizable to the whole text file ?
If you must/want use regular expression you could split with a regex of the form
;$
The $ means "end of line", depending of the regex implementation of Java (don't remember).
I will not use regex for this kind of task. Parsing the text and counting the number of ' or " to be able to recognize the reals ";" delimiters is sufficient.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Java XMLInputFactory - truncates text when reading data with .getData() - java

Related

Unable to capture next line character in Java

Cant parse pipe delimited header data into correct variable

AS400 SQL Script on a parameter file returns

Using trim() but still didn't get expected output

How to distinguish in quotes delimiter vs out of quotes delimiter

Categories

Resources