Parsing text file based on a pre-defined template - java

I have to parse several text files in pre-defined format and get the parsed data into Java code. For eg, a file would look something like below.
12345 5abcd 18864 12585
24584 4frrf 44855 84745
98745 2rgr4 25584 36546
where the first 2 characters in each line are area code (12,24,98 etc) and characters from 3 to 5 positions are district code and similarly, each position has its own meaning.
I am looking for a Java library that would allow me to create a template something like below
$for(int i=1;i<=endOfLine();i++){
$(area_code)[2]$(district_code)[3] $(address_line1)[5] $(address_line2)[5] $(address_line3)[5]
}
and read the file content like
Integer areaCode = template.getNumber("area_code");
Integer districtCode = template.getNumber("district_code");
String addressLine1 = template.getString("address_line1");
...
...
Any suggestion on which library can be used or how to do this?

Related

Map invalid character to valid character while creating a file and back to original name when read filename

I need to map invalid characters to some other characters like "/" to "_" (forward slash to underscore) while creating a file because file name do not allowed to put slashes, question, double quotes etc.
Suppose I have
String name = "Message Test - 22/10/2016";
Now I want to write a file by using above string but it gives error because of slashes.
So I want to map slash like all the invalid characters to any other characters while writing a file. After writing, I need to read all the names of the files & show on the page.
SOMEHOW I MAP THE CHARACTERS, SO FILE NAME WOULD BE
Message_Test_-_22-10-2016
When I show it on web I need to return file name as the original name like
Message Test - 22/10/2016
I am using java. Can anyone help me out of this how can I start writing this approach or Is there any api for it or Is there any other approach.
I don't want to use database to co-related alias file name with original file name
I need to map invalid characters to some other characters like "/" to "_"
It is not enough robust since it supposes that you never use the _ character in the filename.
If you use it, how to know if a file stored as my_file should be displayed as my_file or my/file in your application.
I think that a more reliable way would be to have a file (JSON or XML for example) that stores the two properties for each file :
the stored filename
the visual name representing it in your application
It demands an additional file but it makes things really clearer.
You can use a map to store the mappings:
E.g.
Map<Character,Character> map = new HashMap<Character,Character>();
map.put('/','_');
And then replace the characters in 1 traversal:
for(int i=0;i<str.length();i++){
char c = str.charAt(i);
if( map.containsKey(c) )
str.replace(c,map.get(c));
}

How to check for missing Key in JSON using Pig?

I have a JSON file with varying schema.
{"asin":"xxxxxx", "title":"xxxsomething"}
{"asin":"yyyyy"}
{"asin":"zzzzzz", "title":"zzzsomething"}
For which I have written a pig script that makes use of twitter's elephant-bird library to load the JSON data and convert it into a tab separated file.
However if a line in the input JSON file is missing the "title" key (line# 2 in above example), the tvs file also has nothing in place of it, like:
xxxxxx xxxsomething
yyyyyy
zzzzzz zzzsomething
I would like to give custom default value if a particular key is missing. How can I do this using PigLatin?
expected output:
xxxxxx xxxsomething
yyyyyy default_string
zzzzzz zzzsomething
Here's my script:
REGISTER elephant-bird-elephant-bird-4.13/pig/target/elephant-bird-pig-4.13.jar;
REGISTER elephant-bird-elephant-bird-4.13/hadoop-compat/target/elephant-bird-hadoop-compat-4.13.jar;
REGISTER elephant-bird-elephant-bird-4.13/core/target/elephant-bird-core-4.13-thrift9.jar;
reviews = load '../data/Amazon/meta_Amazon_Instant_Video.json'
using com.twitter.elephantbird.pig.load.JsonLoader();
tabs = FOREACH reviews generate (chararray)$0#'asin' as asin_new, (chararray)$0#'title';
A = ORDER tabs BY asin_new;
DESCRIBE A;
STORE A INTO 'hdfs://localhost:9000/meta_Amazon_Instant_Video.tsv';
You can simply write a UDF for that and put the condition that if either one of them is empty then pass the default string.

unable to retrive inlines images from the mail body in Lotus Notes

I am unable to retrive inline images/screen shot from Java in Lotus Notes from
document.getItemValueString('Body')
By above function am i able retrive text available in mailbody not inline images.
Please provide your suggestions in order to retrive inlines images from the mail body
Thanks in advance.
LSP Jyothi
First of all: Body is a NotesRichtextItem. You would have to use the NotesRichtextItem- methods and properties to get the inline- image... if there where any for that purpose.
Inline- images are not handled by any means in LotusScript. To get them, you need to:
Export the document as XML
Find the part in the XML that represents the inline image
take the Base64- encoded value there and convert it into a binary format, use Mime- Classes for that (Trick).
Write the data to a file
There is a lot of code involved in doing this. I just post the "crucial" parts of the code here (untested, no syntax check, just as a starting point):
EDIT: Sorry, I am not an expert in Java and only saw the tag "lotusscript", therefor my example is LotusScript- Code (should be similar with java, and I think Base64- operations are alreays built in in java, no need to use the Mime- Trick)
Dim strDxl as String
Dim strFoundBase64 as String
Dim exporter as NotesDXLExporter
Dim stream as NotesStream
Dim docConvert as NotesDocument
Dim mimeEntity as NotesMimeEntity
Set exporter = session.CreateDXLExporter
exporter.ConvertNotesBitmapsToGIF = True
strDxl = exporter.Export(document)
'- Search through strDxl and find everything that is in the following tags:
'- <gif></gif>, <gif originalformat='notesbitmap'></gif>, <jpeg></jpeg>, <png></png>
strFoundBase64 = ...'assign text between tags
'- use Mime class to convert to binary
Set docConvert = New NotesDocument( document.ParentDatabase )
Set mimeEntity = docConvert.CreateMIMEEntity
Call mimeEntity.SetContentFromBytes(strFoundBase64, "image/gif", ENC_BASE64)
'- Write result to file
Set stream = ses.CreateStream
Call stream.Open( "C:\Temp\image.gif", "binary")
Call mimeEntity.GetContentAsBytes(stream)
Call stream.Close()

How to set special character (UTF-8) in java String and xml?

I want to save special character (UTF-8) in java.
In my JSF page I am setting string value in model
<h:inputTextarea id="que" value="#{dataModel.question}"/>
When I am going to fetch that String in my java controller that time this give
different character eg. I have sentence like ΔLMN ≠ΔXYZ when I am printing the value
in controller that time it prints like ΔLMN ≠ΔXYZ.
In my project I am fetch the value from XML file and write the same value in XML like
option.addContent(new CDATA(new String(this.launchModel.getQuestionList().get(i).getOptionList().get(k).getOption().getBytes("UTF-8"), "UTF-8")));
How to sovle this problem..I am trting my side..

Load txt's file into Java application and save it to XML's file

I read the next answer about load file into java application.
I need to write a program that load .txt, which contains a list of records. After I parse it, I need to match the records (with conditions that I will check), and save the result to XML's file.
I am stuck on this issue, and I will happy for answer to next questions:
How I load the .txt file into Java?
After I load the file, how I can acsses to the information into it? for example, How I can asked if the first line of one of the records is equal to "1";
How I export the result to XML's file.
one: you need a sample-code for reading a file line by line
two: the split-method of a string might be helpful. For instance getting the number of the first element if information is seperated by a space
String myLine;
String[] components = myLine.split(" ");
if(components != null && components.length >= 1) {
int num = Integer.parseInt(components[0]);
....
}
three: you can just write it like any text-file, or use any XML-Writer you want
Basic I/O
Integer.parseInt(1stLine)
There are a plethora of choices.
Create POJO's to represent the records and write them using XMLEncoder
SAX
DOM..

Categories