Crystal report PDF font issue for japanese characters using Java - java

I am using crystal report for generating different format files(pdf, excel, word). The .rpt files i am exporting while sending emails. I am facing issue while sending emails as PDF attachment. Japanese characters are showing as box. When am sending as Excel or Word attachment, am able to see the japanese characters.
For solution, i searched in google and came to know this is font issue of crystal report when exported as PDF, and we need to give "MS Gothic" font for japanese characters. I tried the same thing for a particular field in the "crystal reposrt designer" and now that is working fine.
But this is not the solution which i wanted because there are thousands of reports(.rpt files) are there and in each many fields are there, so its not feasible to do that for each field in each report.
So i wanted to do it programmatically in my java code for .pdf file.
private void exportReport(HttpSession session, HttpServletRequest req, ReportClientDocument rptDoc, String sReport)
throws Exception
{
String sExportType = req.getParameter("cmbExportTypes");
if (sReport == null)
sReport = (String) session.getValue("selected_report");
sReport = Util.removeEndingChar(sReport, ".rpt");
String sExtension = "";
if (sExportType.toLowerCase().indexOf("word") != -1)
sExtension = ".doc";
else if (sExportType.toLowerCase().indexOf("excel") != -1)
sExtension = ".xls";
else if (sExportType.toLowerCase().indexOf("pdf") != -1)
sExtension = ".pdf";
else if (sExportType.toLowerCase().indexOf("rtf") != -1)
sExtension = ".rtf";
else if (sExportType.toLowerCase().indexOf("report") != -1)
sExtension = ".rpt";
String sFileName = sReport + sExtension;
FileType file = new FileType(-1, sFileName, false);
FieldObject fieldObject = new FieldObject();
IFontColor fontColor = fieldObject.getFontColor();
IFont iFont = fontColor.getIFont();
iFont.setName("MS Gothic");
iFont.setSize(9.5f);
//rptDoc.getReportDefController().getReportObjectController().modify(fieldObject ,fieldObject);
// 07/03/2012 EA: PR #15087 -- Support Crystal Reports 2008 SP5
InputStream byteIS = rptDoc.getPrintOutputController().export(ReportExportFormat.from_string(sExportType));
EmailAttachment attachment = new EmailAttachment(byteIS, file.type, sFileName);
session.putValue("report_attachment", attachment);
rptDoc.close();
}
am still getting those square box in place of japanese characters. Need help. I think the modify method needs some changes.

I don't think this would work programmatically. In our company we also came to the conclusion that we have to change the font in the report. :-(
Also I wouldn't recommend it to do it programmatically. Every font is different. When I take a report and change all field from "Tahoma" to "Verdana", the fields are displaced, to long, overlapping and so on.
The same could happen here.
Maybe you don't have to change all thousand reports. Maybe only the ones which include fields with japanese characters and are often used.

Related

Rendering big Post Script file with Ghost4J in Java

i made a Java application whose purpose is to offer a Print Preview for PS files.
My program uses Ghostscript and Ghost4J to load the Post Script file and produces a list of Images (one for each page) using the SimpleRenderer.render method. Then using a simple JList i show only the image corresponding to the page the user selected in JList.
This worked fine until a really big PS file occurred, causing an OutOfMemoryError when executing the code
PSDocument pdocument = new PSDocument(new File(filename));
I know that is possibile to read a file a little at a time using InputStreams, the problem is that i can't think of a way to connect the bytes that i read with the actual pages of the document.
Example, i tried to read from PS file 100 MB at a time
int buffer_size = 100000000;
byte[] buffer = new byte[buffer_size];
FileInputStream partial = new FileInputStream(filename);
partial.read(buffer, 0, buffer_size);
document.load(new ByteArrayInputStream(buffer));
SimpleRenderer renderer = new SimpleRenderer();
//how many pages do i have to read?
List<Image> images = renderer.render(document, firstpage ??, lastpage ??);
Am i missing some Ghost4J functionality to read partially a file?
Or has someone other suggestions / approaches about how to solve this problem in different ways?
I am really struggling
I found out I can use Ghost4J Core API to retrieve from a Post Script file a reduced set of pages as Images.
Ghostscript gs = Ghostscript.getInstance();
String[] gsArgs = new String[9];
gsArgs[0] = "-dQUIET";
gsArgs[1] = "-dNOPAUSE";
gsArgs[2] = "-dBATCH";
gsArgs[3] = "-dSAFER";
gsArgs[4] = "-sDEVICE=display";
gsArgs[5] = "-sDisplayHandle=0";
gsArgs[6] = "-dDisplayFormat=16#804";
gsArgs[7] = "-sPageList="+firstPage+"-"+lastPage;
gsArgs[8] = "-f"+filename;
//create display callback (capture display output pages as images)
ImageWriterDisplayCallback displayCallback = new ImageWriterDisplayCallback();
//set display callback
gs.setDisplayCallback(displayCallback);
//run PostScript (also works with PDF) and exit interpreter
try {
gs.initialize(gsArgs);
gs.exit();
Ghostscript.deleteInstance();
} catch (GhostscriptException e) {
System.out.println("ERROR: " + e.getMessage());
e.printStackTrace();
}
return displayCallback.getImages(); //return List<Images>
This solve the problem of rendering page as images in the preview.
However, i could not find a way to use Ghost4J to know total number of pages of PS file (in case the file is too big for opening it with Document.load()).
So, i am still here needing some help

XWPFRun generating runs with whitespaces trimmed

I have developed a Java code which replaces some string patterns in a template and then generates a output docx file, using Apache POI. It was easy to replace the patterns in the headers and paragraphs, but I got an issue while trying to replace inside textboxes. I am using the code provided by Axel Ritcher in Replace text in text box of docx by using Apache POI, but the problem is that it is trimming some white spaces on each run.
For example:
cp -r basedir destination
Becomes:
cp-r basedir destination
The part of the code responsible for doing this substitution is this (The parameters of the function are: doc_buffer is a XWPFDocument, pattern and replacement are both Strings):
for (XWPFParagraph paragraph : doc_buffer.getParagraphs()) {
XmlCursor cursor = paragraph.getCTP().newCursor();
cursor.selectPath(
"declare namespace w='http://schemas.openxmlformats.org/wordprocessingml/2006/main' .//*/w:txbxContent/w:p/w:r");
List<XmlObject> ctrsintxtbx = new ArrayList<XmlObject>();
while (cursor.hasNextSelection()) {
cursor.toNextSelection();
XmlObject obj = cursor.getObject();
ctrsintxtbx.add(obj);
}
for (XmlObject obj : ctrsintxtbx) {
CTR ctr = CTR.Factory.parse(obj.toString());
XWPFRun bufferrun = new XWPFRun(ctr, (IRunBody) paragraph);
String text = bufferrun.getText(0);
if ((text != null) && (text.contains(pattern))) {
text = text.replaceAll(pattern, replacement);
bufferrun.setText(text, 0);
}
obj.set(bufferrun.getCTR());
}
}
If you need any additional information, please let me know.
Thanks in advance!
Somehow I have managed to find the issue that was causing this. I'll post it here so if anyone have the same problem, they can see how I have solved.
The method CTR.Factory.parse used on the example required a String type, but if you check the XmlObject.Factory docs, there are many parse function which require different types of parameters to use. So I have changed this line:
CTR ctr = CTR.Factory.parse(obj.toString());
To the method that accepts XMLInputStream as argument, and then created a new InputStream for the XmlObject:
CTR ctr = CTR.Factory.parse(obj.newInputStream());

PDFBox RichText formatted field

I am currently trying to open, edit & save a PDF file using PDFBox.
with plain-text fields it already works but I'm having a hard time setting RichTextFormat-Text as value, since everytime I use "setRichTextValue", save and open the document, the field is empty (unchanged).
Code is as follows (stripped from multiple functions):
PDDocument pdfDoc = PDDocument.load(new File("my pdf path"));
PDDocumentCatalog docCatalog = pdfDoc.getDocumentCatalog();
PDAcroForm acroForm = docCatalog.getAcroForm();
PDField field = acroForm.getField("field-to-change");
if (field instanceof PDTextField) {
PDTextField tfield = (PDTextField) field;
COSDictionary dict = field.getCOSObject();
//COSString defaultAppearance = (COSString) dict.getDictionaryObject(COSName.DA);
//if (defaultAppearance != null && font != "" && size > 0)
// dict.setString(COSName.DA, "/" + font + " " + size + " Tf 0 g");
boolean rtf = true;
String val = "{\rtf1\ansi\deff0 {\colortbl;\red0\green0\blue0;\red255\green0\blue0;} \cf2 Red RTF Text \cf1 }";
tfield.setRichText(rtf);
if (rtf)
tfield.setRichTextValue(val);
else
tfield.setValue(val);
}
// save document etc.
by digging the PDFBox documentation I found this for .setRichTextValue(String r)
* Set the fields rich text value.
* Setting the rich text value will not generate the appearance
* for the field.
* You can set {#link PDAcroForm#setNeedAppearances(Boolean)} to
* signal a conforming reader to generate the appearance stream.
* Providing null as the value will remove the default style string.
* #param richTextValue a rich text string
so I added
pdfDoc.getDocumentCatalog().getAcroForm().setNeedAppearances(true);
..directly after the PDDocument object and it didnt change anything. So I searched further and found the AppearanceGenerator class, which should create the styles automatically? But it doesnt seem to, and you cant call it manually.
I'm at a loss here and Google is no help either. Seems nobody ever used this before or I'm just too stupid. I want the solution to be done in PDFBox since you dont pay for licenses and it already works for everything else I am doing (getting & replacing images, removing text fields), so it must be possible right?
Thanks in advance.

ItextSharp - diacritic chars

i reading pdf documents via ItextSharp library.
But these documents is in Czech language which use diacritic (ř ě ž š č etc.)
How I can read this chars? Any idea? Or, is some solution for replacing this chars for normal r e z s c ?
This is code in my method. Thanks
PdfReader reader = new PdfReader("M:/ShareDirs_KSP/RDM_Debtors/DMS_PROD/" + src);
// we can inspect the syntax of the imported page
String text = new String();
for (int page = 1; page <= 1; page++) {
text += PdfTextExtractor.getTextFromPage(reader, page);
}
reader.close();
I have written a small proof of concept that parses the file czech.pdf. This file contains several characters with diacritics. It was created in answer to the following question: Can't get Czech characters while generating a PDF
The text is stored in the file twice: once using a simple font, once using a composite font. In my proof of concept (named ParseCzech), I parse this PDF to a file encoded using UTF-8 (UNICODE):
public void parse(String filename) throws IOException {
PdfReader reader = new PdfReader(filename);
FileOutputStream fos = new FileOutputStream(DEST);
for (int page = 1; page <= 1; page++) {
fos.write(PdfTextExtractor.getTextFromPage(reader, page).getBytes("UTF-8"));
}
fos.flush();
fos.close();
}
The result is the file czech.txt:
As you can see from the screen shot, the text is extracted correctly (but make sure that the viewer you use knows that the file is encoded as UTF-8, otherwise you may see strange characters instead of the actual text).
Note that some PDFs do not allow text to be extracted correctly. This is explained in the following video: http://www.youtube.com/watch?v=wxGEEv7ibHE
Please share your PDF so that people on StackOverflow can check whether you don't succeed to extract text because of an error in your code, or whether you don't succeed because the PDF doesn't allow you to extract the text.

Change id3 tag version programatically (pref java)

I need a way to change id3 tag version of mp3 files to some id3v2.x programatically, preferably using java though anything that works is better than nothing. Bonus points if it converts the existing tag so that already existing data isn't destroyed, rather than creating a new tag entirely.
Edit: Jaudiotagger worked, thanks. Sadly I had to restrict it to mp3 files and only saving data contained in previous tags if they were id3. I decided to convert the tag to ID3v2.3 since windows explorer can't handle v2.4, and it was a bit tricky since the program was a bit confused about whether to use the copy constructor or the conversion constructor.
MP3File mf = null;
try {
mf = (MP3File)AudioFileIO.read(new File(pathToMp3File));
} catch (Exception e) {}
ID3v23Tag tag;
if (mf.hasID3v2Tag()) tag = new ID3v23Tag(mf.getID3v2TagAsv24());
else if (mf.hasID3v1Tag()) tag = new ID3v23Tag(mf.getID3v1Tag());
else tag = new ID3v23Tag();
My application must be able to read id3v1 or id3v11, but shall only write v23, so I needed a little bit longer piece of code:
AudioFile mf;
Tag mTagsInFile;
...
mf = ... // open audio file the usual way
...
mTagsInFile = mf.getTag();
if (mTagsInFile == null)
{
//contrary to getTag(), getTagOrCreateAndSetDefault() ignores id3v1 tags
mTagsInFile = mf.getTagOrCreateAndSetDefault();
}
// mp3 id3v1 and id3v11 are suboptimal, convert to id3v23
if (mf instanceof MP3File)
{
MP3File mf3 = (MP3File) mf;
if (mf3.hasID3v1Tag() && !mf3.hasID3v2Tag())
{
// convert ID3v1 tag to ID3v23
mTagsInFile = new ID3v23Tag(mf3.getID3v1Tag());
mf3.setID3v1Tag(null); // remove v1 tags
mf3.setTag(mTagsInFile); // add v2 tags
}
}
Basically we have to know that getTagOrCreateAndSetDefault() and similar unfortunately ignores id3v1, so we first have to call getTag(), and only if this fails, we call the mentioned function.
Additionally, the code must also deal with flac and mp4, so we make sure to do our conversion only with mp3 files.
Finally there is a bug in JaudioTagger. You may replace this line
String genre = "(" + genreId + ") " + GenreTypes.getInstanceOf().getValueForId(genreId);
in "ID3v24Tag.java" with this one
String genre = GenreTypes.getInstanceOf().getValueForId(genreId);
Otherwise genre 12 from idv1 will get "(12) Other" which later is converted to "Other Other" and this is not what we would expect. Maybe someone has a more elegant solution.
You can use different libraries for this purpose, for example this or this.

Categories