I don't know what's going on with the FileWriter, because it only writes out the HTML part, but nothing from the String array content. content stores a lot of long Strings. Is it because of Java's garbage collector?
I print out the content and everything is there, but FileWrter did not write anything from content to that file except the HTML part. I added System.out.println(k); inside the enhanced for-loop. content array is not null though.
public void writeHtml(String[] content) {
File file = new File("final.html");
try {
try (FileWriter Fw = new FileWriter(file)) {
Fw.write("<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\">\n"
+ "<html>\n"
+ "<head>\n"
+ "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=us-ascii\">\n"
+ "<title>" + fileName +" for The Delinquent</title>\n"
+ "<style type = \"text/css\">\n"
+ "body {font-family: \"Times New Roman, serif\"; font-size: 14 or 18; text-align: justify;};\n"
+ "p { margin-left: 1%; margin-right: 1%; }\n"
+ "</style>\n"
+ "</head><body>");
for (String k : content) {
Fw.write(k+"\n");
}
Fw.write("</body></html>");
}
} catch (Exception e) {
e.printStackTrack();
}
}
How the final.html looks like after running the program:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<title>the_delinquent.txt for The Delinquent</title>
<style type = "text/css">
body {font-family: "Times New Roman, serif"; font-size: 14 or 18; text-
align: justify;};
p { margin-left: 1%; margin-right: 1%; }
</style>
</head><body>
</body></html>
I know content is not empty because I did this:
for (String k: content) {
System.out.println(k + "\n");
bw.write(k + "\n");
}
Everything printed out. so weird : (
You code is working. The only thing that prevents content to be written - empty content. It has no elements.
Your code iis basically correct, maybe the content array is empty.
The following is in modernized java style.
public void writeHtml(String[] content) {
Path file = Paths.get("final.html");
try (BufferedWriter fw = Files.newBufferedWriter(file, StandardCharsets.UTF_8)) {
fw.write("<!DOCTYPE html>\n"
+ "<html>\n"
+ "<head>\n"
+ "<meta charset=UTF-8\">\n"
+ "<title>" + fileName + " for The Delinquent</title>\n"
+ "<style type = \"text/css\">\n"
+ "body {font-family: \"Times New Roman, serif\";"
+ " font-size: 14 or 18; text-align: justify;};\n"
+ "p { margin-left: 1%; margin-right: 1%; }\n"
+ "</style>\n"
+ "</head><body>");
fw.write("Content has: " + content.length + " strings.<br>\n");
for (String k : content) {
fw.write("* " + k + "<br>\n");
}
fw.write("</body></html>\n");
} catch (IOException e) {
System.out.println("Error " + e.getMessage());
}
}
FileWriter is an old utility class that uses the default charset, so not portable. Better specify the charset to correspond to the charset in HTML.
The encoding UTF-8 allows full Unicode range of characters, like comma like quotes (typical to MS Word). Java internally also uses Unicode, so it is a fine match.
HTML 5 is the latest HTML version, now generally disposable.
At one spot a typo /n entered.
Multiple spaces and line breaks are converted to a single space. So I added <br> for a line break.
Normally one would add to the method header throws IOException to let the caller handle any irregularities.
Related
I'm trying to convert html string with Japanese character to PDF using YaHP Html to Pdf Converter.
I am using Eclipse Photon Release (4.8.0)
Here is my main class that invokes the YaHP Html :
public static void main(String[] args) {
String pdfOutFileName = "C:\\test\\JP-Test.pdf";
double pageHeight = 80;
String htmlContent = "<html>\r\n" +
" <head>\r\n" +
" <meta http-equiv=Content-Type content=\"text/html; charset=UTF-8\">\r\n" +
" <style type=\"text/css\">\r\n" +
" span.cls_005hr{font-family:Arial,serif;font-size:16.8px;color:rgb(50,50,50);font-weight:normal;font-style:normal;text-decoration: none}\r\n" +
" div.cls_005hr{font-family:Arial,serif;font-size:14.8px;color:rgb(50,50,50);font-weight:normal;font-style:normal;text-decoration: none}\r\n" +
" </style>\r\n" +
" </head>\r\n" +
" <body>\r\n" +
" <table border=0 cellpadding=0 cellspacing=0 width=720>\r\n" +
" <col width=10 >\r\n" +
" <col width=710 >\r\n" +
" <tr>\r\n" +
" <td valign=\"middle\" height=\"80\" bgcolor=\"#f0f0f0\">\r\n" +
" <div><span class=\"cls_005hr\">JPTesting</span></div>\r\n" +
" </td>\r\n" +
" <td valign=\"middle\" height=\"80\" bgcolor=\"#f0f0f0\">\r\n" +
" <div><span class=\"cls_005hr\">株式会社 ビー・エス・デーインフォメーションテクノロジー</span></div>\r\n" +
" </td>\r\n" +
" </span>\r\n" +
" </tr>\r\n" +
" </table>\r\n" +
" </body>\r\n" +
"</html>";
System.out.println("htmlContent: [" + htmlContent + "]");
try {
ByteArrayOutputStream outFormPDF = new ByteArrayOutputStream();
outFormPDF = PDFUtil.convertHtmlToPDF(htmlContent, pageHeight);
byte[] bOutFormPDF = outFormPDF.toByteArray();
OutputStream os = new FileOutputStream(pdfOutFileName);
os.write(bOutFormPDF);
System.out.println("Successfully Finished writing PDF to output file");
os.close();
} catch (Exception e) {
System.out.println(e.getMessage());
}
and here is the PDFUtil class method that calls YaHP Converter
public static ByteArrayOutputStream convertHtmlToPDF (String htmlContent, double pageHeight) throws CConvertException, IOException {
ByteArrayOutputStream outFormPDF = new ByteArrayOutputStream();
Scanner scanner = new Scanner(htmlContent).useDelimiter("\\Z");
String htmlContents = scanner.next();
CYaHPConverter converter = new CYaHPConverter();
Map properties = new HashMap();
List headerFooterList = new ArrayList();
URL resource = PDFUtil.class.getClassLoader().getResource("fonts");
String fontDirectory = resource.getPath() ;
properties.put(IHtmlToPdfTransformer.PDF_RENDERER_CLASS, IHtmlToPdfTransformer.FLYINGSAUCER_PDF_RENDERER);
properties.put(IHtmlToPdfTransformer.FOP_TTF_FONT_PATH, fontDirectory);
PageSize pageSize = IHtmlToPdfTransformer.LEGALP;
if (pageHeight>0) {
String sHeight = Double.toString(pageHeight);
sHeight = sHeight.substring(0,sHeight.indexOf("."));
pageHeight = Double.parseDouble(sHeight);
System.out.println ("pageHeight : " + pageHeight);
pageSize = new PageSize(21.6d, pageHeight, 0.7d, 0.5d, 1.5d, 1.5d);
}
System.out.println ("Calling converter.convertToPdf");
converter.convertToPdf(htmlContents,
pageSize,
headerFooterList,
"file://tmp/Html2PdfConvertTemp",
outFormPDF,
properties);
System.out.println ("Successfully Called converter.convertToPdf");
scanner.close();
return outFormPDF;
}
For some reason, the output PDF file contains "JPTesting", but does not contain the Japanese letters : "株式会社 ビー・エス・デーインフォメーションテクノロジー" .
Any help would be much appreciated.
Found the solution. Posting my solution here in case anyone else may struggle with the same issue that I had.
I have added Japanese font from google : https://fonts.google.com/?subset=japanese (I Picked Shippori Mincho B1), added to my font resource directory.
Updated my html CSS to pick up those new fonts :
span.cls_005jp{font-family:Shippori Mincho B1, Arial,serif;...
Update my html to use those new fonts for tags that may contain Japanese letters :
株式会社 ビー・エス・デーインフォメーションテクノロジー\r\n"
Thank you, #g00se, for pointing me to the right direction!
I am generating PDF file from my HTML string, But when PDF file getting generated the content in HTML and PDF does not match. The content is PDF is some random content. I read about the issue on google and they suggest using Unicode notation like %u0627%u0646%u0627%20%u0627%u0633%u0645%u0649%20%u0639%u0628%u062F%u0627%u0644%u0644%u0647. But I am putting this into my HTML it is getting printing as it is.
related issue: Writing Arabic in pdf using itext
package com.example.demo;
import com.itextpdf.html2pdf.ConverterProperties;
import com.itextpdf.html2pdf.HtmlConverter;
import com.itextpdf.styledxmlparser.css.media.MediaDeviceDescription;
import com.itextpdf.styledxmlparser.css.media.MediaType;
import com.itextpdf.html2pdf.resolver.font.DefaultFontProvider;
import com.itextpdf.layout.font.FontProvider;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import java.io.ByteArrayOutputStream;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
#SpringBootApplication
public class DemoApplication {
public static void main(String[] args) throws IOException {
SpringApplication.run(DemoApplication.class, args);
String htmlSource = getContent();
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
ConverterProperties converterProperties = new ConverterProperties();
FontProvider dfp = new DefaultFontProvider(true, false, false);
dfp.addFont("/Library/Fonts/Arial.ttf");
converterProperties.setFontProvider(dfp);
converterProperties.setMediaDeviceDescription(new MediaDeviceDescription(MediaType.PRINT));
HtmlConverter.convertToPdf(htmlSource, outputStream, converterProperties);
byte[] bytes = outputStream.toByteArray();
File pdfFile = new File("java19.pdf");
FileOutputStream fos = new FileOutputStream(pdfFile);
fos.write(bytes);
fos.flush();
fos.close();
}
private static String getContent() {
return "<!DOCTYPE html>\n" +
"<html lang=\"en\">\n" +
"\n" +
"<head>\n" +
" <meta charset=\"UTF-8\">\n" +
" <meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\">\n" +
" <meta http-equiv=\"X-UA-Compatible\" content=\"ie=edge\">\n" +
" <title>Document</title>\n" +
" <style>\n" +
" #page {\n" +
" margin: 0;\n" +
" font-family: arial;\n" +
" }\n" +
" </style>\n" +
"</head>\n" +
"\n" +
"<body\n" +
" style=\"margin: 0;padding: 0;font-family: arial, sans-serif;font-size: 14px;line-height: 125%;width: 100%;-ms-text-size-adjust: 100%;-webkit-text-size-adjust: 100%;color: #222222;\">\n" +
" <table cellpadding=\"0\" cellspacing=\"0\" width=\"100%\" style=\"background: white; direction: rtl;\">\n" +
" <tbody>\n" +
" <tr>\n" +
" <td style=\"padding: 0 35px;\">\n" +
" <p> انا اسمى عبدالله\n" +
" </p>\n" +
" </td>\n" +
" </tr>\n" +
" </tbody>\n" +
" </table>\n" +
"\n" +
"</body>\n" +
"\n" +
"</html>";
}
}
It's difficult to determine what the issue is exactly without seeing the faulty output. But your "random content" sounds like an encoding issue.
Since you have your Arabic content directly in your source code, you have to be careful about encoding. For example, using ISO-8859-1, the resulting PDF output is:
Using Unicode escape sequences (\uXXXX), you can indeed avoid some of these encoding issues. Replacing
" <p> انا اسمى عبدالله\n" +
with
" <p>\u0627\u0646\u0627 \u0627\u0633\u0645\u0649 \u0639\u0628\u062F\u0627\u0644\u0644" +
results in Arabic glyphs, even when using ISO-8859-1 encoding. Alternatively, you can use UTF-8 to get the correct content regardless of the use of Unicode escape sequences.
When your encoding issues are solved, you will likely get output like this:
For correct rendering of certain writing systems, an optional module pdfCalligraph is needed for iText 7. With this module enabled, the resulting output looks like this:
The code used for the tests above:
public static void main(String[] args) throws IOException {
// Needed for pdfCalligraph
LicenseKey.loadLicenseFile("all-products.xml");
File pdfFile = new File("java19.pdf");
OutputStream outputStream = new FileOutputStream(pdfFile);
String htmlSource = getContent();
ConverterProperties converterProperties = new ConverterProperties();
FontProvider dfp = new DefaultFontProvider(true, false, false);
dfp.addFont("/Library/Fonts/Arial.ttf");
converterProperties.setFontProvider(dfp);
converterProperties.setMediaDeviceDescription(new MediaDeviceDescription(MediaType.PRINT));
HtmlConverter.convertToPdf(htmlSource, outputStream, converterProperties);
}
private static String getContent() {
return "<!DOCTYPE html>\n" +
"<html lang=\"en\">\n" +
"\n" +
"<head>\n" +
" <meta charset=\"UTF-8\">\n" +
" <meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\">\n" +
" <meta http-equiv=\"X-UA-Compatible\" content=\"ie=edge\">\n" +
" <title>Document</title>\n" +
" <style>\n" +
" #page {\n" +
" margin: 0;\n" +
" font-family: arial;\n" +
" }\n" +
" </style>\n" +
"</head>\n" +
"\n" +
"<body\n" +
" style=\"margin: 0;padding: 0;font-family: arial, sans-serif;font-size: 14px;line-height: 125%;width: 100%;-ms-text-size-adjust: 100%;-webkit-text-size-adjust: 100%;color: #222222;\">\n" +
" <table cellpadding=\"0\" cellspacing=\"0\" width=\"100%\" style=\"background: white; direction: rtl;\">\n" +
" <tbody>\n" +
" <tr>\n" +
" <td style=\"padding: 0 35px;\">\n" +
// Arabic content
// " <p> انا اسمى عبدالله\n" +
// Arabic content with Unicode escape sequences
" <p>\u0627\u0646\u0627 \u0627\u0633\u0645\u0649 \u0639\u0628\u062F\u0627\u0644\u0644\u0647" +
" </p>\n" +
" </td>\n" +
" </tr>\n" +
" </tbody>\n" +
" </table>\n" +
"\n" +
"</body>\n" +
"\n" +
"</html>";
}
Please check to make sure that your sourcefile and compiler use the same encoding, e.g. UTF-8. I sometimes check that by including characters that are only available in unicode and not in other classic codepages.
I tried to reproduce the issue and I got the following warning in the logging when running the example code:
Cannot find pdfCalligraph module, which was implicitly required by one of the layout properties
This was already mentioned by Alexsey Subach and can cause the following issue:
Problems with text direction (I am no expert on Arabic but the text was aligned to the right)
Wrong combination of characters (For the details see this document: https://itextpdf.com/sites/default/files/2018-12/iText_pdfCalligraph_4pager.pdf )
This is the output I got without pdfCalligraph:
pdf result without calligraph
Created with the codebase on this repository
So in order to get everything to work perfectly like your browser does with the HTML for Arabic you will also need:
A commercial license for https://itextpdf.com/en/products/itext-7/pdfcalligraph
Code to load the license file (or you will get a LicenseFileNotLoadedException )
This dependency https://repo.itextsupport.com/releases/com/itextpdf/typography/2.0.6/
Your question is tagged as regarding iText7 but there may be other possible free alternatives depending on your requirements like Apache FOP that should work with Arabic Ligatures according to this source but probably require rework as it is based on XSL-FO. In theory you could generate the XSL-FO with any templating mechanism that you currently use e.g.: JSP/JSF/Thymeleaf etc. and use something like a ServletFilter to convert the XSL-FO to a PDF on the fly during a request (in a web application)
Make sure your fonts support the characters you need and if you use Maven resource directory to include extra fonts during the build check that the font file is not filtered (properties replacement) as that corrupts the file: Maven corrupting binary files in source/main/resources when building jar
I need to send HTML e-mail from Maxima, but as a result, only text without tags and without a table comes to the mail, how I can do it true?
Code example:
package com.vetasi.testPackage;
import psdi.server.MXServer;
import psdi.server.SimpleCronTask;
import psdi.util.logging.MXLogger;
import psdi.util.logging.MXLoggerFactory;
public class Test1 extends SimpleCronTask{
MXLogger logger = MXLoggerFactory.getLogger("com.test.TestReportCron");
#Override
public void cronAction(){
String message1 = "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Transitional//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd\">\r\n" +
"<html xmlns=\"http://www.w3.org/1999/xhtml\">\r\n" +
" <head>\r\n" +
" <meta http-equiv=\"Content-Type\" content=\"text/html; charset=UTF-8\" />\r\n" +
" <title>Title</title>\r\n" +
" <meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\"/>\r\n" +
"</head>\r\n" +
"<body style=\"margin: 0; padding: 0;\">\r\n" +
" <table border=\"1\" cellpadding=\"0\" cellspacing=\"0\" width=\"100%\">\r\n" +
" <tr>\r\n" +
" <td>\r\n" +
" Hello!\r\n" +
" </td>\r\n" +
" </tr>\r\n" +
" </table>\r\n" +
"</body>\r\n" +
"</html>" ;
try{
MXServer.sendEMail("Misha89uatest#gmail.com", "maxadmin#us.ibm.com", "Hello my friend", message1);
}catch (Exception e) {
logger.error(e.getStackTrace());
e.printStackTrace();
}
}
}
Result:
Comes e-mail
When I using the Java Mail API, everything works fine, but there is much more code, and it is preferable to do in Maximo methods Anybody has any ideas?
Since it is an array it repeats till the end of the loop. But Room 1 should not be repeated . it should be place at the top
first[ind] = new JLabel("<html>"
+ "<body>"
+ "<div id=r12style=border: 3px solid orange; margin-bottom: 5px;>"
+ " <h2>"
+ " Room 1"
+ " </h2>"
+ "<img src=" + icon + " width=\"95\" height=\"105\"></img>"
+ "</div>"
+ "</body>"
+ "</html>");
Try this (I haven't tested it):
IMM[ind] = new JLabel("<html><style >#aa {margin-left:25px;}</style>"
+ "<div id=\"aa\"></font><font color=\"rgb(0,0,0)\"size=\"5\">"
+ bed_no + "<font color=\"rgb(255, 204, 204, 150)\"size=\"1\">.</div></font></html>"
, "<html><img src=" + icon + "></html>", JLabel.LEFT_ALIGNMENT);
If you want the text under the icon:
IMM[ind].setHorizontalTextPosition(JLabel.CENTER);
IMM[ind].setVerticalTextPosition(JLabel.BOTTOM);
Edit - It's probably a good idea to break up the JLabel text by using a String variable. Because it looks almost unreadable.
Recently I was recommended to use JSoup to parse and modify HTML documents.
However what if I have a HTML document that I want to modify (to send, store somewhere else, etc.), how might I go about doing that without changing the original document?
Say I have an HTML file like so:
<html>
<head></head>
<body>
<p></p>
<h2>Title: title</h2>
<p></p>
<p>Name: </p>
<p>Address: </p>
<p>Phone Number: </p>
</body>
</html>
And I want to fill in the appropriate data for Name, Address, Phone Number and any other information I'd like, without modifying the original HTML file, how might I go about that using JSoup?
A possible simpler solution is to modify your template to have placeholders like:
<html>
<head></head>
<body>
<p></p>
<h2>Title: title</h2>
<p></p>
<p>Name: <span id="name"></span></p>
<p>Address: <span id="address"></span></p>
<p>Phone Number: <span id="phone"></span></p>
</body>
</html>
Then load your document this way:
Document doc = Jsoup.parse("" +
"<html>\n" +
" <head></head>\n" +
" <body> \n" +
" <p></p>\n" +
" <h2>Title: title</h2>\n" +
" <p></p>\n" +
" <p>Name: <span id=\"name\"></span></p>\n" +
" <p>Address: <span id=\"address\"></span></p>\n" +
" <p>Phone Number: <span id=\"phone\"></span></p>\n" +
" </body>\n" +
"</html>");
doc.getElementById("name").text("Andrey");
doc.getElementById("address").text("Stackoverflow.com");
doc.getElementById("phone").text("secret!");
System.out.println(doc.html());
And this would give the form filled out.
#MarcoS had an excellent solution using a NodeTraversor to make a list of nodes to change at https://stackoverflow.com/a/6594828/1861357 and I only very slightly modified his method which replaces a node (a set of tags) with the data in the node plus whatever information you would like to add.
To store a String in memory I used a static StringBuilder to save the HTML in memory.
First we read in the HTML file (that is manually specified, this can be changed), then we make a series of checks to change whatever nodes with any data that we want.
The one problem that I didn't fix in the solution by MarcoS was that it split each individual word, instead of looking at a line. However I just used '-' for multiple words, because otherwise it places the string directly after that word.
So a full implementation:
import java.util.*;
import org.jsoup.Jsoup;
import org.jsoup.nodes.*;
import org.jsoup.select.*;
import java.io.*;
public class memoryHTML
{
static String htmlLocation = "C:\\Users\\User\\";
static String fileName = "blah"; // Just for demonstration, easily modified.
static StringBuilder buildTmpHTML = new StringBuilder();
static StringBuilder buildHTML = new StringBuilder();
static String name = "John Doe";
static String address = "42 University Dr., Somewhere, Someplace";
static String phoneNumber = "(123) 456-7890";
public static void main(String[] args)
{
// You can send it the full path with the filename. I split them up because I used this for multiple files.
readHTML(htmlLocation, fileName);
modifyHTML();
System.out.println(buildHTML.toString());
// You need to clear the StringBuilder Object or it will remain in memory and build on each run.
buildTmpHTML.setLength(0);
buildHTML.setLength(0);
System.exit(0);
}
// Simply parse and build a StringBuilder for a temporary HTML file that will be modified in modifyHTML()
public static void readHTML(String directory, String fileName)
{
try
{
BufferedReader br = new BufferedReader(new FileReader(directory + fileName + ".html"));
String line;
while((line = br.readLine()) != null)
{
buildTmpHTML.append(line);
}
br.close();
}
catch (Exception e)
{
e.printStackTrace();
System.exit(1);
}
}
// Excellent method of parsing and modifying nodes in HTML files by #MarcoS at https://stackoverflow.com/a/6594828/1861357
// It has its small problems, but it does the trick.
public static void modifyHTML()
{
String htmld = buildTmpHTML.toString();
Document doc = Jsoup.parse(htmld);
final List<TextNode> nodesToChange = new ArrayList<TextNode>();
NodeTraversor nd = new NodeTraversor(new NodeVisitor()
{
#Override
public void tail(Node node, int depth)
{
if (node instanceof TextNode)
{
TextNode textNode = (TextNode) node;
nodesToChange.add(textNode);
}
}
#Override
public void head(Node node, int depth)
{
}
});
nd.traverse(doc.body());
for (TextNode textNode : nodesToChange)
{
Node newNode = buildElementForText(textNode);
textNode.replaceWith(newNode);
}
buildHTML.append(doc.html());
}
private static Node buildElementForText(TextNode textNode)
{
String text = textNode.getWholeText();
String[] words = text.trim().split(" ");
Set<String> units = new HashSet<String>();
for (String word : words)
units.add(word);
String newText = text;
for (String rpl : units)
{
if(rpl.contains("Name"))
newText = newText.replaceAll(rpl, "" + rpl + " " + name:));
if(rpl.contains("Address") || rpl.contains("Residence"))
newText = newText.replaceAll(rpl, "" + rpl + " " + address);
if(rpl.contains("Phone-Number") || rpl.contains("PhoneNumber"))
newText = newText.replaceAll(rpl, "" + rpl + " " + phoneNumber);
}
return new DataNode(newText, textNode.baseUri());
}
And you'll get this HTML back (remember I changed "Phone Number" to "Phone-Number"):
<html>
<head></head>
<body>
<p></p>
<h2>Title: title</h2>
<p></p>
<p>Name: John Doe </p>
<p>Address: 42 University Dr., Somewhere, Someplace</p>
<p>Phone-Number: (123) 456-7890</p>
</body>
</html>