File not found in project when parsing - java

I am trying to parse HTML with JSOUP library from local file. But my IDE can't find this file.
Project structure:
- ParsingHTML
- .idea
- files
- index.html
- login.html
- libs
- out
- src
- Parsing.java
Here is my code:
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import java.io.File;
import java.io.IOException;
public class Parsing {
public static void main(String[] args) {
try {
File input = new File("./../files/index.html");
Document doc = Jsoup.parse(input, "UTF-8", "https://jsoup.com/");
System.out.println(doc);
} catch (IOException e) {
e.printStackTrace();
}
}
}
How I can fix it?
P.S.:
Change directory name from file to files and file from "input" to "index".

You are getting this error because this location not exist.
Try to use:
File input = new File("files/input.html");
or
Path input = Paths.get("files","input.html");

use
final String dir = System.getProperty("user.dir");
Example:
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import java.io.File;
import java.io.IOException;
public class Parsing {
public static void main(String[] args) {
try {
final String dir = System.getProperty("user.dir");
final File input = new File(dir +"/files/input.html");
final Document doc = Jsoup.parse(input, "UTF-8", "https://jsoup.com/");
System.out.println(doc.toString());
} catch (IOException e) {
e.printStackTrace();
}
}
}

What your project structure tells is that yours "input.html" file doesn't exist, so the error is correct. Maybe you ctrl+c ctrl+v the Jsoup example without changing the html file's name.
Also check:
new File("./../files/input.html");
In your project the folder name is file, not files.

Related

Java Creating copy of file before uploading to AWS cloud

I have an image in a directory.
I want to make a copy of that image with a different name without doing harm to the original image in the same directory.
So there will be two same images in one folder with a different name.
I want a basic code like I tried -
File source = new File("resources/"+getImage(0));
File dest = new File("resources/");
source.renameTo("resources/"+getImage(0)+);
try {
FileUtils.copyDirectory(source, dest);
} catch (IOException e) {
e.printStackTrace();
}
When I upload the same image to the Amazon server multiple times in automation and then it starts giving issue to upload.
So we want to upload a mirror copy of image everytime.
In eclipse generally have resources folder. I want to make copy of a original image every-time before we upload and delete it after upload.
Kindly suggest some approach
You can just copy the file and use StandardCopyOption.COPY_ATTRIBUTES
public static final StandardCopyOption COPY_ATTRIBUTES
Copy attributes to the new file.
Files.copy(Paths.get(//path//to//file//and//filename),
Paths.get(//path//to//file//and//newfilename), StandardCopyOption.COPY_ATTRIBUTES);
Not a perfect solution, but Instead of handling pop-up box we can directly force file path into the form: [I have used date-stamp for creating new filenames but some different logic could also be used viz- Random String appender etc.]
import org.junit.jupiter.api.Test;
import java.io.*;
import java.nio.file.Files;
import java.text.DateFormat;
import java.text.SimpleDateFormat;
import java.util.Calendar;
import java.util.Date;
public class Upload {
private static final String SRC_RESOURCES_FILE_PATH = System.getProperty("user.dir")+"/src/resources/";
File s1 = new File(SRC_RESOURCES_FILE_PATH+"Img1.png");
File s2 = new File(SRC_RESOURCES_FILE_PATH+"Img"+getDateStamp()+".png");
#Test
public void uploadFunction() throws IOException {
copyFileUsingJava7Files(s1,s2);
}
private String getDateStamp(){
DateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd");
Date date = new Date();
return dateFormat.format(date).toString();
}
private static void copyFileUsingJava7Files(File source, File dest)
throws IOException {
Files.copy(source.toPath(), dest.toPath());
}
}

How to Open a PDF at a Named Destination

I need to write a Java program that opens a PDF file at a named destination. The file test.pdf contains the named destination "DestinationX" on page 2. The program opens the PDF file but does not go to the named destination. How do I get to the named destination?
import java.awt.Desktop;
import java.io.IOException;
import java.net.URI;
import java.net.URISyntaxException;
public class MyLauncher {
static void openFileAtNamedDest(){
if (Desktop.isDesktopSupported()) {
try {
URI myURI = new URI("file:///C:/test.pdf#nameddest=DestinationX");
Desktop.getDesktop().browse( myURI );
} catch (IOException e) {
e.printStackTrace();
}
catch (URISyntaxException e) {
e.printStackTrace();
}
}
}
public static void main(String[] args) {
openFileAtNamedDest();
}
}
According to the spec, the format of your URL is correct. The only question is what application you are actually launching via browse(). I think it acts the same way as if you had double-clicked the file's icon on your desktop: it will launch whatever application is registered as the default handler for PDFs.
Acrobat should be able to handle a URL with a named destination, but other PDF viewers may not support it.

Moving large files in java

I have to move files from one directory to other directory.
Am using property file. So the source and destination path is stored in property file.
Am haivng property reader class also.
In my source directory am having lots of files. One file should move to other directory if its complete the operation.
File size is more than 500MB.
import java.io.File;
import java.nio.file.Files;
import java.nio.file.StandardCopyOption;
import static java.nio.file.StandardCopyOption.*;
public class Main1
{
public static String primarydir="";
public static String secondarydir="";
public static void main(String[] argv)
throws Exception
{
primarydir=PropertyReader.getProperty("primarydir");
System.out.println(primarydir);
secondarydir=PropertyReader.getProperty("secondarydir");
File dir = new File(primarydir);
secondarydir=PropertyReader.getProperty("secondarydir");
String[] children = dir.list();
if (children == null)
{
System.out.println("does not exist or is not a directory");
}
else
{
for (int i = 0; i < children.length; i++)
{
String filename = children[i];
System.out.println(filename);
try
{
File oldFile = new File(primarydir,children[i]);
System.out.println( "Before Moving"+oldFile.getName());
if (oldFile.renameTo(new File(secondarydir+oldFile.getName())))
{
System.out.println("The file was moved successfully to the new folder");
}
else
{
System.out.println("The File was not moved.");
}
}
catch (Exception e)
{
e.printStackTrace();
}
}
System.out.println("ok");
}
}
}
My code is not moving the file into the correct path.
This is my property file
primarydir=C:/Desktop/A
secondarydir=D:/B
enter code here
Files should be in B drive. How to do? Any one can help me..!!
Change this:
oldFile.renameTo(new File(secondarydir+oldFile.getName()))
To this:
oldFile.renameTo(new File(secondarydir, oldFile.getName()))
It's best not to use string concatenation to join path segments, as the proper way to do it may be platform-dependent.
Edit: If you can use JDK 1.7 APIs, you can use Files.move() instead of File.renameTo()
Code - a java method:
/**
* copy by transfer, use this for cross partition copy,
* #param sFile source file,
* #param tFile target file,
* #throws IOException
*/
public static void copyByTransfer(File sFile, File tFile) throws IOException {
FileInputStream fInput = new FileInputStream(sFile);
FileOutputStream fOutput = new FileOutputStream(tFile);
FileChannel fReadChannel = fInput.getChannel();
FileChannel fWriteChannel = fOutput.getChannel();
fReadChannel.transferTo(0, fReadChannel.size(), fWriteChannel);
fReadChannel.close();
fWriteChannel.close();
fInput.close();
fOutput.close();
}
The method use nio, it make use os underling operation to improve performance.
Here is the import code:
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.nio.channels.FileChannel;
If you are in eclipse, just use ctrl + shift + o.

Convert DOCX to HTML incliding IMAGES

I am using DOCX4J to convert the DOCX to HTML .I have successfully done the conversion and got the html format.I will be using the html format to embed it as EMAIL body to send an email.But I have some issues which are listed below....
Unable to display images in email body
Losing the spaces and bullets
Please find the code which I have written,
WordprocessingMLPackage wordMLPackage;
wordMLPackage = Docx4J.load(new java.io.File(resourcePath2));
HTMLSettings htmlSettings = Docx4J.createHTMLSettings();
htmlSettings.setImageDirPath(imageFolder + resourcePath2 + "_files");
htmlSettings.setImageTargetUri(imageFolder +resourcePath2.substring(resourcePath2.lastIndexOf("/")+1) + "_files");
htmlSettings.setWmlPackage(wordMLPackage);
OutputStream os;
os = new ByteArrayOutputStream();
Docx4jProperties.setProperty("docx4j.Convert.Out.HTML.OutputMethodXML", true);
Docx4J.toHTML(htmlSettings, os, Docx4J.FLAG_SAVE_FLAT_XML);
DOCX = ((ByteArrayOutputStream)os).toString();
You may add like this in your code
package tcg.doc.web.managedBeans;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import org.apache.poi.xwpf.converter.core.FileImageExtractor;
import org.apache.poi.xwpf.converter.core.FileURIResolver;
import org.apache.poi.xwpf.converter.xhtml.XHTMLConverter;
import org.apache.poi.xwpf.converter.xhtml.XHTMLOptions;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.context.annotation.Scope;
import org.springframework.stereotype.Component;
#Component
#Scope("session")
#Qualifier("ConvertWord")
public class ConvertWord {
private static final String docName = "TestDocx.docx";
private static final String outputlFolderPath = "d:/";
String htmlNamePath = "docHtml.html";
String zipName="_tmp.zip";
File docFile = new File(outputlFolderPath+docName);
File zipFile = new File(zipName);
public void ConvertWordToHtml() {
try {
// 1) Load DOCX into XWPFDocument
InputStream doc = new FileInputStream(new File(outputlFolderPath+docName));
System.out.println("InputStream"+doc);
XWPFDocument document = new XWPFDocument(doc);
// 2) Prepare XHTML options (here we set the IURIResolver to load images from a "word/media" folder)
XHTMLOptions options = XHTMLOptions.create(); //.URIResolver(new FileURIResolver(new File("word/media")));;
// Extract image
String root = "target";
File imageFolder = new File( root + "/images/" + doc );
options.setExtractor( new FileImageExtractor( imageFolder ) );
// URI resolver
options.URIResolver( new FileURIResolver( imageFolder ) );
OutputStream out = new FileOutputStream(new File(htmlPath()));
XHTMLConverter.getInstance().convert(document, out, options);
System.out.println("OutputStream "+out.toString());
} catch (FileNotFoundException ex) {
} catch (IOException ex) {
}
}
public static void main(String[] args) {
ConvertWord cwoWord=new ConvertWord();
cwoWord.ConvertWordToHtml();
System.out.println();
}
public String htmlPath(){
// d:/docHtml.html
return outputlFolderPath+htmlNamePath;
}
public String zipPath(){
// d:/_tmp.zip
return outputlFolderPath+zipName;
}
}
For maven Dependency on pom.xml
<dependency>
<groupId>fr.opensagres.xdocreport</groupId>
<artifactId>org.apache.poi.xwpf.converter.xhtml</artifactId>
<version>1.0.4</version>
</dependency>
or download it from Here
For images to work in an email body, I guess you need to use either a data URI or publish them to a web-reachable location.
In either case, you'll need to write an implementation of:
public interface ConversionImageHandler {
/**
* #param picture
* #param relationship of the image
* #param part of the image, if it is an internal image, otherwise null
* #return uri for the image we've saved, or null
* #throws Docx4JException this exception will be logged, but not propagated
*/
public String handleImage(AbstractWordXmlPicture picture, Relationship relationship, BinaryPart part) throws Docx4JException;
}
and configure docx4j to use it with htmlSettings.setImageHandler.
You can look at some of the existing implementations in the docx4j source code, and take advantage of the helper methods in AbstractConversionImageHandler (eg createEncodedImage if you want data URIs).

JSoup error in data types

I have the following code that is supposed to extract data from HTML document. I used eclipse. It gives me two errors (though, this code is copied and pasted from JSoup site as a tutorial). The errors in 1) File, and 2) Elements. I can't see any problem in these two types.
import java.io.IOException;
import java.net.MalformedURLException;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
public class TestClass
{
public static void main(String args[]) throws IOException
{
try{
File input = new File("/tmp/input.html");
Document doc = Jsoup.parse(input, "UTF-8", "http://example.com/");
Element content = doc.getElementById("content");
Elements links = content.getElementsByTag("a");
for (Element link : links) {
String linkHref = link.attr("href");
String linkText = link.text();
}
}//try
catch (Exception e){//Catch exception if any
System.err.println("Error: " + e.getMessage());
}//catch
}
}</i>
You forgot to import them.
import java.io.File;
import org.jsoup.select.Elements;
See also:
Java tutorial - Using package members
Hint: read the "Quick Fix" options suggested by Eclipse. It's already the 1st option for File.

Categories