Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I have been trying to read a .doc and .docx file and assign the text in the file into a String variable in java but I keep having the error
Exception in thread "main" java.lang.Error: Unresolved compilation problems:
The type org.apache.poi.poifs.filesystem.POIFSFileSystem cannot be resolved. It is indirectly referenced from required .class files
The type org.apache.poi.poifs.filesystem.DirectoryNode cannot be resolved. It is indirectly referenced from required .class files
I have the following code to test the program
import java.io.*;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;
public class ReadDocFile
{
public static void main(String[] args)
{
File file = null;
WordExtractor extractor = null;
try
{
file = new File("c:\\test.doc");
FileInputStream fis = new FileInputStream(file.getAbsolutePath());
HWPFDocument document = new HWPFDocument(fis);
extractor = new WordExtractor(document);
String[] fileData = extractor.getParagraphText();
for (int i = 0; i < fileData.length; i++)
{
if (fileData[i] != null)
System.out.println(fileData[i]);
}
}
catch (Exception exep)
{
exep.printStackTrace();
}
}
}
I have downloaded a .jar file from
https://mvnrepository.com/artifact/org.apache.poi/poi-scratchpad/3.9
I think the .jar file that I imported is incomplete. If so, can anyone give me a link for the complete library?
You need to include poi-3.15.jar into your classpath.
You can find all poi jars & dependencies as single download here
If you are using maven,
<dependency>
<groupId>org.apache.poi</groupId>
<artifactId>poi</artifactId>
<version>3.15</version>
</dependency>
How can I get the latest tweet from html content through either regex or without any external libraries. I am happy to use external libraries I would just prefer not to. I just wanted to know how it would be possible. I have written the html download part in Java and if anyone wants I will post it here.
So I'll do a pit of pseudo code so that I'm not only targeting Java developers This is how my program looks so far.
1.)Load site("www.twitter.com/user123")
2.)Get initial string and write it to variable->buffer
3.)Loop start
4.) Append string->buffer
5.) If there is no more ->break
6.)print buffer
Obviously the variable buffer will now have raw html content. How can I sort this out to get the tweet. I have found a way but this is too inconsistent. The way I managed it was to find the string which held the tweets and get the content surrounded by the code. However there were too many changes in this section. What I mean is some content inside of it changes, like the font size. I could write multiple if statements but is there a neater solution?
Let me just start off by saying that jsoup is an amazing lightweight HTML parsing library. You can use things like CSS selectors and whatnot. If you ever decide to use a library jsoup will make your life a lot easier.
You can just query for the element with the class of TweetTextSize, then get the text content. This will give you all text, hashtags, and links. (The downside being pictures are also given in links)
Otherwise, you'll need to manually traverse the DOM. For example, use regex to find the beginning of the first TweetTextSize, and then just keep all text which is not between a < and a >.
Unfortunately, this second solution is volatile and may not work in the future, and you'll end up with a big glob of code which is overly complex and hard to debug.
Simple answer if you want a regex and not a sophisticated third party library.
<p[^>]+js-tweet-text[^>]*>(.*)</p>
Try the above on the "view-source" of https://twitter.com/a
Thanks.
EDIT:
Source Code:
import java.io.ByteArrayOutputStream;
import java.io.InputStream;
import java.net.URL;
import java.net.URLConnection;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class TweetSucker {
public static void main(String[] args) throws Exception {
URLConnection urlConnection = new URL("https://twitter.com/a").openConnection();
InputStream inputStream = urlConnection.getInputStream();
String encoding = urlConnection.getContentEncoding();
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
byte[] buffer = new byte[8192];
int len = 0;
while ((len = inputStream.read(buffer)) != -1) {
byteArrayOutputStream.write(buffer, 0, len);
}
String htmlContent = null;
if (encoding != null) {
htmlContent = new String(byteArrayOutputStream.toByteArray(), encoding);
} else {
htmlContent = new String(byteArrayOutputStream.toByteArray());
}
Pattern TWEET_PATTERN = Pattern.compile("(<p[^>]+js-tweet-text[^>]*>(.*)</p>)", Pattern.CASE_INSENSITIVE);
Matcher matcher = TWEET_PATTERN.matcher(htmlContent);
while (matcher.find()) {
System.out.println("Tweet Found: " + matcher.group(2));
}
}
}
I know that you don't want any libraries but if you want something really quick this is working code in C#:
using (IE browser = new IE())
{
browser.GoTo("https://twitter.com/user");
List tweets = browser.List(Find.ById("stream-items-id"));
if (tweets != null)
{
foreach (var tweet in tweets.ListItems)
{
var tweetText = tweet.Paras.FirstOrDefault();
if (tweetText != null)
{
MessageBox.Show(tweetText.Text);
}
}
}
}
This program uses a library called WatiN (if you use Visual Studio go to Tools Menu, select "NuGet Package Manager" then select "Manage Nuget Packages for Solution" and then select "Browse" and then type "Watin" on the search box, after you find the library hit "Install", after it is installed you just add a reference in your code and then a using statement:
using WatiN.Core;
You can just copy and paste the code I wrote above in a button handler and it'll work, you need to change the twitter.com/XXXXXX user name to list all their tweets. Modify code accordingly to meet your needs.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I want to read a delimiter-separated or fixed-width file (of defined layout), and want to get something like a Resultset through which I can iterate throgh the record.
Is there any reliable library for doing this? If not then can anyone please suggest me how I should proceed? An example code snippet will be very helpful to me.
You can use java ios to iterate each line in the text file and then implement your own logic to split the line and do as desired.
public static void main(String args[]) throws Exception {
//input file
File inputFile = new File("c:/hadoop/sample.txt");
BufferedReader br = new BufferedReader(new FileReader(inputFile));
String s = null;
while ((s = (br.readLine())) != null) {
//check each line and do the logic may be split or based on the requirement
String cols[] =s.split("|");
}
}
public static Stream<String> lines(Path path)
throws IOException
Read all lines from a file as a Stream. Bytes from the file are decoded into characters using the UTF-8 charset.
This method works as if invoking it were equivalent to evaluating the expression:
Files.lines(path, StandardCharsets.UTF_8)
Parameters:
path - the path to the file
Returns: the lines from the file as a Stream
Throws:
IOException - if an I/O error occurs opening the file
SecurityException - In the case of the default provider, and a security manager is installed, the checkRead method is invoked to check read access to the file.
Since: 1.8
Files.lines(Path) expects a Path argument and returns a Stream<String>.
This is Java 8, so you can use lambda expressions or method references to provide a Consumer argument.
public class FixedWidthFile {
public static void main(String JavaLatte[]) {
Path path = Paths.get("/home/sample.txt");
try {
Stream<String> lines = Files.lines(path);
lines.forEach(s -> System.out.println(s));
} catch (IOException ex) {
}
}
}
Reference: Class Files
The JSON example file consists of:
{
"1st_key": "value1",
"2nd_key": "value2",
"object_keys": {
"obj_1st": "value1",
"obj_2nd": "value2",
"obj_3rd": "value3",
}
}
I read the JSON file into a String with this StringBuilder method, in order to add the newlines into the string itself. So the String looks exactly like the JSON file above.
public String getJsonContent(String fileName) {
StringBuilder result = new StringBuilder("");
File file = new File(fileName);
try (Scanner scanner = new Scanner(file)) {
while (scanner.hasNextLine()) {
String line = scanner.nextLine();
result.append(line).append("\n");
}
scanner.close();
} catch (IOException e) {
e.printStackTrace();
}
return result.toString();
}
Then I translate the JSON file into an Object using MongoDB API (with DBObject, BasicDBObject and util.JSON) and I call out the Object section I need to change, which is 'object_keys':
File jsonFile = new File(C:\\example.json);
String jsonString = getJsonContent(jsonFile.getAbsolutePath());
DBObject jsonObject = (DBObject)JSON.parse(jsonString);
BasicDBObject objectKeys = (BasicDBObject) jsonObject.get("object_keys");
Now I can write new values into the Object using the PUT method like this:
objectKeys.put("obj_1st","NEW_VALUE1");
objectKeys.put("obj_2nd","NEW_VALUE2");
objectKeys.put("obj_3rd","NEW_VALUE3");
! This following part not needed, check out my answer below.
After I have changed the object, I need to write it back into the json file, so I need to translate the Object into a String. There are two methods to do this, either one works.
String newJSON = jsonObject.toString();
or
String newJSON = JSON.serialize(jsonObject);
Then I write the content back into the file using PrintWriter
PrintWriter writer = new PrintWriter(C:\\example.json)
writer.print(newJSON);
writer.close();
The problem I am facing now is that the String that is written is in a single line with no formatting whatosever. Somewhere along the way it lost all the newlines. So it basically looks like this:
{"1st_key": "value1","2nd_key": "value2","object_keys": { "obj_1st": "NEW_VALUE1","obj_2nd": "NEW_VALUE2","obj_3rd": "NEW_VALUE3", }}
I need to write the JSON file back in the same format as shown in the beginning, keeping all the tabulation, spaces etc.
Is this possible somehow ?
When you want something formatted the way you said it is addressed as writing to a file in a pretty/beautiful way. For example: Output beautiful json. A quick search on google found what i believe to solve your problem.
Solution
You're going to have to use a json parser of some sort. I personally prefer org.json and would recommend it if you are manipulating the json data, but you may also like json-io which is really good for json serialization with no external dependencies.
With json-io, it's as simple as
String formattedJson = JsonWriter.formatJson(jsonObject.toString())
With org.json, you simply pass an int to the toString method.
Thanks Saraiva, I found a surprisingly simple solution by Googling around with the words 'pretty printing JSON' and used the Google GSON library. I downloaded the .jar and added it to my project in Eclipse.
These are the new imports I needed:
import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
Since I already had the JSON Object (jsonObject) readily available from my previous code, I only needed to add two new lines:
Gson gson = new GsonBuilder().setPrettyPrinting().create();
String newJSON = gson.toJson(jsonObject);
Now when I use writer.print(newJSON); it will write the JSON in the right format, beautifully formatted and indented.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
Is there a way to read a text file in the resource into a String?
I suppose this is a popular requirement, but I couldn't find any utility after Googling.
Yes, Guava provides this in the Resources class. For example:
URL url = Resources.getResource("foo.txt");
String text = Resources.toString(url, StandardCharsets.UTF_8);
You can use the old Stupid Scanner trick oneliner to do that without any additional dependency like guava:
String text = new Scanner(AppropriateClass.class.getResourceAsStream("foo.txt"), "UTF-8").useDelimiter("\\A").next();
Guys, don't use 3rd party stuff unless you really need that. There is a lot of functionality in the JDK already.
Pure and simple, jar-friendly, Java 8+ solution
This simple method below will do just fine if you're using Java 8 or greater:
/**
* Reads given resource file as a string.
*
* #param fileName path to the resource file
* #return the file's contents
* #throws IOException if read fails for any reason
*/
static String getResourceFileAsString(String fileName) throws IOException {
ClassLoader classLoader = ClassLoader.getSystemClassLoader();
try (InputStream is = classLoader.getResourceAsStream(fileName)) {
if (is == null) return null;
try (InputStreamReader isr = new InputStreamReader(is);
BufferedReader reader = new BufferedReader(isr)) {
return reader.lines().collect(Collectors.joining(System.lineSeparator()));
}
}
}
And it also works with resources in jar files.
About text encoding: InputStreamReader will use the default system charset in case you don't specify one. You may want to specify it yourself to avoid decoding problems, like this:
new InputStreamReader(isr, StandardCharsets.UTF_8);
Avoid unnecessary dependencies
Always prefer not depending on big, fat libraries. Unless you are already using Guava or Apache Commons IO for other tasks, adding those libraries to your project just to be able to read from a file seems a bit too much.
For java 7:
new String(Files.readAllBytes(Paths.get(getClass().getResource("foo.txt").toURI())));
For Java 11:
Files.readString(Paths.get(getClass().getClassLoader().getResource("foo.txt").toURI()));
yegor256 has found a nice solution using Apache Commons IO:
import org.apache.commons.io.IOUtils;
String text = IOUtils.toString(this.getClass().getResourceAsStream("foo.xml"),
"UTF-8");
Guava has a "toString" method for reading a file into a String:
import com.google.common.base.Charsets;
import com.google.common.io.Files;
String content = Files.toString(new File("/home/x1/text.log"), Charsets.UTF_8);
This method does not require the file to be in the classpath (as in Jon Skeet previous answer).
apache-commons-io has a utility name FileUtils:
URL url = Resources.getResource("myFile.txt");
File myFile = new File(url.toURI());
String content = FileUtils.readFileToString(myFile, "UTF-8"); // or any other encoding
I like akosicki's answer with the Stupid Scanner Trick. It's the simplest I see without external dependencies that works in Java 8 (and in fact all the way back to Java 5). Here's an even simpler answer if you can use Java 9 or higher (since InputStream.readAllBytes() was added at Java 9):
String text = new String(AppropriateClass.class.getResourceAsStream("foo.txt")
.readAllBytes());
If you're concerned about the filename being wrong and/or about closing the stream, you can expand this a little:
String text = null;
InputStream stream = AppropriateClass.class.getResourceAsStream("foo.txt");
if (null != stream) {
text = stream.readAllBytes();
stream.close()
}
You can use the following code form Java
new String(Files.readAllBytes(Paths.get(getClass().getResource("example.txt").toURI())));
I often had this problem myself. To avoid dependencies on small projects, I often
write a small utility function when I don't need commons io or such. Here is
the code to load the content of the file in a string buffer :
StringBuffer sb = new StringBuffer();
BufferedReader br = new BufferedReader(new InputStreamReader(getClass().getResourceAsStream("path/to/textfile.txt"), "UTF-8"));
for (int c = br.read(); c != -1; c = br.read()) sb.append((char)c);
System.out.println(sb.toString());
Specifying the encoding is important in that case, because you might have
edited your file in UTF-8, and then put it in a jar, and the computer that opens
the file may have CP-1251 as its native file encoding (for example); so in
this case you never know the target encoding, therefore the explicit
encoding information is crucial.
Also the loop to read the file char by char seems inefficient, but it is used on a
BufferedReader, and so actually quite fast.
If you want to get your String from a project resource like the file
testcase/foo.json in src/main/resources in your project, do this:
String myString=
new String(Files.readAllBytes(Paths.get(getClass().getClassLoader().getResource("testcase/foo.json").toURI())));
Note that the getClassLoader() method is missing on some of the other examples.
Here's a solution using Java 11's Files.readString:
public class Utils {
public static String readResource(String name) throws URISyntaxException, IOException {
var uri = Utils.class.getResource("/" + name).toURI();
var path = Paths.get(uri);
return Files.readString(path);
}
}
Use Apache commons's FileUtils. It has a method readFileToString
I'm using the following for reading resource files from the classpath:
import java.io.IOException;
import java.io.InputStream;
import java.net.URISyntaxException;
import java.util.Scanner;
public class ResourceUtilities
{
public static String resourceToString(String filePath) throws IOException, URISyntaxException
{
try (InputStream inputStream = ResourceUtilities.class.getClassLoader().getResourceAsStream(filePath))
{
return inputStreamToString(inputStream);
}
}
private static String inputStreamToString(InputStream inputStream)
{
try (Scanner scanner = new Scanner(inputStream).useDelimiter("\\A"))
{
return scanner.hasNext() ? scanner.next() : "";
}
}
}
No third party dependencies required.
At least as of Apache commons-io 2.5, the IOUtils.toString() method supports an URI argument and returns contents of files located inside jars on the classpath:
IOUtils.toString(SomeClass.class.getResource(...).toURI(), ...)
With set of static imports, Guava solution can be very compact one-liner:
toString(getResource("foo.txt"), UTF_8);
The following imports are required:
import static com.google.common.io.Resources.getResource
import static com.google.common.io.Resources.toString
import static java.nio.charset.StandardCharsets.UTF_8
package test;
import java.io.InputStream;
import java.nio.charset.StandardCharsets;
import java.util.Scanner;
public class Main {
public static void main(String[] args) {
try {
String fileContent = getFileFromResources("resourcesFile.txt");
System.out.println(fileContent);
} catch (Exception e) {
e.printStackTrace();
}
}
//USE THIS FUNCTION TO READ CONTENT OF A FILE, IT MUST EXIST IN "RESOURCES" FOLDER
public static String getFileFromResources(String fileName) throws Exception {
ClassLoader classLoader = Main.class.getClassLoader();
InputStream stream = classLoader.getResourceAsStream(fileName);
String text = null;
try (Scanner scanner = new Scanner(stream, StandardCharsets.UTF_8.name())) {
text = scanner.useDelimiter("\\A").next();
}
return text;
}
}
Guava also has Files.readLines() if you want a return value as List<String> line-by-line:
List<String> lines = Files.readLines(new File("/file/path/input.txt"), Charsets.UTF_8);
Please refer to here to compare 3 ways (BufferedReader vs. Guava's Files vs. Guava's Resources) to get String from a text file.
Here is my approach worked fine
public String getFileContent(String fileName) {
String filePath = "myFolder/" + fileName+ ".json";
try(InputStream stream = Thread.currentThread().getContextClassLoader().getResourceAsStream(filePath)) {
return IOUtils.toString(stream, "UTF-8");
} catch (IOException e) {
// Please print your Exception
}
}
If you include Guava, then you can use:
String fileContent = Files.asCharSource(new File(filename), Charset.forName("UTF-8")).read();
(Other solutions mentioned other method for Guava but they are deprecated)
The following cods work for me:
compile group: 'commons-io', name: 'commons-io', version: '2.6'
#Value("classpath:mockResponse.json")
private Resource mockResponse;
String mockContent = FileUtils.readFileToString(mockResponse.getFile(), "UTF-8");
I made NO-dependency static method like this:
import java.nio.file.Files;
import java.nio.file.Paths;
public class ResourceReader {
public static String asString(String resourceFIleName) {
try {
return new String(Files.readAllBytes(Paths.get(new CheatClassLoaderDummyClass().getClass().getClassLoader().getResource(resourceFIleName).toURI())));
} catch (Exception e) {
throw new RuntimeException(e);
}
}
}
class CheatClassLoaderDummyClass{//cheat class loader - for sql file loading
}
I like Apache commons utils for this type of stuff and use this exact use-case (reading files from classpath) extensively when testing, especially for reading JSON files from /src/test/resources as part of unit / integration testing. e.g.
public class FileUtils {
public static String getResource(String classpathLocation) {
try {
String message = IOUtils.toString(FileUtils.class.getResourceAsStream(classpathLocation),
Charset.defaultCharset());
return message;
}
catch (IOException e) {
throw new RuntimeException("Could not read file [ " + classpathLocation + " ] from classpath", e);
}
}
}
For testing purposes, it can be nice to catch the IOException and throw a RuntimeException - your test class could look like e.g.
#Test
public void shouldDoSomething () {
String json = FileUtils.getResource("/json/input.json");
// Use json as part of test ...
}
public static byte[] readResoureStream(String resourcePath) throws IOException {
ByteArrayOutputStream byteArray = new ByteArrayOutputStream();
InputStream in = CreateBffFile.class.getResourceAsStream(resourcePath);
//Create buffer
byte[] buffer = new byte[4096];
for (;;) {
int nread = in.read(buffer);
if (nread <= 0) {
break;
}
byteArray.write(buffer, 0, nread);
}
return byteArray.toByteArray();
}
Charset charset = StandardCharsets.UTF_8;
String content = new String(FileReader.readResoureStream("/resource/...*.txt"), charset);
String lines[] = content.split("\\n");