Identifying Bullets in Ms Word, using Apache POI - java

I'm trying to make an application which would read a word file (docx) and do some stuff with it. So far, I've done pretty much everything except for to identify bullets.
I can find isBold(), isItalic(), isStrike() but I cannot seem to find isBullet()
can anyone please tell me how to identify bullets?
the application is built in Java

There's no isBullet() method, because list styling in Word is quite a lot more complicated than that. You have different indent levels, different styles of bullets, numbered lists and bulleted lists etc
Probably the easiest method for you to call for your use case is XWPFParagraph.html.getNumFmt():
Returns numbering format for this paragraph, eg bullet or lowerLetter. Returns null if this paragraph does not have numeric style.
Call that, and if you get null it isn't a list, and if it is, you'll know if it's bulleted, number, letter etc

You can use below code for getting list of all the bullets from the word document. I have used apache poi's XWPF api.
public class ListTest {
public static void main(String[] args) {
String filename = "file_path";
List<String> paraList = new ArrayList<String>();
try {
// is = new FileInputStream(fileName);
XWPFDocument doc =
new XWPFDocument(OPCPackage.open(filename));
List<XWPFParagraph> paragraphList = doc.getParagraphs();
for(XWPFParagraph para :paragraphList) {
if((para.getStyle()!=null) && (para.getNumFmt() !=null)) {
paraList.add(para.getText());
}
for(String bullet :paraList) {
System.out.println(bullet);
}
}
} catch (Exception e) {
e.printStackTrace();
}
}

Related

Trying to add substrings from newLines in a large file to a list

I downloaded my extended listening history from Spotify and I am trying to make a program to turn the data into a list of artists without doubles I can easily make sense of. The file is rather huge because it has data on every stream I have done since 2016 (307790 lines of text in total). This is what 2 lines of the file looks like:
{"ts":"2016-10-30T18:12:51Z","username":"edgymemes69endmylifepls","platform":"Android OS 6.0.1 API 23 (HTC, 2PQ93)","ms_played":0,"conn_country":"US","ip_addr_decrypted":"68.199.250.233","user_agent_decrypted":"unknown","master_metadata_track_name":"Devil's Daughter (Holy War)","master_metadata_album_artist_name":"Ozzy Osbourne","master_metadata_album_album_name":"No Rest for the Wicked (Expanded Edition)","spotify_track_uri":"spotify:track:0pieqCWDpThDCd7gSkzx9w","episode_name":null,"episode_show_name":null,"spotify_episode_uri":null,"reason_start":"fwdbtn","reason_end":"fwdbtn","shuffle":true,"skipped":null,"offline":false,"offline_timestamp":0,"incognito_mode":false},
{"ts":"2021-03-26T18:15:15Z","username":"edgymemes69endmylifepls","platform":"Android OS 11 API 30 (samsung, SM-F700U1)","ms_played":254120,"conn_country":"US","ip_addr_decrypted":"67.82.66.3","user_agent_decrypted":"unknown","master_metadata_track_name":"Opportunist","master_metadata_album_artist_name":"Sworn In","master_metadata_album_album_name":"Start/End","spotify_track_uri":"spotify:track:3tA4jL0JFwFZRK9Q1WcfSZ","episode_name":null,"episode_show_name":null,"spotify_episode_uri":null,"reason_start":"fwdbtn","reason_end":"trackdone","shuffle":true,"skipped":null,"offline":false,"offline_timestamp":1616782259928,"incognito_mode":false},
It is formatted in the actual text file so that each stream is on its own line. NetBeans is telling me the exception is happening at line 19 and it only fails when I am looking for a substring bounded by the indexOf function. My code is below. I have no idea why this isn't working, any ideas?
import java.util.*;
public class MainClass {
public static void main(String args[]){
File dat = new File("SpotifyListeningData.txt");
List<String> list = new ArrayList<String>();
Scanner swag = null;
try {
swag = new Scanner(dat);
}
catch(Exception e) {
System.out.println("pranked");
}
while (swag.hasNextLine())
if (swag.nextLine().length() > 1)
if (list.contains(swag.nextLine().substring(swag.nextLine().indexOf("artist_name"), swag.nextLine().indexOf("master_metadata_album_album"))))
System.out.print("");
else
try {list.add(swag.nextLine().substring(swag.nextLine().indexOf("artist_name"), swag.nextLine().indexOf("master_metadata_album_album")));}
catch(Exception e) {}
System.out.println(list);
}
}
Find a JSON parser you like.
Create a class that with the fields you care about marked up to the parsers specs.
Read the file into a collection of objects. Most parsers will stream the contents so you're not string a massive string.
You can then load the data into objects and store that as you see fit. For your purposes, a TreeSet is probably what you want.
Your code will throw a lot of exceptions only because you don't use braces. Please do use braces in each blocks, whether it is if, else, loops, whatever. It's a good practice and prevent unnecessary bugs.
However, everytime scanner.nextLine() is called, it reads the next line from the file, so you need to avoid using that in this way.
The best way to deal with this is to write a class containing the fields same as the json in each line of the file. And map the json to the class and get desired field value from that.
Your way is too much risky and dependent on structure of the data, even on whitespaces. However, I fixed some lines in your code and this will work for your purpose, although I actually don't prefer operating string in this way.
while (swag.hasNextLine()) {
String swagNextLine = swag.nextLine();
if (swagNextLine.length() > 1) {
String toBeAdded = swagNextLine.substring(swagNextLine.indexOf("artist_name") + "artist_name".length() + 2
, swagNextLine.indexOf("master_metadata_album_album") - 2);
if (list.contains(toBeAdded)) {
System.out.print("Match");
} else {
try {
list.add(toBeAdded);
} catch (Exception e) {
System.out.println("Add to list failed");
}
}
System.out.println(list);
}
}

Reading Formatted .txt File into Variables: Java

I Have a formatted text file called cars.txt; It's separated by tabs.
Name Length Width
truck1 18.6 8.1
suv1 17.4 7.4
coupe1 14.8 5.4
mini1 14.1 5.0
sedan1 16.4 6.1
suv2 17.5 7.3
mini2 14.3 5.2
sedan2 16.5 6.2
I need to read in this information so it can be used for calculations later on.
This is my current idea but I am having a hard time piecing together what I need to execute.
public class Class{
public void readFileIn(){
Scanner sc = new Scanner(new FileReader("cars.txt");
try{
while (sc.hasNextLine()){
if (/**something that catches strings*/){
method1(string1, double1, double2);
method2(double1, double2);
}
}
}catch(FileNotFoundException exception){
System.out.println("File dosen't exist");
}
}
}
Scanner and Buffer Reader are not used very often anymore as Java provides a better way to achieve tha same result with less code.
I can see at least three possible approaches to solve your problem:
approach 1: if you can use at least Java 8, then I would suggest to use the java.nio.file libraries to read the file as a stream of lines:
Stream<String> linesStream=Files.lines("cars.txt");
Then depending on what you need to do, you could use either forEach that will loop on each line of the stream:
linesStream.forEach(e -> e.myMethod());
Or Java Collectors to execute the calculation that you need to. A good tutorial about Collectors can be found here. You can use collectors also to separate your string etc...
approach 2: you can use Apache Commons libraries to achieve the same goal. In particular you could use FileUtils and StringUtils. For instance:
File carFile=new File("cars.txt");
LineIterator lineIterator=lineIterator(carFile);
for(String line : lineIterator) {
String[] my values=StringUtils.split(line);
//do whatever you need
}
approach 3: use Jackson to transform your file into a json or a java object that you can then use for your own transformations. Here is an example explaining how to convert a CSV to JSON. With a bit of digging in the Jackson documentation, you could apply it to your case.
First of all, i recommend you create an Entry class that represents your data.
private class Entry {
private String name;
private double length;
private double width;
// getters and setters omitted
#Override
public String toString() {
// omitted
}
}
Next, create a method that takes a String as an arguments and is responsible for parsing a line of text to an instance of Entry. The regex \\s+ matches any whitespace characters and will split your line to its individual columns. Remember that in production, Double.valueOf can throw an RuntimeException if your are not passing a valid String.
Finally, you can read the file, here using the Java 8 stream API. Skip the first line since it includes the column header and not actual data.
private void readFile() throws Exception {
Path path = Paths.get(/* path to your file */);
Files.readAllLines(path).stream().skip(1).map(FileReadTest::toEntry)
.forEach(this::action);
}
In my example, i am just printing each entry to the console:
private void action(Entry entry) {
System.out.println(entry);
}
Resulting output:
Entry[name='truck1', length=18.6, width=8.1]
Entry[name='suv1', length=17.4, width=7.4]
Entry[name='coupe1', length=14.8, width=5.4]
Entry[name='mini1', length=14.1, width=5.0]
Entry[name='sedan1', length=16.4, width=6.1]
Entry[name='suv2', length=17.5, width=7.3]
Entry[name='mini2', length=14.3, width=5.2]
Entry[name='sedan2', length=16.5, width=6.2]
Here's an example of how to properly read a text file - replace the charset with the one you need.
try (final BufferedReader br = Files.newBufferedReader(file.toPath(), StandardCharsets.UTF_8)) {
String line;
while ((line = br.readLine()) != null) {
System.out.println(line);
}
} catch (IOException e) {
e.printStackTrace();
}
Once you have the individual lines, you can split them by whitespace: str.split("\\s+");
You get an array with three entries. I guess you can figure out the rest.

How can I convert a String object to String[]?

private static void readSplitPrint(String path) {
try {
for (String line : Files.readAllLines(Paths.get(path))) {
for (String word : line.split("/t")) {
System.out.println(path + " : " + word);
}
}
} catch (IOException e) {
e.printStackTrace();
}
}
I have a default code for my homework. I need to use it and it works for reading a text file to complete my project. But I have some trouble about this topic. I can't convert a string object which is called 'word' to any string[] object. I'm begginer and I don't know how to do it with using Array easily. So I need to convert and hold these words and -of course it is possible- keep it any String[] or String[][] object to use as "x[0][0]" or "x[2][3]" matrix type. Can anyone help to me?
You do not need to "convert" any kind of object in order to "turn it" into an array of object. An array is a container; and you just instruct the program to stuff that object into the container.
String foo = "bar";
String[] arrayOfStringsWithLengthOne = new String[1];
arrayOfStringsWithLengthOne[0] = foo;
is all that you need.
But keep in mind: stackoverflow is not a programming school. This is something where you should not look to over people to explain it to you. You should turn to books and tutorials and just read the stuff. You can't came here for every little problem that you will encounter when learning a programming language. And a hint: as long as you have trouble understanding such very basic things about arrays; better forget about the two-dimenensional ones for the moment. Learn the basics; one by one.

Searching files in a directory and pairing them based on a common sub-string

I have been attempting to program a solution for ImageJ to process my images.
I understand how to get a directory, run commands on it, etc etc. However I've run into a situation where I now need to start using some type of search function in order to pair two images together in a directory full of image pairs.
I'm hoping that you guys can confirm I am on the right direction and that my idea is right. So far it is proving difficult for me to understand as I have less than even a month's worth of experience with Java. Being that this project is directly for my research I really do have plenty of drive to get it done I just need some direction in what functions are useful to me.
I initially thought of using regex but I saw that when you start processing a lot of images (especially with imagej which it seems does not dump data usage well, if that's the correct way to say it) that regex is very slow.
The general format of these images is:
someString_DAPI_0001.tif
someString_GFP_0001.tif
someString_DAPI_0002.tif
someString_GFP_0002.tif
someString_DAPI_0003.tif
someString_GFP_0003.tif
They are in alphabetical order so it should be able to go to the next image in the list. I'm just a bit lost on what functions I should use to accomplish this but I think my overall while structure is correct. Thanks to some help from Java forums. However I'm still stuck on where to go to next.
So far here is my code: Thanks to this SO answer for partial code
int count = 0;
getFile("C:\");
string DAPI;
string GFP;
private void getFile(String dirPath) {
File f = new File(dirPath);
File[] files = f.listFiles();
while (files.length > 0) {
if (/* File name contains "DAPI"*/){
DAPI = File f;
string substitute to get 'GFP' filename
store GFP file name into variable
do something(DAPI, GFP);
}
advance to next filename in list
}
}
As of right now I don't really know how to search for a string within a string. I've seen regex capture groups, and other solutions but I do not know the "best" one for processing hundreds of images.
I also have no clue what function would be used to substitute substrings.
I'd much appreciate it if you guys could point me towards the functions best for this case. I like to figure out how to make it on my own I just need help getting to the right information. Also want to make sure I am not making major logic mistakes here.
It doesn't seem like you need regex if your file names follow the simple pattern that you mentioned. You can simply iterate over the files and filter based on whether the filename contains DAPI e.g. see below. This code may be oversimplification of your requirements but I couldn't tell that based on the details you've provided.
import java.io.*;
public class Temp {
int count = 0;
private void getFile(String dirPath) {
File f = new File(dirPath);
File[] files = f.listFiles();
if (files != null) {
for (File file : files) {
if (file.getName().contains("DAPI")) {
String dapiFile = file.getName();
String gfpFile = dapiFile.replace("DAPI", "GFP");
doSomething(dapiFile, gfpFile);
}
}
}
}
//Do Something does nothing right now, expand on it.
private void doSomething(String dapiFile, String gfpFile) {
System.out.println(new File(dapiFile).getAbsolutePath());
System.out.println(new File(gfpFile).getAbsolutePath());
}
public static void main(String[] args) {
Temp app = new Temp();
app.getFile("C:\\tmp\\");
}
}
NOTE: As per Vogel612's answer, if you have Java 8 and like a functional solution you can have:
private void getFile(String dirPath) {
try {
Files.find(Paths.get(dirPath), 1, (path, basicFileAttributes) -> (path.toFile().getName().contains("DAPI"))).forEach(
dapiPath -> {
Path gfpPath = dapiPath.resolveSibling(dapiPath.getFileName().toString().replace("DAPI", "GFP"));
doSomething(dapiPath, gfpPath);
});
} catch (IOException e) {
e.printStackTrace();
}
}
//Dummy method does nothing yet.
private void doSomething(Path dapiPath, Path gfpPath) {
System.out.println(dapiPath.toAbsolutePath().toString());
System.out.println(gfpPath.toAbsolutePath().toString());
}
Using java.io.File is the wrong way to approach this problem. What you're looking for is a Stream-based solution using Files.find that would look something like this:
Files.find(dirPath, 1, (path, attributes) -> {
return path.getFileName().toString().contains("DAPI");
}).forEach(path -> {
Path gfpFile = path.resolveSibling(/*build GFP name*/);
doSomething(path, gfpFile);
});
What this does is:
Iterate over all Paths below dirPath 1 level deep (may be adjusted)
Check that the File's name contains "DAPI"
Use these files to find the relevant "GFP"-File
give them to doSomething
This is preferrable to the files solution because of multiple things:
It's significantly more informative when failing
It's cleaner and more terse than your File-Based solution and doesn't have to check for null
It's forward compatible, and thus preferrable over a File-Based solution
Files.find is available from Java 8 onwards

CSV to a String Array - JAVA

I have a CSV file which has only one column with 100+ rows. I would like to put those values in an one dimensional array(only if its possible). So that it works as same as if I wrote a string array manually. I.e.
String[] username = {'lalala', 'tatata', 'mamama'}; //<---if I did it manually
String[] username = {after passing the CSV values}; //<---I want this like the above ones.
Then later I would like to be able to initialized that class to a different class, say if the class that holds the array is called ArrayClass, I would like to be able to initialized this to different class, like this --
public class MainClass{
ArrayClass array = new ArrayClass();
//Then I would like to be able to do this
someMethod(array.username);
}
I know I asked a lot of things but I seriously appreciate all your help. Even if you see this question and say THIS IS BS. Oh and one more thing I would prefer it to be in JAVA.
It might be easier to use an arraylist rather than an array as you dont have to worry about number of rows. An array has a fixed size that cant be changed. i.e ArrayList
As you have only one column you will not need to worry about commas in csv
Example code would look something like this:
import java.util.*;
import java.io.*;
public class MyClass {
private ArrayList<String> MyArray = new ArrayList<String>();
private Scanner scan;
public MyClass(){
try {
scan = new Scanner(new File("MyFile.csv"));
} catch (IOException ioex) {
System.out.println("File Not Found");
}
}
public ArrayList<String> getArray() {
while (scan.hasNext()) {
Scanner line = new Scanner(scan.nextLine());
MyArray.add(line.next());
}
return MyArray;
}
}
And in the main:
MyClass f = new MyClass();
System.out.println(f.getArray());
If it's just a csv you can use the split method of string with a proper regex.
Please do check the split method
The first half of your question is easy and can be handled in a number of different ways. Personally, I would use the Scanner class and set the delimiter to be ",". Create a new Scanner Object and then call setDelimiter(",") on it. Then simply scan through the tokens. See the example on the documentation. This method of doing things is effective because it handles reading in the file and separating it based on your criteria (the ',' character).

Categories