Parsing Windows tasklist output in Java

Parsing Windows tasklist output in Java - java

I am trying to build an array of processes running on my machine; to do so I have been trying to use the following two commands:
tasklist /fo csv /nh # For a CSV output
tasklist /nh # For a non-CSV output
The issue that I am having is that I can not properly parse the output.
First Scenario
I have a line like:
"wininit.exe","584","Services","0","5,248 K"
Which I have attempted to parse using "".split(","), however this fails when it comes to the process memory usage - the comma in the number field willl result in an extra field.
Second Scenario
Without the non-CSV output, I have a line like:
wininit.exe 584 Services 0 5,248 K
Which I am attempting to parse using "".split("\\s+") however this one now fails on a process like System Idle Process, or any other process with a space in the executible name.
How can I parse either of these output such that the same split index will always contain the correct data column?

To parse a string, always prefer the most strict formatting. In this case, CSV. In this way, you could process each line with a regular expression containing FIVE groups:
private final Pattern pattern = Pattern
.compile("\\\"([^\\\"]*)\\\",\\\"([^\\\"]*)\\\",\\\"([^\\\"]*)\\\",\\\"([^\\\"]*)\\\",\\\"([^\\\"]*)\\\"");
private void parseLine(String line) {
Matcher matcher = pattern.matcher(line);
if (!matcher.find()) {
throw new IllegalArgumentException("invalid format");
}
String name = matcher.group(1);
int pid = Integer.parseInt(matcher.group(2));
String sessionName = matcher.group(3);
String sessionId = matcher.group(4);
String memUsage = matcher.group(5);
System.out.println(name + ":" + pid + ":" + memUsage);
}

You should use a StringTokenizer class instead of split. You use the " delimiter and expect the delimiter to be returned. You can then use that delimiter to provide field separation. For instance,
StringTokenizer st = new StringTokenizer(input, "\"", true);
State state = NONE;
while (st.hasMoreTokens()) {
String t = st.nextToken();
switch(state) {
case NONE:
if ("\"".equals(t)) {
state = BEGIN;
}
// skip the ,
break;
case BEGIN:
// Store t in which entry it correspond to.
state = END;
break;
case END:
state = NONE;
break;
}
}
Each token will be stored within its respective data set and you can then process that information for each Process.

Tried this and seems to work.
public void parse(){
try {
Runtime runtime = Runtime.getRuntime();
Process proc = runtime.exec("tasklist -fo csv /nh");
BufferedReader stdInput = new BufferedReader(new
InputStreamReader(proc.getInputStream()));
String line = "";
while ((line = stdInput.readLine()) != null) {
System.out.println();
for (String column: line.split("\"")){
if (!column.equals(",")&& !column.equals("")){
System.out.print("["+column+"]");
}
}
}
}catch (Exception e){
e.printStackTrace();
}
}

Related

How CSV parsing can be utilized - JAVA

I am given a file that will read the following:
"String",int,int
"String",int,int
"String",int,int
...
Given an unknown number of variables, a while (scanner.hasNextLine()) can solve to the number of entries. My goal is to take these three pieces of data and store them into a Node. I am using the method BinaryTree.addNode(String, int, int) for this. My issue comes to when I am trying to read in the data. I am trying to remove the commas within the document and then attempting to re-read the data using the following:
Scanner firstpass = new Scanner(file);
String input = firstpass.nextLine().replaceAll(",", "");
Scanner secondpass = new Scanner(input);
String variable1 = secondpass.next();
int variable2 = secondpass.nextInt();
int variable3 = secondpass.nextInt();
This however is a very innefective way of going about this.
UPDATED
The compiling errors can be fixed with the following:
try {
Scanner scanner1 = new Scanner(file);
while (scanner1.hasNextLine()) {
String inventory = scanner1.nextLine().replaceAll(",", " ");
Scanner scanner2 = new Scanner(inventory);
while (scanner2.hasNext()){
String i = scanner2.next();
System.out.print(i);
}
scanner2.close();
}
scanner1.close();
}
catch (FileNotFoundException ex) {
ex.printStackTrace();
}
which gives me the output:
"String"intint"String"intint"String"intint...
So I know I am on the right track. However any (spaces) within the "String" variable are removed. So they would output "SomeString" instead of "Some String". Also I still don't know how to remove the "" from the strings.

The format you've shown matches the CSV (Comma-Separated Values) format, so your best option is to use a CSV parser, e.g. Apache Commons CSV ™.
If you don't want to add a third-party library, you could use Regular Expression to parse the line.
Reading lines from a file should not be done with a Scanner. Use a BufferedReader instead. See Scanner vs. BufferedReader.
try (BufferedReader in = new BufferedReader(new FileReader(file))) {
Pattern p = Pattern.compile("\"(.*?)\",(-?\\d+),(-?\\d+)");
for (String line; (line = in.readLine()) != null; ) {
Matcher m = p.matcher(line);
if (! m.matches())
throw new IOException("Invalid line: " + line);
String value1 = m.group(1);
int value2 = Integer.parseInt(m.group(2));
int value3 = Integer.parseInt(m.group(3));
// use values here
}
} catch (IOException | NumberFormatException ex) {
ex.printStackTrace();
}
Note that this will not work if the string contains escaped characters, e.g. if it contains embedded double-quotes. For that, you should use a parser library.
The code above will correctly handle embedded spaces and commas.

I would instead of using
String input = firstpass.nextLine().replaceAll(",", "");
Scanner secondpass = new Scanner(input);
String variable1 = secondpass.next();
int variable2 = secondpass.nextInt();
int variable3 = secondpass.nextInt();
Use the following approach
String line = firstpass.nextLine();
String[] temp = line.split(",");
String variable1 = temp[0];
int variable2 = Integer.parseInt(temp[1]);
int variable3 = Integer.parseInt(temp[2]);

Deal with PatternSyntaxException and scanning texts

I want to find names in a collection of text documents from a huge list of about 1 million names. I'm making a Pattern from the names of the list first:
BufferedReader TSVFile = new BufferedReader(new FileReader("names.tsv"));
String dataRow = TSVFile.readLine();
dataRow = TSVFile.readLine();// skip first line (header)
String combined = "";
while (dataRow != null) {
String[] dataArray = dataRow.split("\t");
String name = dataArray[1];
combined += name.replace("\"", "") + "|";
dataRow = TSVFile.readLine(); // Read next line of data.
}
TSVFile.close();
Pattern all = Pattern.compile(combined);
After doing so I got an IllegalPatternSyntax Exception because some names contain a '+' in their names or other Regex expressions. I tried solving this by either ignoring the few names by:
if(name.contains("\""){
//ignore this name }
Didn't work properly but also messy because you have to escape everything manually and run it many times and waste your time.
Then I tried using the quote method:
Pattern all = Pattern.compile(Pattern.quote(combined));
However now, I don't find any matches in the text documents anymore, even when I also use quote on the them. How can I solve this issue?

I agree with the comment of #dragon66, you should not quote pipe "|". So your code would be like the code below using Pattern.quote() :
BufferedReader TSVFile = new BufferedReader(new FileReader("names.tsv"));
String dataRow = TSVFile.readLine();
dataRow = TSVFile.readLine();// skip first line (header)
String combined = "";
while (dataRow != null) {
String[] dataArray = dataRow.split("\t");
String name = dataArray[1];
combined += Pattern.quote(name.replace("\"", "")) + "|"; //line changed
dataRow = TSVFile.readLine(); // Read next line of data.
}
TSVFile.close();
Pattern all = Pattern.compile(combined);
Also I suggest to verify if your problem domain needs optimization replacing the use of the String combined = ""; over an Immutable StringBuilder class to avoid the creation of unnecessary new strings inside a loop.

guilhermerama presented the bugfix to your code.
I will add some performance improvements. As I pointed out the regex library of java does not scale and is even slower if used for searching.
But one can do better with Multi-String-Seach algorithms. For example by using StringsAndChars String Search:
//setting up a test file
Iterable<String> lines = createLines();
Files.write(Paths.get("names.tsv"), lines , CREATE, WRITE, TRUNCATE_EXISTING);
// read the pattern from the file
BufferedReader TSVFile = new BufferedReader(new FileReader("names.tsv"));
Set<String> combined = new LinkedHashSet<>();
String dataRow = TSVFile.readLine();
dataRow = TSVFile.readLine();// skip first line (header)
while (dataRow != null) {
String[] dataArray = dataRow.split("\t");
String name = dataArray[1];
combined.add(name);
dataRow = TSVFile.readLine(); // Read next line of data.
}
TSVFile.close();
// search the pattern in a small text
StringSearchAlgorithm stringSearch = new AhoCorasick(new ArrayList<>(combined));
StringFinder finder = stringSearch.createFinder(new StringCharProvider("test " + name(38) + "\n or " + name(799) + " : " + name(99999), 0));
System.out.println(finder.findAll());
The result will be
[5:10(00038), 15:20(00799), 23:28(99999)]
The search (finder.findAll()) does take (on my computer) < 1 millisecond. Doing the same with java.util.regex took around 20 milliseconds.
You may tune this performance by using other algorithms provided by RexLex.
Setting up needs following code:
private static Iterable<String> createLines() {
List<String> list = new ArrayList<>();
for (int i = 0; i < 100000; i++) {
list.add(i + "\t" + name(i));
}
return list;
}
private static String name(int i) {
String s = String.valueOf(i);
while (s.length() < 5) {
s = '0' + s;
}
return s;
}

Empty element inserted into ArrayList when reading from command line

I have code that i'm running to get a list of user groups from the command line of a given user, using the following code:
private ArrayList<String> accessGroups = new ArrayList<String>();
public void setAccessGroups(String userName) {
try {
Runtime rt = Runtime.getRuntime();
Process pr = rt.exec("/* code to get users */");
BufferedReader input = new BufferedReader(new InputStreamReader(pr.getInputStream()));
String line = null;
// This code needs some work
while ((line = input.readLine()) != null){
System.out.println("#" + line);
String[] temp;
temp = line.split("\\s+");
if(line.contains("GRPNAME-")) {
for(int i = 0; i < temp.length; i++){
accessGroups.add(temp[i]);
}
}
}
// For debugging purposes, to delete
System.out.println(accessGroups);
} catch (IOException e) {
e.printStackTrace();
}
}
The code to get users returns a result containing the following:
#Local Group Memberships *localgroup1 *localgroup2
#Global Group memberships *group1 *group2
# *group3 *group4
# *GRPNAME-1 *GRPNAME-2
The code is designed to extract anything beginning with GRPNAME-. This works fine, it's just if I print the ArrayList I get:
[, *GRPNAME-1, *GRPNAME-2]
There's an reference to a string of "". Is there a simple way I can alter the regex, or another solution I could try to remove this from occurring at the point of being added.
The expected output is:
[*GRPNAME-1, *GRPNAME-2]
Edit: answered, edited output to reflect changes in code.

Instead of this tokenization as presented from this snippet:
line.split("\\s+");
Use a pattern to match \S+ and add them to your collection. For example:
// Class level
private static final Pattern TOKEN = Pattern.compile("\\S+");
// Instance level
{
Matcher tokens = TOKEN.matcher(line);
while (tokens.find())
accessGroups.add(tokens.group());
}

Simple answer in the end, in place of:
temp = line.split("\\s+");
use:
temp = line.trim().split("\\s+");

Java String Matching in a Sorted File and grouping similar data

i have sorted file and i need to do the following pattern match. I read the row and then compare or do patern match with the row just after it , if it matches then insert the string i used to match after a comma in that row and move on to the next row. I am new to Java and overwhelmed with options from Open CSV to BufferedReader. I intend to iterate through the file till it reaches the end. I may always have blanks and have a dated in quotes. The file size would be around 100 MBs.
My file has data like
ABCD
ABCD123
ABCD456, 123
XYZ
XYZ890
XYZ123, 890
and output is expected as
ABCD, ABCD
ABCD123, ABCD
ABCD456, 123, ABCD
XYZ, XYZ
XYZ890, XYZ
XYZ123, 890, XYZ
Not sure about the best method. Can you please help me.

To open a file, you can use File and FileReader classes:
File csvFile = new File("file.csv");
FileReader fileReader = null;
try {
fileReader = new FileReader(csvFile);
} catch (FileNotFoundException e) {
e.printStackTrace();
}
You can get a line of the file using Scanner:
Scanner reader = new Scanner(fileReader);
while(reader.hasNext()){
String line = reader.nextLine();
parseLine(line);
}
You want to parse this line. For it, you have to study Regex for using Pattern and Matcher classes:
private void parseLine(String line) {
Matcher matcher = Pattern.compile("(ABCD)").matcher(line);
if(matcher.find()){
System.out.println("find: " + matcher.group());
}
}
To find the next pattern of the same row, you can reuse matcher.find(). If some result was found, it will return true and you can get this result with matcher.groud();

Read line by line and use regex to replace it as per your need using String.replaceAll()
^([A-Z]+)([0-9]*)(, [0-9]+)?$
Replacement : $1$2$3, $1
Here is Online demo
Read more about Java Pattern
Sample code:
String regex = "^([A-Z]+)([0-9]*)(, [0-9]+)?$";
String replacement = "$1$2$3, $1";
String newLine = line.replaceAll(regex,replacement);
For better performance, read 100 or more lines at a time and store in a buffer and finally call String#replaceAll() single time to replace all at a time.
sample code:
String regex = "([A-Z]+)([0-9]*)(, [0-9]+)?(\r?\n|$)";
String replacement = "$1$2$3, $1$4";
StringBuilder builder = new StringBuilder();
int counter = 0;
String line = null;
try (BufferedReader reader = new BufferedReader(new FileReader("abc.csv"))) {
while ((line = reader.readLine()) != null) {
builder.append(line).append(System.lineSeparator());
if (counter++ % 100 == 0) { // 100 lines
String newLine = builder.toString().replaceAll(regex, replacement);
System.out.print(newLine);
builder.setLength(0); // reset the buffer
}
}
}
if (builder.length() > 0) {
String newLine = builder.toString().replaceAll(regex, replacement);
System.out.print(newLine);
}
Read more about Java 7 - The try-with-resources Statement

How to extract line with word from string (java_android)

I have following code:
private String ReadCPUinfo()
{
ProcessBuilder cmd;
String result="";
try{
String[] args = {"/system/bin/cat", "/proc/cpuinfo"};
cmd = new ProcessBuilder(args);
Process process = cmd.start();
InputStream in = process.getInputStream();
byte[] re = new byte[1024];
while(in.read(re) != -1){
System.out.println(new String(re));
result = result + new String(re);
}
in.close();
} catch(IOException ex){
ex.printStackTrace();
}
return result;
}
and String from /proc/cpuinfo as result. I need to extract processor info (Processor: WordIWantToExtract) as String to put it in the TextView.
I did it in Python script (print cpuinfo to the txt file, then lookup line number with word "Processor", return its line number and then printing this line with editing). How can I port this to the Java?

/proc/cpuinfo is just a text file. Just use a BufferedReader and read the contents instead of using ProcessBuilder. Check for the the prefix "Processor" to extract the exact line.
BufferedReader reader =
Files.newBufferedReader(Paths.get("/proc/cpuinfo"), StandardCharsets.UTF_8);
while ((line = reader.readLine()) != null) {
Matcher m = Pattern.compile("Processor: (.*)").matcher(line);
if (m.find()) {
System.out.println("Processor is " + m.group(1));
...
}
}

I would use a JSONObject. Yo ucan create the object with a "key" processor and the word you want. For example,
Map<String, String> processors = new HashMap<String, String>();
loggingMap.put("Processor", "Word");
JSONObject jsonObject = new JSONObject();
jsonObject.element(processors);
The line will look like this, {"Processor": "word", "Other Key": "Other Word"}
Then you can write this to a file,
jsonObject.write(Writer writer);
Then you can read the line from the file and use,
jsonObject.getString("Processor");
I used a HashMap incase you have keys and values.

I'm not sure to understand well your question but I think you can add this after the while loop:
Matcher matcher = Pattern.compile("Processor: (.*)").matcher(result);
if (matcher.find()) {
String wordYouWantToExtract = matcher.group(1);
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

Parsing Windows tasklist output in Java - java

Related

How CSV parsing can be utilized - JAVA

Deal with PatternSyntaxException and scanning texts

Empty element inserted into ArrayList when reading from command line

Java String Matching in a Sorted File and grouping similar data

How to extract line with word from string (java_android)

Categories

Resources