Programmatically fetch multiple code from string in Java - java

I have string looks like below, the string is joined by line-breaker. In this string, the the first 2 lines and last two lines are fixed, "public class MyClass {/n public void code() {/n"
String doc =
"public class MyClass {
public void code() {
try (...) {
...
}
}
}"
I only want to take out the multiple lines code in the method code, which means no first 2 lines and last 2 lines. This is what I did in my project:
String[] lines = docj.split("\\r?\\n");
String[] codes = Arrays.copyOfRange(lines, 2, lines.length - 2);
String result = String.join("\n", codes);
Do you have better way to fetch the string in the middle?

The only real answer: use an existing parser framework, such as javaparser.
Seriously, that simple.
Anything else means: you are spending time and energy to solve a solved problem. The result will be deficient, compared to any mature product, and it will be a constant liability in the future. You can get your tool to work with code you have in front of you right now, but the second your tool gets used to "parse" slightly different code, it will most likely break.
In case you are asking for educational purposes, then learn how compiler works, and what it takes to tokenize Java source code, and how to turn it into an abstract syntax tree (AST) representation.

Assuming the task is meant for basic educational purposes or a quick hack (otherwise #GhostCat's answer draws first):
Already method detection, taken seriously is not so easy. Basically you have to start implementing your own syntax parser for a fraction the Java language: chop everything to single words, skip the class declaration, wait for "static", "public", "protected", "private", "synchronized", hope I didn't forget one, skip over them and the return type definition ("void", "string"...), then you are at the name, then come optional type parameters ("<T>"), then "(", then optionally method parameters etc.).
Perhaps there are restrictions to the task, that make it less complicated. You should ask for clarification.
The problem in any case will be to find the closing braces and skip them. If you can afford to neglect such stuff as braces in strings (string s = "ab{{c";) or comments ("/* {{{ */")it is enough to count up for each { occuring after e.g. "public void code() {" and count down for "}". when the brace count is 0 and you see another "}", that one can be skipped and everything until the next method declaration.
If that's not precise enough, or your requirements are of a more serious nature, you'd have to get into parsing, e.g. using antlr or Javaparser. Here's a project that seems to do a similar task.

Learning Java Parser takes some amount of time. It isn't difficult, and there is a Java Doc Documentation Page available on the Internet. (See Here) ... But unfortunately, there isn't a lot of text to read in the documentation pages themselves. This class prints out the Method Bodies from a source-code file that is saved as a String.
Every method in the class is printed...
import com.github.javaparser.ast.*;
import com.github.javaparser.ast.stmt.BlockStmt;
import com.github.javaparser.ast.body.MethodDeclaration;
import com.github.javaparser.ast.visitor.VoidVisitor;
import com.github.javaparser.ast.visitor.VoidVisitorAdapter;
import com.github.javaparser.*;
import java.io.IOException;
import java.util.Optional;
public class MethodBody
{
static final String src =
"public class MyClass {" + '\n' +
" public void code() {" + '\n' +
" try {" + '\n' +
" /* do stuff */ " + '\n' +
" }" + '\n' +
" catch (Exception e) { }" + '\n' +
" }" + '\n' +
"}";
public static void main(String[] argv) throws IOException
{
CompilationUnit cu = StaticJavaParser.parse(src);
VoidVisitor<?> visitor = new VoidVisitorAdapter<Void>()
{
public void visit(MethodDeclaration md, Void arg)
{
System.out.println("Method Name: " + md.getName());
Optional<BlockStmt> optBody = md.getBody();
if (! optBody.isPresent()) System.out.println("No Method Body Definition\n");
System.out.println("Method Body:\n" + optBody.get().toString() + "\n\n");
}
};
visitor.visit(cu, null);
}
}
The above code will print this to terminal:
Method Name: code
Method Body:
{
try {
/* do stuff */
} catch (Exception e) {
}
}

Related

JavaParser - save modified source code file with new name

I'm using JavaParser to modify java source code. My goal is to read a single java source code file (ArithmeticClassToBeMutated) and store it in a compilation unit. Then, i'd like to replace/mutate its arithmetic operators (PLUS,MINUS,MULTIPLY,DIVISION,REMAINDER). All instances of an operator (e.g. plus) shall always be replaced with another one (e.g. minus). In the end, i want to have 4 output files:
One java source code file where every "Plus" became a "Minus",
one file where every "Plus" became a "Multiply",
one file where every "Plus" became a "Division", and
one file where every "Plus" became a "Remainder/Modulo). I can't type the symbols or else i get a formatting error.
In my code (see below), the replacement/modification itself works. Now, my question is: how can I change the name of the output source code files? I did it with the methods add and saveAll:
sourceRoot.add("", OperatorToBeMutated.name() + "_TO_" + arithmeticOperators[i].name() + "_MUTATED_"
+ cu.getStorage().get().getFileName(), cu);
sourceRoot.saveAll(
CodeGenerationUtils.mavenModuleRoot(ReplacementAO.class).resolve(Paths.get("output")));
However, this creates two output files for each operator replacement. One file has the same name as the input file, and one file has my naming convention. The content is the same. What can I do to only save a single file (with my own naming) for each loop? Specifying no name would result in the output file overwriting itself with each iteration, as the name stays the same.
Thank you!
public static String filename = "ArithmeticClassToBeMutated.java";
public static void main(String[] args) {
for (i = 0; i < arithmeticOperators.length; i++) {
if (arithmeticOperators[i] == OperatorToBeMutated) {
continue;
}
sourceRoot = new SourceRoot(
CodeGenerationUtils.mavenModuleRoot(ReplacementAO.class).resolve("src/main/resources"));
CompilationUnit cu = sourceRoot.parse("", filename);
cu.accept(new ModifierVisitor<Void>() {
#Override
public Visitable visit(BinaryExpr n, Void arg) {
if (n.getOperator() == OperatorToBeMutated && n.getLeft().isNameExpr()
&& n.getRight().isNameExpr()) {
n.setOperator(arithmeticOperators[i]);
comment.setContent("Here, the operator " + OperatorToBeMutated.name()
+ " was mutated with the operator " + arithmeticOperators[i].name() + ".");
n.setComment(comment);
}
return super.visit(n, arg);
}
}, null);
sourceRoot.add("", OperatorToBeMutated.name() + "_TO_" + arithmeticOperators[i].name() + "_MUTATED_"
+ cu.getStorage().get().getFileName(), cu);
sourceRoot.saveAll(
CodeGenerationUtils.mavenModuleRoot(ReplacementAO.class).resolve(Paths.get("output")));
}
}
You wouldn't want to rename your file name to be different from the class name, as a public java class needs to have the same name as its file name. As far as I am aware this will throw a compiler error for public classes:
Can I compile a java file with a different name than the class?
I would suggest putting the mutated classes into different folders. Just adding another directory at the end of your path will automatically create a new folder. So for your example:
sourceRoot.saveAll(CodeGenerationUtils.mavenModuleRoot(LogicalOperators.class).resolve(Paths.get("output" + "/mutation1")));

Adding a line in a method block of java code using python

I have a lot of java files wherein I have to search for a method, if present I have to add a line inside this method "If this line does not already exist". This line has to be added before the closing brace of the method.
So far I have the following code:
import os
import ntpath
extensions = set(['.java','.kt'])
for subdir, dirs, files in os.walk("/src/main"):
for file in files:
filepath = subdir + os.sep + file
extension = os.path.splitext(filepath)[1]
if extension in extensions:
if 'onCreate(' in open(filepath).read():
print (ntpath.basename(filepath))
if 'onPause' in open (filepath).read():
print ("is Activity and contains onPause\n")
#Check if Config.pauseCollectingLifecycleData(); is in this code bloack, if exists do nothing, if does not exist add to the end of code block before }
if 'onResume' in open (filepath).read():
print ("is Activity and contains onResume\n")
#Check if Config.resumeCollectingLifecycleData(); is in this code bloack, if exists do nothing, if does not exist add to the end of code block before }
But I am not sure where to go from here, Python not being my first language. Could I request to be guided in the right direction.
Example:
I am looking for a method with the following signature:
public void onPause(){
super.onPause();
// Add my line here
}
public void onPause(){
super.onPause();
Config.pauseCollectingLifecycleData(); // Line exists do nothing
}
This is actually quite difficult. First of all, your if "onPause" in sourcecode approach currently doesn't distinguish between defining onPause() and calling it. And second of all, finding the correct closing } isn't trivial. Naively, you might just count opening and closing curlies ({ increments the blocklevel, } decrements it), and assume that the } that makes the blocklevel zero is the closing curly of the method. However, this might be wrong! Because the method might contain some string literal containing (possibly unbalanced) curlies. Or comments with curlies. This would mess up the blocklevel count.
To do this properly, you would have to build an actual Java parser. That's a lot of work, even when using libraries such as tatsu.
If you're fine with a rather volatile kludge, you can try and use the blocklevel count mentioned above together with the indentation as a clue (assuming your source code is decently indented). Here's something I've hacked up as a starting point:
def augment_function(sourcecode, function, line_to_insert):
in_function = False
blocklevel = 0
insert_before = None
source = sourcecode.split("\n")
for line_no, line in enumerate(source):
if in_function:
if "{" in line:
blocklevel += 1
if "}" in line:
blocklevel -= 1
if blocklevel == 0:
insert_before = line_no
indent = len(line) - len(line.lstrip(" ")) + 4 #4=your indent level
break
elif function in line and "public " in line:
in_function = True
if "{" in line:
blocklevel += 1
if insert_before:
source.insert(insert_before, " "*indent + line_to_insert)
return "\n".join(source)
# test code:
java_code = """class Foo {
private int foo;
public void main(String[] args) {
foo = 1;
}
public void setFoo(int f)
{
foo = f;
}
public int getFoo(int f) {
return foo;
}
}
"""
print(augment_function(java_code, "setFoo", "log.debug(\"setFoo\")"))
Note that this is vulnerable to all sorts of edge cases (such as { in a string or in a comment, or tab indent instead of space, or possibly a thousand other things). This is just a starting point for you.

Match a string from a list and extract values

What would be the most efficient (low CPU time) way of achieving the following in Java ?
Let us say we have a list of strings as follows :
1.T.methodA(p1).methodB(p2,p3).methodC(p4)
2.T.methodX.methodY(p5,p6).methodZ()
3 ...
At runtime we get strings as follows that may match one of the strings in our list :
a.T.methodA(p1Value).methodB(p2Value,p3Value).methodC(p4Value) // Matches 1
b.T.methodM().methodL(p10) // No Match
c.T.methodX.methodY(p5Value,p6Value).methodZ() // Matches 2
I would like to match (a) to (1) and extract the values of p1,p2,p3 and p4
where:
p1Value = p1, p2Value = p2, p3Value = p3 and so on.
Similarly for the other matches like c to 2 for example.
The first method I have in mind is of course a regular expression.
But that could be complicated to update in the future or to handle hedge cases.
Instead you can try using the Nashorn engine, that allow you to exec javascript code in a jvm.
So you just need to create a special javascript object that handle all your methods:
private static final String jsLib = "var T = {" +
"results: new java.util.HashMap()," +
"methodA: function (p1) {" +
" this.results.put('p1', p1);" +
" return this;" +
"}," +
"methodB: function (p2, p3) {" +
" this.results.put('p2', p2);" +
" this.results.put('p3', p3);" +
" return this;" +
"}," +
"methodC: function (p4) {" +
" this.results.put('p4', p4);" +
" return this.results;" +
"}}";
This is a string for semplicity, than handle your first case.
You can write the code in a js file and load that one easely.
You create a special attribute in your javascript object, that is a Java HashMap, so you get that as the result of the evaluation, with all the values by name.
So you just eval the input:
ScriptEngine engine = new ScriptEngineManager().getEngineByName("nashorn");
final String inputSctipt = "T.methodA('p1Value').methodB('p2Value','p3Value').methodC('p4Value')";
try {
engine.eval(jsLib);
Map<String, Object> result = (Map<String, Object>)engine.eval(inputSctipt);
System.out.println("Script result:\n" + result.get("p1"));
} catch (ScriptException e) {
e.printStackTrace();
}
And you got:
Script result:
p1Value
In the same way you can get all the other values.
You need to ignore the script errors, are they should be path not implemented.
Just remember to reset the script context before each evaluation, in order to avoid to mix with previous values.
The advantage of this solution compared to regular expressions is that is easy to understand, easy to update when needed.
The only disadvantage I can see is the knowledge of Javascript, of course, and the performances.
You didn't mention the performances as an issue, so you can try this way if is fine for your need.
If you need a better peroformance than you should look on regular expressions.
UPDATE
To have a more complete answer, here is the same example with regular expressions:
Pattern p = Pattern.compile("^T\\.methodA\\(['\"]?(.+?)['\"]?\\)\\.methodB\\(['\"]?([^,]+?)['\"]?,['\"]?(.+?)['\"]?\\)\\.methodC\\(['\"]?(.+?)['\"]?\\)$");
Matcher m = p.matcher(inputSctipt);
if (m.find()) {
System.out.println("With regexp:\n" + m.group(1));
}
Please be aware that this expression didn't handle hedge cases, and you're going to need a reg exp for each string you want to parse and grab the attribute values.

Automated conversion from string concatenation to formatted arguments

Our code is littered with things like,
Log.d("Hello there " + x + ", I see your are " + y + " years old!");
I want to be able to script the conversion to something like this,
Log.d("Hello there %s, I see your are %d years old!", x, y);
(Note: I'm not worried about getting the right argument type now. I could pre-process the file to determine the types, or convert to always use strings. Not my concern right now.)
I am wondering if anyone has tackled this. I came up with these regexs for pulling out the static and variable parts of the strings,
static final Pattern P1 = Pattern.compile("\\s*(\".*?\")\\s*");
static final Pattern P2 = Pattern.compile("\\s*\\+?\\s*([^\\+]+)\\s*\\+?\\s*");
By looping on find() for each I can pull out the parts,
"Hello there "
", I see your are "
"years old!"
and,
x
y
But I can't come up with a good way to piece these back together, considering all the possibilities of how they might be concatenated together.
Maybe this is the wrong approach. Should I be trying to pull out, then replace the variable part with the format argument?
If you would replace everything to %s, you could do this:
(ps.: Assuming well formatted code in terms of whitespaces)
Keep resolving from RIGHT to LEFT, as parameter position is important.
1.) Run this regex to resolve everything of the form Log.d({something} + var) to Log.d({something}, var)
(Log\.d\(.*?)\"\s*\+\s*([^\s]+)(\+)?(\))
with replacement
$1%s", $2$4
(https://regex101.com/r/hY2iK6/8)
2.) Now, You need to take care about every variable occuring between strings:
Keep running this regex, until no replacements appear:
(Log\.d\(.*)(\"\s*\+\s*([^\s]+)\s*\+\s*\")(.*?\"),([^\"]+);
with replacement
$1%s$4,$3,$5;
After run 1: https://regex101.com/r/hY2iK6/10
After run 2: https://regex101.com/r/hY2iK6/11
3.) Finally, you need to resolve the Strings containing a leading variable - which is no problem:
(Log\.d\()([^\"]+)\s+\+\s*\"(.*?),([^"]+;)
with replacement
$1"%s$3,$2,$4
https://regex101.com/r/hY2iK6/9
There might be cases not covered, but it should give you an idea.
I added the Log.d to the matchgroups as well as its part of the replacement, so you could as well use Log\.(?:d|f|e) if you like,
You can use the following regex to capture all the arguments and strings together in one go. Therefore you can figure out exactly where the arguments are meant to fit into the overall string using the pairings.
(?:(\w+)\s*\+\s*)?"((?:[^"\\]|\\.)*+)"(?:\s*\+\s*(\w+))?
Regex demo here. (Thanks to nhahtdh for the improved version.)
It will find all the concatenations as part of Log.d in the format:
[<variable> +] <string> [+ <variable>]
Where [] denotes an optional part.
With that you can form the appropriate replacements, take the following example:
import java.util.regex.Pattern;
import java.util.regex.Matcher;
import java.lang.StringBuilder;
import java.util.List;
import java.util.ArrayList;
class Main {
public static void main(String[] args) {
String log = "Log.d(\"Hello there \" + x + \", I see your are \" + y + \" years old!\");";
System.out.println("Input: " + log);
Pattern p = Pattern.compile("(?:(\\w+)\\s*\\+\\s*)?\"((?:[^\"\\\\]|\\\\.)*+)\"(?:\\s*\\+\\s*(\\w+))?");
Matcher m = p.matcher(log);
StringBuilder output = new StringBuilder(25);
List<String> arguments = new ArrayList<String>(5);
output.append("Log.d(\"");
while (m.find()) {
if (m.group(1) != null) {
output.append("%s");
arguments.add(m.group(1));
}
output.append(m.group(2));
if (m.group(3) != null) {
output.append("%s");
arguments.add(m.group(3));
}
}
output.append("\"");
for (String arg : arguments) {
output.append(", ").append(arg);
}
output.append(");");
System.out.println("Output: " + output);
}
}
Input: Log.d("Hello there " + x + ", I see your are " + y + " years old!");
Output: Log.d("Hello there %s, I see your are %s years old!", x, y);
Java demo here.

"null" logic error - Java

Hi so I'm relatively new at Java, I have about 2 months of experience, so please try to answer my question using terms and code relevant to my learning level.
So, I have to make a program for school that makes a letter, fitting the following format:
Dear recipient name:
blank line
first line of the body
second line of the body
. . .
last line of the body
blank line
Sincerely,
blank line
sender name
my code looks like this:
private String body;
private String letter;
public Letter(String from, String to)
{
letter = ("Dear " + to + ":" + "\n" + "\n" + body + "\n" + "Sincerely," + "\n" + "\n" + from);
body = "";
}
public void addLine(String line)
{
body = body + line + "\n";
}
public String getText()
{
return letter;
}
Ive tried several different ways to get this program done, and the one that yields the best results is this one.. The thing is, we're only supposed to use two instance fields max. It seems that it's null because body isn't given a value in my constructor. There's also a program tester class that looks like this:
public class LetterTester
{
public static void main(String [] args)
{
Letter tyler = new Letter("Mary", "John");
tyler.addLine("I am sorry we must part.");
tyler.addLine("I wish you all the best.");
System.out.println(tyler.getText());
}
}
i skipped all the default stuff and some braces and theres no syntax errors, but when i run the tester class, I get:
Dear John:
null
Sincerely,
Mary
What am I doing wrong, and can someone please give a solution as to how to get rid of null? Keep in mind I can only use two instance fields, thanks.
body is null because that is the default value for reference fields. You could initialize it to empty string body = "" instead. That would work with your addLine() code. You should also move constructing the content from the constructor to getText(). In the constructor the required data is not yet available.
Also consider using a StringBuilder. That's usually better choise than + when you need to make several concatenations.
Edit: (after a clarifying comment by the OP, and myself reading the question better)
In the constructor you can start the letter like:
body = "Dear " + to + ":" + "\n\n";
sender = from;
Here I made sender a field. You don't need the letter field, so you can still stay at the max 2 fields limit.
You will have to initialize the body variable with an empty string. Otherwise its initialized as null, and thereby you cannot append anything to the string as you are doing in function addLine()

Categories