How do we modify the source of original Java classes? - java

Hi all I was wondering if I could modify and recompile a Java base class?
I would like to add functions to existing classes and be able to call these functions.
For example, I would like to add a function to java.lang.String, recompile it and use it for my project:
public char[] getInternalValue(){
return value;
}
I was wondering how do we go about doing that?

What you're referring to is called "monkey patching". It's possible in Java, but it isn't advisable and the results can be... uhh interesting. You can download the source for the String class, pop it into a JAR and prepend the bootclasspath with:
-Xbootclasspath/p:MonkeyPatchedString.jar
to replace the built-in String class with your own.
There's an interesting paper on this very subject here.

If you do it, you get incompatible with the java.lang.String class, and with all classes, relying on java.lang.String, which is very, very rarely a good idea.
A second problem could be the license. For self-studying it is perfectly fine, but if you publish your code (compiled or in source) you should read the license terms carefully before.
Since the String class is declared final, you can't even inherit from String, and implement your PacerierString, which seems useful at first sight. But there are so many people, who would have implemented their little helpers, that we would get a lot of SpecialString classes from everywhere.
A common practice would be people, writing a class Foo, and adding a method
public Foo toFoo () {
// some conversion for String representation of Foo
}
to their UniversalToolString.
You may, however, write a Wrapper, which contains a String. You might not pass your Wrapper to a method, which expects a String, but you would need to call its 'toString ()' method, if that happens to be a good candidate for that purpose.
Foo foo = new Foo ("foobar", 42);
foo.setMagic (foo.toString ().length);

Don't do that.
If you want to be evil, you can use reflection to access the byte array of a string. But of course there's no guarantee that that field will exist in the future as is... ok, it probably will, but caveat emptor.

Related

Using 'diamond' notation for methods in java

I'm currently working on a component-based architecture management system in java. My current implementation of the retrieval of a component attached to an object works like this:
// ...
private final HashMap<Class<? extends EntityComponent>, EntityComponent> components;
// ...
public <T extends EntityComponent> T getComponent(Class<T> component)
{
// ... some sanity checks
if (!this.hasComponent(component))
{
// ... some exception handling stuff
}
return component.cast(this.components.get(component));
}
// ...
Now, this works fine, but it somewhat bugs me to have to write
object.getComponent(SomeComponent.class)
everytime I need to access a component.
Would it be possible to utilize generics in a way to shift the syntax to something more along the lines of
object.getComponent<SomeComponent>()
, utilizing the diamond operator to specify the class, instead of passing the class of the component as a parameter to the method?
I know it's not really a big thing, but making the syntax of often used code as pretty / compact as possible goes a long way I guess.
Unfortunately not, since type-parameters are "erased" in Java. That means that they are only available at compile-time (where the compiler is using them to type-check the code), but not at run-time.
So when your code is running, the <SomeComponent> type-parameter no longer exists, and your code therefore can't do any operations (if/else, etc) based on its value.
In other words:
At compile time, your method call looks like this: object.getComponent<SomeComponent>()
But after compilation your method call just looks like this object.getComponent(). There is no type-parameter any more.
So, yes, unfortunately you still need to pass a Class object along, or something similar (see "Super Type Tokens" for example), if you need to do something that depends on the type parameter at run-time.
The reason the Class workaround works is that it loosely speaking represents the type-parameter, since the type-checker makes sure that its instance fits with the type-parameter, but is an object and thus available at run-time too - unlike the type-parameter.
Note: The Class-trick doesn't work for type-parameters within type-parameters, such as Class<List<Something>>, since at run-time List<Something> and List<OtherThing> is the same class, namely List. So you can't make a Class token to differentiate between those two types. As far as i remember "Super Type Tokens" can be used instead to fix this (they exploit the fact that there is an exception to erasure: For subclasses of generic classes, the type-parameters used when "extending" the superclass are actually available at run-time through reflection. (there are also more exceptions: https://stackoverflow.com/a/2320725/1743225)).
(Related google terms: "Erasure", "Reification", "Reified generics")

This method has two signatures -- so why the error? [duplicate]

I'm sure you all know the behaviour I mean - code such as:
Thread thread = new Thread();
int activeCount = thread.activeCount();
provokes a compiler warning. Why isn't it an error?
EDIT:
To be clear: question has nothing to do with Threads. I realise Thread examples are often given when discussing this because of the potential to really mess things up with them. But really the problem is that such usage is always nonsense and you can't (competently) write such a call and mean it. Any example of this type of method call would be barmy. Here's another:
String hello = "hello";
String number123AsString = hello.valueOf(123);
Which makes it look as if each String instance comes with a "String valueOf(int i)" method.
Basically I believe the Java designers made a mistake when they designed the language, and it's too late to fix it due to the compatibility issues involved. Yes, it can lead to very misleading code. Yes, you should avoid it. Yes, you should make sure your IDE is configured to treat it as an error, IMO. Should you ever design a language yourself, bear it in mind as an example of the kind of thing to avoid :)
Just to respond to DJClayworth's point, here's what's allowed in C#:
public class Foo
{
public static void Bar()
{
}
}
public class Abc
{
public void Test()
{
// Static methods in the same class and base classes
// (and outer classes) are available, with no
// qualification
Def();
// Static methods in other classes are available via
// the class name
Foo.Bar();
Abc abc = new Abc();
// This would *not* be legal. It being legal has no benefit,
// and just allows misleading code
// abc.Def();
}
public static void Def()
{
}
}
Why do I think it's misleading? Because if I look at code someVariable.SomeMethod() I expect it to use the value of someVariable. If SomeMethod() is a static method, that expectation is invalid; the code is tricking me. How can that possibly be a good thing?
Bizarrely enough, Java won't let you use a potentially uninitialized variable to call a static method, despite the fact that the only information it's going to use is the declared type of the variable. It's an inconsistent and unhelpful mess. Why allow it?
EDIT: This edit is a response to Clayton's answer, which claims it allows inheritance for static methods. It doesn't. Static methods just aren't polymorphic. Here's a short but complete program to demonstrate that:
class Base
{
static void foo()
{
System.out.println("Base.foo()");
}
}
class Derived extends Base
{
static void foo()
{
System.out.println("Derived.foo()");
}
}
public class Test
{
public static void main(String[] args)
{
Base b = new Derived();
b.foo(); // Prints "Base.foo()"
b = null;
b.foo(); // Still prints "Base.foo()"
}
}
As you can see, the execution-time value of b is completely ignored.
Why should it be an error? The instance has access to all the static methods. The static methods can't change the state of the instance (trying to is a compile error).
The problem with the well-known example that you give is very specific to threads, not static method calls. It looks as though you're getting the activeCount() for the thread referred to by thread, but you're really getting the count for the calling thread. This is a logical error that you as a programmer are making. Issuing a warning is the appropriate thing for the compiler to do in this case. It's up to you to heed the warning and fix your code.
EDIT: I realize that the syntax of the language is what's allowing you to write misleading code, but remember that the compiler and its warnings are part of the language too. The language allows you to do something that the compiler considers dubious, but it gives you the warning to make sure you're aware that it could cause problems.
They cannot make it an error anymore, because of all the code that is already out there.
I am with you on that it should be an error.
Maybe there should be an option/profile for the compiler to upgrade some warnings to errors.
Update: When they introduced the assert keyword in 1.4, which has similar potential compatibility issues with old code, they made it available only if you explicitly set the source mode to "1.4". I suppose one could make a it an error in a new source mode "java 7". But I doubt they would do it, considering that all the hassle it would cause. As others have pointed out, it is not strictly necessary to prevent you from writing confusing code. And language changes to Java should be limited to the strictly necessary at this point.
Short answer - the language allows it, so its not an error.
The really important thing, from the compiler's perspective, is that it be able to resolve symbols. In the case of a static method, it needs to know what class to look in for it -- since it's not associated with any particular object. Java's designers obviously decided that since they could determine the class of an object, they could also resolve the class of any static method for that object from any instance of the object. They choose to allow this -- swayed, perhaps, by #TofuBeer's observation -- to give the programmer some convenience. Other language designers have made different choices. I probably would have fallen into the latter camp, but it's not that big of a deal to me. I probably would allow the usage that #TofuBeer mentions, but having allowed it my position on not allowing access from an instance variable is less tenable.
Likely for the same logical that makes this not an error:
public class X
{
public static void foo()
{
}
public void bar()
{
foo(); // no need to do X.foo();
}
}
It isn't an error because it's part of the spec, but you're obviously asking about the rationale, which we can all guess at.
My guess is that the source of this is actually to allow a method in a class to invoke a static method in the same class without the hassle. Since calling x() is legal (even without the self class name), calling this.x() should be legal as well, and therefore calling via any object was made legal as well.
This also helps encourage users to turn private functions into static if they don't change the state.
Besides, compilers generally try to avoid declaring errors when there is no way that this could lead to a direct error. Since a static method does not change the state or care about the invoking object, it does not cause an actual error (just confusion) to allow this. A warning suffices.
The purpose of the instance variable reference is only to supply the type which encloses the static. If you look at the byte code invoking a static via instance.staticMethod or EnclosingClass.staticMethod produces the same invoke static method bytecode. No reference to the instance appears.
The answer as too why it's in there, well it just is. As long as you use the class. and not via an instance you will help avoid confusion in the future.
Probably you can change it in your IDE (in Eclipse Preferences -> Java -> Compiler -> Errors/Warnings)
There's not option for it. In java (like many other lang.) you can have access to all static members of a class through its class name or instance object of that class. That would be up to you and your case and software solution which one you should use that gives you more readability.
It's pretty old topic but still up-to-date and surprisingly bringing higher impact nowadays. As Jon mentioned, it might be just a mistake Java's designers made at the very beginning. But I wouldn't imagine before it can have impact on security.
Many coders know Apache Velocity, flexible and powerful template engine. It's so powerful that it allows to feed template with a set of named objects - stricly considered as objects from programming language (Java originally). Those objects can be accessed from within template like in programming language so for example Java's String instance can be used with all its public fields, properties and methods
$input.isEmpty()
where input is a String, runs directly through JVM and returns true or false to Velocity parser's output). So far so good.
But in Java all objects inherit from Object so our end-users can also put this to the template
$input.getClass()
to get an instance of String Class.
And with this reference they can also call a static method forName(String) on this
$input.getClass().forName("java.io.FileDescriptor")
use any class name and use it to whatever web server's account can do (deface, steal DB content, inspect config files, ...)
This exploit is somehow (in specific context) described here: https://github.com/veracode-research/solr-injection#7-cve-2019-17558-rce-via-velocity-template-by-_s00py
It wouldn't be possible if calling static methods from reference to the instance of class was prohibited.
I'm not saying that a particular programming framework is better than the other one or so but I just want to put a comparison. There's a port of Apache Velocity for .NET. In C# it's not possible to call static methods just from instance's reference what makes exploit like this useless:
$input.GetType().GetType("System.IO.FileStream, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089")
I just consider this:
instanceVar.staticMethod();
to be shorthand for this:
instanceVar.getClass().staticMethod();
If you always had to do this:
SomeClass.staticMethod();
then you wouldn't be able to leverage inheritance for static methods.
That is, by calling the static method via the instance you don't need to know what concrete class the instance is at compile time, only that it implements staticMethod() somewhere along the inheritance chain.
EDIT: This answer is wrong. See comments for details.

How to get the convert the Class object of a primitive type to it's wrapper's Class

I'm working with reflection read and writing objects. I have the problem that I am reading in primitive types; but want to tell the read method to read them as their wrapper (so read a char as a Char). It seems as if there should be a simple static method I can call which would take the primitive Class and return it's wrapper's Class. so for example I could provide char.class and get the Char object's class returned.
I know it's easy enough to hard code this, but that looks ugly; and it seems like this would come up common enough to be worth Sun including a static helper method. I've looked and can't seem to find it, but I still find it hard to believe the method doesn't exist. Can anyone point me to the name of the method?
Thanks.
There appears to be nothing in the JDK, but there's the Primitives class in Guava, which allows you to write:
Primitives.wrap(char.class)
to get Character.class.
In the JDK's tools.jar there's is typeUtils.boxedClass(primitive.type) which deals with Symbol objects. I know it's not what you want, but I'll throw it in for reference and in case what you look for is actually around it.

Was the class literal syntax necessary?

I know well, what is a class literal in java, I just wonder, what is the reason for the .class in the syntax. Is there any ambiguity removed by this? I mean, wouldn't an alternative Java syntax using
Class<String> c = String;
instead of
Class<String> c = String.class;
work? To me the class keyword looks like a boilerplate.
Sure, you could make that the syntax. But using the .class suffix makes the compiler's job easier; it has to do less work to know that the code is syntactically correct.
Without the suffix, the compiler would have to work harder to understand the difference between this:
String.getName() // a method inherited from java.lang.Class<T>
and this:
String.valueOf(...) // a static method from java.lang.String
If you don't think that the .class suffix is needed, do you also think that the f and L suffices are useless (for float and long literals, respectively)?
It's just not the same thing. String is a class of type string, and String.member is one of its member variables, String.method() would be one of its methods.
String.class is an object of type Class that defines String. It seems a lot more intuitive that you need to specify .class to indicate that you're trying to refer to an object of type Class.
Not to mention that it's easier to parse this kind of construct, and potentially prevents bugs where you're accidentally returning a Class object when you didn't mean to.
This is even more relevant when you're looking at inner classes, like OuterClass.InnerClass.class.
To work with Matt's example: How would you work on the class object without having to create a temporary variable first? Assuming your class Foo has a static method called getClasses, how would you differentiate between Foo.getClasses and Foo.class.getClasses?
String is the String class pseudo-object which provides access to the classes static fields and methods, including class, which refers to the Class instance which describes the String class. So they are distinct, but because Java doesn't have the metaclass arrangement of (say) Smalltalk-80 this isn't very clear.
You could certainly make String and String.class synonymous if you wanted to, but I think there is a valid basis for the distinction.
Let's use integer as an example:
Class<Integer> c = Integer; // your proposal
int i = Integer.MAX_VALUE; // compare with below
int j = c.MAX_VALUE; // hmm, not a big fan, personally
It just doesn't seem to flow, in my opinion. But that's just my opinion :)

Why isn't calling a static method by way of an instance an error for the Java compiler?

I'm sure you all know the behaviour I mean - code such as:
Thread thread = new Thread();
int activeCount = thread.activeCount();
provokes a compiler warning. Why isn't it an error?
EDIT:
To be clear: question has nothing to do with Threads. I realise Thread examples are often given when discussing this because of the potential to really mess things up with them. But really the problem is that such usage is always nonsense and you can't (competently) write such a call and mean it. Any example of this type of method call would be barmy. Here's another:
String hello = "hello";
String number123AsString = hello.valueOf(123);
Which makes it look as if each String instance comes with a "String valueOf(int i)" method.
Basically I believe the Java designers made a mistake when they designed the language, and it's too late to fix it due to the compatibility issues involved. Yes, it can lead to very misleading code. Yes, you should avoid it. Yes, you should make sure your IDE is configured to treat it as an error, IMO. Should you ever design a language yourself, bear it in mind as an example of the kind of thing to avoid :)
Just to respond to DJClayworth's point, here's what's allowed in C#:
public class Foo
{
public static void Bar()
{
}
}
public class Abc
{
public void Test()
{
// Static methods in the same class and base classes
// (and outer classes) are available, with no
// qualification
Def();
// Static methods in other classes are available via
// the class name
Foo.Bar();
Abc abc = new Abc();
// This would *not* be legal. It being legal has no benefit,
// and just allows misleading code
// abc.Def();
}
public static void Def()
{
}
}
Why do I think it's misleading? Because if I look at code someVariable.SomeMethod() I expect it to use the value of someVariable. If SomeMethod() is a static method, that expectation is invalid; the code is tricking me. How can that possibly be a good thing?
Bizarrely enough, Java won't let you use a potentially uninitialized variable to call a static method, despite the fact that the only information it's going to use is the declared type of the variable. It's an inconsistent and unhelpful mess. Why allow it?
EDIT: This edit is a response to Clayton's answer, which claims it allows inheritance for static methods. It doesn't. Static methods just aren't polymorphic. Here's a short but complete program to demonstrate that:
class Base
{
static void foo()
{
System.out.println("Base.foo()");
}
}
class Derived extends Base
{
static void foo()
{
System.out.println("Derived.foo()");
}
}
public class Test
{
public static void main(String[] args)
{
Base b = new Derived();
b.foo(); // Prints "Base.foo()"
b = null;
b.foo(); // Still prints "Base.foo()"
}
}
As you can see, the execution-time value of b is completely ignored.
Why should it be an error? The instance has access to all the static methods. The static methods can't change the state of the instance (trying to is a compile error).
The problem with the well-known example that you give is very specific to threads, not static method calls. It looks as though you're getting the activeCount() for the thread referred to by thread, but you're really getting the count for the calling thread. This is a logical error that you as a programmer are making. Issuing a warning is the appropriate thing for the compiler to do in this case. It's up to you to heed the warning and fix your code.
EDIT: I realize that the syntax of the language is what's allowing you to write misleading code, but remember that the compiler and its warnings are part of the language too. The language allows you to do something that the compiler considers dubious, but it gives you the warning to make sure you're aware that it could cause problems.
They cannot make it an error anymore, because of all the code that is already out there.
I am with you on that it should be an error.
Maybe there should be an option/profile for the compiler to upgrade some warnings to errors.
Update: When they introduced the assert keyword in 1.4, which has similar potential compatibility issues with old code, they made it available only if you explicitly set the source mode to "1.4". I suppose one could make a it an error in a new source mode "java 7". But I doubt they would do it, considering that all the hassle it would cause. As others have pointed out, it is not strictly necessary to prevent you from writing confusing code. And language changes to Java should be limited to the strictly necessary at this point.
Short answer - the language allows it, so its not an error.
The really important thing, from the compiler's perspective, is that it be able to resolve symbols. In the case of a static method, it needs to know what class to look in for it -- since it's not associated with any particular object. Java's designers obviously decided that since they could determine the class of an object, they could also resolve the class of any static method for that object from any instance of the object. They choose to allow this -- swayed, perhaps, by #TofuBeer's observation -- to give the programmer some convenience. Other language designers have made different choices. I probably would have fallen into the latter camp, but it's not that big of a deal to me. I probably would allow the usage that #TofuBeer mentions, but having allowed it my position on not allowing access from an instance variable is less tenable.
Likely for the same logical that makes this not an error:
public class X
{
public static void foo()
{
}
public void bar()
{
foo(); // no need to do X.foo();
}
}
It isn't an error because it's part of the spec, but you're obviously asking about the rationale, which we can all guess at.
My guess is that the source of this is actually to allow a method in a class to invoke a static method in the same class without the hassle. Since calling x() is legal (even without the self class name), calling this.x() should be legal as well, and therefore calling via any object was made legal as well.
This also helps encourage users to turn private functions into static if they don't change the state.
Besides, compilers generally try to avoid declaring errors when there is no way that this could lead to a direct error. Since a static method does not change the state or care about the invoking object, it does not cause an actual error (just confusion) to allow this. A warning suffices.
The purpose of the instance variable reference is only to supply the type which encloses the static. If you look at the byte code invoking a static via instance.staticMethod or EnclosingClass.staticMethod produces the same invoke static method bytecode. No reference to the instance appears.
The answer as too why it's in there, well it just is. As long as you use the class. and not via an instance you will help avoid confusion in the future.
Probably you can change it in your IDE (in Eclipse Preferences -> Java -> Compiler -> Errors/Warnings)
There's not option for it. In java (like many other lang.) you can have access to all static members of a class through its class name or instance object of that class. That would be up to you and your case and software solution which one you should use that gives you more readability.
It's pretty old topic but still up-to-date and surprisingly bringing higher impact nowadays. As Jon mentioned, it might be just a mistake Java's designers made at the very beginning. But I wouldn't imagine before it can have impact on security.
Many coders know Apache Velocity, flexible and powerful template engine. It's so powerful that it allows to feed template with a set of named objects - stricly considered as objects from programming language (Java originally). Those objects can be accessed from within template like in programming language so for example Java's String instance can be used with all its public fields, properties and methods
$input.isEmpty()
where input is a String, runs directly through JVM and returns true or false to Velocity parser's output). So far so good.
But in Java all objects inherit from Object so our end-users can also put this to the template
$input.getClass()
to get an instance of String Class.
And with this reference they can also call a static method forName(String) on this
$input.getClass().forName("java.io.FileDescriptor")
use any class name and use it to whatever web server's account can do (deface, steal DB content, inspect config files, ...)
This exploit is somehow (in specific context) described here: https://github.com/veracode-research/solr-injection#7-cve-2019-17558-rce-via-velocity-template-by-_s00py
It wouldn't be possible if calling static methods from reference to the instance of class was prohibited.
I'm not saying that a particular programming framework is better than the other one or so but I just want to put a comparison. There's a port of Apache Velocity for .NET. In C# it's not possible to call static methods just from instance's reference what makes exploit like this useless:
$input.GetType().GetType("System.IO.FileStream, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089")
I just consider this:
instanceVar.staticMethod();
to be shorthand for this:
instanceVar.getClass().staticMethod();
If you always had to do this:
SomeClass.staticMethod();
then you wouldn't be able to leverage inheritance for static methods.
That is, by calling the static method via the instance you don't need to know what concrete class the instance is at compile time, only that it implements staticMethod() somewhere along the inheritance chain.
EDIT: This answer is wrong. See comments for details.

Categories