Can builds from the same source code yield functionally different executables?

Can builds from the same source code yield functionally different executables? - java

Recently, a colleague of mine said something along these lines: "consecutive APKs (executables) produced by build server from the same source code might not be the same". The context for this discussion was whether QA performed on build X also applies to build Y, which was performed by the same build server (configured the same way) from the same source code.
I think that generated executables might not be identical due to various factors (e.g. different timestamp), but the question is whether they can be functionally different.
The only scenario, that I can think of, in which the same source code could produce different functionality is that of multi-threading issue: in case of incorrect synchronization of multi-threaded code, different re-ordering/optimization actions performed at compile time could affect this poorly synchronized code and change its functional behavior.
My questions are:
Is it true that consecutive builds performed by the same build server from the same source code can be functionally different?
If #1 is true, are these differences limited to incorrectly synchronized multi-threaded code?
If #2 is false, what are the other parts that can change?
Links to any related material will be appreciated.

It's certainly possible in a few cases. I'll assume you are using Gradle to build your Android app.
Case 1: You are using a 3rd party dependency that's included with a version wildcard, such as:
compile somelib.1+
It's possible for the dependency to change in this case, which is why it's highly recommended to use explicit dependency versions.
Case 2: You're injecting environment information into your app using Gradle's buildConfigFields. These values will be injected into your app's BuildConfig class. Depending on how you use those values, the app behavior could vary on consecutive builds.
Case 3: You update the JDK on your CI in-between consecutive builds. It's possible, though I'd assume highly unlikely, that your app behavior could change depending on how it's compiled. For example, you might be hitting an edge case in the JDK that gets fixed in a later version, causing code that previously worked before to act differently.
I think this answers your first question and second question.
edit: sorry, I think I missed some important info from your OP. My case 2 is an example of your e.g. different timestamp and case 3 violates your configured the same way. I'll leave the answer here though.

I think that different functionality may be caused only by discrepancies in environment or maybe you are using snapshot version of some 3rd party library, and thus it was updated after some time.
some advice:
if it possible to rebuild it, use verbose mode of build tool (-X in maven for example) and compare output line by line with some diff program

If the same source code could produce different results on the same machine / configuration, programming as we know it would probably not be possible.
There is always an option that things break, when the language level, operating system, or some other dependency changes. If all that changes it the time of the build, you would have to do something fundamentally wrong.
Using android / gradle, one possible reason to lead to a different behavior or errors in general is using + in your build.gradle file for library versions. This is why you should avoid doing so, since a consecutive build could fetch a newer / different version, hence you'd have different source code, and thus it could create a functional different executable.
A good build should always be repeatable. This means given the same configuration it should have the same results. If it isn't, you could never rely on anything and would have to do total regression testing on everything.
[...] consecutive builds performed by the same build server from the same source code can be functionally different
No. As described above, if you use the same versions, the same source code, it should produce the same behavior. Unless you do something very wrong.
[...] are these differences limited to incorrectly synchronized multi-threaded code?
This would imply a bug with your compiler. While this is possible, it is extremely unlikely.
[...] what are the other parts that can change?
Besides the timestamp and the build number nothing else should change, given the same source code and configuration.
It is always a good idea to include unit (and other) tests in your build. This way you can test specific behavior to be the same with each build.

They should be identical,except:
there is threading/optimization issues in build system.
hardware failures CPU/RAM/HDD issues on build environment
time/platform related code in build system itself or build scripts
So if you are building exact same code on exact same HW using exact same version of build system, same OS version and your code DO NOT SPECIALLY DEPEND from build time result should be same. They even should have exact same check sums and size.
Also results is same ONLY if your code do not depend on external modules which is downloaded from Internet at build time like Gradle/Maven does - you can't grantee this libraries the same because of they are not in version control. Moreover there is can be dependency where module version specified not exactly (like 2.0.+) so if maintainer updated this module your build system will use updated one -> so basically your builds generated from different source code.
As somebody mention using Unit tests on build server is good practice to make sure your build is stable and don't contain obvious bugs.

While this question addresses Java/Android, Jon Skeet blogged about different C# parsers treating some Unicode characters differently, mostly due to changes in the Unicode character database.
In his examples, the Mongolian Vowel Separator (U+180E) is considered either a whitespace character or a character allowed within an identifier, yielding different results in variable assignments.

It is definately possible. You can construct an example program that will behave different in functionality everytime you start it up.
Imagine a strategy design pattern that lets you choose between algorithms during runtime and you load one algorithm based on RNG.

Related

Java snippet based AST-Manipulation before/during compilation

The question is whether the functionality I describe below already exists, or whether I need to make an attempt at creating it myself. I am aware that I am probably looking at a lot of work if it does not exist yet, and I am also sure that others have already tried. I am nevertheless grateful for comments such as "project A tried this, but..." or "dude D already failed because...". If somebody has an overall more elegant solution, that would of course be welcome as well.
I want to change the way I develop (private) Java code by introducing a multiplexing layer. What I mean by that is that I want to be able to create library-like parameterizable AST-snippets, which I want to insert into my code via some sort of placeholders (such as annotations). I am aware of project https://projectlombok.org/ and have found that, while I find it useful for small applications, it does not generally suit my requirements, as it does not seem possible to insert own snippets without forking the entire project and making major modifications. Also lombok only ever modifies a single file at a time, while I am looking for a solution that will need to 'know' multiple files at a time.
I imagine a structure like this:
Source S: (Parameterizable) AST-snippets that can be included via some sort of reference in Source A.
Source A: Regular Java-Code, in which I can reference snippets from Source A. This code will not be compiled directly, as it is lacking the referenced snippets, and would thus throw a lot of compile time exceptions.
Source T: Target Source, which is an AST-equivalent copy of Source A, except that all references of AST-Snippets have been replaced by their respective Snippet from Source S. It needs to be mappable to the original Source A as well as the resolved snippets from Source S, where applicable, as most development will happen there.
I see several challenges with this concept, not the least of which are debuggability, source-mapping and compatibility with other frameworks/APIs. Also, it seems a challenge to work around the one-file-at-a-time limitation, memory wise.
The advantage over lombok would be flexibility, since lombok only provides a fixed set of snippets for specific purposes, whereas this would enable devs to write own snippets, or make modifications to getters, setters etc. Also, lombok 'quirks' into the compilation step, and does not output the 'fused' source, afaik.
I want to target at least javac and eclipse's ecj compilers.

Multiple versions in a web application: duplication or messy code?

I was used to manage versions with a tag in Git. But that was a long time ago, for stand-alone applications. Now the problem is that I have a web application, and at the same application might connect clients that expect to communicate to different versions of the application.
So, I added to the input a path variable for the version in that way :
#PathParam("version") String version
And the client can specify the version in the URL:
https://whatever.com/v.2/show
Then across the code I added conditions like this:
if(version.equals("v.2") {
// Do something
}
else if(version.equals("v.3") {
// Do something else
}
else {
// Or something different
}
The problem is that my code is becoming very messy. So I decided to do in a different way. I added this condition only in one point of the code, and from there I call different classes according to the version:
MyClassVersion2.java
MyClassVersion3.java
MyClassVersion4.java
The problem now is that I have a lot of duplication.
And I want to solve this problem as well. How can I do now to have a web application that:
1) Deal with multiple versions
2) It is not messy (with a lot of conditions)
3) Doesn't have much duplication

Normally, when we speak of an old version of an application, we mean that the behavior and appearance of that version is cast in stone and does not change. If you make even the slightest modification to the source files of that application, then its behavior and/or appearance may change, (and according to Murphy's law it will change,) which is unacceptable.
So, if I were you, I would lock all the source files of the old version in the source code repository, so that nobody can commit to them, ever. This approach solves the problem and dictates how you have to go about everything else: Every version would have to have its own set of source files which would be completely unrelated to the source files of all other versions.
Now, if the old versions of the application must have something in common with the newest version, and this thing changes, (say, the database,) then we are not exactly talking about different versions of the application, we have something more akin to different skins: The core of the application evolves, but users who picked a skin some time ago are allowed to stick with that skin. In this case, the polymorphism solution which has already been suggested by others might be a better approach.

your version number is in a place in the URL named the 'Context Root'.
You could release multiple different WAR files each of which is configured to respond on different Context Roots.
So one war for version 1, one war for version 2 etc.
This leaves you with code duplication.
So what you are really asking is, "how do I efficiently modularise Java web applications?".
This is a big question, and leads you into "Enterprise Java".
Essentially you need to solve it by abstracting your common code to a different application. Usually this is called 'n-tier' design.
So you'd create an 'integration tier' application which your 'presentation' layer war files speaks to.
The Integration tier contains all the common code so that it isn't repeated.
Your integration tier could be EJB or webservices etc.
Or you could investigate using OSGi.

Developing different versions of a product

I have a Java-based server, transmitting data from many remote devices to one app via TCP/IP. I need to develop several versions of it. How can I develop and then dwell them without need in coding for 2 projects?I'm asking not only for that project, but for different approaches.

Where the behaviour differs, make the behaviour "data driven" - typically by externalizing the data the drives the behaviour to properties files that are read at runtime/startup.
The goal is to have a single binary whose behaviour varies depending on the properties files found in the runtime environment.
Java supports this pattern through the Properties class, which offers convenient ways of loading properties. In fact, most websites operate in this way, for example the production database user/pass details are never (should never be) in the code. The sysadmins will edit a properties file that is read at start up, and which is protected by the operating system's file permissions.
Other options are to use a database to store the data that drives behaviour.
It can be a very powerful pattern, but it can be abused too, so some discretion is advised.

I think you need to read up on Source Control Management (SCM) and Version Control Systems (VCS).
I would recommend setting up a git or Subversion repository and adding the code initially to trunk and then branching it off to the number of branches (versions you'll be working on).
The idea of different versions is this:
You're developing your code and have it in your SCM's trunk (or otherwise known as a HEAD). At some point you consider the code stable enough for a release. You therefore create a tag (let's call it version 1.0). You cannot (should not) make changes to tags -- they're only there as a marker in time for you. If you have a client who has version 1.0 and reports bugs which you would like to fix, you create a branch based on a copy of your tag. The produced version would (normally) be 1.x (1.1, 1.2, etc). When you're done with your fixes, you tag again and release the new version.
Usually, most of the development happens on your trunk.
When you are ready with certain fixes, or know that certain fixes have already been applied to your trunk, you can merge these changes to other branches, if necessary.

Make any other version based on previous one by reusing code base, configurations and any other asset. In case if several versions should be in place at one time use configuration management practices. Probably you should consider some routing activities and client version checks on server side. This is the place where 'backward compatibility' comes to play.

The main approach is first to find and extract the code that won't change from one version to another. The best is to maximize this part to share the maximum of code base and to ease the maintenance (correcting a bug for one means correcting for all).
Then it depends on what really changes from one version to another. The best is that on the main project you can use some abstract classes or interfaces that you will be able to implement for each specific project.

How to manage multiple versions of same class file for different SDK targets?

This is for an Android application but I'm broadening the question to Java as I don't know how this is usually implemented.
Assuming you have a project that targets a specific SDK version. A new release of the SDK is backward incompatible and requires changing three lines in one class.
How is this managed in Java without duplicating any code(or by duplicating the least amount)?
I don't want to create two projects for only 3 lines that are different.
What I'm trying to achieve in the end is a single executable that'll work for both versions. In C/C++, you'd have a #define based on the version. How do I achieve the same thing in Java?
Edit: after reading the comments about the #define, I realized there were two issues I was merging into one:
So first issue is, how do I not
duplicate code ? What construct is there that is the equivalent of a
#define in C.
The second one is: is it possible
to bundle everything in the same
executable? (this is less of a
concern as the first one).

It depends heavily on the incompatibility. If it is simply behavior, you can check the java.version system property and branch the code accordingly (for three lines, something as simple as an if statement).
If, however, it is a lack of a class or something similar that will throw an error when the class is loaded or when the code gets closer to execution (not necessarily something you can void reasonably by checking before calling), then the solution gets a lot harder. The notion of having a separate version is the cleanest from a code point of view, but it does mean you have to distribute two versions.
Another solution is reflection. Don't reference the class directly, call it via reflection (test for the methods or classes to determine what environment you are currently running in and execute the methods). This is probably the "official" approach in that reflection exists to deal with classes that you don't have or don't know you will have at compile time. It is just being applied to libraries within the JDK. It gets very ugly very fast, however. For three lines of code, it's ok, but doing anything extensive is going to get bad.
The last thing I can think of is to write common denominator code - that is code that gets the job done in both, finding another way to do it that doesn't trigger the problematic class or method.

I would isolate the code that needs to be different in a separate class (or multiple classes if necessary), and include / exclude them when building the project for the different versions.
So i would have like src/java/org/myproj/Foo.java which is the common stuff, and then oldversion/java/org/myproj/Bar.java and newversion/java/org/myproj/Bar.java which is the different implementations of the class that uses changed api.
Then I either compile "src/java and oldversion/java" or "src/java and newversion/java".

Possibly a similar situation, I had a method which wasn't available in the previous version of the JDK but if it was there I wanted to call it, I didn't want to force people to use the more recent version though. I used reflection to look for the method, if it was there I called it, if it wasn't I didn't.
Pretty hacky but might give you what you want.

Addressing Java in general, I see two primary approaches.
1). Refactor the specific code to its own library. Have different versions of that library. Effectively your app is creating an abstaction above the different SDKs. Heavyweight for 3 lines of code, but perhaps quite reasonable for larger scale problems.
2). Injection using annotation. Write your own annotation processor to manage the appropriate injection. More work, but maybe more fun.

Separate changing code in different classes with the same interface. Place classes in the same jar. Use factory design pattern to instantiate one or another class depending on SDK version.

What settings affect the layout of compiled java .class files? How can you tell if two compiled classes are equal?

I have an app that was compiled with the built-in Eclipse "Compile" task. Then I decided to move the build procedure into Ant's javac, and the result ended being smaller files.
Later I discovered that adjusting the debuglevel to "vars,lines,source" I could embed the same debug information that Eclipse did, and in a lot of cases files stayed exactly the same size but the internal layout was different. And as a consequence, I couldn't use md5sum signatures to determine if they were exactly the same version.
Besides debug information, what can be the reason that 2 supposedly equal files get a different internal layout or size?
And how can you compare compiled .class files?

THere is no required order for things such as the order of the constant pool entries (essentially all of the symbol info) as well as the attributes for each field/method/class. Different compilers are free to write out in whatever order they want.
You can compared compiled classes, but you would need to dig into the class file structure and parse it. There are libraries out there for doing that, like BCEL or ASM, but I am not 100% sure they will help you with what you want to do.

The ASM Eclipse plugin has a bytecode comparer in it. You select two classes, right click, and do Compare With / Each Other Bytecode.

An important thing to note is that Eclipse does not use javac. Eclipse has its own compiler, the JDT, and so differences in the resulting .class files do not surprise me. I'd expect them to not be verbatim, because they are different compilers.
Due to their differences, there exists code that compiles with javac but not JDT, and vice versa. Typically I have seen the differences in the two become apparent in cases of heavy use of generics

Most importantly, the stack slots for local variables can be arranged arbitrarily without changing the semantics of the code. So basically, you cannot compare compiled class files without parsing and normalizing them - quite a lot of effort.
Why do you want to do that anyway?

as Michale B said, it can be arbitrary.
I work on systems that are using file sizes as security. If the .class files change in size, the class won't be given certain permissions.
Normally that would be easy to get around, but we have fairly complete control over the environment, so it's actually pretty functional.
Anyway, any time the classes that are watched are recompiled, it seems, we have to recalculate the size.
Another thing--a special key number is generated when the file is compiled. I don't know much about this, but it often prevents classes from working together. I believe the procedure is, compile class A and save it (call it a1). compile class a again (a2). Compile class b against class a2. Try to run b against a1. I believe that in this case it will fail at runtime.
If you could learn more about that key number, it might give you the info you are after.

For the comparisson you can decompile your class files and play with the sources generated. See this.

Is Eclipse doing some instrumentation to assist with running in the debugger?

Ultimately the configurations being used are probably making the difference. Assuming they are using the same versions of Java, there are a host of options that are available for the compile configuration (JDK compliance, class file compatibility and a host of debugging information options).

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.