maven test failing as file modified timings get updated - java

I have a java project with some test files in the following location:
src\test\resources\data\file\daily
I have some Junit test cases that check and assert based on the file modified time.
FileTime modFileTime = Files.getLastModifiedTime(Paths.get(classPathResource.getFile().getPath()));
when I execute the test cases using intellij without maven, my test passes and the modFileTime has time from the past e.g. 16/04/21 19:48
However, my test cases are failing when I run the tests using maven clean test as the file modified timings in target\test-classes\data\file\daily directory get updated timings.
How can I preserve the original file modified timings? or is there a common solution for this?
The method being called with test:
private boolean isFileAvailable(String file) throws IOException {
ClassPathResource classPathResource = new ClassPathResource(file);
boolean exists = Files.exists(Paths.get(classPathResource.getFile().getPath()));
if (exists) {
FileTime modFileTime = Files.getLastModifiedTime(Paths.get(classPathResource.getFile().getPath()));
long modFileMinutes = modFileTime.to(TimeUnit.MINUTES);
long minutes = FileTime.from(Instant.now()).to(TimeUnit.MINUTES);
return minutes - modFileMinutes >= 5;
} else {
return false;
}
}

mvn clean is getting rid of everything in your target/ directory before running the tests, and repopulating it. Hence, the timestamp will change every run. But this also will be the case for an initial (clean) checkout of the project, which you should be doing before any release build, so ... this is a pretty normal thing to be happening.
However, I agree with all the comments -- your test (not posted) doesn't make a lot of sense. If you want to have your test check a file with a relative timestamp, then e.g. set the timestamp on the file to 4 minutes ago and confirm it's not loaded, then set it to 6 minutes ago and confirm it's loaded. You can set the last-modified value on your test file from within the test. This is much more reliable than relying on something in the test execution system (maven) itself, especially if you generate the test file as part of the test (a good idea)
Also: if you only want to load data files older than a certain time, then I doubt you really want to have those be classpath resources. They should probably be loaded from some other known location. I suspect you are trying to solve some problem with cleverness that would be better solved by something from, e.g., https://commons.apache.org/

Related

New value added to Java Enum not available during debug

I am having the following problem:
I have an Enum that was originally declared with 5 elements.
public enum GraphFormat {
DOT,
GML,
PUML,
JSON,
NEO4J,
TEXT {
#Override
public String getFileExtension() {
return ".txt";
}
};
Now I need to add an additional element to it (NEO4J). When I run my code or try to debug it I am getting an exception because the value can't be found in the enum.
I am using IntelliJ as my IDE, and have cleaned the cache, force a rebuild, etc.. and nothing happens. When I look at the .class file created on my target folder, it also has the new element.
Any ideas on what could be causing this issue ?
I found my problem and want to share here what was causing it. My code was actually for a Maven plug-in which I was pointing to another project of mine to run it as a goal. However the pom.xml of my target test project was pointing to the original version of the plug-in instead of the one I am working on, and that version of course is outdated and does not include the new value. Thank you.

Google Cloud Dataflow: Submitted job is executing but using old code

I'm writing a Dataflow pipeline that should do 3 things:
Reading .csv files from GCP Storage
Parsing the data to BigQuery campatible TableRows
Writing the data to a BigQuery table
Up until now this all worked like a charm. And it still does, but when I change the source and destination variables nothing changes. The job that actually runs is an old one, not the recently changed (and committed) code. Somehow when I run the code from Eclipse using the BlockingDataflowPipelineRunner the code itself is not uploaded but an older version is used.
Normally nothing wrong with the code but to be as complete as possible:
public class BatchPipeline {
String source = "gs://sourcebucket/*.csv";
String destination = "projectID:datasetID.testing1";
//Creation of the pipeline with default arguments
Pipeline p = Pipeline.create(PipelineOptionsFactory.fromArgs(args).withValidation().create());
PCollection<String> line = p.apply(TextIO.Read.named("ReadFromCloudStorage")
.from(source));
#SuppressWarnings("serial")
PCollection<TableRow> tablerows = line.apply(ParDo.named("ParsingCSVLines").of(new DoFn<String, TableRow>(){
#Override
public void processElement(ProcessContext c){
//processing code goes here
}
}));
//Defining the BigQuery table scheme
List<TableFieldSchema> fields = new ArrayList<>();
fields.add(new TableFieldSchema().setName("datetime").setType("TIMESTAMP").setMode("REQUIRED"));
fields.add(new TableFieldSchema().setName("consumption").setType("FLOAT").setMode("REQUIRED"));
fields.add(new TableFieldSchema().setName("meterID").setType("STRING").setMode("REQUIRED"));
TableSchema schema = new TableSchema().setFields(fields);
String table = destination;
tablerows.apply(BigQueryIO.Write
.named("BigQueryWrite")
.to(table)
.withSchema(schema)
.withWriteDisposition(BigQueryIO.Write.WriteDisposition.WRITE_APPEND)
.withCreateDisposition(BigQueryIO.Write.CreateDisposition.CREATE_IF_NEEDED)
.withoutValidation());
//Runs the pipeline
p.run();
}
This problem arose because I've just changed laptops and had to reconfigure everything. I'm working on a clean Ubuntu 16.04 LTS OS with all the dependencies for GCP development installed (normally). Normally everything is configured quite well since I'm able to start a job (which shouldn't be possible if my config is erred, right?). I'm using Eclipse Neon btw.
So where could the problem lie? It seems to me that there is a problem uploading the code, but I've made sure that my cloud git repo is up-to-date and the staging bucket has been cleaned up ...
**** UPDATE ****
I never found what was exactly going wrong but when I checked out the creation dates of the files in my deployed jar, I indeed saw that they were never really updated. The jar file itself had however a recent timestamp which made me overlook that problem completely (rookie mistake).
I eventually got it all working again by simply creating a new Dataflow project in Eclipse and copying my .java files from the broken project into the new one. Everything worked like a charm from then on.
Once you submit a Dataflow job, you can check which artifacts were part of the job specification by inspecting the files that are part of the job description which is available via DataflowPipelineWorkerPoolOptions#getFilesToStage. The code snippet below gives a little sample of how to get this information.
PipelineOptions myOptions = ...
myOptions.setRunner(DataflowPipelineRunner.class);
Pipeline p = Pipeline.create(myOptions);
// Build up your pipeline and run it.
p.apply(...)
p.run();
// At this point in time, the files which were staged by the
// DataflowPipelineRunner will have been populated into the
// DataflowPipelineWorkerPoolOptions#getFilesToStage
List<String> stagedFiles = myOptions.as(DataflowPipelineWorkerPoolOptions.class).getFilesToStage();
for (String stagedFile : stagedFiles) {
System.out.println(stagedFile);
}
The above code should print out something like:
/my/path/to/file/dataflow.jar
/another/path/to/file/myapplication.jar
/a/path/to/file/alibrary.jar
It is likely that the resources part of the job that your uploading are out of date in some way containing your old code. Look through all the directories and jar parts of the staging list and find all instances of BatchPipeline and verify their age. jar files can be extracted using the jar tool or any zip file reader. Alternatively use javap or any other class file inspector to validate that the BatchPipeline class file lines up with the expected changes you have made.

How to test methods that download big files

Usually I have these methods responsible for downloading large file/s to a local directory.
It annoys me because I don't really know how to test that properly?
Should I run a test case that downloads these files to a temp directory using Role in Junit? Or maybe ask it to download files to the same local directory in production?
Importantly, such method takes a long time to download 1GB+ file, so is it bad that a test case would take a long time?
What would be an ideal test case for this?
public File downloadFile(File dir, URL url){
//url contains a 1GB or more
//method takes a long time
return file; // it's located in dir
}
My belief here is that you are approaching the problem in the wrong way altogether.
You want to test "if that method works". But this method is highly dependent on "side effects" which are, in this case, failures which can occur at any of these steps:
connection failure: the given URL cannot be accessed (for whatever reason);
network failure: connection is cut while the transfer was in progress;
file system failure: the specified file cannot be created/opened in write mode; or the write fails while you are downloading contents.
In short: it is impossible to test whether the method "works". And what is more, due to the amount of possible failures mentioned above, it means that your method should at least throw an exception so that the caller of the method can deal with it.
Finally, this is 2016; it is assumed below that you use Java 7 or later, but here is how you can rewrite your method:
public Path downloadFile(final Path dir, final URL url)
throws IOException
{
final String filename = /* decide about the filename here */;
final Path ret = dir.resolve(filename);
try (
final InputStream in = url.openStream();
) {
Files.copy(in, ret);
}
return ret;
}
Now, in your tests, just mock the behavior of that method; make it fail, make it return a valid path, make the URL fail on .openStream()... You can simulate the behavior of that method whichever way you want so that you can test how the callers of this method behave.
But such a method, in itself, is just too dependent on "side effects" that it cannot be tested reliably.

JUnit test case failed

I have a simple test case:
public class FileManagerTest {
String dirPath = “/myDir/”
#Before
public void setUp() {
mFileManager = MyFileManager.getInstance();
}
#Test
private void testPersistFiles() {
System.out.println(“testPersistFiles()…”);
//it deletes old files & persists new files to /myDir/ directory
boolean successful =mFileManager.persistFiles();
Assert.assertTrue(successful);
}
#Test
public void testGetFiles() {
System.out.println(“testGetFiles()…”);
mFileManager.persistFiles();
//I double checked, the persistFiles() works, the files are persisted.
List<File> files = mFileManager.getFilesAtPath(dirPath);
Assert.assertNotNull(files); //Failure here!!!!
}
#Test
public void testGetFilesMap() {
System.out.println(“testGetFilesMap()…”);
mFileManager.persistFiles();
Map<String, File> filesMap = mFileManager.getFilesMapAtPath(dirPath);
Assert.assertNotNull(files);
}
}
The persistFiles() function in FileManager delete all files under /myDir/ then persist files again.
As you see above, I have a System.out.println(…) in each test function. When I run it , I can see all the prints in the following order:
testGetFilesMap()…
testGetFiles()…
testPersistFiles()…
However, test is failed at testGetFiles(). Two things I don't understand:
I don’t understand, it is failed at testGetFiles() why I can still see the print testPersistFiles() which sounds like even it is failed, it doesn't stop running, but continues to run the next test testPersistFiles()? What is happening behind the scene in JUnit test case??
Another thing I don’t understand is why testGetFiles() is failed? I can see log that the persistFiles() has persisted files. Why it got null after that?
I don’t understand, it is failed at testGetFiles() why I can still see the print testPersistFiles() which sounds like even it is failed, i
That is how unit testing works. Each test should be isolated and working using only its set of data. Unit test frameworks run every test so you can see which parts of the system work and which do not, they do not stop on the first failure.
mFileManager.getFilesAtPath(dirPath);
You are not searching the files in the right place
String dirPath = “/myDir/”
Are you sure that this path is ok? with a slash before the directory name?
For each of your tests, JUnit creates a separate instance of that class and runs it. Since you seem to have 3 tests, JUnit will create 3 instances of your class, execute #Before on each of them to initialize state, and then run them.
The order in which they are run is typically the order in which the tests are written but this is not guaranteed.
Now about the print statement - you see that it's the first statement in your test so it will be executed. Then mFileManager.persistFiles(); is executed. For some reason it returns a false and hence the test fails.
As to why it returns false, you can run a local debugger, put a break point at the beginning of that method, single-step and see.

'Programming by Coincidence' Excercise: Java File Writer

I just read the article Programming by Coincidence. At the end of the page there are excercises. A few code fragments that are cases of "programming by coincidence". But I cant figure out the error in this piece:
This code comes from a general-purpose
Java tracing suite. The function
writes a string to a log file. It
passes its unit test, but fails when
one of the Web developers uses it.
What coincidence does it rely on?
public static void debug(String s) throws IOException {
FileWriter fw = new FileWriter("debug.log", true);
fw.write(s);
fw.flush();
fw.close();
}
What is wrong about this?
This code relies on the fact that there is a file called debug.log that is writable in the application's executing directory. Most likely the web developer's application is not set up with this file and the method fails when he tries to use it.
A unit test of this code will work because the original developer had the right file in the right place (and with the right permissions). This is the coincidence that allowed the unit test to succeed.
Interesting tidbit. Ideally, resources must be pulled from the classpath. However, there is no end to human studpidity though. What would happen if the file was present in test environment's classpath (say eclipse), but was missing in production deployments.?

Categories