I was wondering if someone could explain the best solution for the smallest memory footprint for an object that has a file in the following situation...
There could be 1 to a few hundred Foo classes.
Thread safety will be important down the road.
Not every Foo class's file is accessed every time.
Each file is unique.
The file in a Foo class may be accessed more than once.
I was planning to profile the solutions below to find the lowest memory footprint and i have a good idea which one would work best but I was interested in some feedback. Solution 1 seems like the best approach but it feels prone to memory leaks the more something accesses the getter. Thoughts?
Solution 1:
public class Foo{
private final String pathToFile;
public class Foo(String pathToFile){
this.pathToFile = pathToFile;
}
public File getFile(){
return new File(pathToFile);
}
}
Solution 2:
public class Foo{
private final File file;
public class Foo(String pathToFile){
this.file = new File(pathToFile);
}
public File getFile(){
return file;
}
}
Solution 3:
public class Foo{
private final String pathToFile;
private File file = null;
public class Foo(String pathToFile){
this.pathToFile = pathToFile;
}
public File getFile(){
if (file == null){
file = new File(pathToFile);
}
return file;
}
}
It all depens on what you want to do with the program, If you need the path in other places then you should have a reference to that. if you need the file, again you would need a reference. Another solution you could do is in the second solution have a method that will return the path: file.getPath();
So overall either the first solution (if you need the path at some point),
or solution 2 if you do not.
It really shouldn't make a big deal either way. The first option will create a new file reference every time that its called, so if you call this 100,000 times AND keep the reference to the files, then it might make an impact. Otherwise, it just depends on if it makes sense to have a reference to the Foo objects file, or if Foo is more of a service than an object and its goal is to return any reasonable reference to the file.
Related
I need to use variables initialized in outer class to be used in inner class.So I had used static variables.Also this is Flink application.
When built as eclipse-export-runnable jar --it works fine--state of variable retains
When built as maven or eclipse-export-jar--it fails--state of variable lost
FileMonitorWrapper.fileInputDir--values is "" and don't fetch the passed value.
Sounds strange..any thoughts
static transient String fileInputDir="";
static transient String fileArchiveDir="";
#SuppressWarnings("serial")
public DataStream<String> ScanDirectoryForFile(String inputDir, String inputFilePattern,String archiveDir, StreamExecutionEnvironment env) {
try {
FileMonitorWrapper.fileArchiveDir = archiveDir;
FileMonitorWrapper.fileInputDir = inputDir;
filteredDirFiles = dirFiles.filter(new FileMapper());
.
.
.
}
}
#SuppressWarnings("serial")
static class FileMapper implements FilterFunction<TimestampedFileInputSplit>{
#Override
public boolean filter(TimestampedFileInputSplit value) throws Exception {
if(value.toString().contains("done"))
FileMonitorWrapper.doneFound = true;
if(value.toString().contains("dat"));
FileMonitorWrapper.datFound = true;
if(FileMonitorWrapper.datFound && FileMonitorWrapper.doneFound) {
try {
if(value.getPath().toString().contains("done")) {
Files.move(Paths.get(FileMonitorWrapper.fileInputDir+"\\"+value.getPath().getName()),
Paths.get(FileMonitorWrapper.fileArchiveDir+"\\"+value.getPath().getName()));
}
}catch(Exception e){
e.printStackTrace();
}
return (!value.toString().contains("done"));
}
else
return false;
}
}
}
Generally speaking, serialization of POJOs does not capture the state of static variables. From what I have read about it, Flink serialization is no different.
So when you say that the static variable state is "retained" in some cases, I think you are misinterpreting the evidence. Something else is preserving the state of the static variables OR they are being initialized to the values that happen to be the same in the "before" and "after" cases.
Why am I so sure about this? The issue is that serializing static variables doesn't make much sense. Consider this
public class Cat {
private static List<Cat> allCats = new ArrayList<>();
private String name;
private String colour;
public Cat(...) {
...
allCats.add(this);
}
...
}
Cat fluffy = new Cat("fluffy", ...);
Cat claus = new Cat("claus", ...);
If the static field of Cat is serialized:
Every time a serial stream contains a Cat it will (must) contain all cats created so far.
Whenever I deserialize a stream contains a Cat, I also need to deserialize the ArrayList<Cat>. What do I do with it?
Do I overwrite allCats with it? (And lose track of the other cats?)
Do I throw it away?
Do I try to merge the lists? (How? What semantics? Do I get two cats called "fluffy"?)
Basically, there is no semantic for this scenario that is going to work out well in general. The (universal) solution is to NOT serialize static variables.
Java Path API is a better replacement of Java File API but massive usage of static methods makes it difficult to mock with Mockito.
From my own class, I inject a FileSystem instance which I replace with a mock during unit tests.
However, I need to mock a lot of methods (and also creates a lot of mocks) to achieve this. And this happens repeatedly so many times across my test classes. So I start thinking about setup a simple API to register Path-s and declare associated behaviour.
For example, I need to check error handling on stream opening.
The main class:
class MyClass {
private FileSystem fileSystem;
public MyClass(FileSystem fileSystem) {
this.fileSystem = fileSystem;
}
public void operation() {
String filename = /* such way to retrieve filename, ie database access */
try (InputStream in = Files.newInputStream(fileSystem.getPath(filename))) {
/* file content handling */
} catch (IOException e) {
/* business error management */
}
}
}
The test class:
class MyClassTest {
#Test
public void operation_encounterIOException() {
//Arrange
MyClass instance = new MyClass(fileSystem);
FileSystem fileSystem = mock(FileSystem.class);
FileSystemProvider fileSystemProvider = mock(FileSystemProvider.class);
Path path = mock(Path.class);
doReturn(path).when(fileSystem).getPath("/dir/file.txt");
doReturn(fileSystemProvider).when(path).provider();
doThrow(new IOException("fileOperation_checkError")).when(fileSystemProvider).newInputStream(path, (OpenOption)anyVararg());
//Act
instance.operation();
//Assert
/* ... */
}
#Test
public void operation_normalBehaviour() {
//Arrange
MyClass instance = new MyClass(fileSystem);
FileSystem fileSystem = mock(FileSystem.class);
FileSystemProvider fileSystemProvider = mock(FileSystemProvider.class);
Path path = mock(Path.class);
doReturn(path).when(fileSystem).getPath("/dir/file.txt");
doReturn(fileSystemProvider).when(path).provider();
ByteArrayInputStream in = new ByteArrayInputStream(/* arranged content */);
doReturn(in).when(fileSystemProvider).newInputStream(path, (OpenOption)anyVararg());
//Act
instance.operation();
//Assert
/* ... */
}
}
I have many classes/tests of this kind and mock setup can be more tricky as static methods may call 3-6 non-static methods over the Path API. I have refactored test to avoid most redundant code but my simple API tends to be very limited as my Path API usage grown. So again it's time to refactor.
However, the logic I'm thinking about seems ugly and requires much code for a basic usage. The way I would like to ease API mocking (whatever is Java Path API or not) is based on the following principles:
Creates abstract classes that implements interface or extends class to mock.
Implements methods that I don't want to mock.
When invoking a "partial mock" I want to execute (in preference order) : explicitly mocked methods, implemented methods, default answer.
In order to achieve the third step, I think about creating an Answer which lookup for implemented method and fallback to a default answer. Then an instance of this Answer is passed at mock creation.
Are there existing ways to achieve this directly from Mockito or other ways to handle the problem ?
Your problem is that you are violating the Single Responsibility Principle.
You have two concerns:
Find and locate a file, get an InputStream
Process the file.
Actually, this should most likely be broken into sub concerns also, but that's outside the scope of this question.
You are attempting to do both of those jobs in one method, which is forcing you to do a ton of extra work. Instead, break the work into two different classes. For example, if your code were instead constructed like this:
class MyClass {
private FileSystem fileSystem;
private final StreamProcessor processor;
public MyClass(FileSystem fileSystem, StreamProcessor processor) {
this.fileSystem = fileSystem;
this.processor = processor;
}
public void operation() {
String filename = /* such way to retrieve filename, ie database access */
try (InputStream in = Files.newInputStream(fileSystem.getPath(filename))) {
processor.process(in);
} catch (IOException e) {
/* business error management */
}
}
}
class StreamProcessor {
public StreamProcessor() {
// maybe set dependencies, depending on the need of your app
}
public void process(InputStream in) throws IOException {
/* file content handling */
}
}
Now we've broken the responsibilities into two places. The class that does all the business logic work that you want to test, from an InputStream, just needs an input stream. In fact, I wouldn't even mock that, because it's just data. You can load the InputStream any way you want, for example using a ByteArrayInputStream as you mention in your question. There doesn't need to be any code for Java Path API in your StreamProcessor test.
Additionally, if you are accessing files in a common way, you only need to have one test to make sure that behavior works. You can also make StreamProcessor be an interface, and then, in the different parts of your code base, do the different jobs for different types of files, while passing in different StreamProcessors into the file API.
In the comments you said:
Sounds good but I have to live with tons of legacy code. I'm starting to introduce unit test and don't want to refactor too much "application" code.
The best way to do it is what I said above. However, if you want to do the smallest amount of changes to add tests, here is what you should do:
Old code:
public void operation() {
String filename = /* such way to retrieve filename, ie database access */
try (InputStream in = Files.newInputStream(fileSystem.getPath(filename))) {
/* file content handling */
} catch (IOException e) {
/* business error management */
}
}
New code:
public void operation() {
String filename = /* such way to retrieve filename, ie database access */
try (InputStream in = Files.newInputStream(fileSystem.getPath(filename))) {
new StreamProcessor().process(in);
} catch (IOException e) {
/* business error management */
}
}
public class StreamProcessor {
public void process(InputStream in) throws IOException {
/* file content handling */
/* just cut-paste the other code */
}
}
This is the least invasive way to do what I describe above. The original way I describe is better, but obviously it's a more involved refactor. This way should involve almost no other code changes, but will allow you to write your tests.
How can I mock any objects of a class.
What I want is for any file object to return true when exists() is called on it.
Something like:
Mockito.mock(File.class)
//return true for any object of File that calls exist()
File file = new File("thisDoesntExist");
assertEquals(true, file.exists());
How can this be done?
This is the method under test (cut down)
#Override
public void load(InputArchive archive, int idx)
{
archive.beginNode("Files", idx);
File file = new File(archive.load("Path"));
if(file.exists())
{
//if it gets here it'll pass the test
}
}
I think that the above will solve my problem, but in case there's a better/alternative way to solve my problem I'll tell you why I'm trying to do this:
The reason I want to do this is that I'm reading an XML which will create a file based off a tag, it will then test this fileObjectCreatedFromXML to see if it exists and if it does then it will do some other stuff which I need it to do.
Is possible to mock your File object even if is created inside your class and you do not have any way to inject it or reference it.
I had this problem few weeks ago and PowerMock can help you here.
You have to annotate your test class to run with PowerMockRunner. See the following example:
#RunWith(PowerMockRunner.class)
#PrepareForTest(MyClassThatWillBeTested.class)
public class MyUnitTest{
private File mockedFile = mock(File.class);
#Before
public void setUp() throws Exception {
PowerMockito.whenNew(File.class).withAnyArguments().thenReturn(mockedFile);
}
}
#Test
public void myTestMethod(){
//test your method here...
}
If you create only one file object this should work well for you.
Also now you can manipulate your mock object to return what you want.
when(mockedFile.exists()).thenReturn(true);
We have got a class, let it be named AttributeUpdater in our project handling the copying of values from one entity to another. The core method traverses through the attributes of an entity and copies them as specified into the second one. During that loop the AttributeUpdater collects all reports, which contain information about what value was overwritten during copying, into a nice list for eventual logging purposes. This list is deleted in case that the old entity which values got overwritten was never persisted into the database, because in that case you only would overwrite default values and logging that is deemed redundant. In pseudo Java code:
public class AttributeUpdater {
public static CopyResult updateAttributes(Entity source, Entity target, String[] attributes) {
List<CopyReport> reports = new ArrayList<CopyReport>();
for(String attribute : attributes) {
reports.add(copy(source, target, attribute));
}
if(target.isNotPersisted()) {
reports.clear();
}
return new CopyResult(reports);
}
}
Now someone got the epiphany that there is one case in which the reports actually matter even if the entity has not been persisted yet. This would not be that big of a deal if I could just add another parameter to the method signature, but that is somewhat out of option due to the actual structure of the class and the amount of required refractoring. Since the method is static the only other solution I came up with is adding a flag as a static field and setting it just before the function call.
public class AttributeUpdater {
public static final ThreadLocal<Boolean> isDeletionEnabled = new ThreadLocal<Boolean> {
#Override protected Boolean initialValue() {
return Boolean.TRUE;
}
public static Boolean getDeletionEnabled() { return isDeletionEnabled.get(); }
public static void setDeletionEnabled(Boolean b) { isDeletionEnabled.set(b); }
public static CopyResult updateAttributes(Entity source, Entity target, String[] attributes) {
List<CopyReport> reports = new ArrayList<CopyReport>();
for(String attribute : attributes) {
reports.add(copy(source, target, attribute));
}
if(isDeletionEnabled.get() && target.isNotPersisted()) {
reports.clear();
}
return new CopyResult(reports);
}
}
ThreadLocal is a container used for thread-safety. This solution, while it does the job, has at least for me one major drawback: for all the other methods which assume that the reports are deleted there is now no way of guaranteeing that those reports will be deleted as expected. Again refractoring is not an option. So I came up with this:
public class AttributeUpdater {
private static final ThreadLocal<Boolean> isDeletionEnabled = new ThreadLocal<Boolean> {
#Override protected Boolean initialValue() {
return Boolean.TRUE;
}
public static Boolean getDeletionEnabled() { return isDeletionEnabled.get(); }
public static void disableDeletionForNextCall() { isDeletionEnabled.set(Boolean.FALSE); }
public static CopyResult updateAttributes(Entity source, Entity target, String[] attributes) {
List<CopyReport> reports = new ArrayList<CopyReport>();
for(String attribute : attributes) {
reports.add(copy(source, target, attribute));
}
if(isDeletionEnabled.get() && target.isNotPersisted()) {
reports.clear();
}
isDeletionEnabled.set(Boolean.TRUE);
return new CopyResult(reports);
}
}
This way I can guarantee that for old code the function will always work like it did before the change. The downside to this solution is, especially for nested entities, that I am going to be accessing the ThreadLocal-Container a lot - Iteration over one of those means calling disableDeletionForNextCall() for each nested element. Also as the method is called a lot overall there are valid performance concerns.
TL;DR: Look at pseudo Java source code. First one is old code, second and third are different attempts to allow deletion disabling. Parameters cannot be added to method signature.
Is there a possibility to determine which solution is better or is this merely a philosophical issue? Or is there even a better solution to this problem?
The obvious way to decide which solution is better in terms of performance would be benchmarking this. As both solutions access the thread-local variable at least for reading, I doubt that they would differ too much. You could perhaps combine them like this:
if(!isDeletionEnabled.get())
isDeletionEnabled.set(Boolean.TRUE);
else if (target.isNotPersisted())
reports.clear();
In this case, you will have the benefit of the second solution (guaranteed resetting of the flag) without unneccessary writes.
I doubt there will be much practical difference. With a bit of luck, the HotSpot JVM will compile the thread local variable into some nice native code which works without too much of a performance penalty, though I have no actual experience there.
I inherited an application which uses a java properties file to define configuration parameters such as database name.
There is a class called MyAppProps that looks like this:
public class MyAppProps {
protected static final String PROP_FILENAME = "myapp.properties";
protected static Properties myAppProps = null;
public static final String DATABASE_NAME = "database_name";
public static final String DATABASE_USER = "database_user";
// etc...
protected static void init() throws MyAppException {
try {
Classloader loader = MyAppException.class.getClassLoader();
InputStream is = loader.getResourceAsStream(PROP_FILENAME);
myAppProps = new Properties();
myAppProps.load(is);
} catch (Exception e) {
threw new MyAppException(e.getMessage());
}
}
protected static String getProperty(String name) throws MyAppException {
if (props==null) {
throw new MyAppException("Properties was not initialized properly.");
}
return props.getProperty(name);
}
}
Other classes which need to get property values contain code such as:
String dbname = MyAppProps.getProperty(MyAppProps.DATABASE_NAME);
Of course, before the first call to MyAppProps.getProperty, MyAppProps needs to be initialized like this:
MyAppProps.init();
I don't like the fact that init() needs to be called. Shouldn't the initialization take place in a static initialization block or in a private constructor?
Besides for that, something else seems wrong with the code, and I can't quite put my finger on it. Are properties instances typically wrapped in a customized class? Is there anything else here that is wrong?
If I make my own wrapper class like this; I always prefer to make strongly typed getters for the values, instead of exposing all the inner workings through the static final variables.
private static final String DATABASE_NAME = "database_name"
private static final String DATABASE_USER = "database_user"
public String getDatabaseName(){
return getProperty(MyAppProps.DATABASE_NAME);
}
public String getDatabaseUser(){
return getProperty(MyAppProps.DATABASE_USER);
}
A static initializer looks like this;
static {
init();
}
This being said, I will readily say that I am no big fan of static initializers.
You may consider looking into dependency injection (DI) frameworks like spring or guice, these will let you inject the appropriate value directly into the places you need to use them, instead of going through the indirection of the additional class. A lot of people find that using these frameworks reduces focus on this kind of plumbing code - but only after you've finished the learning curve of the framework. (DI frameworks are quick to learn but take quite some time to master, so this may be a bigger hammer than you really want)
Reasons to use static initializer:
Can't forget to call it
Reasons to use an init() function:
You can pass parameters to it
Easier to handle errors
I've created property wrappers in the past to good effect. For a class like the example, the important thing to ensure is that the properties are truly global, i.e. a singleton really makes sense. With that in mind a custom property class can have type-safe getters. You can also do cool things like variable expansion in your custom getters, e.g.:
myapp.data.path=${myapp.home}/data
Furthermore, in your initializer, you can take advantage of property file overloading:
Load in "myapp.properties" from the classpath
Load in "myapp.user.properties" from the current directory using the Properties override constructor
Finally, load System.getProperties() as a final override
The "user" properties file doesn't go in version control, which is nice. It avoids the problem of people customizing the properties file and accidentally checking it in with hard-coded paths, etc.
Good times.
You can use either, a static block or a constructor. The only advice I have is to use ResourceBundle, instead. That might better suit your requirement. For more please follow the link below.
Edit:
ResourceBundles vs Properties
The problem with static methods and classes is that you can't override them for test doubles. That makes unit testing much harder. I have all variables declared final and initialized in the constructor. Whatever is needed is passed in as parameters to the constructor (dependency injection). That way you can substitute test doubles for some of the parameters during unit tests.
For example:
public class MyAppProps {
protected static final String PROP_FILENAME = "myapp.properties";
protected Properties props = null;
public String DATABASE_NAME = "database_name";
public String DATABASE_USER = "database_user";
// etc...
public MyAppProps(InputStream is) throws MyAppException {
try {
props = new Properties();
props.load(is);
} catch (Exception e) {
threw new MyAppException(e.getMessage());
}
}
public String getProperty(String name) {
return props.getProperty(name);
}
// Need this function static so
// client objects can load the
// file before an instance of this class is created.
public static String getFileName() {
return PROP_FILENAME;
}
}
Now, call it from production code like this:
String fileName = MyAppProps.getFileName();
Classloader loader = MyAppException.class.getClassLoader();
InputStream is = loader.getResourceAsStream(fileName);
MyAppProps p = new MyAppProps(is);
The dependency injection is when you include the input stream in the constructor parameters. While this is slightly more of a pain than just using the static class / Singleton, things go from impossible to simple when doing unit tests.
For unit testing, it might go something like:
#Test
public void testStuff() {
// Setup
InputStringTestDouble isTD = new InputStreamTestDouble();
MyAppProps instance = new MyAppProps(isTD);
// Exercise
int actualNum = instance.getProperty("foo");
// Verify
int expectedNum = 42;
assertEquals("MyAppProps didn't get the right number!", expectedNum, actualNum);
}
The dependency injection made it really easy to substitute a test double for the input stream. Now, just load whatever stuff you want into the test double before giving it to the MyAppProps constructor. This way you can test how the properties are loaded very easily.