A bit of a complexity in Using the factory design pattern - java

As it is said "if we have a super class and n sub-classes, and based on data provided, we have to return the object of one of the sub-classes, we use a factory pattern"
Situation:
I have 20 clients, more can be added with time. Each will provide a file from which data would be extracted and inserted into the db. Every client has his own style of mainting the file i.e. data fields would be at different places.
Solution:
For this I think I will have to use factory design pattern, I create 20 classes and every class has its own implementation of every field, like how and from which place in file it has to extract it. As new clients are added, I just create a new class and I am done, no other changes are required.
Am I correct till here?
Complexity:
Now the problem is that the files that clients' provide, can be in either of the 4 formats (PDF, XLS(X), HTML, TXT). THe engine that extracts text from these formats has to be static, like I use pdftoXML to extract PDF to XML etc. If I dont create a seperate engine class that just converts the PDF to XML then I would have to re-write the functionality of PDF text extraction in every client's class that provides file in PDF. The same is the case with excel extraction engine.
Question:
How should I encorporate these engines in factory pattern? Should the engines classes be static and the sub class that has to deal with pdf for example, calls the pdf class' extract method to get the reqtuired data or what?
Hope I made myself clear, thanks

I think this is a good use of Factory Method, but don't try to use the client class hierarchy to model the extraction algorithms as well. In terms of patterns, you could use Strategy to dynamically configure a client object with the right kind of extraction algorithm.

You have (at least) two things going on here.
Extracting text from a range of file formats
Parsing this text according to client specific rules.
I would separate these functions completely. Perhaps even have two factories, one for text extraction and one for parsing. The client specific code that does the parsing, does not need to know that the text came from a PDF, CSV, or was piped in over http or whatever. It merely needs to know about parsing text.
Hope this helps.

To start with, a few questions. These may change the suggested solution:
Does each of the 20 clients have a single FileFormat?
What is common (behavior/methods) between the subclasses?
Is there ever a chance that a subclass might change such that is utilizes a different FileFormat?
In the simplest solution where subclasses do not need to switch file types, Seems like right now you could have an abstract ContentProvider which is subclassed by a set of abstract classes PdfProvider, XlsProvider, HtmlProvider, and TxtProvider. This middle layer of classes implement the file-format specific functionality. Then your 20 "client" classes inherit from the appropriate FileFormat-specific base.

Related

why to use separate string constants file?

Given the way String pool works for a given piece of code why is it preferred to create a separate file for storing the constants?
String str = "test";
Suppose, the String "test" is used over 50 times in the entire application. Why is recommended to store this in a constants file. Now ,the way string pool works should actually create only one object if it doesn't exist in the pool and then share the references of this object to other places as and when needed. Then why to create a separate constant file to store constants?
The major importance comes with internationalisation, having a strings file means you have a few places in which you have to search for translatable text.
Consider the case when your app becomes a hit and it's being shipped outside of the locale in which it was developed, with a unified strings file, you just have to do a simple conversion of the file content, while with other methods you will have to search every single file and replace every single instance before your application is barely usable to people in different locales.
A very good example is the android resource system for strings. If you keep strings in specific files, it's easier managing them and translating them.
Following advantages you get
Code duplication is avoided across the project. The defined string is constant across the project and can be reused in multiple places
If user want to modify the string value across the codebase, he/she has to change only one place in the file.
Strings don’t clutter up your project code, leaving it clear and easy to maintain
If you want to support localization/i18 support in other languages, then it will be highly useful. Putting strings in resource files makes it much easier to provide separate translations of each string for different languages.
You need some storage for this literals (not only file and etc.) for simple reasons: if you need to change the value of this literal, then you will change it only in one place. For this reasons exists 2 main solutions:
use constant class with public static final Object CONSTANT fields.
use external storage (for example, file)
First case has 2 problems: at first, if you change this values, you will need to recompile project. at second, if this variables used by some frameworks you need to pass this arguments directly in your code. Also, when you need to support of i18n, you could obtain problems with managing right variants.
But when you use external storages, only what you need is implement parser for your storage (I mean json, xml, db and so on). Some frameworks already implement this feature, so you don't need to make this work.
Howevere, you mustn't do it, but it's just really good practice.
Simple: because code duplication is the root of all evil.
Constants are less about performance than about maintaining an application in the long run. As soon as some "property" is used with the same meaning in different places - you want to make sure that all usages are based on the very same definition. So that when updates are necessary, only one line of code needs to be adapted. And not 50, with an ever growing chance of missing to update one or the other occurrence.
Because everything has its place and there's a place for everything. If it's a constant string that, hopefully, will never change and it's used application-wide then logic dictates it (and things like it) should be defined separately for both maintenance and code readability concerns.
That is because if anyone want to make changes in this string in future, then you don't need to search it in whole project. Just make changes in constants file.

How do I replace Java's default deserialization with my own readObject call?

Someone thought it would be a good idea to store Objects in the database in a blob column using Java's default serialization methods.
The structure of these objects is controlled by another group and they changed a field type from BigDecimal to a Long,
but the data in our database remains the same.
Now we can't read the objects back because it causes ClassCastExceptions.
I tried to override it by writing my own readObject method,
but that throws a StreamCorruptedException because what was written by the default writeObject method.
How do I make my readObject call behave like Java's default one?
Is there a certain number of bytes I can skip to get to my data?
Externalizable allows you to take full control of serialization/deserialization. But it means you're responsible for writing and reading every field,
When it gets difficult though is when something was written out using the default serialization and you want to read it via Externalizable. (Or rather, it's impossible. If you try to read an object serialized with the default method using Externalizable, it'll just throw an exception.)
If you've got absolutely no control on the output, your only option is to keep two versions of the class: use the default deserialization of the old version, then convert to the new. The upside of this solution is that it keeps the "dirty" code in one place, separate from your nice and clean objects.
Again, unless you want to do things really complicated, your best option is to keep the old class as the "transport" bean and rename the class your code really uses to something else.
If you want to read what's already in your database your only option is to get them to change the class back again, and to institute some awareness that you're relying on the class definition as it was when the class was serialized. Merely implementing your own readObject() call can't fix this, and if the class is under someone else's control you can't do that anyway.
If you're prepared to throw away the existing data you have many other choices starting with custom Serialization, writeReplace()/readResolve(), Externalizable, ... or a different mechanism such as XML.
But if you're going to have third parties changing things whenever they feel like it you're always going to have problems of one kind or another.
BigDecimal to Long sounds like a retrograde step anyway.
Implement the readObject and readObjectNoData methods in you class.
Read the appropriate type using ObjectInoutStream.readObject and convert it to the new type
See the Serializable interface API for details.
More Details
You can only fix this easily if you control the source of the class that was serialized into the blob.
If you do not control this class,
then you have only a few limited and difficult options:
Have the controlling party give you a version of the class that reads the old format and writes the new format.
Write you own form of serialization (as in you read the blob and convert the bytes to classes) that can read the old format and generate new versions of the classes.
Write you own version of the class in question (remove the other from the class path) which reads the old format and produces some intermediate form (perhaps JSON).
Next you have to do one of these
Convince the powers that be that the blob technique is shitty and should be done away with. use the current class change as evidance. Almost any technique is better that this. Writing JSON to the db in the blob is better.
Stop depending on shitty classes from other people. (shitty is a judgement which I can only suspect, not know, is true). Instead create a suite of classes that represent the data in the database and convert from the externally controlled classes to the new data classes before writing to the database.

Design pattern for mapping field from one object to another

I am writing a component that fits into a 3rd party framework. The component exports orders into a specific file format, ready to be transported to a separate backend system.
The backend system has a very different view of the data, with specific restrictions on field lengths and formats that the framework doesnt have. Therefore i need to be able to:
1. Store/know about these rules
2. Take the data from the framework
3. Transform based on the data received and the rules i mentioned in point 1
4. Write the transformed data to file
Are there any design patterns for this type of functionality. Particularly, where to put the mapping rules:
- xml config
- directly in a class
- something else?
Adapter is used to adapt from one interface to another.
Different ways to accomplish, but you can simply implent two interfaces on the one adapter class. And/Or make the adapter composed of an instance of another class or classes.
The Adapter pattern (more specifically Object Adapter pattern) contains an instance of the class it wraps. In this situation, the adapter makes calls to the instance of the wrapped object. The pattern itself allows the interface of an existing class to be used from another interface. Hope this helps!

How to find all Java method calls and given parameters in a project programmatically?

I have a multilingual web application that gets all of the translations from a single object, for example lang.getTranslation("Login") and the object is responsible for finding the translation in an xml file in the user's language.
What I'd like to do is a script / custom static analysis that outputs all the missing translations and translations that are no more used in the application. I believe I should start by programmatically finding every call to the getTranslation method and the string parameter, storing the data in a special structure and comparing it to all the translation files.
Is there a library that will allow me to do this easily? I already found Javassist but I can't use it to read the parameter values. I also tried grepping, but I'm not sure if that's a robust solution (in case there will be a call to another class that has a getTranslation method). I know Qt has a similar mechanism for finding translatable strings in the code, but that's a totally different technology..
I'm asking this because I'm quite sure there's a good existing solution for this and I don't want to reinvent the wheel.
Ok, here's how I did it. Not the optimal solution, but works for me. I created a utility program in Java to find all the method calls and compare the parameters to existing translations.
Find all classes in my project's root package using the Reflections library
Find all getTranslation method calls of the correct class in the classes using the Javassist library and create a list of them (contains: package, class, row number)
Read the appropriate .java files in the project directory from the given row until the ';' character
Extract the parameter value and add it to a list
Find the missing translations and output them
Find the redundant translations and output them
It took me a while to do this, but at least I now have a reusable utility to keep the translation files up to date.

Generate object model out of RelaxNG schema with RNGOM - how to start?

I want to generate an object model out of an RelaxNG Schema.
Therefore I want to use the RNGOM Object Model/Parser (mainly because I could not find any alternative - although I don't even care about the language the parser is written in/generates). Now that I checked out the RNGOM source from SVN, I don't have ANY idea how to use RNGOM, since there is not any piece of information out there about the usage.
A useful hint how to start with RNGOM - a link, example, or any description which saves me from having to read understand the whole source code of RNGOM - will be awarded as an answer.
Even better would be a simple example how to use the parser to generate an Object model out of an RNG file.
More infos:
I want to generate Java classes out of the following RelaxNG Schema:
http://libvirt.org/git/?p=libvirt.git;a=tree;f=docs/schemas;hb=HEAD
I found out that the Glassfish guys are using rngom to generate the same object model I need, but I could not yet find out how they are using rngom.
A way to proceed could be to :
use jing to convert from Relax NG to XML Schema (see here)
use more common tools to generate classes (e.g. JaxB).
Hi I ran into mostly the same requirement except I am concentrating on the Compact Syntax. Here is one way of doing what you want but YMMV.
To give some context, my goal in 2 phases: (a) Trying to slurp RelaxNG Compact Syntax and traverse an object/tree to create Spring 4 POJOs usable in Spring 4 Rest Controller. (b) From there I want to develop a request validator that uses the RNG Compact and automatically validates the request before Spring de-serializes the request. Basically scaffolding JSON REST API development using RelaxNG Compact Syntax as both design/documentation and JSON schema definition/validation.
For the first objective I thought about annotating the CompactSyntax with JJTree but I am obviously not fluent in JavaCC so I decided to go a more programatic approach...
I analyzed and tested the code in several ways to determine if there was a tree implementation in binary, digested and/or nc packages but I don't think there is one (an om/tree) as such.
So my latest, actually successful approach, has been to build upon binary and extend SchemaBuilderImpl, implement the visitor interface, and passing my custom SchemaBuilderImpl to CompactSyntax using the long constructor: CompactSyntax(CompactParseable parseable, Reader r, String sourceUri, SchemaBuilder sb, ErrorHandler eh, String inheritedNs)
When you call CompactParseable.parse you will get structured events in the visitor interface and I think this is good enough to traverse the rng schema and from here you could easily create an OM or tree.
But I am not sure this is the best approach. Maybe I missed something and there is in fact an OM/Tree built by the rngom implementation (in my case CompactSyntax) that you can traverse to determine parent/child relationships more easily. Or maybe there are other approaches to this.
Anyway, this is one approach that seems to be working for what I want. Is mostly visitor pattern based and since the interfaces were there I decided to use them. Maybe it will work for you. Bottom line, I could not find an OM/AST that can be traversed implemented anywhere in the implementation packages (nc, binary, digested).

Categories