How to handle multiple schemas containing the same data

How to handle multiple schemas containing the same data - java

I'm working on a system which predicts soccer matches at work. I have several pre-existing databases which each contain broadly the same data, although some vendors provide more data than others. I have a core set of fields that my application uses and which all vendors provide:
homeTeamId, awayTeamId, fullTimeHomeGoals, fullTimeAwayGoals, homeShotsOnTarget, awayShotsOnTarget, etc...
Because these databases have come from different sources the field-names vary. Also some of this data is subjective (the definition of a shot on target varies). This means that I need to know which vendor a match came from. There is also overlap because several vendors will have the data for a particular match.
At the moment we are using one source of data at a time, but we will use two or more vendors at once based on the competition covered by that vendor in future (by selecting based on competition we remove the issue of duplicate matches).
My solution was to use XML to store a mapping of the fieldName. E.g
<Schemas>
<Schema>
<SchemaName>VendorA</SchemaName>
<TableName>VendorA_MatchResults</TableName>
<FullTimeHomeGoals>homeFullTimeScore</FullTimeHomeGoals>
Etc...
</Schema>
</Schemas>
Then whenever I need a field for a sql query, look at the vendor the user has specified in the job configuration XML and lookup the fields relevant to that data vendor. When we come to uses results from two vendors I was planning to use a view and treat this as a new vendor in the XML.
This must be a reasonably common problem but I couldn't find anything online discussing how to tackle it. My gut instinct says the DB should be able to handle this internally instead, perhaps with a view?
I'd be grateful for any advice or ideas.
For background, I'm using MySql and Java to develop this application.

I think what you are doing is good.
You should create a class which stores these configurations for each schema. Store these configs in a Map. Make sure that these entries and their configurations come from a config file, just like you did.
I would recommend Spring here, it makes your life easier. Each time when you want to add a new vendor, you just need to edit this config file, and restart your app.

Related

Data abstraction or Data Connector framework for Java

Note:There is a good chance I'm not using the correct terminology here and that maybe the reason I'm not finding the answers to my question. I apologize upfront if this has been already answered, so please just direct me there.
I am looking for an open source framework written in Java that would allow me to build pluggable data connectors (and obviously have some built in already) and almost have a query language (abstraction layer) that would translate into any of those connections.
For example: I would be able to say:
Fetch 1 record from a Mongo DB that matches name='John Doe'
and get JSON as a response
or I could say
Fetch all records from a MySQL DB that matches name='John Doe'
and get a JSON as a response
If not exactly what I described, I am willing to work with anything that would have a part of this solved.
Thank you in advance!

You're not going to find a "Swiss army knife" data abstraction framework that does all of the above. Perhaps the closest things to what you ask for would be JPA providers for both Mongo and MySQL (Hibernate is a well-regarded JPA provider for MySQL, and a quick google search shows Kundera, DataNucleus and Hibernate OGM for Mongo). This will let you map your data to Java Objects, which might be a step further than what you ask for since you explicitly asked for JSON; however, there are numerous options for mapping the resulting objects into JSON if you need to present JSON to a user or another system (Jackson comes to mind for this).

Try YADA, an open source data-abstraction framework.
From the README:
YADA is like a Universal Remote Control for data.
For example, what if you could access
any data set
at any data source
in any format
from any environment
using just a URL
with just one-time configuration?
You can with YADA.
Or, what if you could get data
from multiple sources
in different formats
merging the results
into a single set
on-the-fly
with uniform column names
using just one URL?
You can with YADA.
Full disclosure: I am the creator of YADA.

Saving Data from a JavaFX-Application without Database

Unfortunately I couldn't find anything specific to this topic / to my problem. Here we go:
I'm building a JavaFX Business Application for a friend of mine. Unfortunately I do not have any possibility to connect to a Database. I want the Application to load a savestate from a file. The application contains a list with clients and the clients got some specific properties. I do not want to hardcode this to a .prop or .txt file, because I'm sure that there's a different way of doing this, isn't there?
Thanks in advance, appreciate it!

Lots of choices for persisting data to local storage. The exact choice depends on your needs. You do not describe enough details to make a specific recommendation.
Here is a list of possibilities, roughly in increasing order of complexity of your data.
Text file
If you have small amounts of simple data, save to a text file. You can store each piece in a separate file, or combine into a single file. Recent versions of Java have new classes to make this easier than ever. See Oracle Tutorial.
Comma-separate & Tab-delimited
For sets of structured data, write to text files in comma-separated values (CSV) or tab-delimited values. For example a list of people with rows for each person, and columns for name, phone number, and email address.
While reading/writing such files is easy enough to program yourself, I suggest using an established library to eliminate the drudgery, avoid bugs, and save yourself some time. There are a few such libraries written in Java.
My favorite is the Apache Commons CSV project. This library makes easy work of the chore of reading/writing such files. Despite the name, this library supports tab-delimited as well as comma-separated formats. I've written a few Answers here on Stack Overflow showing how to use this library, as you can see here, here, and here.
By the way, plain old ASCII defines a few character positions explicitly for delimiting in data files, with four levels of grouping (document, group, record/row, and field). Unicode, of course, inherits these from ASCII as code points. I am puzzled why these have remained so obscure and so infrequently used. Seems much more logical to me than using commas and tabs which may well exist inside the data payload.
Serialization
You can write out the data values stored within an object. This is called serialization. Java has a serialization facility built-in, but be sure to study up on the details.
To more simply write out an object’s values and later read them back in to reconstitute an object, I have enjoyed using the Simple XML Serialization project. This works well for relatively simple needs, and is aimed at the situation where you want the structure of a class to drive the process of determining what to write.
Java has other XML binding facilities both built-in and third-party. These are much more powerful in their flexibility. They are especially good for when you want to define and verify the XML structure in a rigid fashion such as defining a XML DTD or XML Schema against which to validate the data and perhaps even generate the Java class in which to represent the data.
Embedded database
For more complicated data, use an embedded relational database.
The SQLite database is bundled with many platforms. This is a C-based library, not pure Java. As the name indicates, SQLite is indeed quite “lite“, lacking rigid data types and many other common database features. SQLite is meant to be an alternative to writing text files than as a competitor to more serious databases. It is a great product if your needs fit the sweet-spot of its capabilities.
My first choice for an embedded database would be H2 Database Engine. Built in pure Java. Can be run inside your app, or separately as a server (you choice). Has sophisticated relational database features. Has been around for years, often updated, and is well-worn. The principal author has much experience in the field.

Good practice for layered application with internationalization

I'm designing a new application in JSE which I want to internationalize.
I've never done such an application. I'm looking for the best practices about the internationalization. The application while be writing the translated data in files or DB. I've searched about best practices but I didn't found anything about my main question(the first one).
Should I put all the internationalization data in some layer or next to the object they are about ?
Could I directly use the properties files as a kind of enum to do a switch case ?
Or can I reverse engineer the data catched and know the default internationalize value and work with it?

I did encounter several strategies. I would start with a properties file.
One factor is that the data must be professionally maintained:
keep it in version control.
keep a version number for us humans, "1.0.23"
keep the texts ordered and nice, to help translation.
keep a second properties file with a glossary for consistent translation.
Undermore I did see generating properties or java ListResourceBundles from DocBook XML, Excel, translation memories. And yes, database.
Maintenance of data must be done careful, as several different parties will use the text at different times.
Programming tools, consistency checks and preparing data, communicating are tasks not to neglect.
Properties files are not entirely ideal, but IDEs have generally some support for them.
Set up everything for UTF-8, though take notice that properties files use ISO-8859-1, but you can use \uXXXX escaping or do a encoding conversion in your build process. ListResourceBundle java sources, generated than, would be an alternative.

Set up permissions/features in desktop application?

i have a desktop application that consists of 10 features, and some clients asks only for 8 features or 7 features.
i want to have a way to manage adding/removing the permissions/features for the client (only i can control that). so that i can hide/show feature based on a flag.
is that should be done through a property file that contains the name of the feature with boolean flag, or what ?
please give me some ideas, thanks.

From your other answers, it sounds to me like the following additional details have cropped up; please let me know if I have these wrong:
You're delivering your application as a .jar file,
Each customer gets their build directly from you, and there's a small number of customers,
You configure a build specifically for each customer, and
You don't want your customers to be able to modify their feature access.
In that scenario, I'd store the "active" feature list in a hashed property value stored in a .properties file bound into the .jar. I'll describe one way to do that below. You generate the properties file just before delivery, add the file to the jar:
jar -uf applicationJarFile.jar configuration.properties
then sign the .jar and deliver it. At runtime, your app can load the properties file, run the hash of each feature, compare with the properties you've stored, and determine which ones are off or on.
Your properties, which determine which features are enabled, might consist of a list like this:
feature1=enabled
feature2=disabled
feature3=disabled
feature4=enabled
Write yourself a utility which hashes the whole string "feature1=enabled" plus a salt value, e.g. "feature1=enabledaKn087*h5^jbAS5yt". (There's code for this built into java; see How can I generate an MD5 hash?, for example.) The result will be an opaque 16-byte number, which you can then store in another properties file to be included in your app: feature1=1865834.... The salt value should be broken into multiple shorter strings in your code so your customer can't just retrieve it and easily duplicate the process themselves.
In your app, at startup, you construct the string above using both the "enabled" and the "disabled" value, run the MD5 of both, and compare it with the stored hash. That'll tell you what features to enable.
I think a separate .jar or .properties is a bad idea; it clutters your delivery.
You can automate the whole process fairly easily, since you can generate the properties on the fly any time, and bind them into your app.
You can add other "baked in" properties which gives you a lot of flexibility in the final deliverable, including things like skinning for customer branding.
As others have pointed out, though: there's lots of ways to approach this, depending on the rest of the details of your product and your overall goals. This is one way to do it, given the assumptions above. AFAIK, there's no "canonical" way to do this sort of thing.

You should consider using a License management api to do the same, which will give u both security and capability to change License pre/post installations.
It is not advisable to build adhoc licensing capabilty, take a look at License3j and TrueLicense, they are both free and can help you gain perspective or better fulfil your requirement

You could try and encode that in a file. I assume each user has an own installation/version of the application, right? I further assume the application should not need to check some web resource. Thus you need to implement that in a file.
However, you should encrypt that file and put the salt and key somewhere in the code where they can't easily be decompiled. Additionally create a hash to check for modifications of the file. That hash could be based on the application's size or something else.
Please note that there's no 100% security and any hacker could still crack your application. But in that case this would need some form of criminal energy not commonly present in the business world.

Modularize the application and deploy to each client only those parts that he wants/has access to. There's many ways to do it (the most complete but heavyweight being OSGi), but the specifics depend on your circumstances and requirements.
The quickest way to implement it might be to simply extract your extra functionality in separate JARs, and on deployment update the classpath appropriately.

It depends on the kind of application,kind of security you want and the number of people likely to use the application.
If the number of clients is not that big you can store their preference in some in memory data structure like a Map . Otherwise you can use file system or a DB depending upon the kind of security you want.

This is very open ended - it really depends on what you're trying to achieve, and what you mean by a feature.
One approach is to use a plugin based architecture. e.g. you have an interface
public interface Feature {}
and provide each of your ten features as implementors of this interface. Then have some method which runs at application start which looks for Feature subclasses on the classpath.
You can control which features a client has by including only the relevant features on the classpath, e.g. using maven.

Application configuration files [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
OK, so I don't want to start a holy-war here, but we're in the process of trying to consolidate the way we handle our application configuration files and we're struggling to make a decision on the best approach to take. At the moment, every application we distribute is using it's own ad-hoc configuration files, whether it's property files (ini style), XML or JSON (internal use only at the moment!).
Most of our code is Java at the moment, so we've been looking at Apache Commons Config, but we've found it to be quite verbose. We've also looked at XMLBeans, but it seems like a lot of faffing around. I also feel as though I'm being pushed towards XML as a format, but my clients and colleagues are apprehensive about trying something else. I can understand it from the client's perspective, everybody's heard of XML, but at the end of the day, shouldn't be using the right tool for the job?
What formats and libraries are people using in production systems these days, is anyone else trying to avoid the angle bracket tax?
Edit: really needs to be a cross platform solution: Linux, Windows, Solaris etc. and the choice of library used to interface with configuration files is just as important as the choice of format.

YAML, for the simple reason that it makes for very readable configuration files compared to XML.
XML:
<user id="babooey" on="cpu1">
<firstname>Bob</firstname>
<lastname>Abooey</lastname>
<department>adv</department>
<cell>555-1212</cell>
<address password="xxxx">ahunter#example1.com</address>
<address password="xxxx">babooey#example2.com</address>
</user>
YAML:
babooey:
computer : cpu1
firstname: Bob
lastname: Abooey
cell: 555-1212
addresses:
- address: babooey#example1.com
password: xxxx
- address: babooey#example2.com
password: xxxx
The examples were taken from this page: http://www.kuro5hin.org/story/2004/10/29/14225/062

First: This is a really big debate issue, not a quick Q+A.
My favourite right now is to simply include Lua, because
I can permit things like width=height*(1+1/3)
I can make custom functions available
I can forbid anything else. (impossible in, for instance, Python (including pickles.))
I'll probably want a scripting language somewhere else in the project anyway.
Another option, if there's a lot of data is to use sqlite3, because they're right to claim
Small.
Fast.
Reliable.
Choose any three.
To which I would like to add:
backups are a snap. (just copy the db file.)
easier to switch to another db, ODBC, whatever. (than it is from fugly-file)
But again, this is a bigger issue. A "big" answer to this probably involves some kind of feature matrix or list of situations like:
Amount of data, or short runtime
For large amounts of data, you might want efficient storage, like a db.
For short runs (often), you might want something that you don't need to do a lot of parsing for, consider something that can be mmap:ed in directly.
What does the configuration relate to?
Host:
I like YAML in /etc. Is that reimplemented in windows?
User:
Do you permit users to edit config with text editor?
Should it be centrally manageable? Registry / gconf / remote db?
May the user have several different profiles?
Project:
File(s) in project directory? (Version control usually follows this model...)
Complexity
Are there only a few flat values? Consider YAML.
Is the data nested, or dependent in some way? (This is where it gets interesting.)
Might it be a desirable feature to permit some form of scripting?
Templates can be viewed as a kind of configuration files..

XML XML XML XML. We're talking config files here. There is no "angle bracket tax" if you're not serializing objects in a performance-intense situation.
Config files must be human readable and human understandable, in addition to machine readable. XML is a good compromise between the two.
If your shop has people that are afraid of that new-fangled XML technology, I feel bad for you.

Without starting a new holy war, the sentiments of the 'angle bracket tax' post is one area where I majorly disagree with Jeff. There's nothing wrong with XML, it's reasonably human readable (as much as YAML or JSON or INI files are) but remember its intent is to be read by machines. Most language/framework combos come with an XML parser of some sort for free which makes XML a pretty good choice.
Also, if you're using a good IDE like Visual Studio, and if the XML comes with a schema, you can give the schema to VS and magically you get intellisense (you can get one for NHibernate for example).
Ulimately you need to think about how often you're going to be touching these files once in production, probably not that often.
This still says it all for me about XML and why it's still a valid choice for config files (from Tim Bray):
"If you want to provide general-purpose data that the receiver might want to do unforeseen weird and crazy things with, or if you want to be really paranoid and picky about i18n, or if what you’re sending is more like a document than a struct, or if the order of the data matters, or if the data is potentially long-lived (as in, more than seconds) XML is the way to go.
It also seems to me that the combination of XML and XPath hits a sweet spot for data formats that need to be extensible; that is to say, it’s pretty easy to write XML-processing code that won’t fail in the presence of changes to the message format that don’t touch the piece you care about."

#Guy
But application config isn't always just key/value pairs. Look at something like the tomcat configuration for what ports it listens on. Here's an example:
<Connector port="80" maxHttpHeaderSize="8192"
maxThreads="150" minSpareThreads="25" maxSpareThreads="75"
enableLookups="false" redirectPort="8443" acceptCount="100"
connectionTimeout="20000" disableUploadTimeout="true" />
<Connector port="8009"
enableLookups="false" redirectPort="8443" protocol="AJP/1.3" />
You can have any number of connectors. Define more in the file and more connectors exist. Don't define any more and no more exist. There's no good way (imho) to do that with plain old key/value pairs.
If your app's config is simple, then something simple like an INI file that's read into a dictionary is probably fine. But for something more complex like server configuration, an INI file would be a huge pain to maintain, and something more structural like XML or YAML would be better. It all depends on the problem set.

We are using ini style config files. We use the Nini library to manage them. Nini makes it very easy to use. Nini was orignally for .NET but it has been ported to other platforms using Mono.

XML, JSON, INI.
They all have their strengths and weaknesses.
In an application context, I feel that the abstraction layer is the important thing.
If you can choose a way to structure the data that is a good middle ground between human readability and how you want to access/abstract the data in code, you're golden.
We mostly use XML where I work, and I cant really believe that a configuration file loaded into a cache as objects when first read or after it has been written to, and then abstracted away from the rest of the program, really is that much of a hit on neither CPU nor disk space.
And it is pretty readable too, as long as you structure the file right.
And all languages on all platforms supports XML through some pretty common libraries.

#Herms
What I really meant was to stick to the recommended way software should store configuration values for any given platform.
What you often get then is also the recommended ways these should/can be modified. Like a configuration menu in a program or a configuration panel in a "system prefs" application (for system services softwares ie). Not letting the end users modify them directly via RegEdit or NotePad...
Why?
The end users (=customers) are used to their platforms
System for backups can better save "safe setups" etc
#ninesided
About " choice of library ", try to link in (static link) any selected library to lower the risk of getting into a version-conflict-war on end users machines.

If your configuration file is write-once, read-only-at-bootup, and your data is a bunch of name value pairs, your best choice is the one your developer can get working first.
If your data is a bit more complicated, with nesting etc, you are probably better off with YAML, XML, or SQLite.
If you need nested data and/or the ability to query the configuration data after bootup, use XML or SQLite. Both have pretty good query languages (XPATH and SQL) for structured/nested data.
If your configuration data is highly normalized (e.g. 5th normal form) you are better off with SQLite because SQL is better for dealing with highly normalized data.
If you are planning to write to the configuration data set during program operation, then you are better off going with SQLite. For example, if you are downloading configuration data from another computer, or if you are basing future program execution decisions on data collected in previous program execution. SQLite implements a very robust data storage engine that is extremely difficult to corrupt when you have power outages or programs that are hung in an inconsistent state due to errors. Corruptible data leads to high field support costs, and SQLite will do much better than any home-grown solution or even popular libraries around XML or YAML.
Check out my page for more information on SQLite.

As far as I know, the Windows registry is no longer the preferred way of storing configuration if you are using .NET - most applications now make use of System.Configuration [1, 2]. Since this is also XML based it seems to be that everything is moving in the direction of using XML for configuration.
If you want to stay cross-platform I would say that using some sort of a text file would be the best route to go. As for the formatting of said file, you might want to take into account if a human is going to be manipulating it or not. XML seems to be a bit more friendly to manual manipulation than INI files due to the visible structure of the file.
As for the angle bracket tax - I don't worry about it too often as the XML libraries take care of abstracting it. The only time it might be a consideration is if you have very little storage space to work with and every byte counts.
[1] System.Configuration Namespace - http://msdn.microsoft.com/en-us/library/system.configuration.aspx
[2] Using Application Configuration Files in .NET - http://www.developer.com/net/net/article.php/3396111

We are using properties files, simply because Java supports them natively. A couple of months ago I saw that SpringSource Application Platform uses JSON to configure their server and it looks very interesting. I compared various configuration notations and came to the conclusion that XML seems to be the best fit at the moment. It has nice tools support and is rather platform independent.

Re: epatel's comment
I think the original question was asking about application configuration that an admin would be doing, not just storing user preferences. The suggestions you gave seem more for user prefs than application config, and aren't usually something that the user would ever deal with directly (the app should provide the configuration options in the UI, and then update the files). I really hope you'd never make the user have to view/edit the Registry. :)
As for the actual question, I'd say XML is probably OK, as plenty of people will be used to using that for configuration. As long as you organize the configuration values in an easy to use manner then the "angle bracket tax" shouldn't be too bad.

Maybe a bit of a tangent here but my opinion is that the config file should be read into a key value dictionary/hash table when the app first starts up and always accessed via this object from then on for speed. Typically the key/value table starts off as string to string but helper functions in the object do things such DateTime GetConfigDate(string key) etc...

I think the only important thing is to choose a format that you prefer and can navigate quickly. XML and JSON are both fine formats for configs and are widely supported--technical implementation isn't at the crux of the issue, methinks. It's 100% about what makes the task of config files easier for you.
I have started using JSON, because I work quite a bit with it as a data transport format, and the serializers make it easy to load into any development framework. I find JSON easier to read than XML, which makes handling multiple services, each using a config file that is modified quite frequently, that much easer for me!

What platform are you working on? I'd recommend trying to use the preferred/common method for it.
MacOSX - plists
Win32 - Registry (or are there a new one here, long since I developed on it)
Linux/Unix - ~/.apprc (name-value perhaps)

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.