Some kind of Data persistency

Some kind of Data persistency - java

Basically what I need to know is this:
I have to show a drop down list of countries to my users each country also has a code associated to it. I will have to work with both the country and the code What would be the best approach:
-We (the dev.) are thinking about a table in our app database with this data, or XML file.
-Our "architect" says that is old school and that we should use constants in our app with a map that associates the country with the code
Please Help me feel smart

I agree with you that you should not hard code this or use constants. There are a few good options depending on yours needs:
Java Properties Files - If you just have a few key-value pairs to store, these are the simplest way and easy to use.
XML Storage - If you are looking for persistence and are looking at XML for storage, I would recommend looking at JAXB. It is part of Java 6 and will make your life easier than trying to use the DOM.
Database Persistence - If you have more data that is changing often, you could also look at storing it in a database. JPA is a great standard library for doing this. That is probably overkill for what you are looking for though.
Bottom line is hard coding is a thing of the past. There are lots of great ways to get data in quickly and easily without resorting to hard coding everything.

Countries rarely change, so adding them statically as code or a config file seems reasonable. If you don't use a database for anything else, don't add one just for this feature.
If you already have XML parsing in your app, use an XML file to define the data. It already solves all kinds of issues (for example if you need to add a second attribute per country or something).
If you don't use XML for anything else, I suggest to give it a try. It doesn't add much to your app. Otherwise, use a plain text file, maybe a CSV one.

The different methods have different advantages and drawbacks:
Database:
allows you to use the country data in queries
data can be changed without redeploying the app
editing the data requires you to write some sort of frontend or do it manually via some generic SQL browser
requires database access code, and some sort of caching strategy
Any country-based logic in the code can break when the DB changes, or has to be reflected in the DB
XML:
Very easy to edit
can be changed without recompiling the app, but changes have to be deployed somehow
Requires parsing code and some sort of caching strategy
Any country-based logic in the code can break when the XML changes, or has to be reflected in the XML
Code:
Easy to edit - for developers
Changes require compilation and deployment
Requires no extra technical layers
Code and country data can't get out of synch
All in all, the "code as data" solution is indeed the nicest, if the compile&deploy step for each change is acceptable to you. The other solutions create overhead and duplication of structure (or even logic) - and no, they don't magically make it "safe" to do last-minute changes "because it's not code". Code is data, and data is code.

In short your architect is wrong (Or at least he is if your paraphrase of his position is accurate). It shouldn't be in the code.
This data is not static; a country's name changes, a new one is founded, or some cease to exist.
As far as what mechanism, it doesn't necessarily matter. Just make sure you can retrieve the data easily, that you have unit tests, and that there is straightforward mechanism to update the data.

I think that "table solution" has more flexible approach:
1. You can manage data and connecting properties
2. You can work with table directly
3. You can create associated map, based on db table))

I would certainly not use them as constants in the code.
Names can change, while countries can be created, merge, disappear, etc.
If you are already using a database, adding this may make sense. For one, it ensures that the codes that may be stored with client data are valid in terms of your country code list. So if a country disappears but a client record still refers to it, the data stays valid.
Make sure that your UI loads and caches the list; no point making a query every time if you can avoid it.
By the way, correctly handling countries in an internationalized application is much more complicated than just dealing with renames. For example, if a country or part of a country declares independence, some countries will recognize it, while others do not.

Related

Getting database value to display it on JSP [duplicate]

We have a rather large application, with a great deal of dynamic content. Is there anyway to force struts to use a database for the i18n lookups, instead of properties files?
I'd be open for other ways to solve this as well, if anyone has ever done i18n with dynamic content.

I don't know of an easy plug-and-play solution for this, so you will probably have to implement it yourself -- plan on spending quite a bit of time just coming to grips with how the localization features of struts 2 (and XWork) are implemented. The key will probably be to provide your own implementation of com.opensymphony.xwork2.TextProvider (and tell struts to use it by providing a <bean> tag in struts.xml). I can think of at least two ways of fitting this into the overall architecture:
Have your TextProvider implementation access the database directly. In the spirit of YAGNI, this is probably the best way to start (you can always refactor later, if necessary).
Alternatively, you could place the database code into an implementation of Java's ResourceBundle interface, which is what XWork uses internally. To me this sounds like an even more design-heavy approach, but on the plus side there are some articles around describing how to do this.

No, there is no built-in way to have Struts2 load localized content from a database. You would need to write that yourself.
What are your requirements? Do you need for users to be able to dynamically change field prompts, error messages, etc.?

You may be able to do something like that by building a custom interceptor. You could have the interceptor read all the key value pairs from your database and inject them into the value stack. The only thing I am not sure about, not really having messed with i18n with struts before, is if the i18n stuff pulls that information from the value stack. If not, I am not sure if maybe you could do something else in the interceptor to load up the information.
Building a custom interceptor is not too terribly complicated. There are plenty of tutorial sites out there, including (brace for self promotion here) my blog: http://ddubbya.blogspot.com/2011/01/creating-custom-struts2-interceptors.html.

Use properties files just for static content, like labels, messages etc.
For dynamic content start with a database table that includes a language-code-id for every language you want to use. All the dynamic content entries that are already translated go with their respective language-code-id added to their primary key. If a translation is missing, you can program your application to fall back to your default language in order to make things easier until the right translation is present.
Let your users provide their contributions in the language they like and store it with the appropriate language-id. Someone should provide the translation to the other languages in order to make the contribution complete.
...
PRIMARY KEY (`subject_id`,`language_id`),
...

Permanently saving the value of a string

I am currently making a flash card program, and I want the user to be able to put in their own questions and answers, then test themselves. My only problem is that i want the values they enter to be there permanently until changed by them. How do I do this? (P.S: if you need the code, I can give it.)

I assume that you are new to programming and you have not yet worked with persistence of any context. In this case, for your simple example, the Java Properties class might be a good entry point into the field of file persistence.
In general, there are plenty of ways to persist data: databases, files, web storage, etc... It depends on your application and what you want to do with the data. For an example of the Java Properties file see for example this tutorial: http://www.mkyong.com/java/java-properties-file-examples/

Well without seeing any of the code, it is hard to tell you exactly how you should approach this, but in general you will need some method of persistence to save this type of information, ie. a database, flat file, etc. Try looking into sqlite or even XML for storage and retrieval.

How can I compare 2 large objects running on separate jvm's?

I am looking at changing the way some large objects which maintain the data for a large website are reloaded, they contain data relating to catalogue structure, products etc and get reloaded daily.
After changing how they are reloaded I need to be able to see whether there is any difference in the resulting data so the intention is to reload both and compare the content.
There may be some issues(ie. lists used when ordering is not imporatant) that make the comparison harder so I would need to be able to alter the structure before comparison. I have tried to serialise to json using gson but I run out of memory. I'm thinking of trying other serialisation methods or writing my own simple one.
I imagine this is something that other people will have wanted to do when changing critical things like this but I haven't managed to find anythign about it.

In this special case (separate VMs) I suggest adding something like a dump method to each class which writes the relevant content into a file (human readable text). This method calls dump on each aggregated object as well.
In the end you have to files from each VM, and then you can compare them using an MD5 checksum for example.
This is probably a lot of work, but if you encounter any differences, you can use diff on both files, and this will be a great help.
You can start with a simple version, and refine it step-by-step by adding more output.
Adding (complete) serialization later to a class is cumbersome. There might be tools which simplify this (using reflection etc.), but in my experience you have to tweak your classes: Exclude fields which are not relevant, define a sort order for lists, cyclic relations etc.
Actually I use a similar approach for the same reasons (to check whether a new version still returns the same result): The application contains multiple services (for each version), the results are always data transfer objects, serialization is added immediately to the DTOs, and DTOs must provide a comparison method dedicated for this purpose.

Looking at the complications and memory issues, also as you have mentioned you dont want to maintain versions, i would look to use database for comparison.
It will need some effort in terms of mapping your data in jvm to db table but once you have done that, it will be staright forward. You can dump data from one large object in db tables and then you can simply run a check from 2nd object in db.
Creating a stored proc can simplify things. This solution can support data check from any number of jvms.

Continue with Object serialization or use database?

I have written a math game in Java, and have distributed some copies to a few beta-testers. The problem is that the version I have given them is saving the GameData via object serialization, which I found out is mainly for sending Objects, or in this case, ArrayLists of GameData, over a network. It is NOT persistance; that is what a relational database is for. Knowing this, I would like to know if it would be better to create a database on the beta-tester's machine (and rewrite the game), or continue with the Object serialization version of the game, and then retrieve the Objects when they are ready to send the data?
My guess would be to just move their data to a database that is created on their computer, and then give them the database version of the game. That way, the data can be persisted and be much easier to manipulate. What turns me away from that idea is the question of how am I going to write their database into mine (in the future)?

Although relatively rare, there are still lots of applications that use serialization for storage and retrieval of objects. It's not wrong to do this, just slightly unusual. If it's working for you, stick with it because DB's are a heavyweight solution. What you found out, about serialization, is only an opinion and an ill-formed one at that.

In terms of using an embedded database, two options to consider are SQLite and HyperSQL. However, serialization is also an option, and in my opinion it should be your default option if you've already implemented it. Some considerations:
With serialization you've generally got to retrieve the entire object, which is slow if you've got an object with several dozen fields and you only want to read one of them. If you're making queries like these, then use a database. I suspect that you're just reading in all of your serialized objects at startup and serializing them back out to disk at shutdown, in which case there's no reason to use a database instead of serialization.
Java's default serialization mechanism is fairly slow. You may want to consider another serialization mechanism, such as Kryo or Jackson, but only if you're not happy with your program's serialization performance.

It is difficult to advise on the best choice of technology without knowing what you are persisting and why.
If the state is simply a snapshot of your game state (i.e. a save file) or a "best scores" table, then you don't need a database. Serializing using JSON, XML or ... Java Object serialization is sufficient.
If the state needs to be read or updated incrementally or shared with other applications ... or users on other machines ... then a database is more appropriate.
Serialization mechanisms are problematic if the requirements include incremental changes, etcetera. You end up building a database-like layer over the top of the serialization.
As to whether you should stick with Java serialization ... or switch to JSON or XML or something like that:
Object serialization is simple, but it can be fragile if you change the classes that you are serializing. This fragility can be mitigated, but it is messy and you lose the simplicity. (You need to write custom readObject and writeObject methods that know how to read "old versions" of the serialized objects.)
JSON and XML are a bit more complicated, but still relatively simple if you use an object binding mechanism.
It is worth noting that changes to the persisted object classes (or the database schemas) are potentially problematic no matter what you do. There is no easy universal solution to this problem.
UPDATE
Given the additional information that you provided in your first comment (below), it seems like you don't need a database in the game itself. All you need is something that can read and analyse the session state save files that your beta testers provide for you. Indeed, it doesn't even seem like the actual app needs to be able read the files. (But that's unclear, because you've not said what the real purpose of these files is ... or at least, not what the entire purpose is.)
It is also worth noting that you are probably saving the wrong information if your aim is to tune the sets of questions. What you really need to do is record the length of time and whether the user got the right or wrong answer and the time ... for each individual question. And you probably need to know what the actual answer given was ... so that you can spot cases where the user's answer was actually right and you "marked" it as wrong ... or vice versa.
"What turns me away from that idea is the question of how am I going to write their database into mine (in the future)?"
Exactly. If you hadn't prematurely "analysed" the data, you wouldn't have this problem.
But ignoring that, it seems like that a simple state saving mechanism is sufficient to meet your (still hypothetical / inferred) requirement of keeping a personal score board for the end user. Your "tuning" stuff would be better implemented using a custom log file. I cannot see any value in incorporating a database as part of the app itself.

I presume you are doing java serialisation, If so there is nothing wrong with it. Just be aware of its limitations - Different versions of java might not be able to retrieve the file.
Also If you change the Class, previous saved data can not be retrieved.
If you decide to change you could look at Xml, JSon, Protocol Buffers, Thrift, Avro etc as well as a DB.
Note:
Xml is builtin in to java
Java Db (Derby) is also in Java
Other serialisation schema's require a seperate library.

Is there a dbunit-like framework that doesn't suck for java/scala?

I was thinking of making a new, light-weight database population framework. I absolutely hate dbunit. Before I do, I want to know if someone already did it.
Things i dislike about dbunit:
1) The simplest format to write and get started is deprecated. They want you to use formats that are bloated. Some even require xml schemas. Yeah, whatever.
2) They populate rows not in the order you write them, but in the order tables are defined in the xml file. This is really bad because you can't order your data in such a way that foreign key constraints won't cause problems. This just forces you to go through the hassle of turning them off altogether.
This also wastes time and bloats up your junit base classes to include code to disable the foreign key constraints. You will probably have to test for the database type (hsqldb, etc.) and disable them in database-specific ways. This is way bad.
It could be better if dbunit helped in disabling foreign key constraints as part of their framework automatically, but they don't do this. They do keep track of dialects... so why not use them for this? Ultimately, all of this does is force the programmer to waste time and not get up and testing quickly.
3) XML is a pain to write. I don't need to say more about this. They also offer so many ways to do it, that I think it just complicates matters. Just offer one really solid way and be done with it.
4) When your data gets large, keeping track of the ids and their consistent/correct relationships is a royal pain.
Also, if you don't work on a project for a month, how are you to remember that user_id 1 was an admin, user_id 2 was a business user, user_id 3 was an engineer and user_id 4 was something else? Going back to check this is wasting more time. There should be a meaningful way to retrieve it other than an arbitrary number.
5) It's slow. I've found that unless hsqldb is used, it is painfully slow. It doesn't have to be. There are also numerous ways to mess up its configuration as it is not easy to do "out of the box". There is a hump that you must go through to get it working right. All this does is encourage people to not use it, or be pissed of when they do start to use it.
6) Some values tend to repeat a lot, likes dates. It'd be nice to specify defaults, or even have the framework put defaults in automatically, even without you telling it to put defaults in there. That way you can create objects just with the values you want, and leave the rest off. This sure beats specifying every nook and cranny of a column if it's not required.
7) Probably the most annoying thing is that the first entry must include ALL the values - even null placeholders - or future rows won't pick the columns that you actually specified.
DBunit doesn't have a sensible default for translating [NULL] to a real null value either. You have to manually add it. Tell me, who hasn't done this with dbunit? Everyone has. It shouldn't be like this!
What this means is that if you have a polymorphic object, you must declare all the foreign keys to the joining tables of each subclass in the first row, even though they are null. If you do a table for all subclasses pattern, you still have to specify all the fields on the first row. This is just awful.
Anything out there to satisfy me, or should I become the next framework developer of a much better database testing framework?

I'm not aware of any real alternative to DbUnit and none of the tools mentioned by #Joe are in my eyes:
Incanto: not DB agnostic
SQLUnit: a regression and unit testing harness for testing database stored procedures (that's not what DbUnit is about)
Cactus: a tool for In-container testing (I fail to see where it helps with databases)
Liquibase: a database migration tool (doesn't load/verify data)
ORMUnit: can initialize a database but that's all
JMock: doesn't compete with DbUnit at all
That being said, I've personally used DbUnit successfully several times, on small and huge projects, and I find it pretty usable, especially when using Unitils and its DbUnit module. This doesn't mean it's perfect and can't be improved but with decent tooling (either custom made or something like Unitils), using it has been a decent experience.
So let me answer some of your points:
The simplest format to write and get started is deprecated. They want you to use formats that are bloated. Some even require xml schemas. Yeah, whatever.
DbUnit supports flat or structured XML, XLS, CSV. What revolutionary format would you like to use? By the way, a DTD or schema is not mandatory when using XML. But it gives you nice things like validation and auto-completion, how is that bad? And Unitils can generate it easily for you, see Generate an XSD or DTD of the database structure.
It could be better if dbunit helped in disabling foreign key constraints as part of their framework automatically, but they don't do this. They do keep track of dialects... so why not use them for this? Ultimately, all of this does is force the programmer to waste time and not get up and testing quickly.
They are waiting for your patch.
Meanwhile, Unitils provides support to handle constraints transparently, see Disabling constraints and updating sequences.
XML is a pain to write. I don't need to say more about this. They also offer so many ways to do it, that I think it just complicates matters. Just offer one really solid way and be done with it.
I guess pain is subjective but I don't find it painful, especially when using a schema and autocompletion. What is the silver bullet you're suggesting?
When your data gets large, keeping track of the ids and their consistent/correct relationships is a royal pain.
Keep them small, that's a know best practice. You're going against a known best practice and then complain...
Also, if you don't work on a project for a month, how are you to remember that user_id 1 was an admin, user_id 2 was a business user, user_id 3 was an engineer and user_id 4 was something else? Going back to check this is wasting more time. There should be a meaningful way to retrieve it other than an arbitrary number.
Yes, task switching is counter productive. But since you're working with low level data, you have to know how they are represented, there is no magic solution unless you use a higher level API of course (but that's not the purpose of DbUnit).
It's slow. I've found that unless hsqldb is used, it is painfully slow. It doesn't have to be. There are also numerous ways to mess up its configuration as it is not easy to do "out of the box". There is a hump that you must go through to get it working right. All this does is encourage people to not use it, or be pissed of when they do start to use it.
That's inherent to databases and JDBC, not DbUnit. Use a fast database like H2 if you want things to be as fast as possible (if you have a better agnostic way to do things, I'd be glad to learn about it).
Probably the most annoying thing is that the first entry must include ALL the values - even null placeholders - or future rows won't pick the columns that you actually specified.
Not when using Unitils as mentioned in presentations like Unitils - Home - JavaPolis 2008 or Unit testing: unitils & dbmaintain.
Anything out there to satisfy me, or should I become the next framework developer of a much better database testing framework?
If you think you can make things better, maybe contribute to existing solutions. If that's not possible and if you think you can create the killer database testing framework, what can I say, do it. But don't forget, ranting is easy, coming up with solutions using your own solutions is less so.

As a DbUnit developer I'm grateful for criticism and I must partially agree with you. We are currently starting the design of the next DbUnit major release and I wish to invite you to participate both in the discussion and development.
I'm not going to answer your points as your question is not really related to DbUnit, but to DbUnit alternatives. Anyway, I just want to highlight your point 7 is completely false: you do not need to specify all the columns on first row any more, the feature is called column sensing. I'm not going to tell you why it's not enabled by default as you are surely smart enough to understand it by yourself.
I'll give scaladbtest a deep examination in the hope we can integrate their ideas.

Faced with similar concerns using DBUnit I have found this : http://dbsetup.ninja-squad.com/index.html which may address concerns. Such as instead of representing test data in separate files all DB content is contained within the java class itself.

If you use the Spring Framework (or don’t mind using it at least for testing), then Spring DBUnit is currently the best (maintained) alternative to plain DBUnit that I know and use. Quoting their website:
Spring DBUnit provides integration between the Spring testing
framework and the popular DBUnit project. It allows you to setup and
teardown database tables using simple annotations as well as checking
expected table contents once a test completes.
Spring DBUnit appears to be the ‘somewhat official’ Spring solution for DB unit testing (with DBUnit); at least the author/maintainer of the library, Phil Webb, is working at SpringSource/Pivotal.

I use DBUnit, with a few wrappers to smooth over the rough edges. A nice tool that can either complement or overlap the functionality is Jailer. It can extract subsets of data from a reference database, and store this as either DBUnit compatible XML files, or as "topologically sorted DML files", which respect the foreign key constraints.

I just released a library called JDBDT (Java Database Delta Testing) that
you may use for database setup and validation in software tests.
Have a look at http://jdbdt.org
Best,
Eduardo

You're making excellent point.
I've been working for a lot of web portals over the last years, mostly with PHP, but also some Java now and then.
And like you I don't get that after all these years framework and unittesting developers don't seem to realize how much storage handling has changed in the last decade.
It's not enough to just send create/insert/truncate statements to some database!
If you're operating at large scale you end up employing all sorts of storage backends, organized in layers to push hot content out fast. Plus on the Database front there's the issue of data partitioning. If you don't have a proper foreign key abstraction provided you will certainly go nuts when your storage setup changes. And while we're at it: fixture ordering by foreign key precedence has many pitfalls and I have yet to see a real solution for that with DBUnit.
Anyway, the point is having just a basic database storage in place for unittesting is not enough for complex storage setups, since they often fail to reproduce problems in the live environment and are a pain in the ass to maintain.
Without wanting to sound like a fanboy: one place where things are okay is ruby on rails.
That has a persistent model concept that people seem to have actually put some thought into. If you're dealing in PHP, Symfony is the place to go. It is limited via the default inclusion of Doctrine, with is also quite DB-centric, but it has clean interfaces and great extensibility and copied the rails fixture system completely. Professionally I need to stick to homebrew solutions for now, but they work okay.

Another vote for wrapping DBUnit with a modern library to improve usability and conciseness. My choice is database-rider, which makes DBUnit a breeze to use and even supports JUnit 5 as demonstrated in the following example:
#RunWith(JUnitPlatform.class)
#ExtendWith(DBUnitExtension.class)
#DBUnit(cacheConnection = true, cacheTableNames = true)
class TestInstrumentQueryService {
private ConnectionHolder connHolder = () -> EntityManagerProvider.instance("my-jta-unit").connection();
#DBRider
#DataSet("datasets/instrumentIds.yml")
void testFindInstrumentById() {
InstrumentQueryService iqs = new InstrumentQueryService(EntityManagerProvider.em());
Instrument instr = iqs.findInstrumentById(InstrumentIdType.TICKER_BBG, "AAPL");
assertEquals(100, instr.getId());
}
}
Notice how this allows leveraging (concise) YAML testing data sets seamlessly (YAML not XML, though I'm lead to believe that DBUnit actually supports those natively).

Here's a short list of a few tools in this vein (besides DBunit) that I particularly like, or find interesting. At the very least they may offer some inspiration:
Incanto
SQLunit
Cactus
Liquibase
ORMUnit
JMock
Note that none of these are really competitors to DBunit in terms of scope or feature sets. However, there are some interesting ideas there that might be worth taking a look at. Good luck!

We are writing Daleq as a wrapper around DbUnit to address some of the mentioned concerns. It allows populating a DB just within your unit test rather than relying on editing XML files.

I too had similar issues with DBUnit. Especially for using it to populate local development data and exporting data from a real database. I ran into several cases where it would export a dataset that it couldn't then import.
This inspired me to write a new library for it: https://github.com/jeffskj/phonydata
This uses a groovy DSL to define the datasets which makes for a very compact representation of the data and makes it possible to do cool things like generate random data since it's just groovy code.

The situation of DBUnit is indeed sometimes frustrating. Some of the problem are solved from Marc Philipp with dbunit-datasetbuilder, specially if you combine it with the validator, which is in a very early stage. You can see it in action at SZE.
Disclaimer: All referenced github-resources are maintained by me.

An alternative using Spring configuration and Specs2 testing can be found here

I just released a groovy DSL based framework called pedal-loader available via github. Documentation here.
It allows you to work with JPA entity level abstraction directly. Since it is a groovy script, you can use all of the groovy constructs.
To insert rows into a table backed by a JPA entity called Student, with fields (not database columns, but mapped fields) called id, name and grade, you would do something like this:
allStudents = table(Student, ['id', 'name', 'grade']) {
row 1, 'Joe', Grade.A
rowOfInterest = row 2, 'John', Grade.B
}
Grade is an enum in the Student class that is mapped to the database column (perhaps using JPA 2.1 #Convert annotation). allStudents is a list that will hold the rows and rowOfInterest is a reference to a particular row. These properties (allStudents and rowOfInterest) become available to your unit test.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.