Some background.
I am currently working on a distributed system in Java. The code is mostly all hand rolled, and consists of a client server frame work and an application to be served. The application is for running a turn based system.
The design issues.
The application is supposed to be almost completely configurable. The base of the program is a processor that will process objects in a linked list style, moving from one object in the chain to the next until the processing is done. I have already written a simple blackjack game with it, nothing fancy just something to implement the API. The problem i am having is that i am unsure as to the granularity of the objects. I know i do not want to write a programing language style architecture, example, classes to represent add, subtract, and other languages constructs, but i want the architecture to be flexible enough so that a user can specify a program, within its capabilities, in a file and have this construct instantiate the proper classes link them and then run the sudo-program.
Any suggestions and/or ideas would be helpful.
Thanks in advance,
Zix
This kind of thing is pretty tough, mostly because the requirements are so broad. When we look at successful, real-world platforms that are highly flexible and configurable it's too easy to think they they started that way, or that you could design them from the beginning to be that way.
This is not the case.
The reality is more about picking a subset of problems. In your case, not just a turn-based games in general, or even turn-based card games, but blackjack, followed by some other game (bridge, poker, whatever), and then you build upon that.
Let your application (and its features) grow based on actual needs, and not the set of problems you think you will need to solve. Focus on the problems that are in front of you as you try to build actual games. Over time, you'll figure which parts to abstract (and which parts not to abstract), and in the mean time you'll actually have something to show for it.
Go the other way, and you'll have an interesting idea, an API no one uses to solve actual problems, and a bunch of code but no games. Don't fool yourself into thinking that no one has tried to build the platform for turn-based games. It's been tried 1,000 times. The reason it doesn't really exist is because building platforms without clear goals (i.e. a set of requirements leading to a finished game) leads you to software without a clear use.
That means they don't get used, games don't get built, you never hear about them, and therefore they aren't successful. Successful, flexible platforms have a long list of successful applications built using them, at many stages throughout their lifetime.
Related
I started a job as an appdeveloper for android. After a week they figured that I'm not fast enough getting the apps ready so the fired me and offered me an internship, for free.
I'm wondering how I can get better fast. Should I read more (I'm missing architecture stuff I think)? What should I read? should I just start coding stuff, or should I do a course of some sort?
I'm reading design patterns from head first. Will Java design patterns help me with Android or are they not applicable on android. Basically I'm at home reading android stuff. The company said they would take me back when I'm good enough so I need to get better quick.
Best way to learning ( imho ) is by looking in to already coded project and disecting them. Try to understand how the program is made up. Then work on some tutorials and expand then!
Resources:
Java: http://download.oracle.com/javase/tutorial/java/index.html
Android tutorials: http://p-xr.com
I assume you are the only developer in a very small company. If I'm wrong and there are multiple developers then they should be doing design and architecture and you shouldn't need to worry about that--just code the sections they give you to code.
So the rest of this is based on the assumption that you are the sole developer.
Don't worry about architecture or design or anything--Get a GUI together as fast as you possibly can--it's all about what you can show. Even if it has nothing at all behind it, it's visible progress.
A well architected solution will save you bunches of time in the futre, but from the sound of it you don't have the experience, and also a well architected job is front-loaded.. a lot more time spent before you can show anything. Doesn't sound like that will make your employers happy.
I'd actually spend 2 or 3 hours some evening researching some of the "Project Management and Design portions of XP (you may have to do this on your own time at first--but then if you're getting paid $0 all your time is your own time right now)
XP is all about your process being transparent to your customer (in this case your customer is your employer). This allows them to understand why something might take time and it actually allows them to make decisions on the fly to correct problems or speed the process by eliminating features.
I'd start by writing down each task on a piece of paper. Each of these tasks should take a couple hours to a day or two to do so make them pretty fine-grained but don't write down a time estimate, instead when you are done with the cards go through them and rank them all in difficulty 1-5 (The purpose becomes clear when you see how XP does time estimations).
Then you start interacting with your employeer---they choose which tasks they want done for the day/week. You do them.
As you are executing tasks you may encounter new ones or have to break some up. This is expected--make more sheets of paper. If your employers want to know when it will be ready, look into how to do burndown charts and time prediction. In XP this practice does a good job of taking into account the fact that programmers suck at estimating time (It uses your history and the difficulty 1-5 ranking to determine how long a "2" actually takes you to do.
Soon after implementing this practice, decide on what metrics they want you to maintain in order to get paid, or see if they want to pay you per task.
On the other hand they are only treating you like that because you let them. You're fired now, start looking for another job. If you find one, just walk away.
Best for learning is doing. All reading will get yo nowhere if you don't practice.
And the design pattern may help, but much of the classical wisdom may not directly apply - if only for the fact that the devices have limited memory.
Take the JavaBeans approach, where a bean has to have a getter and setter. On Android (before 2.3 iirc), this had a negative impact on performance, so that making fields public and directly using them is preferred.
To learn, take an open source project (e.g. my Zwitscher app), use it, try to find out what you want to improve and improve it.
Firstly, make sure that your employer was justified in his reasons for firing you. Some of the questions you should be asking them are:
- Were the deadlines not respected? In this case, if you did give them timelines for completion of phases of your work, then you may need to look at improving your effort estimation. If they gave you timelines, see if they were reasonable
- Were features added to your work on an ad-hoc basis? Many times I've found that a freeze on requirements is consistently broken
- If you are slow and haven't picked up some of the basics, then come up with a sample application that closely resembles your employer's application, and DO it. Use SO, developer docs and books to clear doubts and hone your skills.
Best of Luck with Programming :)
I have been working on Java Swing project. Its design is pretty poor. I have been given task to make its design better. Primarily I was thinking that I will decouple Java Swing code and the business code by following MVC pattern.
Doing this stuff manually seems error prone. Is there any tool/software that decouples GUI layer code (written in Java Swing) and the business layer code (written in core Java).
As far as I can tell the answer to that question is "No, there is not such a tool."
I would hazard a guess to say that it is not even likely that someone could make such a tool. The problem is that thereis no easy way to distinquish between "business logic" and "GUI" code.
On projects that I have worked on, we made a concerted effort in the design phase to keep the GUI and the business logic separate, but I can't count the number of times that it was unavoidable that there was cross-over, especially when some business rule drove how the GUI should behave.
I don't envy your task. Dealing with, and modifying existing code, especially code that you didn't create, IS error prone. And the whole concept of MVC programming is really just way to think about how you build a program not a set of existing library of tools to use.
I learned MVC programming back in Smalltalk days, when it was originally invented by the smart people at Xerox Parc, and even they couldn't avoid some cross-over coupling of code.
Automated tools work well when the input is well defined and structured. Based on what you have described, your inputs are far from that.
Yes, doing this work is going to be complicated and a lot of work. But this is something that you just need to roll up your sleeves and dive into it. Before you start touching the code, sit down and make sure you fully understand how things are laid out now (and why that is a bad thing). Then start planning how you would do it better.
Now that you have a plan of how things should be, start working with the code and updating the legacy code.
There isn't a tool, but there are techniques.
Refactoring is a practice of making small changes to code to alter it's architecture without altering it's function. You've already probably done refactoring to a degree, but sometimes reading up on a subject you know about can help sharpen one's skills.
Many IDEs do support some of the simple refactorings. Eclipse and NetBeans have different strengths in their refactoring support, and depending on your current environment, you may find one to be more useful than the other.
What refactoring won't do for you is to give you a roadmap of how to get from point A (your spaghetti code) to point B (your cleanly separated out MVC). Instead it will just make you a better driver along the path. Unfortunately, (or fortunately for all the developers out there) good brains are still needed to figure out what sections of code intend to do, and whether they are more appropriately placed in the Model, View, or Controller. The rest is just teasing out the dependencies in such a manner that you can eventually move the code up or down in the sequence, and eventually push it into a method that can then be moved to the appropriate class.
Good luck.
I also don't know of any tool. And as others here - kind of - point out, I think such tools would end up making programmers redundant. We don't want that, do we. :-D
But, checking manually if you actually managed to decouple the code should not be to complicated. Just make sure your "business part" imports no swing packages:
find in files "swing"
move all those files in one package
check if there is any "business logic" in there => take it out
...
Is it wise to develop a prototype GUI before designing other part of the system?
I am using Java for this small project. It will be a program with GUI and database connection. Say the database has table A and B, the user can choose which table to interact with. The program then display the contents of, say, table A in the GUI, and allows the user to change the content and submit the changes, or delete, or insert.
I think GUI should be developed first before any back-end development starts. There are couple of reason to do this:
You gain clarity on how model objects should interact.
Usability poses lots of restrictions on the way you want to pull data. You will probably want to develop and architect after you're 100% sure what constraints are there.
On business point, managers like to have a dumb function UI before any development start. Many times, the feedback leads in major changes in back-end assumptions. Which is a lot less pain than the case when you get a change request after the back-end development is over.
My personal experience goes that simultaneous development of GUI and back-end is a bit messy. Plus GUI provides solid expectation of behavior from back-end. Moreover, this approach makes sure all the developers, your client and your manager on the same page.
I agree with Joel Spolsky that it is a great idea to write a functional spec before writing code. Part of that spec should include a collection of screen mockups. #O.D. is right, Balsamiq is a great tool. It has saved me a lot of time in the past.
Once you have a functional spec in place that the business users are happy with, you will then have a better idea of how to design your system to meet the requirements. e.g. is high performance a requirement, domain model vs simple crud etc.
Then you should start by taking a single use case and building a vertical slice of your application. Build a GUI, service layer, persistence layer, database schema in one iteration. This will hopefully point out any problems with your design and give you the chance to modify it before you start building out the horizontal functionality.
I'd say yes and no.
No because you should design you application to be modularized enough so that your logic and data do not depend on UI design.
Yes because it is always smart to design everything before you actually start implementing it.
So what I mean is that you should make a concept, but not let your UI concept 'tie your hands' when you implement your logic. So if your managers clients don't like your conceptual UI, you can always change it without actually changing your application logic.
Well showing you GUI brfore starting to program is a very a good Idea, specially that you enable the enduser (Customer) to check if the UI is up to his expectations, which can save you lots of time.
In order to do that you dont necessarily need to develope a "real" prototype, you can use programms which enable you to fast design the UI of your App, including a minimal workflow simulation instead of full funcionality.
i had a very good experience with: Balsamiq can really recommend it
Writing spec before your code is always a good idea, because it makes you think. But most specs I have seen are not that good. And if the spec is too technical, users will at the end sign-off your spec without really understanding what are they going to get.
I have seen best results when either presenting the User Manual to the client, or by discussing mockups of the system one scenario at a time.
Note that half-baked mockups won't do the trick. You need your mockups to be fully populated with relevant data (Ever tried to discuss some screens with accounting while the numbers on the screen don't match? There's no way at all you could explain to them these are only dummy numbers...)
And the caveat of using mockups is that users will more often than not believe the app is "almost finished", whatever you do or say. It must be some subconscious thing, I'm not sure. But to avoid that, most of specialized tools have either only "black&white" look and feel or multiple skins you can switch to and from.
There is a pretty complete list of mockup tools here. Many of them are free:
http://c2.com/cgi/wiki?GuiPrototypingTools
My own tool is pretty popular: http://MockupScreens.com, I created it a long time ago exactly because of my own frustration with above mentioned problems.
Question in nutshell
What is the best way to get Python and Java to play nice with each other?
More detailed explanation
I have a somewhat complicated situation. I'll try my best to explain both in pictures and words. Here's the current system architecture:
We have an agent-based modeling simulation written in Java. It has options of either writing locally to CSV files, or remotely via a connection to a Java server to an HDF5 file. Each simulation run spits out over a gigabyte of data, and we run the simulation dozens of times. We need to be able to aggregate over multiple runs of the same scenario (with different random seeds) in order to see some trends (e.g. min, max, median, mean). As you can imagine, trying to move around all these CSV files is a nightmare; there are multiple files produced per run, and like I said some of them are enormous. That's the reason we've been trying to move towards an HDF5 solution, where all the data for a study is stored in one place, rather than scattered across dozens of plain text files. Furthermore, since it is a binary file format, it should be able to get significant space savings as compared to uncompressed CSVS.
As the diagram shows, the current post-processing we do of the raw output data from simulation also takes place in Java, and reads in the CSV files produced by local output. This post-processing module uses JFreeChart to create some charts and graphs related to the simulation.
The Problem
As I alluded to earlier, the CSVs are really untenable and are not scaling well as we generate more and more data from simulation. Furthermore, the post-processing code is doing more than it should have to do, essentially performing the work of a very, very poor man's relational database (making joins across 'tables' (csv files) based on foreign keys (the unique agent IDs). It is also difficult in this system to visualize the data in other ways (e.g. Prefuse, Processing, JMonkeyEngine getting some subset of the raw data to play with in MatLab or SPSS).
Solution?
My group decided we really need a way of filtering and querying the data we have, as well as performing cross table joins. Given this is a write-once, read-many situation, we really don't need the overhead of a real relational database; instead we just need some way to put a nicer front end on the HDF5 files. I found a few papers about this, such as one describing how to use [XQuery as the query language on HDF5 files][3], but the paper describes having to write a compiler to convert from XQuery/XPath into the native HDF5 calls, way beyond our needs.
Enter [PyTables][4]. It seems to do exactly what we need (provides two different ways of querying data, either through Python list comprehension or through [in-kernel (C level) searches][5].
The proposed architecture I envision is this:
What I'm not really sure how to do is to link together the python code that will be written for querying, with the Java code that serves up the HDF5 files, and the Java code that does the post processing of the data. Obviously I will want to rewrite much of the post-processing code that is implicitly doing queries and instead let the excellent PyTables do this much more elegantly.
Java/Python options
A simple google search turns up a few options for [communicating between Java and Python][7], but I am so new to the topic that I'm looking for some actual expertise and criticism of the proposed architecture. It seems like the Python process should be running on same machine as the Datahose so that the large .h5 files do not have to be transferred over the network, but rather the much smaller, filtered views of it would be transmitted to the clients. [Pyro][8] seems to be an interesting choice - does anyone have experience with that?
This is an epic question, and there are lots of considerations. Since you didn't mention any specific performance or architectural constraints, I'll try and offer the best well-rounded suggestions.
The initial plan of using PyTables as an intermediary layer between your other elements and the datafiles seems solid. However, one design constraint that wasn't mentioned is one of the most critical of all data processing: Which of these data processing tasks can be done in batch processing style and which data processing tasks are more of a live stream.
This differentiation between "we know exactly our input and output and can just do the processing" (batch) and "we know our input and what needs to be available for something else to ask" (live) makes all the difference to an architectural question. Looking at your diagram, there are several relationships that imply the different processing styles.
Additionally, on your diagram you have components of different types all using the same symbols. It makes it a little bit difficult to analyze the expected performance and efficiency.
Another contraint that's significant is your IT infrastructure. Do you have high speed network available storage? If you do, intermediary files become a brilliant, simple, and fast way of sharing data between the elements of your infrastructure for all batch processing needs. You mentioned running your PyTables-using-application on the same server that's running the Java simulation. However, that means that server will experience load for both writing and reading the data. (That is to say, the simulation environment could be affected by the needs of unrelated software when they query the data.)
To answer your questions directly:
PyTables looks like a nice match.
There are many ways for Python and Java to communicate, but consider a language agnostic communication method so these components can be changed later if necessarily. This is just as simple as finding libraries that support both Java and Python and trying them. The API you choose to implement with whatever library should be the same anyway. (XML-RPC would be fine for prototyping, as it's in the standard library, Google's Protocol Buffers or Facebook's Thrift make good production choices. But don't underestimate how great and simple just "writing things to intermediary files" can be if data is predictable and batchable.
To help with the design process more and flesh out your needs:
It's easy to look at a small piece of the puzzle, make some reasonable assumptions, and jump into solution evaluation. But it's even better to look at the problem holistically with a clear understanding of your constraints. May I suggest this process:
Create two diagrams of your current architecture, physical and logical.
On the physical diagram, create boxes for each physical server and diagram the physical connections between each.
Be certain to label the resources available to each server and the type and resources available to each connection.
Include physical hardware that isn't involved in your current setup if it might be useful. (If you have a SAN available, but aren't using it, include it in case the solution might want to.)
On the logical diagram, create boxes for every application that is running in your current architecture.
Include relevant libraries as boxes inside the application boxes. (This is important, because your future solution diagram currently has PyTables as a box, but it's just a library and can't do anything on it's own.)
Draw on disk resources (like the HDF5 and CSV files) as cylinders.
Connect the applications with arrows to other applications and resources as necessary. Always draw the arrow from the "actor" to the "target". So if an app writes and HDF5 file, they arrow goes from the app to the file. If an app reads a CSV file, the arrow goes from the app to the file.
Every arrow must be labeled with the communication mechanism. Unlabeled arrows show a relationship, but they don't show what relationship and so they won't help you make decisions or communicate constraints.
Once you've got these diagrams done, make a few copies of them, and then right on top of them start to do data-flow doodles. With a copy of the diagram for each "end point" application that needs your original data, start at the simulation and end at the end point with a pretty much solid flowing arrow. Any time your data arrow flows across a communication/protocol arrow, make notes of how the data changes (if any).
At this point, if you and your team all agree on what's on paper, then you've explained your current architecture in a manner that should be easily communicable to anyone. (Not just helpers here on stackoverflow, but also to bosses and project managers and other purse holders.)
To start planning your solution, look at your dataflow diagrams and work your way backwards from endpoint to startpoint and create a nested list that contains every app and intermediary format on the way back to the start. Then, list requirements for every application. Be sure to feature:
What data formats or methods can this application use to communicate.
What data does it actually want. (Is this always the same or does it change on a whim depending on other requirements?)
How often does it need it.
Approximately how much resources does the application need.
What does the application do now that it doesn't do that well.
What can this application do now that would help, but it isn't doing.
If you do a good job with this list, you can see how this will help define what protocols and solutions you choose. You look at the situations where the data crosses a communication line, and you compare the requirements list for both sides of the communication.
You've already described one particular situation where you have quite a bit of java post-processing code that is doing "joins" on tables of data in CSV files, thats a "do now but doesn't do that well". So you look at the other side of that communication to see if the other side can do that thing well. At this point, the other side is the CSV file and before that, the simulation, so no, there's nothing that can do that better in the current architecture.
So you've proposed a new Python application that uses the PyTables library to make that process better. Sounds good so far! But in your next diagram, you added a bunch of other things that talk to "PyTables". Now we've extended past the understanding of the group here at StackOverflow, because we don't know the requirements of those other applications. But if you make the requirements list like mentioned above, you'll know exactly what to consider. Maybe your Python application using PyTables to provide querying on the HDF5 files can support all of these applications. Maybe it will only support one or two of them. Maybe it will provide live querying to the post-processor, but periodically write intermediary files for the other applications. We can't tell, but with planning, you can.
Some final guidelines:
Keep things simple! The enemy here is complexity. The more complex your solution, the more difficult the solution to implement and the more likely it is to fail. Use the least number operations, use the least complex operations. Sometimes just one application to handle the queries for all the other parts of your architecture is the simplest. Sometimes an application to handle "live" queries and a separate application to handle "batch requests" is better.
Keep things simple! It's a big deal! Don't write anything that can already be done for you. (This is why intermediary files can be so great, the OS handles all the difficult parts.) Also, you mention that a relational database is too much overhead, but consider that a relational database also comes with a very expressive and well-known query language, the network communication protocol that goes with it, and you don't have to develop anything to use it! Whatever solution you come up with has to be better than using the off-the-shelf solution that's going to work, for certain, very well, or it's not the best solution.
Refer to your physical layer documentation frequently so you understand the resource use of your considerations. A slow network link or putting too much on one server can both rule out otherwise good solutions.
Save those docs. Whatever you decide, the documentation you generated in the process is valuable. Wiki-them or file them away so you can whip them out again when the topic come s up.
And the answer to the direct question, "How to get Python and Java to play nice together?" is simply "use a language agnostic communication method." The truth of the matter is that Python and Java are both not important to your describe problem-set. What's important is the data that's flowing through it. Anything that can easily and effectively share data is going to be just fine.
Do not make this more complex than it needs to be.
Your Java process can -- simply -- spawn a separate subprocess to run your PyTables queries. Let the Operating System do what OS's do best.
Your Java application can simply fork a process which has the necessary parameters as command-line options. Then your Java can move on to the next thing while Python runs in the background.
This has HUGE advantages in terms of concurrent performance. Your Python "backend" runs concurrently with your Java simulation "front end".
You could try Jython, a Python interpreter for the JVM which can import Java classes.
Jython project homepage
Unfortunately, that's all I know on the subject.
Not sure if this is good etiquette. I couldn't fit all my comments into a normal comment, and the post has no activity for 8 months.
Just wanted to see how this was going for you? We have a very very very similar situation where I work - only the simulation is written in C and the storage format is binary files. Every time a boss wants a different summary we have to make/modify handwritten code to do summaries. Our binary files are about 10 GB in size and there is one of these for every year of the simulation, so as you can imagine, things get hairy when we want to run it with different seeds and such.
I've just discovered pyTables and had a similar idea to yours. I was hoping to change our storage format to hdf5 and then run our summary reports/queries using pytables. Part of this involves joining tables from each year. Have you had much luck doing these types of "joins" using pytables?
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
It's been mentioned to me that I'll be the sole developer behind a large new system. Among other things I'll be designing a UI and database schema.
I'm sure I'll receive some guidance, but I'd like to be able to knock their socks off. What can I do in the meantime to prepare, and what will I need to keep in mind when I sit down at my computer with the spec?
A few things to keep in mind: I'm a college student at my first real programming job. I'll be using Java. We already have SCM set up with automated testing, etc...so tools are not an issue.
Do you know much about OOP? If so, look into Spring and Hibernate to keep your implementation clean and orthogonal. If you get that, you should find TDD a good way to keep your design compact and lean, especially since you have "automated testing" up and running.
UPDATE:
Looking at the first slew of answers, I couldn't disagree more. Particularly in the Java space, you should find plenty of mentors/resources on working out your application with Objects, not a database-centric approach. Database design is typically the first step for Microsoft folks (which I do daily, but am in a recovery program, er, Alt.Net). If you keep the focus on what you need to deliver to a customer and let your ORM figure out how to persist your objects, your design should be better.
This sounds very much like my first job. Straight out of university, I was asked to design the database and business logic layer, while other people would take care of the UI. Meanwhile the boss was looking over my shoulder, unwilling to let go of what used to be his baby and was now mine, and poking his finger in it. Three years later, developers were fleeing the company and we were still X months away from actually selling anything.
The big mistake was in being too ambitious. If this is your first job, you will make mistakes and you will need to change how things work long after you've written them. We had all sorts of features that made the system more complicated than it needed to be, both on the database level and in the API that it presented to other developers. In the end, the whole thing was just far too complicated to support all at once and just died.
So my advice:
If you're not sure about taking on such a big job single-handed, don't. Tell your employers, and get them to find or hire somebody for you to work with who can help you out. If people need to be added to the project, then it should be done near the start rather than after stuff starts going wrong.
Think very carefully about what the product is for, and to boil it down to the simplest set of requirements you can think of. If the people giving you the spec aren't technical, try to see past what they've written to what will actually work and make money. Talk to customers and salespeople, and understand the market.
There's no shame in admitting you're wrong. If it turns out that the entire system needs to be rewritten, because you made some mistake in your first version, then it's better to admit this as soon as possible so you can get to it. Correspondingly, don't try to make an architecture that can anticipate every possible contingency in your first version, because you don't know what every contingency is and will just get it wrong. Write once with an eye to throwing away and starting again - you may not have to, the first version may be fine, but admit it if you do.
I also disagree about starting with the database. The DB is simply an artifact of how your business objects are persisted. I don't know of an equivalent in Java, but .Net has stellar tools such as SubSonic that allow your DB design to stay fluid as you iterate through your business objects design. I'd say first and foremost (even before deciding on what technologies to introduce) focus on the process and identify your nouns and verbs ... then build out from those abstractions. Hey, it really does work in the "real world", just like OOP 101 taught you!
Before you start coding, plan out your database schema - everything else will flow from that. Getting the database reasonably correct early on will save you time and headaches later.
The main thing is being able to abstract the complexity of the system so that you don't get bogged down by it as soon as you start off.
First read the spec like a story (skimming through it). Don't stop at every requirement to analyze it right there and then. This will allow you to get an overall picture of the system without too many details. At this point you would start identifying the major functional components of the system. Start putting these down (use a mindmap tool if you like).
Then take each component and start exploding it (and tying each detail with requirements in the spec document). Do this for all components, till you have covered all requirements.
Now, you should start looking at relationships between the components, and whether there are repetitions of features or functions across the various components (which you can then pull out to create utility components, or such). Around now, you would have a good detailed map of your requirements in your mind.
NOW, you should think of designing the database, ER diagrams, Class Design, DFDs, deployment, etc.
The problem with doing the last step first is that you can get bogged down in the complexity of your system without really gaining an overall understanding in the first place.
I do it the other way around. I find that doing it database-schema-first gets the system stuck in a data-driven-design that is difficult to abstract from persistence. We try to do domain model designs first and then base the database schema on those.
And then there's the infrastructure design: the team should settle on conventions on how to structure the program first and foremost. And then we work together to agree first on a design for the common functionality of the system (e.g., things everyone needs like persistence, logging, etc.). This becomes the framework of the system.
We all work on that together first before we split the rest of the functionalities amongst ourselves.
It has been my experience that Java applications (.NET also) that consider the database last are highly likely to perform poorly when placed into a corporate environment. You need to really think about your audience. You didn't say if it was a web app or not. Either way the infrastructure that you are implementing on is important when considering how you handle your data.
No matter what methodology you consider, how you get and save your data and it's impact on performance should be right up there as one of your #1 priorities.
I'd suggest thinking about how this application will be used. How will future users work with it? I'm sure you know at least a few things about what this application needs to handle, but my first advice is "think of the user and what he or she needs".
Draw it up on plain paper, thinking of where to section off the code. Remeber not to mix logic with GUI code (common error). This way you will be set to extend your applications reach in the future to servlets and/or applets or whatever platform comes along. Section in layers so that you can respond to large changes faster without rebuilding everything. Layers should not see any other layers than their closest neighbouring layers.
Begin with true core functionallity. All that time consuming fluff (that will make your project 4 weeks late), wont matter much to the wast majority of users. It can be added later once you are sure you can deliver on time.
Btw. Even though this has nothing to do with design I'd just like to say that you won't deliver on time. Make a realistic estimate on time consumption and then double it :-) I assume here that you will not be alone in this project and that people will come and go as the project progresses. You may need to train people midway through the project, people go on holiday / need surgery etc.
Split the big system to smaller pieces.
And don't think that it's so complex, because it usually isn't. By thinking too complex it just ruins your thoughts and eventually the design. Some point you just realize that you could do the same thing easier, and then you redesign it.
Atleast this has been my major mistake in designing.
Keep it simple!
I found very insightful ideas about starting a new large project, based on
common good practices
Test Driven Development
and pragmatic approach
in the book Growing Object-Oriented Software, Guided by Tests.
It is still under development, but first 3 chapters may be what You are looking for and IMHO worth reading.