I am used to adding logging to standalone Java applications and writing the logs to the files using log4j and sl4j. I am moving some applications to a Java web start format and I am not clear on what is the best way to perform the logging to monitor the behaviour of the application. I have thought of two options
Write the log to the local machine and provide an option to send the information to the central server under some condition (time, error etc..)
Send the output of the log to the server directly
What is best practice?
I've seen 1. implemented by many programs.
But 2. seems bandwidth intensive, intrusive, and overkill.
Agreed, 2 seems like it's not such a good option. An error with webservices wouldn't be logged in that case. I was wondering if there was any other option but I can't think of any.
I was thinking of entirely local sources of problems connecting to the server, but good point.
What is best practice?
Stick with the majority and use method 1. Unless you have a marvelous inspiration about how the entire logging/reporting system can be improved, I'd go with "tried and tested". It is likely to be easiest, best supported by existing frameworks, and should your code falter, has the greatest number of people who have 'been there, done that' to potentially help.
Related
I work for an enterprise that has a many millions of lines of Java code base. Unfortunately, there were very poor practices put in place to track when one java EAR calls another EAR on another system. The problem gets even worse, we run DB2 and all the DB2 schemas run on the same data connection. This means there is no standard way to look at a config file or database connection to even tell what databases the application accesses. This problem extends to other REST services, since we have REST data services, MQ systems, JMS, EJB RMI, etc. Trying to do impact analysis is a nightmare.
Is there a tool that exists, maybe a findbugs plugin, that I can run on an application and have it generate a report of the systems that the application accesses?
If not, if I put say TRACE on the java.io and java.nio to log everything, should that capture any network connections that Java attempts to make thru the app server?
My ultimate goal, if i can't find a static analysis system that can help with these problems, i would like to write some AOP app that would live between the EAR and WebSphere and log all outbound and possibly inbound connections to the EAR resources.
is this possible?
Tricky one ;-)
Findbugs can help you identify all communication related places in the java code. But you have to do some stuff for that:
Identify all kinds of connections you want to flag (e.g. DB connections, EJB communication, ReST client code ...)
If you have that you need to write your own findbugs plugin which detect those places. May sound complicated however depending on how many places you want to identify a versed developer can do that in 2-3 days I would guess. As starting point have a look at the sourcecode of the available bug patterns in findbugs, look for a similar one and use that as a starting point. There are also lots of tutorials in the web on how to write your own bug pattern...
Configure findbugs to only use your bug pattern and run it on your code base (otherwise all the other bugs will clutter the result especially if your codebase is this huge).
Findbugs will generate a report / show you all the "communication" places...
I'm looking for a way to centralise the logging concerns of distributed software (written in Java) which would be quite easy, since the system in question has only one server. But keeping in mind, that it is very likely that more instances of the particular server will run in the future (and there are going to be more application's in need for this), there would have to be something like a Logging-Server, which takes care of incoming logs and makes them accessable for the support-team.
The situation right now is, that several java-applications use log4j which writes it's data to local files, so if a client expiriences problems the support-team has to ask for the logs, which isn't always easy and takes a lot of time. In the case of a server-fault the diagnosis-problem is not as big, since there is remote-access anyways, but even though, monitoring everything through a Logging-Server would still make a lot of sense.
While I went through the questions regarding "centralised logging" I found another Question (actually the only one with a (in this case) useable answer. Problem being, all applications are running in a closed environment (within one network) and security-guidelines do not permit for anything concerning internal software to go out of the environments network.
I also found a wonderful article about how one would implement such a Logging-Server. Since the article was written in 2001, I would have thought that someone might have already solved this particular problem. But my search-results came up with nothing.
My Question: Is there a logging-framework which handle's logging over networks with a centralised server which can be accessed by the support-team?
Specification:
Availability
Server has to be run by us.
Java 1.5 compatibility
Compatibility to a heterogeneous network.
Best-Case: Protocol uses HTTP to send logs (to avoid firewall-issues)
Best-Case: Uses log4j or LogBack or basically anything that implements slf4j
Not necessary, but nice to have
Authentication and security is of course an issue, but could be set back for at least a while (if it is open-software we would extend it to our needs OT: we always give back to the projects).
Data mining and analysis is something which is very helpful to make software better, but that could as well be an external application.
My worst-case scenario is that their is no software like that. For that case, we would probably implement this ourselves. But if there is such a Client-Server Application I would very much appreciate not needing to do this particularly problematic bit of work.
Thanks in advance
Update: The solution has to run on several java-enabled platforms. (Mostly Windows, Linux, some HP Unix)
Update: After a lot more research we actually found a solution we were able to acquire. clusterlog.net (offline since at least mid-2015) provides logging services for distributed software and is compatible to log4j and logback (which is compatible to slf4j). It lets us analyze every single users way through the application. Thus making it very easy to reproduce reported bugs (or even non reported ones). It also notifies us of important events by email and has a report system were logs of the same origin are summorized into an easily accessable format. They deployed (which was flawless) it here just a couple of days ago and it is running great.
Update (2016): this question still gets a lot of traffic, but the site I referred to does not exist anymore.
You can use Log4j with the SocketAppender, thus you have to write the server part as LogEvent processing.
see http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/net/SocketAppender.html
NXLOG or LogStash or Graylogs2
or
LogStash + ElasticSearch (+optionally Kibana)
Example:
1) http://logstash.net/docs/1.3.3/tutorials/getting-started-simple
2) http://logstash.net/docs/1.3.3/tutorials/getting-started-centralized
Have a look at logFaces, looks like your specifications are met.
http://www.moonlit-software.com/
Availability (check)
Server has to be run by us. (check)
Java 1.5 compatibility (check)
Compatibility to a heterogeneous network. (check)
Best-Case: Protocol uses HTTP to send logs (to avoid firewall-issues) (almost TCP/UDP)
Best-Case: Uses log4j or LogBack or basically anything that implements slf4j (check)
Authentication (check)
Data mining and analysis (possible through extension api)
There's a ready-to-use solution from Facebook - Scribe - that is using Apache Hadoop under the hood. However, most companies I'm aware of still tend to develop in-house systems for that. I worked in one such company and dealt with logs there about two years ago. We also used Hadoop. In our case we had the following setup:
We had a small dedicated cluster of machines for log aggregation.
Workers mined logs from production service and then parse individual lines.
Then reducers would aggregate the necessary data and prepare reports.
We had a small and fixed number of reports that we were interested in. In rare cases when we wanted to perform a different kind of analysis we would simply add a specialized reducer code for that and optionally run it against old logs.
If you can't decide what kind of analyses you are interested in in advance then it'll be better to store structured data prepared by workers in HBase or some other NoSQL database (here, for example, people use Mongo DB). That way you won't need to re-aggregate data from the raw logs and will be able to query the datastore instead.
There are a number of good articles about such logging aggregation solutions, for example, using Pig to query the aggregated data. Pig lets you query large Hadoop-based datasets with SQL-like queries.
Our development team hosts many different applications both .Net and Java based. Currently, we handle our error logging with Log4J and use emails to alert the development team when problems arise. Currently, we get thousands of alerts a day and it's becoming a little tedious to maintain.
We've been discussing creating a central dashboard for all our apps. The ideal tool would track errors, warnings, info etc. over the life of an application (it doesn't necessarily need to be db driven). The idea is that the data can be viewed on a dashboard, drillable to specific errors with the capability of alerting via emal when triggers and or thresholds are met.
Elmah is good for .Net but we need a tool that could also work for Java EE? What is the best way to go about this? Should we:
Just use Elmah for the .Net apps and find something similar for Java and build our own dashboard to create a united look & feel?
OR
Is there a tool that already exists that we can leverage to do this cross platform?
I've tried looking in Sourceforge but it's difficult to describe what I'm looking for.
I don't think you have a logging problem, I think that you have an integration problem, no matter if it is logging, or any other area your root issue is the same... How do I make my completely different components talk to each other?
There is a lot of approaches, but probably the easiest to implement for different technologies is Web services or REST... You will probably need to have a central logger that you need to implement independently, and then build a Web service/REST interface to which you are going to have to connect to...
Maybe a different line of investigation for you is to see if there is a logging product out in the market that takes web service calls... If that's the case, you only need to change your components to make a service call every time.
Something else that you need to consider is that your remote logging should never superseed your local logging, that's it do both, the reason is very simple, remote calls can fail, so code as if they will fail.
We have been using http://www.exceptional.io/ for error tracking for some time now: it's cheap and extremely simple.
To report errors you just post a json document to its endpoint.
I am designing a server for logging. The business logic for this application is written in multiple languages (C++ & Java for now, but other languages might be added to the mix at a later stage.
I am considering making this a separate server with a well defined interface to ensure that I need not port this for other languages at a later date. For scalability, the main application has the ability to run multiple instances on multiple machines supported by load balancers.
One of the important considerations for the design (other than the usuals like the logging level) is performance and support for multiple logging targets (flat file, console, DB(?) etc) .
How do I ensure that the logger is not impacting the performance of the application? Would communicating using a socket make sense? Is there a better way to do this?
Is there a need to have all your logs shared? I would use whatever logging mechanism is best for each stage of the logic (log4j or java's logging in java, and I guess I don't know C++'s logging libraries enough to suggest one.)
For the most part, logs should only be used for debugging and outside-the-app parsing. I would not recommend integrating logging as part of your business logic. If you really need data in the logs, you'll be much better off making a direct communication rather than spitting out the log to have it slurped in by another application.
If you absolutely need it, you can have an external (very low priority) application that feeds off the logs and sends them back to a centralized logging server.
There is data you need to see in near real time and data which needs to be recorded for offline processing. They have different requirements.
real time data needs to be in a machine readable format and is usually directed to the places where it is used. The central logger can be on this path provided it doesn't delay the real time information unacceptably. For this I would use a sockets (or JMS) rather than a buffered file.
offline processing logs can be machine readable format (for over night reports) or be human readable (for debugging) For this I would use a file or a database or both. File can be simpler to manage, esp if that are large. A database makes building reports easier.
In either case I would pass the information which needs to be send via socket or written to a file, to another thread so that any random delays in the system do not impact the code which is producing the log. In fact, I would consider delaying send any logs until whatever the critical process is complete. i.e. You process everything which needs to be done first, then log everything of interest later.
Check this:
http://logging.apache.org/log4j/1.2/manual.html
Take a look at the performance section. It will address your concerns as far as the logging overhead in your application is concerned.
As far as supporting multiple logging targets this is easily achievable with log4j but you need to delve into some details (refer to the URL I posted you).
In general, from my experience log4j is excellent. I have generated thousands of static & dynamic logs of "considerable size" ( for my application - this term may be interpreted differently for your application ) without any problem despite the heavy processing I perform (for history I am evaluating/simulating a distributed P2P algorithm in a local PC and all is going well despite creating hundred of logger instances for the simulation ).
We have a Java web application and we'd like to set up some basic monitoring with a view to expanding this monitoring in future. Our plan is as follows:
(1) Collect generic information (e.g. memory and threads) about the virtual machine of the web container that application is running in.
(2) Monitor the "state" of the application. This is rather vague but at the least we'd like to see if the web application is still alive and can respond to requests.
(3) In the future we'd like to collect more information that is specific to our application. Again this is rather vague but you can assume that we might want to make certain statistics collected internally by the application available to the support staff.
Usually the web application will be deployed in a Tomcat 5.5 or 6 environment. A quick bit of searching on the web shows that JMX can be enabled for Tomcat and that JConsole can then be used to connect to the server. This gives us lots of basic information that solves point (1). Also, some information is available in the MBeans section for "Catalina" and drilling down on this I can at least, for example, see how many requests a particular servlet has received. This is not quite what we want for point (2) but at least gives us some information. There seems to be quite a lot of information there but it's rather difficult to interpret using JConsole. Perhaps there is a better tool for interpreting the MBeans exposed by Tomcat.
For point (3), it seems, at first glance that we could write our own MBeans and then make these available to something like JConsole. Personally, this would involve me learning about JMX which I'm quite happy to do but I have a concern. Having looked around I notice that most of the textbooks on the subject haven't been updated for several years and the open source tools seem to be languishing without recent updates. So my main question is a simple one. What are your opinions on JMX? Does it have a future or is it/has it been superseded by something else? Given we already have our web application but we're starting from scratch for the management console, should we choose JMX or is there something more appropriate with a better future ahead of it?
I ask this question with no personal axe to grind, I'm simply interested to hear your opinions and experiences. I'm sure there's no one correct answer but I think an informed discussion would be useful.
Thanks in advance,
Adam.
JMX is certainly a good solution here. I wouldn't worry about it languishing. Most enterprises I've worked for recently use (or have plans to use) JMX, and I'd have to hear a pretty convincing argument before choosing something else in the Java world. It's easy to write clients (monitoring solutions) for it and you can return complex data very easily indeed. Most 3rd party components support monitoring via JMX as well.
Note that you may want to consider integration with any existing management solutions (e.g. Nagios, BNC Patrol, HP Openview etc.) as well. They may not be so Java-aware, but rather prefer tests like simple HTTP connectivity for testing if a web-site is up (easy using Nagios), or integration using SNMP (which Openview talks natively).
If applicable to your situation (Java 6 update 10 JDK or later, plus on the same machine) then consider using jvisualvm instead as it can dig even deeper than JConsole.
You may find that the easiest way to do what you need is a plugin to jvisualvm knowing your application