Splitting log4j Output with Quartz Worker Threads

Splitting log4j Output with Quartz Worker Threads - java

I'm working on an application that consists of an overall Quartz-based scheduler and "CycledJob" run using CronTriggers. The purpose of the application is to process inputs from different email inboxes based on the source country.
Based on the country that it comes in from (i.e. US, UK, FR, etc.) the application triggers one job thread to run each country's processing cycle, so there would be a UK Worker thread, one for US, France, etc. When formatting the output to log4j, I'm using the thread parameter, so it emits [ApplicationName_Worker-1], [ApplicationName_Worker-2] etc. Try as I might, I can't find a way to name the threads since they're pulled out of Quartz's Thread Pools. Although I could possibly go so far as to extend Quartz, I'd like to work out a different solution instead of messing with the standard library.
Here's the problem: When using log4j, I'd like to have all log items from the US thread output to a US only file, likewise for each of the country threads. I don't care if they stay in one unified ConsoleAppender, the FileAppender split is what I'm after here. I already know how to specify multiple file appenders and such, my issue is I can't differentiate based on country. There are 20+ classes within the application that can be on the execution chain, very few of which I want to burden with the knowledge of passing an extra "context" parameter through EVERY method... I've considered a Strategy pattern extending a log4j wrapper class, but unless I can let every class in the chain know which thread it's on to parameterize the logger call, that seems impossible. Without being able to name the thread also creates a challenge (or else this would be easy!).
So here's the question: What would be a suggested approach to allow many subordinate classes in an application that are each used for every different thread to process the input know that they are within the context of a particular country thread when they are logging?
Good luck understanding, and please ask clarifying questions! I hope someone is able to help me figure out a decent way to tackle this. All suggestions welcome.

At the top of each country's processing thread, put the country code into Log4j's mapped diagnostic context (MDC). This uses a ThreadLocal variable so that you don't have to pass the country up and down the call stack explicitly. Then create a custom filter that looks at the MDC, and filters out any events that don't contain the current appender's country code.
In your Job:
...
public static final String MDC_COUNTRY = "com.y.foo.Country";
public void execute(JobExecutionContext context)
/* Just guessing that you have the country in your JobContext. */
MDC.put(MDC_COUNTRY, context.get(MDC_COUNTRY));
try {
/* Perform your job here. */
...
} finally {
MDC.remove(MDC_COUNTRY);
}
}
...
Write a custom Filter:
package com.y.log4j;
import org.apache.log4j.spi.LoggingEvent;
/**
* This is a general purpose filter. If its "value" property is null,
* it requires only that the specified key be set in the MDC. If its
* value is not null, it further requires that the value in the MDC
* is equal.
*/
public final class ContextFilter extends org.apache.log4j.spi.Filter {
public int decide(LoggingEvent event) {
Object ctx = event.getMDC(key);
if (value == null)
return (ctx != null) ? NEUTRAL : DENY;
else
return value.equals(ctx) ? NEUTRAL : DENY;
}
private String key;
private String value;
public void setContextKey(String key) { this.key = key; }
public String getContextKey() { return key; }
public void setValue(String value) { this.value = value; }
public String getValue() { return value; }
}
In your log4j.xml:
<appender name="fr" class="org.apache.log4j.FileAppender">
<param name="file" value="france.log"/>
...
<filter class="com.y.log4j.ContextFilter">
<param name="key" value="com.y.foo.Country" />
<param name="value" value="fr" />
</filter>
</appender>

I wish I could be a bit more helpful than this, but you may want to investigate using some filters? Perhaps your logging could output the country code and you could match your filter based on that?
A StringMatchFilter should probably be able to match it for you.
Couldn't get the below address to work properly as a link, but if you look at it, it has some stuff on separate file logging using filters.
http://mail-archives.apache.org/mod_mbox/logging-log4j-user/200512.mbox/<1CC26C83B6E5AA49A9540FAC8D35158B01E2968E#pune.kaleconsultants.com > (just remove the space before the >)
http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/spi/Filter.html

Why not just call Thread.setName() when your job starts to set the name the Thread? If there is an access problem, configure quartz to use your own thread pool.

I may be completely off base on my understanding of what you are attempting to accomplish, but I will take a stab at the solution. It sounds like you want a separate log file for each country for which you are processing email. Based on that understanding, here is a possible solution:
Set up an appender in your log4j configuration for each country for which you wish to log separately (US example provided):
log4j.appender.usfile=org.apache.log4j.FileAppender
log4j.appender.usfile.File=us.log
log4j.appender.usfile.layout=org.apache.log4j.PatternLayout
log4j.appender.usfile.layout.ConversionPattern=%m%n
Create a logger for each country and direct each of them to the appropriate appender (US example provided):
log4j.logger.my-us-logger=debug,usfile
In your code, create your Logger based on the country for which the email is being processed:
Logger logger = Logger.getLogger("my-us-logger");
Determine how you will accomplish step 3 for the subsequent method calls. You could repeat step 3 in each class/method; or you could modify the method signatures to accept a Logger as input; or you could possibly use ThreadLocal to pass the Logger between methods.
Extra info: If you do not want the log statements going to parent loggers (e.g. the rootLogger), you can set their additivity flags to false (US example provided):
log4j.additivity.my-us-logger=false

Related

how to add new log value to all logging statements

I am using slf4j with logback in my project.there is one request_id stored in a ThreadLoacal.
I want to add the value of this request id to all log statements.
Is there any way so that logger implicitly pick up the value of request_id and log it as well, without being pass it in existing log statements?

Slf4j and logback both supports the usage of a mapped diagnostic context (MDC). You can add named values to the MDC, which are passed to the logger. The logging pattern supports tokens for output.
Note that the MDC is stuck to your thread, i.e. with a different thread the context is lost. And with thread reusage, the context will reappear, so cleaning is important in such situations.

You should be able to do this using a custom Converter.
This Answer to another question shows how you can use a Converter to incorporate a thread identifier that has been saves in a thread-local variable. You should be able to adapt this to incorporate a request id.
This section of the logback manual describes how to hook a custom converter into your logback configs.

This can be tedious, But, I am sure it would do it
Firstly configure sl4j logging with the JDK logger.
Write your own formatter, or override the Java SimpleFormatter like so:
package com.mypackage
public class MyFormatter extends SimpleFormatter{
#Override
public String format(LogRecord record){
String simpleFormattedLog = super.format(record);
String simpleFormattedLogWithThreadID = "THREAD_ID(" + getThreadId() + ") _ " + simpleFormattedLog;
return simpleFormattedLogWithThreadID;
}
private String getThreadId(){
return "GET THREAD ID HERE";
}
}
Then specify the formatter in the properties file as
java.util.logging.ConsoleHandler.formatter = com.mypackage.MyFormatter
This should work, all things being equal.

Generic Java Logger output null.null for class and member name

I am using general java.util.logging.Logger with the following initialization line:
private static final Logger _log = Logger.getLogger(Login.class.getName());
Then I use it like this (just an example):
_log.log(Level.FINE, "Session id: {0}.", session.getId());
In the log sometimes I can see the following:
24-Nov-2014 17:26:13.692 FINE [http-nio-80-exec-1] null.null Session id: 18BD6989930169F77565EA2D001A5759.
Most of the times times however it shows me the calling class and function correctly. That happens in other classes and members too. I cannot figure out why does this happen? Is it a bug in the Loggger?

The output looks like it is from the 'org.apache.juli.OneLineFormatter' and the 'null.null' is the source class and source method names. From the LogRecord.getSourceMethodName documentation:
May be null if no information could be obtained.
It is possible it can't be determined or has been forced to null.
Looking at the source code for 'org.apache.juli.AsyncFileHandler' there is a bug where that code is not capturing the original callsite.
You can create and install a filter on the AsynchFileHandler to force compute the method and class names before a the thread hand off. Here is an example of such a filter:
public class InferCallerFilter implements Filter {
public boolean isLoggable(LogRecord record) {
record.getSourceMethodName(); //Infer caller.
return true;
}
}
Even the MemoryHandler included with the JDK gets this wrong. Per the LogRecord documentation:
Therefore, if a logging Handler wants to pass off a LogRecord to another thread, or to transmit it over RMI, and if it wishes to subsequently obtain method name or class name information it should call one of getSourceClassName or getSourceMethodName to force the values to be filled in.

I "solved" this by changing my custom LogFormatter to get the source class name and source method name at the very beginning of the format method.
The return values of these methods seem to have an "expiration date"...
#Override
public String format(LogRecord record)
{
String sourceClassName = record.getSourceClassName();
String sourceMethodName = record.getSourceMethodName();
...
}
So far, after that change I have not observed any null.null entries anymore, when before I had lots of them.

IDs for log messages when using SLF4J

I would like to add a unique identifier to log statements, so I am able to add documentation (externally, e.g. a wiki) to every log statement, so a user can quickly access the message related documentation using the id. The logging framework I would like to use is SLF4J/logback.
I was not able to find documentation about related approaches except for some bits regarding auditing frameworks.
There is the Marker concept which I thought could be usable for ID injection, or I could just add the ID to the message text itself.
How would I add IDs to the logging statements "the right way"? Are there possibilities I didn't think of?
EDIT
The term unique ID just states there should be an identifier per log statement. A developer e.g. adds such an ID to a table/enum/whatever manually, which could be done wrong.
Such ID has to be stable, so documentation can be based on it. So the ID itself is not what I am wondering about.
My question is: what would be the right way of pushing the ID to the logger together with the message text? Would Markers be suited for this kind of requirement, should I embed the ID into the message text or is there some other possibility?
So, basically, would I use
logger.info(IDMarkers.DB_CONNECTION_FAILED, "no connection to the database");
or instead just
logger.info("[{}] no connection to the database", LogIDs.DB_CONNECTION_FAILED);
First approach has the advantage that showing the IDs is up to the logging system/its configuration.

Slf4j has http://www.slf4j.org/apidocs/org/slf4j/Marker.html
Unfortunately Markers are advertised for a different purpose. Still you can use them to uniquely mark logging statements.
More cumbersome solution is MDC:
MDC.put("MsgId", "EV-1234");
log.info()
MDC.remove("MsgId");
or with structural logging (requires v2.0.0):
logger.atDebug()
.addKeyValue("MsgId", "EV-1234")
.log("Temperature changed.");

Unique is only unique within some scope. Eventually, even every int or long value will be used.
So think about what "uniqueness" means to you. Then use a wrapper that will ensure your logging is handled with that id inserted.
Note that with slf4j you are dealing with an interface which will make a number of logging APIs consistent. This means you probably won't have the option to sub-class or even inject your implementation of the interface to ensure your consistent logging. Therefore you will be constrained to techniques like wrapping your logging API (preferably through the "consistent" interface).
package mypackage.log;
public class LoggerWrapper implements org.log4j.Logger {
private org.log4j.Logger logger;
public LoggerWrapper(org.log4j.Logger logger) {
this.logger = logger;
}
public String getUniqueId() {
return ...;
}
public void info(String message, Object params...) {
logger.info(String.format("[%d] %s", getUniqueId(), message), params));
}
... implement all the other methods ...
}
And this means that you will have to make your own LoggerFactory interface too
public Logger getLogger(String name) {
return new LoggerWrapper(org.sql4j.LoggerFactory(name));
}
While the code above has a few warts (not actually testing it); hopefully, it will give you an idea.

What libraries exist for multitenant/conditional configurations for Java?

I'm trying to find a solution for configuring a server-side Java application such that different users of the system interact with the system as if it were configured differently (Multitenancy). For example, when my application services a request from user1, I wish my application to respond in Klingon, but for all other users I want it to reply in English. (I've picked a deliberately absurd example, to avoid specifics: the important thing is that I want the app to behave differently for different requests).
Ideally there's a generic solution (i.e. one that allows me to add
user-specific overrides to any part of my config without having to change code).
I've had a look at Apache Commons Configuration which has built in support for multitenant configuration, but as far as I can tell this is done by combining some base config with some set of overrides. This means that I'd have a config specifying:
application.lang=english
and, say a user1.properties override file:
application.lang=klingon
Unfortunately it's much easier for our support team if they can see all related configurations in one place, with overrides specified somehow inline, rather than having separate files for base vs. overrides.
I think some combination of Commons Config's multitenancy + something like a Velocity template to describe the conditional elements within underlying config is kind of what I'm aiming for - Commons Config for the ease of interacting with my configuration and Velocity for very expressively describing any overrides, in a single configuration, e.g.:
#if ($user=="user1")
application.lang=klingon
#else
application.lang=english
#end
What solutions are people using for this kind of problem?

Is it acceptable for you to code each server operation like in the following?
void op1(String username, ...)
{
String userScope = getConfigurationScopeForUser(username);
String language = cfg.lookupString(userScope, "language");
int fontSize = cfg.lookupInt(userScope, "font_size");
... // business logic expressed in terms of language and fontSize
}
(The above pseudocode assumes the name of a user is passed as a parameter, but you might pass it via another mechanism, for example, thread-local storage.)
If the above is acceptable, then Config4* could satisfy your requirements. Using Config4*, the getConfigurationScopeForUser() method used in the above pseudocode can be implemented as follows (this assumes cfg is a Configuration object that has been previously initialized by parsing a configuration file):
String getConfigurationScopeForUser(String username)
{
if (cfg.type("user", username) == Configuration.CFG_SCOPE) {
return Configuration.mergeNames("user", username);
} else {
return "user.default";
}
}
Here is a sample configuration file to work with the above. Most users get their configuration from the "user.default" scope, but Mary and John have their own overrides of some of those default values:
user.default {
language = "English";
font_size = "12";
# ... many other configuration settings
}
user.John {
#copyFrom "user.default";
language = "Klingon"; # override a default value
}
user.Mary {
#copyFrom "user.default";
font_size = "18"; # override a default value
}
If the above sounds like it might meet your needs, then I suggest you read Chapters 2 and 3 of the "Getting Started Guide" to get a good-enough understanding of the Config4* syntax and API to be able to confirm/refute the suitability of Config4* for your needs. You can find that documentation on the Config4* website.
Disclaimer: I am the maintainer of Config4*.
Edit: I am providing more details in response to comments by bacar.
I have not put Config4* in a Maven repository. However, it is trivial to build Config4* with its bundled Ant build file, because Config4* does not have any dependencies on third-party libraries.
Another approach for using Config4* in a server application (prompted by a comment by bacar) with Config4* is follows...
Implement each server operation like in the following pseudo-code:
void op1(String username, ...)
{
Configuration cfg = getConfigurationForUser(username);
String language = cfg.lookupString("settings", "language");
int fontSize = cfg.lookupInt("settings", "font_size");
... // business logic expressed in terms of language and fontSize
}
The getConfigurationForUser() method used above can be implemented as shown in the following pseudocode:
HashMap<String,Configuration> map = new HashMap<String,Configuration>();
synchronized String getConfigurationForUser(String username)
{
Configuration cfg = map.get(username);
if (cfg == null) {
// Create a config object tailored for the user & add to the map
cfg = Configuration.create();
cfg.insertString("", "user", username); // in global scope
cfg.parse("/path/to/file.cfg");
map.put(username, cfg);
}
return cfg;
}
Here is a sample configuration file to work with the above.
user ?= ""; // will be set via insertString()
settings {
#if (user #in ["John", "Sam", "Jane"]) {
language = "Klingon";
} #else {
language = "English";
}
#if (user == "Mary") {
font_size = "12";
} #else {
font_size = "10";
}
... # many other configuration settings
}
The main comments I have on the two approaches are as follows:
The first approach (one Configuration object that contains lots of variables and scopes) is likely to use slightly less memory than the second approach (many Configuration objects, each with a small number of variables). But my guess is that the memory usage of either approach will be measured in KB or tens of KB, and this will be insignificant compared to the overall memory footprint of your server application.
I prefer the first approach because a single Configuration object is initialized just once, and then it is accessed via read-only lookup()-style operations. This means you don't have to worry about synchronizing access to the Configuration object, even if your server application is multi-threaded. In contrast, the second approach requires you to synchronize access to the HashMap if your server application is multi-threaded.
The overhead of a lookup()-style operation is in the order of, say, nanoseconds or microseconds, while the overhead of parsing a configuration file is in the order of, say, milliseconds or tens of milliseconds (depending on the size of the file). The first approach performs that relatively expensive parsing of a configuration file only once, and that is done in the initialization of the application. In contrast, the second approach performs that relatively expensive parsing of a configuration file "N" times (once for each of "N" users), and that repeated expense occurs while the server is processing requests from clients. That performance hit may or may not be an issue for your application.
I think ease of use is more important than ease of implementation. So, if you feel that the second approach will make it easier to maintain the configuration file, then I suggest you use that approach.
In the second approach, you may wonder why I put most of the variables in a named scope (settings) rather than in the global scope along with the "injected" user variable. I did that for a reason that is outside the scope of your question: separating the "injected" variables from the application-visible variables makes it easier to perform schema validation on the application-visible variables.

Normally user profiles are going into a DB and the user must open a session with a login. The user name may go into the HTTPSession (=Cookies) and on every request the server will get the user name and may read the profile from the DB. Shure, the DB can be some config files like joe.properties, jim.properties, admin.properties, etc.

Separate INFO and ERROR logs from java.util.logging

I'm configuring the logging for a Java application. What I'm aiming for is two logs: one for all messages and one for just messages above a certain level.
The app uses the java.util.logging.* classes: I'm using it as is, so I'm limited to configuration through a logging.properties file.
I don't see a way to configure two FileHandlers differently: the docs and examples I've seen set properties like:
java.util.logging.FileHandler.level = INFO
While I want two different Handlers logging at different levels to different files.
Any suggestions?

http://java.sun.com/j2se/1.4.2/docs/guide/util/logging/overview.html is helpful. You can only set one Level for any individual logger (as you can tell from the setLevel() method on the logger). However, you can take the lowest of the two common levels, and then filter programmatically.
Unfortunately, you can't do this just with the configuration file. To switch with just the configuration file you would have to switch to something like log4j, which you've said isn't an option.
So I would suggest altering the logging in code, with Filters, with something like this:
class LevelFilter implements Filter {
private Level Level;
public LevelFilter(Level level) {
this.level = level;
}
public boolean isLoggable(LogRecord record) {
return level.intValue() < record.getLevel().intValue();
}
}
And then on the second handler, do setFilter(new LevelFilter(Level.INFO)) or whatever. If you want it file configurable you could use a logging properties setting you've made up yourself, and use the normal Properties methods.
I think the configuration code for setting up the two file handlers ad the programmatic code is fairly simple once you have the design, but if you want more detail add a comment and I'll edit.

I think you should be able to just subclass a handler and then override the methods to allow output to go to multiple files depending on the level of the message. This would be done by overriding the publish() method.
Alternatively, if you have to use the system-provided FileHandler, you could do a setFilter() on it to inject your own filter into the mix and, in that filter code, send ALL messages to your other file and return true if the LogRecord level if INFO or higher, causing the FileHandler.publish() to write it to the real file.
I'm not sure this is the way you should be using filters but I can't see why it won't work.

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.