Timestamp issue in log4j - java

What is the time actually we see in logs with log4j. Is it the time of the event we write some message ? or is it the time when the message will be written to logs on disk. Considering there is some load on the system.

My answer is valid for log4j 1.2 only.
The logging contains the time when logger.log(..) is called not when the message is written to disk.
For a better understanding please check the source code of log4j. All logger.log(..) methods create a new LogRecord instance. The constructor of LogRecord retrieves the timestamp which is used later to write the timestamp to disk.

Related

Hadoop Serialization and De-Serialization

I have my file to be processed is stored in HDFS in Binary Stream format.
Now I have to do some processing over the file using map-reduce.
The input file is split into no of blocks(The file is in the original format when it arrives the input block)
My question is when does this de-serialization occurs?
I have the writable interface implemented in my code and it has two methods i.e readFields and write. Is these methods are responsible for de serialization and serialization of actual data stored in HDFS?
If yes, Could you please explain the flow of data?
I'm stuck with this concept for the whole day, Please help..
Serialization occurs during write method on Context object in the mapper phase. In the code when you write context.write(key,value{own_object}), serialization starts. Once the map output is written to the local disk, SS will come into picture. In this phase the intermediate output will be processed by the framework. Here comes the de-serialization(using read()). You can see the serialized data after mapper.

Prevent request to get data from json file while JSON file updating

We have seen lot of applications who are working with JSON file but i have a case study of which i want to get solution.
Let us see ...
a app is working with json file which gets requests from million users and every second thousands of requests has been completed.
JSON file is updated by admin panel every minute or second or specific time frame.
what will be behaviour of JSON file while request has been received to access JSON file and open for update from admin at same time (i have read it that JSON file will be fetched in readable mode.)
Let JSON file is writing using some script and its process is third of a second than what will be behaviour while 50% file has been updated.
Either file will be given with new written content when process completed or when it was partially updated?
Don't bother with locking, just use rename().
Assuming you're running on an OS where a rename() is an atomic operation, create a new file, say "/data/file/name.json.new", then when that's complete, rename the file. In C that would look like this:
rename( "/data/file/name.json.new", "/data/file/name.json" );
This way, any process opening "/data/file/name.json" will always see a consistent data file.
Practically, by what you describe, you want a service that applies operations on a file server-side.
You should though avoid taking the responsibility of Creating, Readind, Updating and Deleting (CRUD), as you will have troubles on preserving principles such as Atomicity, Consistency, Isolation and Durability (ACID), while there are systems doing that for you, the Database Management Systems.
In simple words, scenarios like what you describe should be a responsibility of a DBMS and not yours.
You probably need a NoSQL DBMS, that responsible for the CRUD operations of your database - which can be file-based in a JSON format and other forms, preserving ACID always (or almost always, but this is probably something you will learn on searching on it). MongoDB is a great example of such system.
Because you mentioned JSON, please take into consideration that it is another story to transfer the data, and another to store them. I suggest that you use the JSON format for requests & responses, but explore other options in storage. For instance, even a Relational DBMS that uses SQL can be good for you, it always depends on your needs. You might just need to form (encode & decode) the data in JSON format wherever received or sent to each client.
Take a look here for more info.

Including log file name in log entry in Log4j

I have a requirement to include the name of the log file in the log entry itself.
For example say final name of the log file is something like trx_log.2014-09-22-12-42 the log entries I'm printing that log should have that same name. Following is a example log entry.
123456|test value|xyz|trx_log.2014-09-22-12-42
I'm using Log4j DailyRollingFileAppender to print the log at the moment. Is there a way which I can get this requirement implemented using some log4j/logback configuration.
Not that I'm aware of.
But a solution does exist nevertheless: write your own custom extension of the DailyRollingFileAppender.
Please note though the filename will be available to your custom appender only: in case you want to use such information in another appender (the only use case I can think of this might be of any use) then you need a more convoluted solution using a shared data storage (shared memory, file system, database, whatever) with the simplest solution being a static member of your just made appender. In this case the other appender (lat's say Console) need to be extended as well in order to append the new information to the log statement.
Use this method logger.getName()
logger.log(Level.SEVERE,"Exception in "+e.getMessage()+logger.getName());

Camel route picking up file before ftp is complete

I have a customer who ftp's a file over to our server. I have a route defined to select certain files from this directory and move them to a different directory to be processed. The problem is that it takes it as soon as it sees it and doesn't wait till the ftp is complete. The result is a 0 byte file in the path described in the to uri. I have tried each of the readLock options (masterFile,rename,changed, fileLock) but none have worked. I am using spring DSL to define my camel routes. Here is an example of one that is not working. camel version is 2.10.0
<route>
<from uri="file:pathName?initialDelay=10s&move=ARCHIVE&sortBy=ignoreCase:file:name&readLock=fileLock&readLockCheckInterval=5000&readLockTimeout=10m&filter=#FileFilter" />
<to uri="file:pathName/newDirectory/" />
</route>
Any help would be appreciated. Thanks!
Just to note...At one point this route was running on a different server and I had to ftp the file to another server that processed it. When I was using the ftp component in camel, that route worked fine. That is it did wait till the file was received before doing the ftp. I had the same option on my route defined. Thats why I am thinking there should be a way to do it since the ftp component uses the file component options in camel.
I am taking #PeteH's suggestion #2 and did the following. I am still hoping there is another way, but this will work.
I added the following method that returns me a Date that is current.minus(x seconds)
public static Date getDateMinusSeconds(Integer seconds) {
Calendar cal = Calendar.getInstance();
cal.add(Calendar.SECOND, seconds);
return cal.getTime();
}
Then within my filter I check if the initial filtering is true. If it is I compare the Last modified date to the getDateMinusSeconds(). I return a false for the filter if the comparison is true.
if(filter){
if(new Date(pathname.getLastModified()).after(DateUtil.getDateMinusSeconds(-30))){
return false;
}
}
I have not done any of this in your environment, but have had this kind of problem before with FTP.
The better option of the two I can suggest is if you can get the customer to send two files. File1 is their data, File2 can be anything. They send them sequentially. You trap when File2 arrives, but all you're doing is using it as a "signal" that File1 has arrived safely.
The less good option (and this is the one we ended up implementing because we couldn't control the files being sent) is to write your code such that you refuse to process any file until its last modified timestamp is at least x minutes old. I think we settled on 5 minutes. This is pretty horrible since you're essentially firing, checking, sleeping, checking etc. etc.
But the problem you describe is quite well known with FTP. Like I say, I don't know whether either of these approaches will work in your environment, but certainly at a high level they're sound.
camel inherits from the file component. This is at the top describing this very thing..
Beware the JDK File IO API is a bit limited in detecting whether another application is currently writing/copying a file. And the implementation can be different depending on OS platform as well. This could lead to that Camel thinks the file is not locked by another process and start consuming it. Therefore you have to do you own investigation what suites your environment. To help with this Camel provides different readLock options and doneFileName option that you can use. See also the section Consuming files from folders where others drop files directly.
To get around this problem I had my publishers put out a "done" file. This solves this problem
A way to do so is to use a watcher which will trigger the job once a file is deposed and to delay the consuming of the file to a significant amount of time, to be sure that it's upload is finished.
from("file-watch://{{ftp.file_input}}?events=CREATE&recursive=false")
.id("FILE_WATCHER")
.log("File event: ${header.CamelFileEventType} occurred on file ${header.CamelFileName} at ${header.CamelFileLastModified}")
.delay(20000)
.to("direct:file_processor");
from("direct:file_processor")
.id("FILE_DISPATCHER")
.log("Sending To SFTP Uploader")
.to("sftp://{{ftp.user}}#{{ftp.host}}:{{ftp.port}}//upload?password={{ftp.password}}&fileName={{file_pattern}}-${date:now:yyyyMMdd-HH:mm}.csv")
.log("File sent to SFTP");
It's never late to respond.
Hope it can help someone struggling in the deepest creepy places of the SFTP world...

Obtain last 500 logged messages

We are using SLF4J/Logback combination to perform our logging. One of the requirement we have is if anything fails, send an email to support/dev group with last 500 logged messages.
I was trying to go through the documentation, but haven't found anything relevant.
One of the approach, I can think is obtain the current log file name, read the file and send last 500 records. But I dont know how to get the current log file name. anyone knows how to? or any other better option to retrieve the log tail?
Thanks
It sounds like Log4j's SMTPAppender has the features you require. You could look at its source code as model to guide your own implementation if Logback lacks a similar appender (which would be somewhat surprising).
Essentially, this email appender has a ring buffer of log events. When a triggering event occurs (by default, an event at ERROR level or worse), the buffer is flushed to an email and sent.
Create a custom Appender that would cache the last 500 log messages. You may extend the SMTPAppender to shoot the email by reading the contents from this cache.
start here

Categories