Logging access to Java servlet - java

I'm currently developing a Java Web application using Servlets. What I need to do is log to a file every access made to the website. To do that, I used Filters. So far, I've made it to the point where I can print everything to the console.
What I now need to do is store that into a file with a maximum of 10.000 entries up to 30 days old (if the maximum entries is achieved, the oldest ones are replaced when a new one is written).
How can I do that?
P.S: I cannot use a database for this assignment
Edit: I am not using a web framework. I can use logging frameworks.

So, this question actually prompted me to investigate whether any of the popular logging frameworks can actually do the task as requested.
While most do rolling logs based on file size and date/time, none of them had an easy way to do a rolling log based on entries in the log file. Also, existing logging frameworks typically store each day (and sometimes smaller units of time) in their own separate file, making for efficient cleanup based on date/time.
With a requirement for a maximum number of lines inside a single file, this necessitates reading the entire file into memory (very inefficient!). When everything, past and present is being written to a single file, removing older entries requires parsing each line for the date/time that entry was written (also, inefficient!).
Below is a simple program to demonstrate that this can be done, but there are some serious problems with this approach:
Not thread safe (if two threads try to read/write an entry simultaneously, one will be clobbered and the message will be skipped)
Slurping is bad (ten thousand entries is a lot: can the server slurp all that into memory?)
This is probably suitable for a toy project, a demonstration, or a school assignment.
This is NOT suitable for production applications, or really anything on the web that more than one person is going to use at a time.
In short, if you try to use a handi-craft program that you found on the internet for a mission-critical application that other people depend on, you are going to get exactly what you deserve.
public static void main(final String[] args) throws Exception
{
final File logFile = new File("C:/", "EverythingInOneBigGiant.log");
final int maxDays = 30;
final int maxEntries = 10000;
while (true)
{
// Just log the current time for this example, also makes parsing real simple
final String msg = Instant.now().toString();
slurpAndParse(logFile, msg, maxDays, maxEntries);
// Wait a moment, before writing another entry
Thread.sleep(750);
}
}
private static void slurpAndParse(final File f, final String msg, final int maxDays, final int maxEntries)
throws Exception
{
// Slurp entire file into this buffer (possibly very large!)
// Could crash your server if you run out of memory
final StringBuffer sb = new StringBuffer();
if (f.exists() && f.isFile())
{
final LocalDateTime now = LocalDateTime.now();
final long totalLineCount = Files.lines(Paths.get(f.getAbsolutePath())).count();
final long startAtLine = (totalLineCount < maxEntries ? 0 : (totalLineCount - maxEntries) + 1);
long currentLineCount = 0;
try (final BufferedReader br = new BufferedReader(new FileReader(f)))
{
String line;
while (null != (line = br.readLine()))
{
// Ignore all lines before the start counter
if (currentLineCount < startAtLine)
{
++currentLineCount;
continue;
}
// Parsing log data... while writing to the same log... ugh... how hideous
final LocalDateTime lineDate = LocalDateTime.parse(line, DateTimeFormatter.ISO_ZONED_DATE_TIME);
final Duration timeBetween = Duration.between(lineDate, now);
// ... or maybe just use Math.abs() here? I defer to the date/time buffs
final long dayDiff = (timeBetween.isNegative() ? timeBetween.negated() : timeBetween).toDays();
// Only accept lines less than the max age in days
if (dayDiff <= maxDays)
{
sb.append(line);
sb.append(System.lineSeparator());
}
}
}
}
System.out.println(msg);
// Add the new log entry
sb.append(msg);
sb.append(System.lineSeparator());
writeLog(f, sb.toString());
}
private static void writeLog(final File f, final String content) throws IOException
{
try (final Writer out = new FileWriter(f))
{
out.write(content);
}
}

Related

Write multiple files with same string without hanging the UI

I am working on an Android App that changes the CPU Frequency when a foreground app changes. The frequencies for the foreground app is defined in my application itself. But while changing the frequencies my app has to open multiple system files and replace the frequency with my text. This makes my UI slow and when I change apps continuously, it makes the systemUI crash. What can I do to write these multiple files all together at the same time?
I have tried using ASynctaskLoader but that too crashes the SystemUI later.
public static boolean setFreq(String max_freq, String min_freq) {
ByteArrayInputStream inputStream = new ByteArrayInputStream(max_freq.getBytes(Charset.forName("UTF-8")));
ByteArrayInputStream inputStream1 = new ByteArrayInputStream(min_freq.getBytes(Charset.forName("UTF-8")));
SuFileOutputStream outputStream;
SuFileOutputStream outputStream1;
try {
if (max_freq != null) {
int cpus = 0;
while (true) {
SuFile f = new SuFile(CPUActivity.MAX_FREQ_PATH.replace("cpu0", "cpu" + cpus));
SuFile f1 = new SuFile(CPUActivity.MIN_FREQ_PATH.replace("cpu0", "cpu" + cpus));
outputStream = new SuFileOutputStream(f);
outputStream1 = new SuFileOutputStream(f1);
ShellUtils.pump(inputStream, outputStream);
ShellUtils.pump(inputStream1, outputStream1);
if (!f.exists()) {
break;
}
cpus++;
}
}
} catch (Exception ex) {
}
return true;
}
I assume SuFile and SuFileOutputStream are your custom implementations extending Java File and FileOutputStream classes.
Couple of points need to be fixed first.
f.exists() check should be before initializing OutputStream, otherwise it will create the file before checking exists or not. This makes your while loop to become an infinite loop.
as #Daryll suggested, use the number of CPUs with while/for loop. I suggest using for loop.
close your streams after pump(..) method call.
If you want to keep the main thread free, then you can do something like this,
see this code segment:
public static void setFreq(final String max_freq, final String min_freq) {
new Thread(new Runnable() {
//Put all the stuff here
}).start();
}
This should solve your problem.
Determine the number of CPUs before hand and use that number in your loop rather than using a while (true) having to do SuFile.exists() every cycle.
I don't know what SuFileOutputStream is but you may need to close those file output streams or find a faster way to write the file if that implementation is too slow.

Java, why reading from MappedByteBuffer is slower than reading from BufferedReader

I tried to read lines from a file which maybe large.
To make a better performance, I tried to use mapped file. But when I compare the performance, I find that the mapped file way is even a a little slower than I read from BufferedReader
public long chunkMappedFile(String filePath, int trunkSize) throws IOException {
long begin = System.currentTimeMillis();
logger.info("Processing imei file, mapped file [{}], trunk size = {} ", filePath, trunkSize);
//Create file object
File file = new File(filePath);
//Get file channel in readonly mode
FileChannel fileChannel = new RandomAccessFile(file, "r").getChannel();
long positionStart = 0;
StringBuilder line = new StringBuilder();
long lineCnt = 0;
while(positionStart < fileChannel.size()) {
long mapSize = positionStart + trunkSize < fileChannel.size() ? trunkSize : fileChannel.size() - positionStart ;
MappedByteBuffer buffer = fileChannel.map(FileChannel.MapMode.READ_ONLY, positionStart, mapSize);//mapped read
for (int i = 0; i < buffer.limit(); i++) {
char c = (char) buffer.get();
//System.out.print(c); //Print the content of file
if ('\n' != c) {
line.append(c);
} else {// line ends
processor.processLine(line.toString());
if (++lineCnt % 100000 ==0) {
try {
logger.info("mappedfile processed {} lines already, sleep 1ms", lineCnt);
Thread.sleep(1);
} catch (InterruptedException e) {}
}
line = new StringBuilder();
}
}
closeDirectBuffer(buffer);
positionStart = positionStart + buffer.limit();
}
long end = System.currentTimeMillis();
logger.info("chunkMappedFile {} , trunkSize: {}, cost : {} " ,filePath, trunkSize, end - begin);
return lineCnt;
}
public long normalFileRead(String filePath) throws IOException {
long begin = System.currentTimeMillis();
logger.info("Processing imei file, Normal read file [{}] ", filePath);
long lineCnt = 0;
try (BufferedReader br = new BufferedReader(new FileReader(filePath))) {
String line;
while ((line = br.readLine()) != null) {
processor.processLine(line.toString());
if (++lineCnt % 100000 ==0) {
try {
logger.info("file processed {} lines already, sleep 1ms", lineCnt);
Thread.sleep(1);
} catch (InterruptedException e) {}
} }
}
long end = System.currentTimeMillis();
logger.info("normalFileRead {} , cost : {} " ,filePath, end - begin);
return lineCnt;
}
Test result in Linux with reading a file which size is 537MB:
MappedBuffer way:
2017-09-28 14:33:19.277 [main] INFO com.oppo.push.ts.dispatcher.imei2device.ImeiTransformerOfflineImpl - process imei file ends:/push/file/imei2device-local/20170928/imei2device-13 , lines :12758858 , cost :14804 , lines per seconds: 861852.0670089165
BufferedReader way:
2017-09-28 14:27:03.374 [main] INFO com.oppo.push.ts.dispatcher.imei2device.ImeiTransformerOfflineImpl - process imei file ends:/push/file/imei2device-local/20170928/imei2device-13 , lines :12758858 , cost :13001 , lines per seconds: 981375.1249903854
That is the thing: file IO isn't straight forward and easy.
You have to keep in mind that your operating system has a huge impact on what exactly is going to happen. In that sense: there are no solid rules that would work for all JVM implementations on all platforms.
When you really have to worry about the last bit of performance, doing in-depth profiling on your target platform is the primary solution.
Beyond that, you are getting that "performance" aspect wrong. Meaning: memory mapped IO doesn't magically increase the performance of reading a single file within an application once. Its major advantages go along this path:
mmap is great if you have multiple processes accessing data in a read only fashion from the same file, which is common in the kind of server systems I write. mmap allows all those processes to share the same physical memory pages, saving a lot of memory.
( quoted from this answer on using the C mmap() system call )
In other words: you example is about reading a file contents. In the end, the OS still has to turn to the drive to read all bytes from there. Meaning: it reads disc content and puts it in memory. When you do that the first time ... it really doesn't matter that you do some "special" things on top of that. To the contrary - as you do "special" things the memory-mapped approach might even be slower - because of the overhead compared to an "ordinary" read.
And coming back to my first record: even when you would have 5 process reading the same file, the memory-mapped approach isn't necessarily faster. As the Linux might figure: I already read that file into memory, and it didn't change - so even without explicit "memory mapping" the Linux kernel might cache information.
The memory mapping doesn't really give any advantage, since even though you're bulk loading a file into memory, you're still processing it one byte at a time. You might see a performance increase if you processed the buffer in suitably sized byte[] chunks. Even then the BufferedReader version may perform better or at least almost the same.
The nature of your task is to process a file sequentially. BufferedReader already does this very well and the code is simple, so if I had to choose I'd go with the simplest option.
Also note that your buffer code doesn't work except for single byte encodings. As soon as you get multiple bytes per character, it will fail magnificently.
GhostCat is correct. And in addition to your OS choice, other things that can affect performance.
Mapping a file will place greater demand on physical memory. If physical memory is "tight" that could cause paging activity, and a performance hit.
The OS could use a different read-ahead strategy if you read a file using read syscalls versus mapping it into memory. Read-ahead (into the buffer cache) can make file reading a lot faster.
The default buffer size for BufferedReader and the OS memory page size are likely to be different. This may result in the size of disk read requests being different. (Larger reads often result in greater throughput I/O. At least to a certain point.)
There could also be "artefacts" caused by the way that you benchmark. For example:
The first time you read a file, a copy of some or all of the file will land in the buffer cache (in memory)
The second time you read the same file, parts of it may still be in memory, and the apparent read time will be shorter.

Fastest way to read a large XML file in Java

I'm working on a java project to optimize existing code. Currently i'm using BufferedReader/FileInputStream to read content of an XML file as String in Java.
But my question is , is there any faster way to read XML content.Are SAX/DOM faster than BufferedReader/FileInputStream?
Need help regarding the above issue.
Thanks in advance.
I think that your code shown in other question is faster than DOM-like parsers which would definitely require more memory and likely some computation in order to reconstruct the document in full. You may want to profile the code though.
I also think that your code can be prettified a bit for streaming processing if you would use javax XMLStreamReader, which I found quite helpful for many tasks. That class is "... is designed to be the lowest level and most efficient way to read XML data", according to Oracle.
Here is the excerpt from my code where I parse StackOverflow users XML file distributed as a public data dump:
// the input file location
private static final String fileLocation = "/media/My Book/Stack/users.xml";
// the target elements
private static final String USERS_ELEMENT = "users";
private static final String ROW_ELEMENT = "row";
// get the XML file handler
//
FileInputStream fileInputStream = new FileInputStream(fileLocation);
XMLStreamReader xmlStreamReader = XMLInputFactory.newInstance().createXMLStreamReader(
fileInputStream);
// reading the data
//
while (xmlStreamReader.hasNext()) {
int eventCode = xmlStreamReader.next();
// this triggers _users records_ logic
//
if ((XMLStreamConstants.START_ELEMENT == eventCode)
&& xmlStreamReader.getLocalName().equalsIgnoreCase(USERS_ELEMENT)) {
// read and parse the user data rows
//
while (xmlStreamReader.hasNext()) {
eventCode = xmlStreamReader.next();
// this breaks _users record_ reading logic
//
if ((XMLStreamConstants.END_ELEMENT == eventCode)
&& xmlStreamReader.getLocalName().equalsIgnoreCase(USERS_ELEMENT)) {
break;
}
else {
if ((XMLStreamConstants.START_ELEMENT == eventCode)
&& xmlStreamReader.getLocalName().equalsIgnoreCase(ROW_ELEMENT)) {
// extract the user data
//
User user = new User();
int attributesCount = xmlStreamReader.getAttributeCount();
for (int i = 0; i < attributesCount; i++) {
user.setAttribute(xmlStreamReader.getAttributeLocalName(i),
xmlStreamReader.getAttributeValue(i));
}
// all other user record-related logic
//
}
}
}
}
}
That users file format is quite simple and similar to your Bank.xml file:
<users>
<row Id="1567200" Reputation="1" CreationDate="2012-07-31T23:57:57.770" DisplayName="XXX" EmailHash="XXX" LastAccessDate="2012-08-01T00:55:12.953" Views="0" UpVotes="0" DownVotes="0" />
...
</users>
There are different parser options available.
Consider using a streaming parser, because the DOM may become quite big. I.e. either a push or a pull parser.
It's not as if XML parsers are necessarily slow. Consider your web browser. It does XML parsing all the time, and tries really hard to be robust to syntax errors. Usually, memory is the bigger issue.

How to read files with an offset from Hadoop using Java

Problem: I want to read a section of a file from HDFS and return it, such as lines 101-120 from a file of 1000 lines.
I don't want to use seek because I have read that it is expensive.
I have log files which I am using PIG to process down into meaningful sets of data. I've been writing an API to return the data for consumption and display by a front end. Those processed data sets can be large enough that I don't want to read the entire file out of Hadoop in one slurp to save wire time and bandwidth. (Let's say 5 - 10MB)
Currently I am using a BufferedReader to return small summary files which is working fine
ArrayList lines = new ArrayList();
...
for (FileStatus item: items) {
// ignoring files like _SUCCESS
if(item.getPath().getName().startsWith("_")) {
continue;
}
in = fs.open(item.getPath());
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String line;
line = br.readLine();
while (line != null) {
line = line.replaceAll("(\\r|\\n)", "");
lines.add(line.split("\t"));
line = br.readLine();
}
}
I've poked around the interwebs quite a bit as well as Stack but haven't found exactly what I need.
Perhaps this is completely the wrong way to go about doing it and I need a completely separate set of code and different functions to manage this. Open to any suggestions.
Thanks!
As added noted based on research from the below discussions:
How does Hadoop process records records split across block boundaries?
Hadoop FileSplit Reading
I think SEEK is a best option for reading files with huge volumes. It did not cause any problems to me as the volume of data that i was reading was in the range of 2 - 3GB. I did not encounter any issues till today but we did use file splitting to handle the large data set. below is the code which you can use for reading purpose and test the same.
public class HDFSClientTesting {
/**
* #param args
*/
public static void main(String[] args) {
// TODO Auto-generated method stub
try{
//System.loadLibrary("libhadoop.so");
Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(conf);
conf.addResource(new Path("core-site.xml"));
String Filename = "/dir/00000027";
long ByteOffset = 3185041;
SequenceFile.Reader rdr = new SequenceFile.Reader(fs, new Path(Filename), conf);
Text key = new Text();
Text value = new Text();
rdr.seek(ByteOffset);
rdr.next(key,value);
//Plain text
JSONObject jso = new JSONObject(value.toString());
String content = jso.getString("body");
System.out.println("\n\n\n" + content + "\n\n\n");
File file =new File("test.gz");
file.createNewFile();
}
catch (Exception e ){
throw new RuntimeException(e);
}
finally{
}
}
}

How to read and update row in file with Java

currently i creating a java apps and no database required
that why i using text file to make it
the structure of file is like this
unique6id username identitynumber point
unique6id username identitynumber point
may i know how could i read and find match unique6id then update the correspond row of point ?
Sorry for lack of information
and here is the part i type is
public class Cust{
string name;
long idenid, uniqueid;
int pts;
customer(){}
customer(string n,long ide, long uni, int pt){
name = n;
idenid = ide;
uniqueid = uni;
pts = pt;
}
FileWriter fstream = new FileWriter("Data.txt", true);
BufferedWriter fbw = new BufferedWriter(fstream);
Cust newCust = new Cust();
newCust.name = memUNTF.getText();
newCust.ic = Long.parseLong(memICTF.getText());
newCust.uniqueID = Long.parseLong(memIDTF.getText());
newCust.pts= points;
fbw.write(newCust.name + " " + newCust.ic + " " + newCust.uniqueID + " " + newCust.point);
fbw.newLine();
fbw.close();
this is the way i text in the data
then the result inside Data.txt is
spencerlim 900419129876 448505 0
Eugene 900419081234 586026 0
when user type in 586026 then it will grab row of eugene
bind into Cust
and update the pts (0 in this case, try to update it into other number eg. 30)
Thx for reply =D
Reading is pretty easy, but updating a text file in-place (ie without rewriting the whole file) is very awkward.
So, you have two options:
Read the whole file, make your changes, and then write the whole file to disk, overwriting the old version; this is quite easy, and will be fast enough for small files, but is not a good idea for very large files.
Use a format that is not a simple text file. A database would be one option (and bear in mind that there is one, Derby, built into the JDK); there are other ways of keeping simple key-value stores on disk (like a HashMap, but in a file), but there's nothing built into the JDK.
You can use OpenCSV with custom separators.
Here's a sample method that updates the info for a specified user:
public static void updateUserInfo(
String userId, // user id
String[] values // new values
) throws IOException{
String fileName = "yourfile.txt.csv";
CSVReader reader = new CSVReader(new FileReader(fileName), ' ');
List<String[]> lines = reader.readAll();
Iterator<String[]> iterator = lines.iterator();
while(iterator.hasNext()){
String[] items = (String[]) iterator.next();
if(items[0].equals(userId)){
for(int i = 0; i < values.length; i++){
String value = values[i];
if(value!=null){
// for every array value that's not null,
// update the corresponding field
items[i+1]=value;
}
}
break;
}
}
new CSVWriter(new FileWriter(fileName), ' ').writeAll(lines);
}
Use InputStream(s) and Reader(s) to read file.
Here is a code snippet that shows how to read file.
BufferedReader reader = new BufferedReader(new InputStreamReader(new FileInputStream("c:/myfile.txt")));
String line = null;
while ((line = reader.readLine()) != null) {
// do something with the line.
}
Use OutputStream and Writer(s) to write to file. Although you can use random access files, i.e. write to the specific place of the file I do not recommend you to do this. Much easier and robust way is to create new file every time you have to write something. I know that it is probably not the most efficient way, but you do not want to use DB for some reasons... If you have to save and update partial information relatively often and perform search into the file I'd recommend you to use DB. There are very light weight implementations including pure java implementations (e.g. h2: http://www.h2database.com/html/main.html).

Categories