Configure checkstyle for method chaining? - java

Our project contains many statements in the method chaining fluent style:
int totalCount = ((Number) em
.createQuery("select count(up) from UserPermission up where " +
"up.database.id = :dbId and " +
"up.user.id <> :currentUserId ")
.setParameter("dbId", cmd.getDatabaseId())
.setParameter("currentUserId", currentUser.getId())
.getSingleResult())
.intValue();
I've got checkstyle mostly configured to match our existing code style, but now it's failing on these snippets, preferring instead:
int totalCount = ((Number) em
.createQuery("select count(up) from UserPermission up where " +
"up.database.id = :dbId and " +
"up.user.id <> :currentUserId ")
.setParameter("dbId", cmd.getDatabaseId())
.setParameter("currentUserId", currentUser.getId())
.getSingleResult())
.intValue();
Which is totally inappropriate. Is there anyway to configure checkstyle to accept the method chaining style? Is there an alternate tool I can run from maven to enforce this kind of indentation?

I never made this work in Eclipse so we barely use Format Source. In the end it is often best to extend. We tried hard and failed. It was one and half year ago. In the end we use formatting text only in Eclipse by Selecting the line or to preformat before we format by hand.
Usually the formating done by a engineer carries a certain meaning. And so automatic format will never work. Especially if you do something like
public static void myMethod(
int value, String value2, String value3)
If you autoformat this it fails similar to your example.
So feel free to join the club of not using automatic formatting beside as a step before you format it the human way.

with intellij , it can be done by selecting "align when multiline" in case of "method chain calls" so i guess this property is misconfigured in the configurations.

Related

ElasticSearch - define custom letter order for sorting

I'm using ElasticSearch 2.4.2 (via HibernateSearch 5.7.1.Final from Java).
I have a problem with string sorting.
The language of my application has diacritics, which have a specific alphabetic
ordering. For example Ł goes directly after L, Ó goes after O, etc.
So you are supposed to sort the strings like this:
Dla
Dła
Doa
Dóa
Dza
Eza
ElasticSearch sorts by typical letters first, and moves all strange
letters to at the end:
Dla
Doa
Dza
Dła
Dóa
Eza
Can I add a custom letter ordering for ElasticSearch?
Maybe there are some plugins for this?
Do I need to write my own plugin? How do I start?
I found a plugin for Polish language for ElasticSearch,
but as I understand it is for analysing, and analysing is not a solution
in my case, because it will ignore diacritics and leave words with L and Ł mixed:
Dla
Dłb
Dlc
This would sometimes be acceptable, but is not acceptable in my specific usecase.
I will be grateful for any remarks on this.
I've never used it, but there is a plugin that could fit your needs: the ICU collation plugin.
You will have to use the icu_collation token filter, which will turns the tokens into collation keys. For that reason you will need to use a separate #Field (e.g. myField_sort) in Hibernate Search.
You can assign a specific analyzer to your field with #Field(name = "myField_sort", analyzer = #Analyzer(definition = "myCollationAnalyzer")), and define this analyzer (type, parameters) with something like that on one of your entities:
#Entity
#Indexed
#AnalyzerDef(
name = "myCollationAnalyzer",
filters = {
#TokenFilterDef(
name = "polish_collation",
factory = ElasticsearchTokenFilterFactory.class,
params = {
#Parameter(name = "type", value = "'icu_collation'"),
#Parameter(name = "language", value = "'pl'")
}
)
}
)
public class MyEntity {
See the documentation for more information: https://docs.jboss.org/hibernate/stable/search/reference/en-US/html_single/#_custom_analyzers
It's admittedly a bit clumsy right now, but analyzer configuration will get a bit cleaner in the next Hibernate Search version with normalizers and analyzer definition providers.
Note: as usual, your field will need to be declared as sortable (#SortableField(forField = "myField_sort")).

Built-in string formatting vs string concatenation as logging parameter

I'm using SonarLint that shows me an issue in the following line.
LOGGER.debug("Comparing objects: " + object1 + " and " + object2);
Side-note: The method that contains this line might get called quite often.
The description for this issue is
"Preconditions" and logging arguments should not require evaluation
(squid:S2629)
Passing message arguments that require further evaluation into a Guava
com.google.common.base.Preconditions check can result in a performance
penalty. That's because whether or not they're needed, each argument
must be resolved before the method is actually called.
Similarly, passing concatenated strings into a logging method can also
incur a needless performance hit because the concatenation will be
performed every time the method is called, whether or not the log
level is low enough to show the message.
Instead, you should structure your code to pass static or pre-computed
values into Preconditions conditions check and logging calls.
Specifically, the built-in string formatting should be used instead of
string concatenation, and if the message is the result of a method
call, then Preconditions should be skipped altoghether, and the
relevant exception should be conditionally thrown instead.
Noncompliant Code Example
logger.log(Level.DEBUG, "Something went wrong: " + message); // Noncompliant; string concatenation performed even when log level too high to show DEBUG messages
LOG.error("Unable to open file " + csvPath, e); // Noncompliant
Preconditions.checkState(a > 0, "Arg must be positive, but got " + a); // Noncompliant. String concatenation performed even when a > 0
Preconditions.checkState(condition, formatMessage()); //Noncompliant. formatMessage() invoked regardless of condition
Preconditions.checkState(condition, "message: %s", formatMessage()); // Noncompliant
Compliant Solution
logger.log(Level.SEVERE, "Something went wrong: %s", message); // String formatting only applied if needed
logger.log(Level.SEVERE, () -> "Something went wrong: " + message); //since Java 8, we can use Supplier , which will be evaluated lazily
LOG.error("Unable to open file {}", csvPath, e);
if (LOG.isDebugEnabled() { LOG.debug("Unable to open file " + csvPath, e); // this is compliant, because it will not evaluate if log level is above debug. }
Preconditions.checkState(arg > 0, "Arg must be positive, but got %d", a); // String formatting only applied if needed
if (!condition) { throw new IllegalStateException(formatMessage()); // formatMessage() only invoked conditionally }
if (!condition) { throw new IllegalStateException("message: " + formatMessage()); }
I'm not 100% sure whether i understand this right. So why is this really an issue. Especially the part about the performance hit when using string concatenation. Because I often read that string concatenation is faster than formatting it.
EDIT: Maybe someone can explain me the difference between
LOGGER.debug("Comparing objects: " + object1 + " and " + object2);
AND
LOGGER.debug("Comparing objects: {} and {}",object1, object2);
is in the background. Because I think the String will get created before it is passed to the method. Right? So for me there is no difference. But obviously I'm wrong because SonarLint is complaining about it
I believe you have your answer there.
Concatenation is calculated beforehand the condition check. So if you call your logging framework 10K times conditionally and all of them evaluates to false, you will be concatenating 10K times with no reason.
Also check this topic. And check Icaro's answer's comments.
Take a look to StringBuilder too.
String concatenation means
LOGGER.info("The program started at " + new Date());
Built in formatting of logger means
LOGGER.info("The program started at {}", new Date());
very good article to understand the difference
http://dba-presents.com/index.php/jvm/java/120-use-the-built-in-formatting-to-construct-this-argument
Consider the below logging statement :
LOGGER.debug("Comparing objects: " + object1 + " and " + object2);
what is this 'debug' ?
This is the level of logging statement and not level of the LOGGER.
See, there are 2 levels :
a) one of the logging statement (which is debug here) :
"Comparing objects: " + object1 + " and " + object2
b) One is level of the LOGGER. So, what is the level of LOGGER object :
This also must be defined in the code or in some xml , else it takes level from it's ancestor .
Now why am I telling all this ?
Now the logging statement will be printed (or in more technical term send to its 'appender') if and only if :
Level of logging statement >= Level of LOGGER defined/obtained from somewhere in the code
Possible values of a Level can be
DEBUG < INFO < WARN < ERROR
(There can be few more depending on logging framework)
Now lets come back to question :
"Comparing objects: " + object1 + " and " + object2
will always lead to creation of string even if we find that 'level rule' explained above fails.
However,
LOGGER.debug("Comparing objects: {} and {}",object1, object2);
will only result in string formation if 'level rule explained above' satisfies.
So which is more smarter ?
Consult this url.
First let's understand the problem, then talk about solutions.
We can make it simple, assume the following example
LOGGER.debug("User name is " + userName + " and his email is " + email );
The above logging message string consists of 4 parts
And will require 3 String concatenations to be constructed.
Now, let's go to what is the issue of this logging statement.
Assume our logging level is OFF, which means that we don't interested in logging now.
We can imagine that the String concatenations (slow operation) will be ALWAYS applied and will not consider the logging level.
Wow, after understanding the performance issue, let's talk about the best practice.
Solution 1 (NOT optimal)
Instead of using String concatenations, we can use String Builder
StringBuilder loggingMsgStringBuilder = new StringBuilder();
loggingMsgStringBuilder.append("User name is ");
loggingMsgStringBuilder.append(userName);
loggingMsgStringBuilder.append(" and his email is ");
loggingMsgStringBuilder.append(email );
LOGGER.debug(loggingMsgStringBuilder.toString());
Solution 2 (optimal)
We don't need to construct the logging message before check the debugging level.
So we can pass logging message format and all parts as parameters to the LOGGING engine, then delegate String concatenations operations to it, and according to the logging level, the engine will decide to concatenate or not.
So, It's recommended to use parameterized logging as the following example
LOGGER.debug("User name is {} and his email is {}", userName, email);

Comparing Date type in Oracle PACKAGE does not work

I made a Oracle Package like below.
And I will pass parameter String like '2014-11-05'.
--SEARCH 2014 11 04
FUNCTION SEARCHMYPAGE(v_created_after IN DATE, v_created_before IN DATE)
return CURSORTYPE is rtn_cursor CURSORTYPE;
BEGIN
OPEN
rtn_cursor FOR
select
news_id
from
(
select
news_id,
news_title, news_desc,
created, news_cd
from
news
)
where
1=1
AND (created BETWEEN decode(v_created_after, '', to_date('2000-01-01', 'YYYY-MM-DD'), to_date(v_created_after, 'YYYY-MM-DD'))
AND (decode(v_created_before, '', sysdate, to_date(v_created_before, 'YYYY-MM-DD')) + 0.999999));
return rtn_cursor ;
END SEARCHMYPAGE;
I confirmed my parameter in Eclipse console Message, since I am working on Eclipse IDE.
I got contents, which are made in 2014-10-29 ~ 2014-10-31.
when I pass '2014-11-01' as created_after, It returns 0 records.(But I expected all contents, since every contents are made between 10-29 and 10-31)
Would you find anything wrong with my Function?
Thanks :D
create function search_my_page(p_created_after in date, p_created_before in date)
return cursortype
is rtn_cursor cursortype;
begin
open rtn_cursor for
select news_id
from news
where created between
nvl(v_created_after, date '1234-01-01')
and
nvl(v_created_before, sysdate) + interval '1' day - interval '1' second;
return rtn_cursor;
end search_my_page;
/
Changes:
Re-wrote predicates - there was a misplaced parentheses changing the meaning.
Replaced to_date with date literals and variables. Since you're already using ANSI date format, might as well use literals. And date variables do not need to be cast to dates.
Replace DECODE with simpler NVL.
Removed extra parentheses.
Renamed v_ to p_. It's typical to use p_ to mean "parameter" and v for "(local) variable".
Removed extra inline view. Normally inline views are underused, in this case it doesn't seem to help much.
Removed unnecessary 1=1.
Replaced 0.99999 with date intervals, to make the math clearer.
Changed to lower case (this ain't COBOL), added underscores to function name.
Changed 2000-01-01 to 1234-01-01. If you use a magic value it should look unusual - don't try to hide it.

Java String formatting solution

I have a string description of a company, which is nasty written by different users (hand-typed). Here is a example (focus on the dots, spaces, first letters etc..):
XXXX is a Global menagement consulting,Technology services and
outsourcing company, with 257000people serving clients in more than
120 countries.. combining unparalleled experience, comprehensive
capabilities across all industries and business functions,and
extensive research on the worlds most successfull companies, XXXX
collaborates with clients to help them become high-performance
businesses and governments., the company generated net revenues of
US$27.9 Billion for the fiscal year ended 31.07.2012..
Now what i want is to format the string to a bit nicer version like this:
XXXX is a global management consulting, technology services and
outsourcing company, with 257,000 people serving clients in more than
120 countries. Combining unparalleled experience, comprehensive
capabilities across all industries and business functions, and
extensive research on the world’s most successful companies, XXXX
collaborates with clients to help them become high-performance
businesses and governments. The company generated net revenues of
US$27.9 billion for the fiscal year ended Aug. 31, 2012.
My question is: Is there any library with already defined methods which could do all the spelling corrections, unneeded space removal, etc .. ?
So far, I do it be replacing stuff like " ," with ", " and toUpperCase() if the is a "///." in front etc..
desc = desc.replace(" ", " ");
desc = desc.replace("..", ".");
desc = desc.replace(" .", ".");
desc = desc.replace(" ,", ", ");
desc = desc.replace(".,", ".");
desc = desc.replace(",.", ".");
desc = desc.replace(", .", ".");
desc = desc.replace("*", "");
I'm sure there is a cleaner and better version to do this. Using regex maybe??
Any solution would be appreciated.
If I were trying to solve your problem, I would probably read the text 1 char at a time, and format it as you go. For example, in psuedocode...
while (has more chars){
char letter = readChar();
if (letter == ','){
// checking for the ',.' combination
letter = readChar();
if (readChar == '.'){
// write out a '.' only
out.print('.');
}
else {
// it wasn't the ',.' combination, so you need to output both characters, whatever they are
out.print(',');
out.print(letter);
}
}
else if (another letter you want to filter){
// etc.
}
else {
// doesn't match any of the filters, so just output the letter
out.print(letter);
}
}
Basically if you read the text 1 char at a time, you can detect any of your chosen formatting problems as you go, and correct them immediately. This provides a performance improvement, as you're only reading over the text string once (not 8 times, like you are currently doing), and allows you to add as many different/complex formatting changes as you want. The downside, however, is that you need to write the logic yourself rather than relying on in-built functions.

hbase: querying for specific value with dynamically created qualifier

Hy,
Hbase allows a column family to have different qualifiers in different rows. In my case a column family has the following specification
abc[cnt] # where cnt is an integer that can be any positive integer
what I want to achieve is to get all the data from a different column family, only if the value of the described qualifier (in a different column family) matches.
for narrowing the Scan down I just add those two families I need for the query. but that is as far as I could get for now.
I already achieved the same behaviour with a SingleColumnValueFilter, but then the qualifier was known in advance. but for this one the qualifier can be abc1, abc2 ... there would be too many options, thus too many SingleColumnValueFilter's.
Then I tried using the ValueFilter, but this filter only returns those columns that match the value, thus the wrong column family.
Can you think of any way to achieve my goal, querying for a value within a dynamically created qualifier in a column family and returning the contents of the column family and another column family (as specified when creating the Scan)? preferably only querying once.
Thanks in advance for any input.
UPDATE: (for clarification as discussed in the comments)
in a more graphical way, a row may have the following:
colfam1:aaa
colfam1:aab
colfam1:aac
colfam2:abc1
colfam2:abc2
whereas I want to get all of the family colfam1 if any value of colfam2 has e.g. the value x, with regard to the fact that colfam2:abc[cnt] is dynamically created with cnt being any positive integer
I see two approaches for this: client-side filtering or server-side filtering.
Client-side filtering is more straightforward. The Scan adds only the two families "colfam1" and "colfam2". Then, for each Result you get from scanner.next(), you must filter according to the qualifiers in "colfam2".
byte[] queryValue = Bytes.toBytes("x");
Scan scan = new Scan();
scan.addFamily(Bytes.toBytes("colfam1");
scan.addFamily(Bytes.toBytes("colfam2");
ResultScanner scanner = myTable.getScanner(scan);
Result res;
while((res = scanner.next()) != null) {
NavigableMap<byte[],byte[]> colfam2 = res.getFamilyMap(Bytes.toBytes("colfam2"));
boolean foundQueryValue = false;
SearchForQueryValue: while(!colfam2.isEmpty()) {
Entry<byte[], byte[]> cell = colfam2.pollFirstEntry();
if( Bytes.equals(cell.getValue(), queryValue) ) {
foundQueryValue = true;
break SearchForQueryValue;
}
}
if(foundQueryValue) {
NavigableMap<byte[],byte[]> colfam1 = res.getFamilyMap(Bytes.toBytes("colfam1"));
LinkedList<KeyValue> listKV = new LinkedList<KeyValue>();
while(!colfam1.isEmpty()) {
Entry<byte[], byte[]> cell = colfam1.pollFirstEntry();
listKV.add(new KeyValue(res.getRow(), Bytes.toBytes("colfam1"), cell.getKey(), cell.getValue());
}
Result filteredResult = new Result(listKV);
}
}
(This code was not tested)
And then finally filteredResult is what you want. This approach is not elegant and might also give you performance issues if you have a lot of data in those families. If "colfam1" has a lot of data, you don't want to transfer it to the client if it will end up not being used if value "x" is not in a qualifier of "colfam2".
Server-side filtering. This requires you to implement your own Filter class. I believe you cannot use the provided filter types to do this. Implementing your own Filter takes some work, you also need to compile it as a .jar and make it available to all RegionServers. But then, it helps you to avoid sending loads of data of "colfam1" in vain.
It is too much work for me to show you how to custom implement a Filter, so I recommend reading a good book (HBase: The Definitive Guide for example). However, the Filter code will look pretty much like the client-side filtering I showed you, so that's half of the work done.

Categories