I see that Cobertura has a <cobertura:check> task that can be used to enforce coverage at build-time (if coverage metrics dip below a certain value, the build fails). The website shows examples with several different attributes that are available, but doesn't really give a description as to what they are or what they do:
branchrate
linerate
totalbranchrate
etc.
Also, what are the standard values for each of these attributes? I'm sure it will differ between projects, but there has to be some way for an organization to gauge what is acceptable and what isn't, and I'm wondering how to even arrive at that. Thanks in advance.
Perhaps the documentation has changed since you asked the question, because I think your answer is right there now.
At the time that I'm writing this, the answers to your specific questions are:
branchrate
Specify the minimum acceptable branch coverage rate needed by each class. This should be an integer value between 0 and 100.
linerate
Specify the minimum acceptable line coverage rate needed by each class. This should be an integer value between 0 and 100.
totalbranchrate
Specify the minimum acceptable average branch coverage rate needed by the project as a whole. This should be an integer value between 0 and 100.
If you do not specify branchrate, linerate, totalbranchrate or totallinerate, then Cobertura will use 50% for all of these values.
A bit of googling shows that most people agree that a "good" coverage number is somewhere from 75% - 95%. I use %85 for new projects. However, I think the metric that is the most useful in gauging whether you have enough test coverage is how comfortable your developers are in making and releasing changes to the code (assuming you have responsible developers who care about introducing bugs). Remember, you can have 100% test coverage without a single assert in any test!
For legacy projects things are usually more complicated. It's rare that you can get time to just focus on coverage alone, so most of the time you find out what your code coverage is, and then try to improve it over time. My dream cobertura-check task would check if the coverage on any given line/method/class/package/project is the same as or better than the last build, and have separate thresholds for any code that is "new in this build." Maybe Sonar has something like that...
Related
Sorry I'm quite the beginner in the field of NLP, as the title says what is the best interval for optimization in Mallet API? I was also wondering if it was dependent or related to the number of iterations/topics/corpus etc.
The optimization interval is the number of iterations between hyperparameter updates. Values between 20 and 50 seem to work well, but I haven't done any systematic tests. One possible failure mode to look out for is that too many optimization rounds could lead to instability, with the alpha hyperparameters going to zero.
Here is an interesting blog post where Christof Schöch did some systematic tests on
Topic Modeling with MALLET: Hyperparameter Optimization
TL;DR:
It all depends on the project’s aims. But it is important that we are
aware of the massive effects Mallet’s inconspicuous parameter of the
hyperparameter optimization can have on the resulting models.
EDIT: The authors did not fix the random seed. So results might be explained by random initialization of MALLET.
I am using 10 folds cross validations technique to train 200K records. The target class index is like
Status {PASS,FAIL}
Pass has ~144K and Fail has ~6K instances.
while training the model using J48. Its not able to find the failures. The accuracy is 95% but most the cases its predicting just success. where as in our case, we need to find the failure which are actually happening.
So my question is mainly hypothetical analysis.
Does it really matter the distribution among class instances during training(in my case PASS,FAIL).
What could be possible values in weka J48 tree to train better as i see 2% failure in every 1000 records i pass. So, there will be increase in success if we increase the Success scenarios.
What should be the ratio among them in order to better train them.
There is nothing i could find in the API as far as ratio is concerned.
I am not adding the code because this is happening both with Java API as well as using weka GUI tool.
Many Thanks.
The problem here is that your dataset is very unbalanced. You do have a few options on how to help your classification task:
Generate synthetic instances for your minority class using an algorithm like SMOTE. This should increase your performance.
It's not possible in every case, but you could maybe try splitting your majority class into a couple of smaller classes. This would help the balance.
I believe Weka has a One Class Classifier. This allows to see decision boundary of the larger class and considers the minority class as an outlier allowing for hopefully better classifications. See here for Weka's implementation.
Edit:
You could also use a classifier that will weight classifications based on whether they are correct or not. Again, Weka has this as a meta classifier that can be applied to most base classifiers, see here again.
we are using Drools Planner 5.4.0.Final.
We want to profile our java application to understand if we can improve performance.
Is there a way to profile how much time a rule needs to be evaluated?
We use a lot of eval(....) and our "average calculate count per second" is nearly 37. Removing all eval(...) our "average calculate count per second" remains the same.
We already profiled the application and we saw most of the time is spent in doMove ... afterVariableChanged(...).
So we suspect some of our rules are inefficient, but we don't understand where is the problem.
Thanks!
A decent average calculate count per second is higher than 1000 (at least), a good one higher than 5000. Follow these steps in order:
1) First, I strongly recommend to upgrade to to 6.0.0.CR5. Just follow the upgrade recipe which will guide you step by step in a few hours. That alone will double your average calculate count (and potentially far more), due to several improvements (selectors, constraint match system, ...).
2) Open the black box by enabling logging: first DEBUG, then TRACE. The logs can show if the moves are slow (= rules are slow) or the step initialization is slow (= you need JIT selection).
3) Use the stepLimit benchmark technique to find out which rule(s) are slow.
4) Use the benchmarker (if you aren't already) and play with JIT selection, late acceptance, etc. See those topics in the docs.
Our team is responsible for a large codebase containing legal rules.
The codebase works mostly like this:
class SNR_15_UNR extends Rule {
public double getValue(RuleContext context) {
double snr_15_ABK = context.getValue(SNR_15_ABK.class);
double UNR = context.getValue(GLOBAL_UNR.class);
if(UNR <= 0) // if UNR value would reduce snr, apply the reduction
return snr_15_ABK + UNR;
return snr_15_ABK;
}
}
When context.getValue(Class<? extends Rule>) is called, it just evaluates the specific rule and returns the result. This allows you to create a dependency graph while a rule is evaluating, and also to detect cyclic dependencies.
There are about 500 rule classes like this. We now want to implement tests to verify the correctness of these rules.
Our goal is to implement a testing list as follows:
TEST org.project.rules.SNR_15_UNR
INPUT org.project.rules.SNR_15_ABK = 50
INPUT org.project.rules.UNR = 15
OUTPUT SHOULD BE 50
TEST org.project.rules.SNR_15_UNR
INPUT org.project.rules.SNR_15_ABK = 50
INPUT org.project.rules.UNR = -15
OUTPUT SHOULD BE 35
Question is: how many test scenario's are needed? Is it possible to use static code analysis to detect how many unique code paths exist throughout the code? Does any such tool exist, or do I have to start mucking about with Eclipse JDT?
For clarity: I am not looking for code coverage tools. These tell me which code has been executed and which code was not. I want to estimate the development effort required to implement unit tests.
(EDIT 2/25, focused on test-coding effort):
You have 500 sub-classes, and each appears (based on your example with one conditional) to have 2 cases. I'd guess you need 500*2 tests.
If your code is not a regular as you imply, a conventional (branch) code coverage tool might not be the answer you think you want as starting place, but it might actually help you make an estimate. Code T<50 tests across randomly chosen classes, and collect code coverage data P (as a percentage) over whatever part of the code base you think needs testing (particularly your classes). Then you need roughly (1-P)*100*T tests.
If your extended classes are all as regular as you imply, you might consider generating them. If you trust the generation process, you might be able avoid writing the tests.
(ORIGINAL RESPONSE, focused on path coverage tools)
Most code coverage tools are "line" or "branch" coverage tools; they do not count unique paths through the code. At best they count basic blocks.
Path coverage tools do exist; people have built them for research demos, but commercial versions are relatively rare. You can find one at http://testingfaqs.org/t-eval.html#TCATPATH. I don't think this one handles Java.
One of the issues is that the apparent paths through code is generally exponential in the number of decisions since each encountered decision generates a True path and a False path based on the outcome of the conditional (1 decision --> 2 paths, 2 decisions --> 4 paths, ...). Worse loops are in effect a decision repeated as many times as the loop iterates; a loop that repeats a 100 times in effect has 2**100 paths. To control this problem, the more interesting path coverage tools try to determine the feasibility of a path: if the symbolically combined predicates from the conditionals in a prefix of that path are effectively false, the path is infeasible and can be ignored, since it can't really occur. Another standard trick is treat loops as 0, 1, and N iterations to reduce the number of apparent paths. Managing the number of paths requires rather a lot of machinery, considerably above what most branch-coverage test tools need, which helps explain why real path coverage tools are rare.
how many test scenario's are needed?
Many. 500 might be a good start.
Is it possible to use static code analysis to detect how many unique code paths exist throughout the code?
Yes. It called a code coverage tool. Here are some free ones. http://www.java-sources.com/open-source/code-coverage
I am using javancss to detect CCN of methods. There are methods in our source code with values varying from 1 to 35 (are are even large).
Is there any guide line on what could be realistic limit? The article here gives some ideas -- http://java-metrics.com/cyclomatic-complexity/cyclomatic-complexity-what-is-it-and-why-should-you-care
I am thinking of 10 as soft limit and 15 as hard limit.. Main reason is that testing gets complicated with larger values..
I would like to hear from SO community..
I have two methods I use:
Pick a maximum value and stick to it (mine is 10).
Just regularly review your code and fix the method with the highest score.
Another method to use is to be very rigorous when fixing bugs - check back with source control to see which methods changed to fix the bug. Refactor those methods to reduce their complexity.
As also mentioned in your link paper I think the most important aspect is that you don't see the limit of 10 or 15 as a hard limit, but always give a good justification if the limit is exceeded. In that way you're ''forced'' to carefully examine critical methods and check if is really necessary that they are this complex.