problem of testing file worker in java - java

I have a question which is described below:
What problems would arise for testing a Java class which counts number of words in a file?
The function's signature is below:
public int wordCount(String filename)
Well, this is a junit testing question.
If you know the problem, what is the solution of that?

So your question is what to test for? If yes, I'd say you should check if the definition of "word" is implemented correctly (e.g. is "stack-overflow" one word or two), are new lines handled correctly, are numbers counted as words (e.g. difference between "8" and "eight"), are (groups of special) characters (e.g. a hyphen) counted correctly.
Additionally, you should test whether the method returns the expected value (or exception) if the file does not exist.
This should be a good starting point.

To sfussenegger's list, I'd add the file handling checks: does the method respond correctly to files not found (including null filename), or lacking read permission?
Also, to sfussenegger's correctness list, I'd add whether duplicates count and case sensitivity rules, as well.
Of course, all of this requires that you know how the method is supposed to behave for all of these specifics. It's easy to tell someone to "go count words", but there are subtleties to that assignment.
Which is one of the big benefits of writing a good set of unit tests.

This really sounds like a task for FIT: Framework for Integrated Test. It's an acceptance testing framework that works with ant and JUnit.
One docent of mine did such a task and used this framework. It allows you to write a whole bunch of test cases within one html/wiki table. FIT will interpret each line as a parameter set for the function under test and checks the output.
For example:
This table displays the result of three test cases. Two passed, one failed.
You can use fit if you write sentences and define the number of words in your table. With FIT, they're executed and the result is displayed in a new table.
For further information, please read Introduction to FIT.

Related

Java Assertions : How to assert for 2 fields on same value

In my code, I have to assert one value against 2 fields. This is what I have to do :
assertThat(request.get(0).name()).isEqualTo("ABC");
assertThat(request.get(0).name2()).isEqualTo("ABC");
How can I use one single line assertion for the above 2 lines?
For example to explain more what I need :
Is there a way I can achieve something like :
assertThat(request.get(0).name() && request.get(0).name2()).isEqualTo("ABC");
How can I use one single line assertion for the above 2 lines?
Why do you want to do such a thing ?
By trying to try too clever, you will get two drawbacks :
you will make your test more complex to read and to maintain.
you will lose the relevant feedback information as a test fails
Actually your test is fine.
If any of these two values doesn't respect the assertion, you have the exact line that spots the issue and you also have a relevant information message.
As a hint, you could maybe just remove the duplication :
final String expected = "ABC";
assertThat(request.get(0).name()).isEqualTo(expected);
assertThat(request.get(0).name2()).isEqualTo(expected);
I don't want to say that it is bad to make multiple assertions in a same statement. Not at all.
I say only that you have to adapt your way of asserting to the tools you are using.
And about it, you don't specify the matcher tool.
If the matcher tool provides a support to make this kind of assertion, use it.
Otherwise, don't make it in a raw way otherwise you will lose the benefit of getting useful failure test messages.
Here is an example with AssertJ that provides this feature out of box.
#Test
void namesEquals() {
List<Request> requests = new ArrayList<>();
requests.add(new Request("ABC", "ABD"));
Assertions.assertThat(requests.get(0)).extracting(Request::name, Request::name2)
.containsExactly("ABC", "ABC");
}
And in this failing test, you will get a useful information message :
java.lang.AssertionError:
Expecting:
<["ABC", "ABD"]>
to contain exactly (and in same order):
<["ABC", "ABC"]>
but some elements were not found:
<[]>
and others were not expected:
<["ABD"]>
A bit too clever perhaps, but you can try this:
assertTrue(Stream.of(request.get(0).name(), request.get(0).name2())
.allMatch("ABC"::equals));
Or you can give this a spin:
assertThat(Arrays.asList(request.get(0).name(), request.get(0).name2()),
Every.everyItem(IsEqual.equalTo("ABC")));
something like:
assertThat(request.get(0).name().equals( request.get(0).name2()) ?
request.get(0).name() : "false").isEqualTo("ABC");

Match a string String against large list of regexps, performance, in Java

I have following:
private static List<Pattern> pats;
This list contains around 90 patterns that is instantiated before iteration. The patterns are complex, like:
System.out.println("pat: " + pats.get(0).toString());
// pat: \bsingle1\b|\bsingle2\b|(?=.*\bcombo1\b)(?=.*\bcombo2\b)|\bsingle3\b|\bwild.*card\b ...
Some of the patterns contains around 40-50 single words or combination of words, as the regex above shows. The words can contain wildcards.
Now, I have a list of strings, sentences on around 30-60 characters each. I iterate through them and for every string in the list, I iterate them through the list of patterns and perform a pattern.match("This is one of the strings in my list").find() until I get a match, which I mark down and save somewhere else, then I break out of iteration through patterns and continue with the next string in the list.
This is a categorization job, so several strings can match on the same pattern.
My problem is that this of course takes a lot of execution time, I am looking for a more efficient way to solve this problem.
Any suggestions?
One thing that solved my problem (to 90%) was to give up regex partially where String.indexOf() made more sense out of a performance perspective.
This post inspired me: Quickest way to return list of Strings by using wildcard from collection in Java
I wrote my own implementation since the one in the link handles only full words, while I'm dealing with sentences.
It helped with wildcards "*" and pipes "hel(l|lo)" in the performance perspective, the former more than the latter.
Reason for this direction was several recommendations, and it improved performance by cutting down time on 200000 sentences from 1.5 hour down to 15 minutes.
You could also offload the regular expression in a dedicated service ? I believe that it could be faster (and perhaps safer) than giving up regexp partially ?
If your app is intended to run on multiple server, you may also gain performances by centralizing the computation cost.
Here is an example of such implementation via a REST api : http://www.rex-daemon.com/tutorial/more-advanced-queries/

When do we use 'Theories' in JUnit?

Some tests are used with this annotation #RunWith(Theories.class) in JUnit and I don't know when and why we use it?
You should use them when you would like your tests to focus on the generalized relationship between inputs and outputs. See: https://blogs.oracle.com/jacobc/entry/junit_theories.
I think capturing multiple types of input are one thing. The high level idea is that you want to test if your method is true for all possible inputs.
For example, let's say I have some complex business process that takes 5 different inputs, and let's say for each input, there's 10 possible states, so we end up with 10*10*10*10*10 = 100,000 possible input states, which means we need to know beforehand what all these 100,000 output values are.
However, you probably realize that you don't need to actually enumerate all 100,000 states. There's probably a subset that you are interested in. Let's theorize for example:
"Admins have no permission restriction". And if I wanted to assert that this is true, my Test ends up looking like the pseudo-code below.
#Test
public void AdminsHaveNoPermissionRestriction(User user, BusinessProcess bp, Input a, Input b ...) {
Assume.assumeThat("User is an admin", user.hasRole(admin);
// .. rest of test which uses bp, a, b etc...
)
The nice thing is that we skip un-interested objects (non-admins) because it fails the assumptions.

Java - How to test some code?

I am not looking for the answer to this question but just a brief outline of how to do it. This is a question from a exam past paper.
It states: Describe in English a sequence of tests that you might use to test code to implement the NumberCoversion class. Your tests should cover all the conditions described in the above definition.
I won't write the specification of the class but it contains things like: it must take String as input and output, accepting two parameters and returning null if a number is not valid etc.
The question is worth 10% so will I just be required to write a series of things like: Ensure that the constructor only accepts two parameters of type int, and not anything else e.g. double or accept 3 parameters.
Would it be worth writing possible JUnit test methods in English/Pseudocode.
Would this be the right sort of thing to write for tests in English?
I think the goal is to describe a test case which checks each of the specifications in the question, whilst also avoiding attempting to test things which are limited by the language construct (e.g. wrong number/type of arguments).
Describe in English what you will do if you write tests. Typically it's usage of the NumberConversion class.
According to the question, you need to describe tests in English. I think that jumping to JUnit unit tests is more than being asked for. If I answered this question, I would start by looking at the definition given of the NumberConversion class. I would describe tests that use valid inputs as well as tests that use invalid ones. For each test describe how you will ensure that the NumberConversion class behaves as expected, including expected error conditions.
As an example for what might be appropriate...
If the specifications were:
Takes a string as input
The string can be an arbitrarily large non-negative integer
If the string is not a non-negative integer, an exception is thrown
Then I would probably answer something along the lines of:
I would test with "42" as input to check that the method works with a "normal" number
I would test with "0" as input to check that the method works with the edge case number
I would test with "9223372036854775808" (1 more than Long.MAX_VALUE) as input to check that the method works with a number larger than the fixed length integers provided by Java
I would test with "-1" as input and ensure that an exception is thrown, as negative input is invalid
I would test witht "0xa" as input and ensure that an exception is thrown, as hexadecimal input is invalid
I would test witht "0.1" as input and ensure that an exception is thrown, as non integral input is invalid

Is it possible to automate generation of wrong choices from a correct word?

The following list contains 1 correct word called "disastrous" and other incorrect words which sound like the correct word?
A. disastrus
B. disasstrous
C. desastrous
D. desastrus
E. disastrous
F. disasstrous
Is it possible to automate generation of wrong choices given a correct word, through some kind of java dictionary API?
No, there is nothing related in java API. You can make a simple algorithm which will do the job.
Just make up some rules about letters permutations and doubling and add generated words to the Set until you get enough words.
There are a number of algorithms for matching words by sound - 'soundex' is the one that springs to mind, but I remember uncovering a few when I did some research on this a couple of years ago. I expect the problem you would find is that they take a word and return a value that represents how the word sounds so you can see if two spellings sound similar (so the words in the question should generate similar values); but I expect doing the reverse, i.e. taking the value and generating similar sounding spellings, would be quite hard.

Categories