I have a semicolon separated text file. The idea is to read the text file line by line. Every line will be splitted to an array element.
Now I want to do some checks like is the ID (first element called "Referenz") unique, are all mandatory "fields" filled, etc...
I guess I have to take the ID and put it to an list. And for the next line I have to compare the ID with the IDs from the list?
So question is that the right way and what / how to realise that.
Here is my code so far:
public class Test_Line2Array {
public static void main(String[] args) {
String strLine = "Referenz;field2;field3;field4;field5;field6;field7;Titel;Name1;Name2;Name3;field8;field9;field10;field11;field12;field13;field14;Street;field15;ZIP;field16;city;field17;dob;field18;field19;field20;field21;field22;field23;field24;field25;field26;field27;field28;field29;field30;field31;field32;field33;field34;field35;field36;field37;field38;field39;phone;mobile;CustomField1;CustomField2;CustomField3;CustomField4;CustomField5;CustomField6;CustomField7;CustomField8;CustomField9;CustomField10";
//declaration
String[] stringArray;
String delimiter = ";";
// allocates memory for 59 strings
stringArray = new String[59];
// split the String after separator ";"
stringArray = strLine.split(";", -1);
// print array
for(int j = 0; j < stringArray.length; j++) {
System.out.println(j + " " + stringArray[j]);
}
}
I recommend you to split the string with delimiter ; and add separated Strings to a List, where you can easily validate with the Collections.frequency() static method returning the number as int of the occurence.
String[] values = strLine.split(";");
List<String> list = Arrays.asList(values);
if (Collections.frequency(list, list.get(0) > 1) {
System.out.println("The first value is not unique in the list");
}
Since Java 8 feel free to use Stream:
if (list.stream().filter(a -> a.equals(list.get(0))).count() > 1) {
System.out.println("The first value is not unique in the list");
}
// allocates memory for 59 strings
stringArray = new String[59];
// split the String after separator ";"
stringArray = strLine.split(";", -1);
Initializing the String[59] isn't helping you; the split method is just returning something that overwrites it immediately afterwards.
If you needed to check for any duplicates, using a HashSet would help here.
If you only need to make sure the first element isn't duplicated, you can just do it in a loop. You've already got one, so...
// print array
for(int j = 0; j < stringArray.length; j++) {
if (stringArray[0].equals(stringArray(j)) {
System.out.println("Duplicate!");
}
System.out.println(j + " " + stringArray[j]);
}
}
To check if the first element is unique, you can use the following:
Collections.frequency(Arrays.asList(stringArray), stringArray[0]) == 1
This returns a boolean that is true if the first element of stringArray is unique, otherwise false.
For each line, put its Referenz in a HashSet. Then checking if a subsequent Referenz is unique would be as simple as referenzSet.contains(theNewReferenz)
Related
I have a task which involves me creating a program that reads text from a text file, and from that produces a word count, and lists the occurrence of each word used in the file. I managed to remove punctuation from the word count but I'm really stumped on this:
I want java to see this string "hello-funny-world" as 3 separate strings and store them in my array list, this is what I have so far , with this section of code I having issues , I just get "hello funny world" seen as one string:
while (reader.hasNext()){
String nextword2 = reader.next();
String nextWord3 = nextword2.replaceAll("[^a-zA-Z0-9'-]", "");
String nextWord = nextWord3.replace("-", " ");
int apcount = 0;
for (int i = 0; i < nextWord.length(); i++){
if (nextWord.charAt(i)== 39){
apcount++;
}
}
int i = nextWord.length() - apcount;
if (wordlist.contains(nextWord)){
int index = wordlist.indexOf(nextWord);
count.set(index, count.get(index) + 1);
}
else{
wordlist.add(nextWord);
count.add(1);
if (i / 2 * 2 == i){
wordlisteven.add(nextWord);
}
else{
wordlistodd.add(nextWord);
}
}
This can work for you ....
List<String> items = Arrays.asList("hello-funny-world".split("-"));
By considering that you are using the separator as '-'
I would suggest you to use simple split() of java
String name="this-is-string";
String arr[]=name.split("-");
System.out.println("Here " +arr.length);
Also you will be able to iterate through this array using for() loop
Hope this helps.
I've searched high and low and finally have to ask.
I have an array containing, for example, ["123456","132457", "468591", ... ].
I have a string with a value of "46891".
How do I search through the array and find the object that contains all the characters from my string value? For example the object with "468591" contains all the digits from my string value even though it's not an exact match because there's an added "5" between the "8" and "9".
My initial thought was to split the string into its own array of numbers (i.e. ["4","6","8","9","1"] ), then to search through the array for objects containing the number, to create a new array from it, and to keep whittling it down until I have just one remaining.
Since this is likely a learning assignment, I'll give you an idea instead of an implementation.
Start by defining a function that takes two strings, and returns true if the first one contains all characters of the second in any order, and false otherwise. It should looks like this:
boolean containsAllCharsInAnyOrder(String str, String chars) {
...
}
Inside the function set up a loop that picks characters ch from the chars string one by one, and then uses str.indexOf(ch) to see if the character is present in the string. If the index is non-negative, continue; otherwise, return false.
If the loop finishes without returning, you know that all characters from chars are present in src, so you can return true.
With this function in hand, set up another loop in your main function to go through elements of the array, and call containsAllCharsInAnyOrder on each one in turn.
I think you can use sets for this.
List<String> result = new ArrayList<>();
Set<String> chars = new HashSet<>(Arrays.asList(str.split(""));
for(String string : stringList) {
Set<String> stringListChars = new HashSet<>(Arrays.asList(string.split(""));
if(chars.containsAll(stringListChars)) {
result.add(string);
}
}
There is a caveat here; it doesn't work as you would expect for repeated characters and you haven't specified how you want to handle that (for example, 1154 compared against 154 will be considered a positive match). If you do want to take into account repeated characters and you want to make sure that they exist in the other string, you can use a List instead of a Set:
List<String> result = new ArrayList<>();
List<String> chars = Arrays.asList(str.split(""));
for(String string : stringList) {
List<String> stringListChars = Arrays.asList(string.split("");
if(chars.containsAll(stringListChars)) {
result.add(string);
}
}
Your initial idea was good start, so what you can do is to create not an array but set, then using Guava Sets#powerSet method to create all possible subsets filter only those that have "46891".length mebers, convert each set into String and look those strings in the original array :)
You could do this with the ArrayList containsAll method along with asList:
ArrayList<Character> lookingForChars = new ArrayList<Character>(Arrays.asList(lookingForString.toCharArray()));
for (String toSearchString : array) {
ArrayList<Character> toSearchChars = new ArrayList<Character>(Arrays.asList(toSearchString.toCharArray));
if (toSearchChars.containsAll(lookingForChars)) {
System.out.println("Match Found!");
}
}
You can use String#chartAt() in a nested for loop to compare your string with each of the array's elements.
This method would help you check whether a character is contained in both strings.
This is more tricky then a straigt-forward solution.
The are better algorithms but here one easy to implement and understand.
Ways of solving:
Go through every char at your given string and check if it at the
given arrray.
Collect list for every string from the selected
array containing the given char.
Check if no other char to check.
If there is, Perform A again but on the collected list(result list).
Else, Return all possible matches.
try this
public static void main(String args[]) {
String[] array = {"123456", "132457", "468591"};
String search = "46891";
for (String element : array) {
boolean isPresent = true;
for (int index = 0; index < search.length(); index++) {
if(element.indexOf(search.charAt(index)) == -1){
isPresent = false;
break;
}
}
if(isPresent)
System.out.println("Element "+ element + " Contains Serach String");
else
System.out.println("Element "+ element + " Does not Contains Serach String");
}
}
This sorts the char[]'s of the search string and the and the string to search on. Pretty sure (?) this is O(n logn) vs O(n^2) without sorting.
private static boolean contains(String searchMe, String searchOn){
char[] sm = searchMe.toCharArray();
Arrays.sort(sm);
char[] so = searchOn.toCharArray();
Arrays.sort(so);
boolean found = false;
for(int i = 0; i<so.length; i++){
found = false; // necessary to reset 'found' on subsequent searches
for(int j=0; j<sm.length; j++){
if(sm[j] == so[i]){
// Match! Break to the next char of the search string.
found = true;
break;
}else if(sm[j] > so[i]){ // No need to continue because they are sorted.
break;
}
}
if(!found){
// We can quit here because the arrays are sorted.
// I know if I did not find a match of the current character
// for so in sm, then no other characters will match because they are
// sorted.
break;
}
}
return found;
}
public static void main(String[] args0){
String value = "12345";
String[] testValues = { "34523452346", "1112", "1122009988776655443322",
"54321","7172839405","9495929193"};
System.out.println("\n Search where order does not matter.");
for(String s : testValues){
System.out.println(" Does " + s + " contain " + value + "? " + contains(s , value));
}
}
And the results
Search where order does not matter.
Does 34523452346 contain 12345? false
Does 1112 contain 12345? false
Does 1122009988776655443322 contain 12345? true
Does 54321 contain 12345? true
Does 7172839405 contain 12345? true
Does 9495929193 contain 12345? true
How would I remove the chars from the data in this file so I could sum up the numbers?
Alice Jones,80,90,100,95,75,85,90,100,90,92
Bob Manfred,98,89,87,89,9,98,7,89,98,78
I want to do this so for every line it will remove all the chars but not ints.
The following code might be useful to you, try running it once,
public static void main(String ar[])
{
String s = "kasdkasd,1,2,3,4,5,6,7,8,9,10";
int sum=0;
String[] spl = s.split(",");
for(int i=0;i<spl.length;i++)
{
try{
int x = Integer.parseInt(spl[i]);
sum = sum + x;
}
catch(NumberFormatException e)
{
System.out.println("error parsing "+spl[i]);
System.out.println("\n the stack of the exception");
e.printStackTrace();
System.out.println("\n");
}
}
System.out.println("The sum of the numbers in the string : "+ sum);
}
even the String of the form "abcd,1,2,3,asdas,12,34,asd" would give you sum of the numbers
You need to split each line into a String array and parse the numbers starting from index 1
String[] arr = line.split(",");
for(int i = 1; i < arr.length; i++) {
int n = Integer.parseInt(arr[i]);
...
try this:
String input = "Name,2,1,3,4,5,10,100";
String[] strings = input.split(",");
int result=0;
for (int i = 1; i < strings.length; i++)
{
result += Integer.parseInt(strings[i]);
}
You can make use of the split method of course, supplying "," as the parameter, but that's not all.
The trick is to put each text file's line into an ArrayList. Once you have that, move forwars the Pseudocode:
1) Put each line of the text file inside an ArrayList
2) For each line, Split to an array by using ","
3) If the Array's size is bigger than 1, it means there are numbers to be summed up, else only the name lies on the array and you should continue to the next line
4) So the size is bigger than 1, iterate thru the strings inside this String[] array generated by the Split function, from 1 to < Size (this will exclude the name string itself)
5) use Integer.parseInt( iterated number as String ) and sum it up
There you go
Number Format Exception would occur if the string is not a number but you are putting each line into an ArrayList and excluding the name so there should be no problem :)
Well, if you know that it's a CSV file, in this exact format, you could read the line, execute string.split(',') and then disregard the first returned string in the array of results. See Evgenly's answer.
Edit: here's the complete program:
class Foo {
static String input = "Name,2,1,3,4,5,10,100";
public static void main(String[] args) {
String[] strings = input.split(",");
int result=0;
for (int i = 1; i < strings.length; i++)
{
result += Integer.parseInt(strings[i]);
}
System.out.println(result);
}
}
(wow, I never wrote a program before that didn't import anything.)
And here's the output:
125
If you're not interesting in parsing the file, but just want to remove the first field; then split it, disregard the first field, and then rejoin the remaining fields.
String[] fields = line.split(',');
StringBuilder sb = new StringBuilder(fields[1]);
for (int i=2; i < fields.length; ++i)
sb.append(',').append(fields[i]);
line = sb.toString();
You could also use a Pattern (regular expression):
line = line.replaceFirst("[^,]*,", "");
Of course, this assumes that the first field contains no commas. If it does, things get more complicated. I assume the commas are escaped somehow.
There are a couple of CsvReader/Writers that might me helpful to you for handling CSV data. Apart from that:
I'm not sure if you are summing up rows? columns? both? in any case create an array of the target sum counters int[] sums(or just one int sum)
Read one row, then process it either using split(a bit heavy, but clear) or by parsing the line into numbers yourself (likely to generate less garbage and work faster).
Add numbers to counters
Continue until end of file
Loading the whole file before starting to process is a not a good idea as you are doing 2 bad things:
Stuffing the file into memory, if it's a large file you'll run out of memory (very bad)
Iterating over the data 2 times instead of one (probably not the end of the world)
Suppose, format of the string is fixed.
String s = "Alice Jones,80,90,100,95,75,85,90,100,90,92";
At first, I would get rid of characters
Matcher matcher = Pattern.compile("(\\d+,)+\\d+").matcher(s);
int sum = 0;
After getting string of integers, separated by a comma, I would split them into array of Strings, parse it into integer value and sum ints:
if (matcher.find()){
for (String ele: matcher.group(0).split(",")){
sum+= Integer.parseInt(ele);
}
}
System.out.println(sum);
I have a two Dimensional Object array (Object[][] data) that holds pairs of products-prices.
I try to pass these values to a Map with the following way.
private String myPairs = "";
private String[] l, m;
for (int i=0; i<data.length; i++){
myPairs += (String)data[i][0] + ":" + String.valueOf(data[i][1]) + ",";
}
Map<String, Double> pairs = new java.util.HashMap<>();
l = myPairs.split(",");
for (int i=0; i<l.length; i++){
m = l[i].split(":");
pairs.put((String)m[0], Double.parseDouble((String)m[1]));
}
I get a java.lang.ArrayIndexOutOfBoundsException. What's the wrong I have done?
Try
for (int i=0; i<l.length-1; i++){
m = l[i].split(":");
pairs.put((String)m[0], Double.parseDouble((String)m[1]));
}
You problem is here:
pairs.put((String)m[0], Double.parseDouble((String)m[1]));
The first for loop creates a string that ends with a ,. For example "foo:0.1,bar:0.2,".
Then, you split by ,. So, the above example will return ["foo:0.1"; "bar:0.2"; ""]. Note the empty string value, due to the last , of the string.
Finally, for each value, you split by :. It works for the first two values (i.e. ["foo"; "0.1"] and ["bar"; "0.2"]), but the last one will be a 1-value array, containing an empty string: [""].
When trying to access the second value of the array (i.e. the index 1 since arrays are 0-based indexed), the ArrayIndexOutOfBoundsException get thrown.
Several solutions:
In the first loop, put a condition to add the , or not:
myPairs += (i == 0 ? "" : ",") + (String)data[i][0] + ":" + String.valueOf(data[i][1]);
OR Just after your first loop, remove the last char of the string:
myPairs = myPairs.substring(0, myPairs.length() - 1);
OR In the second loop, don't go until the last value, but only until the n-1 one:
for (int i=0; i<l.length - 1; i++)
OR even better, only if you don't need the string representation you're building in the first loop, replace all your code by:
for (int i=0; i<data.length; i++) {
pairs.put((String)data[i][0], Double.parseDouble((String)data[i][1]));
}
When the first for-loop ends, you have all the pairs separated with ',' and an extra ',' in the end. So, l.length is the number of pairs plus one. Though, this shouldn't produce an error so far.
The problem is that when you split every pair on ':', the last element of l is equal to a blank string.
So the splitting produces an 1-element-array, containing a blank string. The error occures because you ask for m[1].
Try not adding the ',' after the last element of the pairs, and the problem should be solved.
I hope this helps :)
The last element in the split of ,s is empty (because you say + "," on the last iteration of the first loop), so skip the last element in the second loop.
for (int i = 0; i < l.length-1; i++)
{
m = l[i].split(":");
pairs.put((String)m[0], Double.parseDouble((String)m[1]));
}
Also note that if the supplied strings contains :s or ,s, your algorithm would probably throw an exception too.
Note - A way better way (and to avoid the above) would just be to do it in the first loop, something like:
for (int i = 0; i < data.length; i++)
{
pairs.put((String)data[i][0], Double.parseDouble((String)data[i][1]));
}
I am trying to find and print the words in a string that occurs more than one. And it works almost. I am however fighting with a small problem. The words a printed out twice since they occur twice in the sentence. I want them printed only once:
This is my code:
public class Main {
/**
* #param args the command line arguments
*/
public static void main(String[] args) {
String sentence = "is this a sentence or is this not ";
String[] myStringArray = sentence.split(" "); //Split the sentence by space.
int[] count = new int[myStringArray.length];
for (int i = 0; i < myStringArray.length; i++){
for (int j = 0; j < myStringArray.length; j++){
if (myStringArray[i].matches(myStringArray[j]))
count[i]++;
//else break;
}
}
for (int i = 0; i < myStringArray.length; i++) {
if (count[i] > 1)
System.out.println("1b. - Tokens that occurs more than once: " + myStringArray[i] + "\n");
}
}
}
You can try for (int i = 0; i < myStringArray.length; i+=2) instead.
break on the first match, after incrementing. then it won't also increment the second match.
Your code has some problems with it.
If you notice, your code will look through the list of n elements n^2 times.
If the occurrence of the word is twice. You will increment each word's count value twice.
What you need to keep track of is the set of words you have already seen, and check if a new word you encounter has already been seen or not.
If you had 3 occurrence of one word in your sentence, you each word would have a count of 3. The 3 is redundant data that doesn't need to be stored for each token, but rather just the word.
All this can be done easily if you know how a Map works.
Here is an implementation that would work.
import java.util.HashMap;
public class Main {
public static void main(String[] args) {
String sentence = "is this a sentence or is this not ";
String[] myStringArray = sentence.split("\\s"); //Split the sentence by space.
Map <String, Integer> wordOccurrences = new HashMap <String, Integer> (myStringArray.length);
for (String word : myStringArray)
if (wordOccurrences.contains(word))
wordOccurrences.put(word, wordOccurrences.get(word) + 1);
else wordOccurrences.put(word, 1);
for (String word : wordOccurrences.keySet())
if (wordOccurrences.get(word) > 1)
System.out.println("1b. - Tokens that occurs more than once: " + word + "\n");
}
}
We want to find the repeating words from an input string. So, I suggest the following approach which is fairly simple:
Make a Hash Map instance. The key (String) will be the word and the value(Integer) will be the frequency of its occurrence.
Split the string using split("\s") method to make an array of only words.
Introduce an Integer type 'frequency' variable with initial value '0'.
Iterate of the string array and after checking frequency, add each element ( or word) to the map (if frequency for that key is 0) or if
the key (word) exists, only increment the frequency by 1.
So you are now left with each word and its frequency.
For example, if input string is "We are getting dirty as this earth is getting polluted. We must stop it."
So, the map will be
{ ("We",2), ("are",1), ("getting",2), ("dirty",1), ("as",1), ("this",1), ("earth",1), ("is",1), ("polluted.",1), ("must",1), ("stop",1), ("it.",1) }
Now you know what is next step and how to use it. I agree with Kaushik.