Compare two String[] arrays and print out the strings which differ - java

I've got a list of all the file names in a folder and a list of files which have been manually "checked" by a developer. How would I go about comparing the two arrays such that we print out only those which are not contained in the master list.
public static void main(String[] args) throws java.lang.Exception {
String[] list = {"my_purchases", "my_reservation_history", "my_reservations", "my_sales", "my_wallet", "notifications", "order_confirmation", "payment", "payment_methods", "pricing", "privacy", "privacy_policy", "profile_menu", "ratings", "register", "reviews", "search_listings", "search_listings_forms", "submit_listing", "submit_listing_forms", "terms_of_service", "transaction_history", "trust_verification", "unsubscribe", "user", "verify_email", "verify_shipping", "404", "account_menu", "auth", "base", "dashboard_base", "dashboard_menu", "fiveohthree", "footer", "header", "header_menu", "listings_menu", "main_searchbar", "primary_navbar"};
String[] checked = {"404", "account_menu", "auth", "base", "dashboard_base", "dashboard_menu", "fiveohthree", "footer", "header", "header_menu", "listings_menu"};
ArrayList<String> ar = new ArrayList<String>();
for(int i = 0; i < checked.length; i++)
{
if(!Arrays.asList(list).contains(checked[i]))
ar.add(checked[i]);
}
}

Change your loop to :
ArrayList<String> ar = new ArrayList<String>();
for(int i = 0; i < checked.length; i++) {
if(!Arrays.asList(list).contains(checked[i]))
ar.add(checked[i]);
}
ArrayList ar should be outside of the for loop. Otherwise ar will be created every time when element of checked array exists in list.
Edit:
if(!Arrays.asList(list).contains(checked))
With this statement you are checking whether the checked reference is not the element of list. It should be checked[i] to check whether the element of checked exists in list or not.
If you want to print elements in list that are not in checked. Then use :
for(int i = 0; i < list.length; i++) {
if(!Arrays.asList(checked).contains(list[i]))
ar.add(list[i]);
}
System.out.println(ar);

Your updated solution seems kind of odd to me, not sure why you would add list[i] to the result list. Generally this sounds like something hashsets are made for:
String[] list = { "my_purchases", "my_reservation_history","my_reservations","my_sales", "my_wallet", "notifications", "order_confirmation", "payment", "payment_methods", "pricing", "privacy", "privacy_policy", "profile_menu", "ratings", "register", "reviews", "search_listings", "search_listings_forms", "submit_listing", "submit_listing_forms", "terms_of_service", "transaction_history", "trust_verification", "unsubscribe", "user", "verify_email", "verify_shipping", "404", "account_menu", "auth", "base", "dashboard_base", "dashboard_menu", "fiveohthree", "footer", "header", "header_menu", "listings_menu", "main_searchbar", "primary_navbar"};
String[] checked = { "404", "account_menu", "auth", "base", "dashboard_base", "dashboard_menu", "fiveohthree", "footer", "header", "header_menu", "listings_menu"};
HashSet<String> s1 = new HashSet<String>(Arrays.asList(checked));
s1.removeAll(Arrays.asList(list));
System.out.println(s1);

for (String s: checked) { // go through all in second list
if (! list.contains(s)) { // if string not in master list
System.out.println(s); // print that string
}
}

First of all, I think your code has some errors:
s1 is not defined
ar is not defined
you mean to use Arrays.toString instead of Array.toString
So I fixed your code (using Java 8) and it should work like that:
public static void main(String[] args) throws java.lang.Exception {
String[] list = {"my_purchases", "my_reservation_history", "my_reservations", "my_sales", "my_wallet", "notifications", "order_confirmation", "payment", "payment_methods", "pricing", "privacy", "privacy_policy", "profile_menu", "ratings", "register", "reviews", "search_listings", "search_listings_forms", "submit_listing", "submit_listing_forms", "terms_of_service", "transaction_history", "trust_verification", "unsubscribe", "user", "verify_email", "verify_shipping", "404", "account_menu", "auth", "base", "dashboard_base", "dashboard_menu", "fiveohthree", "footer", "header", "header_menu", "listings_menu", "main_searchbar", "primary_navbar"};
String[] checked = {"404", "account_menu", "auth", "base", "dashboard_base", "dashboard_menu", "fiveohthree", "footer", "header", "header_menu", "listings_menu"};
final List<String> result = Stream.of(list)
.filter(listEntry -> Stream.of(checked)
.filter(checkedEntry -> checkedEntry.equals(listEntry)).findFirst().orElse(null) == null)
.collect(Collectors.toList());
System.out.println(result);
}
If you don't want to use Java 8, you have to replace the usage of Streams and filters and collect with the appropriate functions in Java 7 (see e.g., Satya's post).
Anyways, I should mention that there are better (regarding performance) implementations to solve your problem, e.g.,
you could sort your lists prior to searching for duplicates,
you could use, e.g., hash-based implementations to increase the speed when searching for duplicates,
you could move the code outside of the inner loop,
and many more

Related

finding the most popular word in a person's tweets

In a project, I'm trying to query the tweets of a particular user's handle and find the most common word in the user's tweets and also return the frequency of that most common word.
Below is my code:
public String mostPopularWord()
{
this.removeCommonEnglishWords();
this.sortAndRemoveEmpties();
Map<String, Integer> termsCount = new HashMap<>();
for(String term : terms)
{
Integer c = termsCount.get(term);
if(c==null)
c = new Integer(0);
c++;
termsCount.put(term, c);
}
Map.Entry<String,Integer> mostRepeated = null;
for(Map.Entry<String, Integer> curr: termsCount.entrySet())
{
if(mostRepeated == null || mostRepeated.getValue()<curr.getValue())
mostRepeated = curr;
}
//frequencyMax = termsCount.get(mostRepeated.getKey());
try
{
frequencyMax = termsCount.get(mostRepeated.getKey());
return mostRepeated.getKey();
}
catch (NullPointerException e)
{
System.out.println("Cannot find most popular word from the tweets.");
}
return "";
}
I also think it would help to show the codes for the first two methods I call in the method above, as shown below. They are all in the same class, with the following defined:
private Twitter twitter;
private PrintStream consolePrint;
private List<Status> statuses;
private List<String> terms;
private String popularWord;
private int frequencyMax;
#SuppressWarnings("unchecked")
public void sortAndRemoveEmpties()
{
Collections.sort(terms);
terms.removeAll(Arrays.asList("", null));
}
private void removeCommonEnglishWords()
{
Scanner sc = null;
try
{
sc = new Scanner(new File("commonWords.txt"));
}
catch(Exception e)
{
System.out.println("The file is not found");
}
List<String> commonWords = new ArrayList<String>();
int count = 0;
while(sc.hasNextLine())
{
count++;
commonWords.add(sc.nextLine());
}
Iterator<String> termIt = terms.iterator();
while(termIt.hasNext())
{
String term = termIt.next();
for(String word : commonWords)
if(term.equalsIgnoreCase(word))
termIt.remove();
}
}
I apologise for the rather long code snippets. But one frustrating thing is that even though my removeCommonEnglish() method is apparently right (discussed in another post), when I run the mostPopularWord(), it returns "the", which is clearly a part of the common English Words list that I have and meant to eliminate from the List terms. What might I be doing wrong?
UPDATE 1:
Here is the link ot the commonWords file:
https://drive.google.com/file/d/1VKNI-b883uQhfKLVg-L8QHgPTLNb22uS/view?usp=sharing
UPDATE 2: One thing I've noticed while debugging is that the
while(sc.hasNext())
in removeCommonEnglishWords() is entirely skipped. I don't understand why, though.
It can be more simple if you use stream like so :
String mostPopularWord() {
return terms.stream()
.collect(Collectors.groupingBy(s -> s, Collectors.counting()))
.entrySet().stream()
.sorted(Map.Entry.comparingByValue(Comparator.reverseOrder()))
.findFirst()
.map(Map.Entry::getKey)
.orElse("");
}
I tried your code. Here is what you will have to do. Replace the following part in removeCommonEnglishWords()
Iterator<String> termIt = terms.iterator();
while(termIt.hasNext())
{
String term = termIt.next();
for(String word : commonWords)
if(!term.equalsIgnoreCase(word))
reducedTerms.add( term );
}
with this:
List<String> reducedTerms = new ArrayList<>();
for( String term : this.terms ) {
if( !commonWords.contains( term ) ) reducedTerms.add( term );
}
this.terms = reducedTerms;
Since you hadn't provided the class, I created one with some assumptions, but I think this code will go through.
A slightly different approach using streams.
This uses the relatively common frequency count idiom using streams and stores them in a map.
It then does a simple scan to find the largest count obtained and either returns
that word or the string "No words found".
It also filters out the words in a Set<String> called ignore so you need to create that too.
import java.util.Arrays;
import java.util.Comparator;
import java.util.Map;
import java.util.Map.Entry;
import java.util.stream.Collectors;
Set<String> ignore = Set.of("the", "of", "and", "a",
"to", "in", "is", "that", "it", "he", "was",
"you", "for", "on", "are", "as", "with",
"his", "they", "at", "be", "this", "have",
"via", "from", "or", "one", "had", "by",
"but", "not", "what", "all", "were", "we",
"RT", "I", "&", "when", "your", "can",
"said", "there", "use", "an", "each",
"which", "she", "do", "how", "their", "if",
"will", "up", "about", "out", "many",
"then", "them", "these", "so", "some",
"her", "would", "make", "him", "into",
"has", "two", "go", "see", "no", "way",
"could", "my", "than", "been", "who", "its",
"did", "get", "may", "…", "#", "??", "I'm",
"me", "u", "just", "our", "like");
Map.Entry<String, Long> entry = terms.stream()
.filter(wd->!ignore.contains(wd)).map(String::trim)
.collect(Collectors.groupingBy(a -> a,
Collectors.counting()))
.entrySet().stream()
.collect(Collectors.maxBy(Comparator
.comparing(Entry::getValue)))
.orElse(Map.entry("No words found", 0L));
System.out.println(entry.getKey() + " " + entry.getValue());

Sort a Groovy flattened JsonSluper object after parsing

I have a JSON message that after parsing it w/ the JsonSluper the ordering is messed up. I know the ordering isn't important, but I need to put the message back into ascending order after the message is parsed and flatted into single objects, so I can a build a JsonArray and present the message in the proper asc order.
String test = """[
{
"AF": "test1",
"BE": "test2",
"CD": "test3",
"DC": "test4",
"EB": "test5",
"FA": "test5"
},
{
"AF": "test1",
"BE": "test2",
"CD": "test3",
"DC": "test4",
"EB": "test5",
"FA": "test5"
}
]"""
The parseText produces this:
def json = new groovy.json.JsonSlurper().parseText(test);
[{CD=test3, BE=test2, AF=test1, FA=test5, EB=test5, DC=test4}, {CD=test3,
BE=test2, AF=test1, FA=test5, EB=test5, DC=test4}]
After parsing the json message, I need to pass the flatten json object into a method at which point needs to be sorted in ascending order by the map keys prior to adding to a JSONArray like below.
def json = new groovy.json.JsonSlurper().parseText(test);
for( int c = 0; c < json?.size(); c++ )
doSomething(json[c]);
void doSomething( Object json ){
def jSort= json.????
JSONArray jsonArray = new JSONArray();
jsonArray.add(jSort);
}
​
You can just sort entries before adding them. The following uses collectEntries, which creates LinkedHashMap objects (thus preserving order):
def json = new groovy.json.JsonSlurper().parseText(test);
def sortedJson = json.collect{map -> map.entrySet().sort{it.key}
.collectEntries{[it.key, it.value]}}
sortedJson has this content, which seems to be sorted as required:
[[AF:test1, BE:test2, CD:test3, DC:test4, EB:test5, FA:test5],
[AF:test1, BE:test2, CD:test3, DC:test4, EB:test5, FA:test5]]

Parsing through JSON and selecting multiple items in java

I currently have this JSON:
[
{
"example": "12345678",
"test": "0",
"name": "tom",
"testdata": "",
"testtime": 1531209885613
},
{
"example": "12634346",
"test": "43223452234",
"name": "jerry",
"testdata": "pawenkls",
"testtime": 1531209888196
}
]
I am trying to parse through the array to find a value of "testdata" that matches the value of "testdata" that I have generated, which I am currently doing like so:
JsonArray entries = (JsonArray) new JsonParser().parse(blockchainJson);
JsonElement dataHash = ((JsonObject)entries.get(i)).get("dataHash");
Then I wish to find the value of "example" that is in the same array as the "testdata" with the value "pawenkls".
How do I search for the "example" value that is in the same group as the value of "test data" that I have found?
You need to run through the objects in the array and check the value of the testData field against yours. Then read its example field.
String testData = "pawenkls";
JsonArray entries = (JsonArray) new JsonParser().parse(blockchainJson);
String example = null;
for(JsonElement dataHashElement : entries) {
JsonObject currentObject = dataHashElement.getAsJsonObject();
if(testData.equals(currentObject.get("testdata").getAsString())) {
example = currentObject.get("example").getAsString();
break;
}
}
System.out.println("example: "+example);
This prints out
example: 12634346
Here is a Java 8 version doing the same thing:
String testData = "pawenkls";
JsonObject[] objects = new Gson().fromJson(blockchainJson, JsonObject[].class);
Optional<JsonObject> object = Arrays.stream(objects)
.filter(o -> testData.equals(o.get("testdata").getAsString()))
.findFirst();
String example = null;
if(object.isPresent())
example = object.get().get("example").getAsString();
System.out.println("example: "+example);

How to access elements in java subarray without key but with index

I have an JSON that looks like this:
{ "Message": "None", "PDFS": [
[
"test.pdf",
"localhost/",
"777"
],
[
"retest.pdf",
"localhost\",
"666"
] ], "Success": true }
I'm trying to access the individual strings within the arrays but I'm having difficulty doing it as getString is requiring me to use a key and not indexes.
I've tried this to access the first string in each sub-array:
JSONArray pdfArray = resultJson.getJSONArray("PDFS");
for (int i = 0; i < pdfArray.length(); i++) {
JSONObject pdfObject = pdfArray.getJSONObject(i);
String fileName = pdfObject.getString(0);
}
Read the array as an array:
JSONArray array = pdfArray.getJSONArray(i);
String fileName = array.getString(0);

How to use indri for indexing in java?

import lemurproject.indri.*;
import java.io.*;
public class Indritest {
public static void main(String[] args) throws Exception {
String [] stopWordList = {"a", "an", "and", "are", "as", "at", "be",
"by","for", "from", "has", "he", "in", "is",
"it", "its", "of", "on", "that", "the", "to",
"was", "were", "will", "with"};
String myIndex = "C:/Program Files/lemur/lemur4.12/src/app/obj/myIndex5";
try {
IndexEnvironment envI = new IndexEnvironment();
envI.setStoreDocs(true);
// create an Indri repository
envI.setMemory(256000000);
envI.setStemmer("krovetz");
envI.setStopwords(stopWordList);
envI.setIndexedFields( new String[] {"article", "header", "p", "title", "link"});
envI.open(myIndex);
envI.create( myIndex );
// add xml files to the just created index i.e myIndex
// xml_data is a folder which contains the list of xml files to be added
File filesDir = new File("C:/NetbeanProg2/xml_data");
File[] files = filesDir.listFiles();
int noOffiles = files.length;
for (int i = 0; i < noOffiles; i++) {
System.out.println(files[i].getCanonicalPath() + "\t" + files[i].getCanonicalFile());
envI.addFile(files[i].getCanonicalPath(), "xml");
}
} catch (Exception e) {
System.out.println("issue is: " + e);
}
}
}
I have found this code from a tutorial but it isn't working. It's giving me an exception.
Exception in thread "main" java.lang.UnsatisfiedLinkError: C:\Program Files\Indri\Indri 5.9\bin\indri_jni.dll: Can't find dependent libraries
In the myindex variable I have provided the path of my IndexUI.jar file.
I am new to indri. I have not much idea about its usage. I have downloaded indri 5.9
issue was the version of indri

Categories