finding the value pair that has the highest affinity in Java? - java

Hi I am current working on a algorithm problem set.
Given the below file in a file.txt file,
yahoo,ap42
google,ap42
twitter,thl76
google,aa314
google,aa314
google,thl76
twitter,aa314
twitter,ap42
yahoo,aa314
A web server logs page views in a log file. The log file consists of one line per page view. A page view consists of page id and a user id, separated by a comma. The affinity of a pair of pages is the number of distinct users who viewed both pages. For example in the quoted log file, the affinity of yahoo and google is 2 (because ap42 viewed both and aa314 viewed both).
My requirement is to create an algorithm which will return the pair of pages with highest affinity.
Currently, I have written below code, however, right now it is not returning the pair of pages with highest affinity, any suggest of how I am modify the code to make it work? thanks. :
Scanner in = new Scanner(new File("./file.txt"));
ArrayList<String[]> logList = new ArrayList<String[]>();
while (in.hasNextLine()) {
logList.add(in.nextLine().split(","));
}
String currentPage;
String currentUser;
int highestCount =0;
for (int i = 0; i < logList.size()-1; i++) {
int affinityCount =0;
currentPage = logList.get(i)[0];
currentUser = logList.get(i)[1];
for (int j = logList.size()-1; j > 0; j--) {
if (i != j) {
if (!currentPage.equals(logList.get(j)[0])
&& currentUser.equals(logList.get(j)[1])) {
affinityCount++;
System.out.println("currentPage: "+currentPage+" currentUser: "+ currentUser);
System.out.println("logList.get(j)[0]: "+logList.get(j)[0]+" logList.get(j)[1]): "+ logList.get(j)[1]);
System.out.println(affinityCount);
}
}
}
}

Am going to write the algorithm here . You can convert that into the code.
Traverse the file and create a hashMap of .
After this traversal, you shall get the pages viewed by each user.
Now traverse this dataset. For each user, take out the list of pages he viewed. Make all possible combinations of pair of pages and put it in a max heap with value set to 1. If the combination exists in heap, increment the value.
Make sure you treat - yahoo,google same as google,yahoo while comparing.
At the end of this, the element at top of the heap is your output.

Related

Verifying the data from two arraylist and delete it

I have a master arraylist call toBeDeleted which stored timestamp and email. The following are the sample data inside the toBeDeleted arraylist
[1507075234, bunny#outlook.com]
I have one arraylist call logData1 which stored status,email,timestamps and ID. The following are the sample data inside the logData1 arraylist.
[16, bunny#outlook, 1507075234, 0OX9VQB-01-00P-02]
I hope to delete the data inside the logData1 arraylist by verifying the timestamp first with timestamps stated in toBeDeleted1 arraylist, if the timestamp matched, I will check the email for both arraylist. If both of them are matched, I would like to delete away all the data (status,email,timestamp,ID). But I cant make it work
this is my sample output from my source code
[16, bunny#outlook.com, 1507075234, 0OX9VQB-01-00P-02]
The data inside toBeDeleted1 is :[1507075234, bunny#outlook.com]
The time1 is :1507075234
The email1 is :bunny#outlook.com
The time is :1507075234
The emails is :bunny#outlook.com
The data is :bunny#outlook.com
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: -3
at java.util.ArrayList.elementData(Unknown Source)
at java.util.ArrayList.get(Unknown Source)
at EmailReporting.main(EmailReporting.java:83)
This is my sample program
System.out.println(logData1);
System.out.println("The data inside toBeDeleted1 is :"+toBeDeleted1);
for(int v = 0;v<toBeDeleted1.size();v++) //look through the logdata1 for removing the record base on timestamp
{
String time1 = toBeDeleted1.get(v);
String email1 = toBeDeleted1.get(v+1);
System.out.println("The time1 is :"+time1);
System.out.println("The email1 is :"+email1);
for(int f = logData1.size();f>logData1.size()-1;f--)
{
// System.out.println(logData1.size());
// System.out.println("The data in logdata1 is "+logData1.get(f-2));
if(time1.equals(logData1.get(f-2)))
{
System.out.println("The time is :"+logData1.get(f-2));
System.out.println("The emails is :"+logData1.get(f-3));
if(email1.equals(logData1.get(f-3)))
{
System.out.println("The data is :"+logData1.get(f-3));
logData1.remove(f-1);
logData1.remove(f-2);
logData1.remove(f-3);
logData1.remove(f-4);
f-=4;
}
}
}
}
The error occurred after this line of code executed
System.out.println("The data is :"+logData1.get(f-3));
You can find elements in the list in order using Collections.indexOfSubList:
List<String> toFind = Arrays.asList(time1, email1);
int emailIndex = Collections.indexOfSubList(logData1, toFind);
A similar lastIndexOfSubList method also exists. That might be more appropriate for your use case.
You can then use this to remove the elements from toFind:
int emailIndex = Collections.lastIndexOfSubList(logData1, toFind);
if (emailIndex >= 1) {
logData1.subList(emailIndex-1, emailIndex+3).clear();
}
Just do this in a loop to keep going until all occurrences have been removed.
Note that just doing this in a loop naively will keep on searching over the tail of the list repeatedly. Instead, you can use subList to "chop" the end of the list, to avoid re-searching it:
List<String> view = logData1;
int emailIndex;
while ((emailIndex = Collections.lastIndexOfSubList(view, toFind)) >= 1) {
logData1.subList(emailIndex-1, emailIndex+3).clear();
view = logData1.subList(0, emailIndex-1);
}
Additionally, note that deleting from the middle of an ArrayList is inefficient, because the elements after the ones you delete have to be shifted down. This is why using subList(...).clear() is better, because it does all of those shifts at once. But if you are removing lots of 4-element batches, you can do better.
Instead of the subList(...).clear(), you can set the bits of elements to be deleted into a BitSet:
List<String> view = logData1;
BitSet bits = new BitSet(logData1.size());
int emailIndex;
while ((emailIndex = Collections.lastIndexOfSubList(view, toFind)) >= 1) {
bits.set(emailIndex-1, emailIndex+3);
view = logData1.subList(0, emailIndex-1);
}
And then shift all the elements down at once, discarding the elements you want to delete:
int dst = 0;
for (int src = 0; src < logData1.size(); ++src) {
if (!bits.get(src)) {
logData1.set(dst++, logData1.get(src));
}
}
And now truncate the list:
logData1.subList(dst, logData1.size());

Read weighted graph from text file

I have to create a weighted graph from a text file. Below is a example how the text file looks like. The first number is the id of the actual train station. The second number is a possible destination and after the comma is the time in seconds, it takes to travel. The the third number is another possible destination.
060060101832 060063101842,78 060054104822,90
060054104822 060060101832,90 060057104812,90 060058101502,90 060054105611,66
060057104812 060054104822,90 060057102802,72
I want to store the routes in an ArrayList. Each route object should look like this:
Start: 060060101832
Destination: 060063101842
Time: 78
The problem is, I have to store multiple routes for the same starting location. How do I read the lines properly in, using a scanner? My approach was this:
while (routes.hasNext()) {
routes.useDelimiter(",| |\\n");
String start = routes.next();
String dest= routes.next();
String time= routes.next();
Edge edge = new Edge(start, dest, time);
edges.add(edge);
}
Since I cannot go back in the text file, I can't imagine how a right solution should look like.
This is not a complete code nor it was tested. It may or may not work but it will guide you anyways.
// Java 8
Node n;
Edge e;
String[] splittedLine;
String[] splittedEdge;
HashMap<String, Node> stationNumberToNode = new HashMap<>();
// if the file is not too large, you can read the file at once
List<String> lines = Files.readAllLines(new File("path/to/file.txt").getPath());
for(String line : lines){
splittedLine = line.split(" ");
if((n = stationNumberToNode.get(splittedLine[0]) == null){
n = new Node(splittedLine[0]); // assuming your Node has a constructor that takes the station id
stationNumberToNode.put(stationNumberToNode[0], n);
}
for(int i = 1; i < splittedLine.lenght; ++i){
splittedEdge = splittedLine[i].split(",");
e = new Edge(splittedEdge[0], splittedEdge[1]); // assuming your Edgehas a constructor that takes the destination station and the cost
n.addEdge(e);
}
}
Explanation
Node n;
Edge e;
String[] splittedLine;
String[] splittedEdge;
HashMap<String, Node> stationNumberToNode = new HashMap<>();
Ideally you should always declare variables outside loops, so you avoid allocating a new memory on every iteration. Thus, we declare our 5 variables before entering the loop. The HashMap is used here to cover the case that your input is not always grouped and you avoid having to perform a list search everytime.
List<String> lines = Files.readAllLines(new File("path/to/file.txt").getPath());
Read all the lines on the file at once. Alternatively, as requested on the question, you can read the file using Scanner like on this anwer. You have to change the way you iterate over the lines, though.
splittedLine = line.split(" ");
Splits the line on the " ", since your input file is well formated.
if((n = stationNumberToNode.get(splittedLine[0]) == null){
n = new Node(splittedLine[0]); // assuming your Node has a constructor that takes the station id
stationNumberToNode.put(stationNumberToNode[0], n);
}
Checks if the current node is already on the HashMap. If yes, it will be stored in the variable n. Else, it will create a Node with the current id and add it to our HashMap.
for(int i = 1; i < splittedLine.lenght; ++i){
splittedEdge = splittedLine[i].split(",");
e = new Edge(splittedEdge[0], splittedEdge[1]); // assuming your Edgehas a constructor that takes the destination station and the cost
n.addEdge(e);
}
Since everything in the input file is the destination station and its cost (id,cost), we iterate on the splittedLine from index 1 onwards.
For every edge data, we split based on "," (from your input file), whereas splittedEdge[0] will be the destination id and splittedEdge[1] will be the cost to that destination. We create an Edge with that information and add that Edge to the Node object.

Processing XML elements inline with text

I have a program which reads an XML file using Java DOM and processes certain element. For example, here is part of the document I am looking at:
<Flow>
<Id>306</Id>
<Type>Simple</Type>
<FlowContent Width="0.2000000000000000111">
<P Id="523"><T xml:space="preserve" Id="652">A spouse’s pension would be paid equal to <O Id="351"/>% of your Core pension at date of death.</T>
</P>
</FlowContent>
(Note: this is exported from a program called GMC Inspire Designer, so I have no control over its format.)
I can process most elements fine, but have issues with text content which also contains elements. In the example above, another layout object <O Id="351"/> (referencing another piece of text or a variable) occurs in the body of the text.
I can look up this element and retrieve it using the ID number. This is the element linked in the above snippet:
<Variable>
<Id>351</Id>
<Name>CAMT44</Name>
What I would then like to do is output information from the linked node (e.g., I could look up the node with ID 351 and retrieve the name etc. then display this information in place of where the element appears within the string).
I currently look up children and store the ID in a string array like so:
NodeList nl = e.getElementsByTagName("O");
sa = new String[nl.getLength()]; // Set up new array to hold child ids
for (int i = 0; i < nl.getLength(); i++) {
sa[i] = nodeToElement(nl.item(i)).getAttribute("Id");
}
I'm very much a Java beginner, so I've been wondering if DOM was the correct choice for this project. Perhaps I should have used SAX instead, but as I don't have much XML experience, I'm not sure which best suits my needs and, as I mentioned, I have managed to do most of the things I need, it's just this last tricky bit that I'm stuck on.
Currently my output looks like this:
IF CR.SCHEME == "EXCT" PRINT:
"A spouse’s pension would be paid equal to % of your Core pension at
date of death, ignoring the fact that you may have chosen to convert
part of your pension into a lump sum at retirement."
Child flow: 351
It would be great if there is some way to do this using DOM. Apologies if anything is unclear, I'm new to most of this.
You should be able to do something like this:
String output = "";
for (int i = 0; i < nl.getLength(); i++) {
Node n = nl.item(i);
if(n.getNodeType() == Node.TEXT_NODE) {
output += n.getTextContent();
} else if (n.getNodeType() == Node.ELEMENT_NODE && n.getNodeName().equals("O")) {
output += lookup(doc, ((Element)n).getAttribute("id"));
}
}
System.out.println(output);
The lookup method is something you would need to implement yourself but it would look something like this:
private static String lookup(Document doc, String id) {
return "<IMPLEMENT_LOOKUP_HERE>";
}

Best way to put 20 elements in a coordinate system with neighbouring elements unique

This question is regarding libGDX, but I think it's in fact more Java/algorithm related.
Part of my game includes placing 20 elements out of predefined 30 elements list on a screen (so effectively a coordinate system) in 20 partially-predefined places.
By partially predefined I mean that they are predefined for each screen, but there can be dozens of screens, so they can be as well treated as random.
The elements will be selected randomly, but the elements close to each other must be unique. By close I mean in range of some arbitrary defined distance X. Effectively each place will have around 3 'close neightbours'.
The best way I can think of so far is as follows:
Calculate the distance between all places. If a given distance between A and B is lower than X put two entries in a map - one (A,B) and one (B,A)
Now start filling the places with elements
For each place create a list with all neightbours using the map from point 1 (let's call it N-list)
For each place create a temporary list with all possible (30) elements (let's call it E-list)
Get a random element from E-list
Iterate through N-list. For each place from the list get an element currently there (if there's any). For this a (place, element) map is needed, so it will be filled as the algorithm progresses.
If the found element is equal to the current random element remove this element from E-list and this place from N-list and come back to point 5
Proceed until all places are filled
Step 1 is in fact a separate algorithm, that probably can be tweaked, ex. if we calculated the A->B distance we don't need to calculate B->A, but that needs an additional map to store calculation info, etc.
I would like to know what you think of this way and if you have any ideas for a better one.
Thanks in advance for your answers.
P.S. Perhaps the terms I used could be better chosen, but I'm not a native speaker and I don't know English math terms :-)
Ok, I think I understood your solution and this is what I thought of initially. But I think it can be slightly optimized by eliminating extra pairs and maps (or maybe not :)
First, create a map of locations where key is location position (or the location itself) and value is a list of location's parents who fall within the close range. Yes it will have multiple parents, not children, it is actually the same but parents are more fitting here as we'll see.
ArrayList<Place> place_list; // your list of places here
ArrayList<Element> element_list; // your list of elements here
HashMap<Place,ArrayList<Place>> parent_map = new HashMap<Place,ArrayList<Place>>;
ArrayList<Place> a;
for (int i = 0; i < place_list.size() - 1; i++) {
Place place1 = place_list.get(i);
for (int j = i + 1; j < place_list.size(); j++) {
Place place2 = place_list.get(j);
int dist = getDistance(place1, place2);
if (dist > DISTANCE_THRESHOLD) continue;
// if this place is within range,
// add parent place to its list and put/update it to the map
a = parent_map.get(place2);
if (a == null) a = new ArrayList<Place>();
a.add(place1);
parent_map.put(place2, a);
}
}
Now we have a map of all places that have parents. Next we do the following: if place does not have parents, it can choose any random element freely. If it does have parents, it checks what elements parents own and reduces the available set of elements. After the set was reduced, any random element can be chosen from it.
HashMap<Place,Element> used_place_map = new HashMap<Place,Element>(); // key is place, value is assigned element
ArrayList<Element> tmp_element_list;
for (i = 0; i < place_list.size(); i++) {
Place place = place_list.get(i);
a = parent_map.get(place);
if (a == null) { // this place has no parents, use elements freely
tmp_element_list = element_list;
} else { // if it has parents, they have already registered their elements in used_place_map
tmp_element_list = new ArrayList<Element>();
// create list of available elements, lame
for (j = 0; j < element_list.size(); j++) tmp_element_list.add(element_list.get(j));
// now reduce it, very lame, sorry
for (Place pl : a) {
Element used_element = used_place_map.get(pl);
for (j = 0; j < tmp_element_list.size(); j++) {
if (used_element.equals(tmp_element_list.get(j)) {
tmp_element_list.remove(j);
break;
}
}
}
}
// finally, get the random index on (probably reduced) array
int element_id = Random.nextInt(tmp_element_list.size());
Element element = element_list.get(element_id);
// store our choice as future parent
used_place_map.put(place, element);
}

Checking if ArrayList element exists or not

I'll try to explain this as best I can. I have an ArrayList of String's. I am trying to implement server-side paging for a webapp. I am restricted to the number of items per page (6 in this case) which are read from this ArrayList. The ArrayList is, lets say, the entire catalog, and each page will take a section of it to populate the page. I can get this working just fine when there are enough elements to fill the particular page, its when we hit the end of the ArrayList where there will be less than 6 items remaining for that pages segment. How can I check if the ArrayList is on its last element, or if the next one doesn't exist? I have the following code (in pseudo-ish code):
int enterArrayListAtElement = (numberOfItemsPerPage * (requestedPageNumber - 1));
for (int i = 0; i < numberOfItemsPerPage; i++) {
if (!completeCatalog.get(enterArrayListAtElement + i).isEmpty() {
completeCatalog.get(enterArrayListAtElement + i);
}
}
The if in the code is the problem. Any suggestions will be greatly appreciated.
Thanks.
It sounds like you want:
if (enterArrayListAtElement + i < completeCatalog.size())
That will stop you from trying to fetch values beyond the end of the list.
If that's the case, you may want to change the bounds of the for loop to something like:
int actualCount = Math.min(numberOfItemsPerPage,
completeCatalog.size() - enterArrayListAtElement);
for (int i = 0; i < actualCount; i++) {
// Stuff
}
(You may find this somewhat easier to format if you use shorter names, e.g. firstIndex instead of enterArrayListAtElement and pageSize instead of numberOfItemsPerPage.)
Can't you just get
completeCatalog.size()
and compare it to i? i.e to answer the question "is there an ith element" you say
if (i<completeCatalog.size())
You just need to add a second expression to look whether the end of the list was reached already:
int enterArrayListAtElement = (numberOfItemsPerPage * (requestedPageNumber - 1));
for (int i = 0; i < numberOfItemsPerPage; i++) {
if (enterArrayListAtElement + i < completeCatalog.size() && !completeCatalog.get(enterArrayListAtElement + i).isEmpty() {
completeCatalog.get(enterArrayListAtElement + i);
}
}
An ArrayList has the method of size(), which returns the number of elements within the List.
Therefore, you can use this within the if statement to check you've not went too far.
For example,
if(enterArrayListAtElement + i < completeCatalog.size()) {
...
}

Categories