I'm using a Map in Java to solve a problem, but I'm not seeing how I can compare a certain key's values to a previous key's values and then drag on whatever is not being updated.
In the table below, we see Loan Swaps involving 4 companies that are occurring (in this example) from 1/1/2020 to 1/30/2020.
All in all, the participation percentage needs to add up to 100 of all the companies no matter the swaps, like this:
Before Swap Date: company 1 (10) + company 2 (60) + company 3 (10) +
company 4 (20) = 100 😊
1/15/2020 Swap Date: company 1 (10) + company
2 (30) + company 3 (40) + company 4 (20) = 100 😊
1/21/2020 Swap Date: company 1 (10) + company 2 (50) + company 3 (20) + company 4
(20) = 100 😊
1/30/2020 Swap Date: company 1 (0) + company 2 (50) +
company 3 (20) + company 4 (30) = 100 😊
That is the goal of my code. This is my current thought process, but I'm stuck on how to pull down the values from a previous key value pair if they are not being changed due to a swap -- in particular how to accomplish the parts in red.
Any insight is greatly appreciated.
I have a dataset of transactions where each transactions represent a purchase of a single item. So, each order is recorded as 3 transactions if the order contained 3 items.
Example dataset:
User Order, ItemCount, ItemPrice
1 1 1 10
1 1 1 10
1 2 1 30
1 2 1 30
2 3 1 20
2 3 1 20
3 4 1 15
3 4 1 15
3 4 1 15
To reduce the dataset I have grouped by order and user and aggregated ItemCount and ItemPrice to get a dataset like this:
User Order, ItemCount, OrderAmount
1 1 2 20
1 2 2 60
2 3 2 40
3 4 3 45
Now I want to group the orders by user and do some analysis on the orders for each user. Is there a way in Spark to group the orders by user and end with a pair of > where User is the user id and the Dataset contains the orders?
The only solution I see at the moment is to convert the dataset to rdd and do groupbykey to get rddpair> and then write some code to do my analysis on the list of rows.
I would prefer a solution where I can work with the orders as a Dataset and do my analysis using Dataset functionality. Can anyone point me into the right direction here? Is this possible?
I am new to spark and have been using Spark with Java so far as I have very limited experience with Scala, but examples in Scala would help.
Just group by user and order and aggregate columns itemcount and itemprice. Then group by user and run all the aggregations in the appropriate columns.
df.groupBy($"User", $"Order").agg(sum($"ItemCount").as("count"),
sum($"ItemPrice").as("total"))
.groupBy($"User").agg(avg($"total").as("avg_amount"),
avg($"count").as("avg_count"),
count($"count").as("total_purchases"))
i have two tables
table a:
name number noOfcol price color
john 1 4 2 green
phil 2 3 2 blue
harry 3 2 5 green
jack 4 5 6 red
jill 5 1 4 red
table b:
localName noOfcol price color
monster 2 4 blue
and i want table c to output:
localName name number
monster harry 3
monster jill 5
so whats happening here is that table c is taking those that are blue, and blacklisting them, and those that are not are blacklisted, then it makes sure that price is atleast the same or above the listed price in table b, and lastly it makes sure noOfcol is atleast the same or less. im having trouble creating a query that will do this for me. any pointer would be greatly appreciated. (i also plan on implementing this into to my java app, using netbeans, but for now a query is what im need of)
The:
SELECT b.localName,a.name,a.number FROM a,b WHERE a.color != b.color AND a.price >= b.price AND a.numOfCol <= b.numOfcol;
gives such output. DEMO
This is a largely conceptual question so i dont have any code to show. I'll try to explain this as best i can. I am writing a program that is supposed to find common sequences of numbers found in a large table of random combinations.
So for example take this data:
1 5 3 9 6 3 8 8 3 3
6 7 5 5 5 4 9 2 0 1
6 4 4 3 7 8 3 9 5 6
2 4 2 4 5 5 3 4 7 7
1 5 6 3 4 9 9 3 3 2
0 2 7 9 4 5 3 9 8 3
These are random combinatinos of the numbers 1-9. For every 3 digit (or more) sequence found more than once i need to put that into another database. So the first row contains "5 3 9" and the 6th row also contains "5 3 9". I would put that sequence in a separate table with the number of times it was found.
I'm still working out the algorithm for actually making these comparisons but i figure i'll have to start with "1 5 3", compare that to every single 3 number trio found, then move on to "5 3 9" then "3 9 6" etc....
MY MAIN PROBLEM RIGHT NOW is that i dont know how to do this if these numbers are stored in a database. My database table has 11 columns. One column for each individual number, and one column for the 10 digit sequence as a whole. Columns are called Sequence, 1stNum, 2ndNum, 3rdNum...10thNum.
Visual: first row in my database for the data above would be this :
| 1 5 3 9 6 3 8 8 3 3 | 1 | 5 | 3 | 9 | 6 | 3 | 8 | 8 | 3 | 3 |
("|" divide columns)
How do i make comparisons efficiently with Java? I'm iterating over every row in the table many times. Once for the initial sequence to be compared, and for every one of those sequences i go through each row. Basically a for loop in a for loop. This sounds like its going to take a ton of queries and could take forever if the table gets to be massive (which it will).
Is it more computationally efficient if i iterate through a database using queries or if i dump the database and iterate through a file?
I tried to explain this as best as i could, its a very confusing process for me. I can clarify anything you need me to. I just need guidance on what the best course of action for this would be.
Here is what I would do, assuming you have retrieved the sequences in a list :
List<String> sequences = Arrays.asList("1539638833","6755549201","6443783956","2424553477","1563499332","0279453983");
Map<String,Integer> count = new HashMap<>();
for (String seq : sequences) {
int length = seq.length();
for (int i=0 ; i<length - 2 ; i++) {
String sub = seq.substring(i,i + 3);
count.put(sub,count.containsKey(sub) ? count.get(sub) + 1 : 1);
}
}
System.out.println(count);
Ouput :
{920=1, 783=1, 945=1, 332=1, 963=1, 644=1, 156=1, 983=1, 453=1, 153=1, 388=1, 534=1,
455=1, 245=1, 539=2, 554=1, 242=1, 555=1, 553=1, 437=1, 883=1, 349=1, 755=1, 675=1,
638=1, 395=1, 201=1, 956=1, 933=1, 499=1, 634=1, 839=1, 794=1, 027=1, 477=1, 833=1,
347=1, 492=1, 378=1, 279=1, 993=1, 443=1, 396=1, 398=1, 549=1, 563=1, 424=1}
You can then store these values in the database from the Map.
You can do it in sql with a union clause:
select sum(c), sequence
from
(
select
count(*) as c, concat(col1 ,col2 , col3) as sequence
from t
group by col1, col2, col3
union
select
count(*) as c, concat(col2 ,col3 , col4) as sequence
from t
group by col2, col3, col4
union (... and so on enumerating through the column combinations)
) as tt
group by sequence
I would imagine a pure java implementation would be quicker and have less
memory overhead. But if you already have it in the database it may be quick
enough.
I need to construct a sqlite query on Android. My DB looks similar to this:
ColA ColB ColC
1 Jim 16
2 Rob 14, 12
3 Tom 1, 4, 7
How do I run a query to match a number in a list of numbers?
I am trying to do this:
SELECT ColA, ColB FROM nametable WHERE 4 IN ColC
This should return row 3, but not row 2, so I can't just use "LIKE %4%".
I prefer that you change your database design since it is not normalised. as follow
Remove column ColC from your table.
Create another table that have two columns:
a. an id col that contain the id of the record in the table you have.
b. a column that contain one number.
You tables will look like the following:
Table 1
ColA ColB
1 Jim
2 Rob
3 Tom
Table 2
ColA ColB
1 16
2 14
2 12
3 1
3 4
3 7
Your select statement will be something like:
Select a.* from table1 a, table2 b where a.ColA == b.ColA and b.Colb = 4
Good Luck
Should be LIKE % 4,% OR LIKE % 4 OR LIKE 4, % OR LIKE 4 to make it work for the first and last item.
But that's getting complicated. Maybe you should have a child table with one to many relationship to store the numbers.