for (Tweet tweet : tweets) {
for(long forId : idFromArray){
long tweetId = tweet.getId();
if(forId != tweetId){
String twitterString = tweet.getText();
db.insertTwitter(twitterString, tweetId);
}
}
}
My code won't run pass the first for{} loop, that's why idFromArray is empty since I don't add anything there until a tweet is has been added to the database.
And even if there is something in the array it loops the whole thing twice (DUH! Since I have two loops) which makes the database very bloated with the same tweets.
It is not a simple compare of the two tweets id and simply ignore the ones with the same id.
I'm pretty certain there is a really simple solution to this problem, but I still can't wrap my head around it. Anybody?
UPDATE:
What I want is the code to ignore the the tweetId that already is in the database.
And just insert the tweets that is not in the database.
I don't think I should have two for-loops, I think the second loop should be replaced with something? (or maybe I'm wrong?)
If I understand correctly, what you want to do, in pseudo-code is the following:
for (Tweet tweet : tweets) {
if (!db.containsTweet(tweet.getId())) {
db.insertTweet(tweet.getText(), tweet.getId());
}
}
I assume your db class actually uses an sqlite database as a backend? What you could do is implement containsTweet directly and just query the database each time, but that seems less than perfect. The easiest solution if we go by your base code is to just keep a Set around that indexes the tweets. Since I can't be sure what the equals() method of Tweet looks like, I'll just store the identifiers in there. Then you get:
Set<Integer> tweetIds = new HashSet<Integer>(); // or long, whatever
for (Tweet tweet : tweets) {
if (!tweetIds.contains(tweet.getId())) {
db.insertTweet(tweet.getText(), tweet.getId());
tweetIds.add(tweet.getId());
}
}
It would probably be better to save a tiny bit of this work, by sorting the list of tweets to begin with and then just filtering out duplicate tweets. You could use:
// if tweets is a List
Collections.sort(tweets, new Comparator() {
public int compare (Object t1, Object t2) {
// might be the wrong way around
return ((Tweet)t1).getId() - ((Tweet)t2).getId();
}
}
Then process it
Integer oldId;
for (Tweet tweet : tweets) {
if (oldId == null || oldId != tweet.getId()) {
db.insertTweet(tweet.getText(), tweet.getId());
}
oldId = tweet.getId();
}
Yes, you could do this using a second for-loop, but you'll run into performance problems much more quickly than with this approach (although what we're doing here is trading time for memory performance, of course).
Your syntax is not correct. It should be like that:
for (Tweet tweet : tweets) {
for(long forId : idFromArray){
long tweetId = tweet.getId();
if(forId != tweetId){
String twitterString = tweet.getText();
db.insertTwitter(twitterString);
}
}
}
EDIT
This answer no longer really answers the question since it was updated ;)
most simple solution would be to set a boolean var. if to true where you do the insert statement and then in the outter loop check this and insert the tweet there if the boolean is true...
for (Tweet : tweets){ ...
should really be
for(Tweet tweet: tweets){...
So you really want:
for each tweet
unless tweet is in db
insert tweet
If so, just write it down in your programming language.
Hint: The loop over the array is to be done before the insert, which is done depending on the outcome.
What you want to test is that all array elements are not equal to the current one. But your for loop does not do that.
Related
It's difficult for me to think how to ask this so I will create an example to demonstrate what I am asking for:
Suppose I have my model:
public class UserEvaluation {
String name;
Date respondedAt;
}
public class Evaluation {
String name;
List<UserEvaluation> userEvaluations;
{
And then in my EvaluationService I need to know the amount of userEvaluations which have been responded (respondedAt != null).
Possible solutions:
1 By iterating through all the items:
Evaluation evaluation = evaluationRepository.get(1);
Long count = 0;
for(UserEvaluation userEvaluation : evaluation.getUserEvaluations()) {
if(userEvaluation.getRespondedAt() != null) {
count++;
}
}
2 By Lambda Expressions:
Evaluation evaluation = evaluationRepository.get(1);
Long count = evaluation.getUserEvaluations().stream()
.filter(ue -> ue.getRespondedAt() != null)
.count();
3 By querying de database:
Evaluation evaluation = evaluationRepository.get(1);
Long count = userEvaluationRepository.getRespondedCountByEvaluation(evaluation); //And implement this simple count query.
So this is the simpliest thing. Which I would pick? I'm using a lot of iterators and lambda stream iterator expressions in my app. But I am worried about that should be an error and I would need to interact more with the database? Should I? Should I not?
If you are not sure I would design it so you can change it as required. Have a DAO implementation which accesses the database, but one which could be changed to use in memory data if you determine using the database is not fast enough.
interface DatabaseDAO {
long getRespondedCountByEvaluation(Evaluation e);
}
It depends on which approach would retrieve less data. If you grab the data once and extract information from that many times it can be faster, but if you are only getting a small portion each time and have an indexed database it can be much faster. I would design it
I have just wrote a code to cach a table in the memory (simple java hashmap). Now one of the code that i am trying to replace is the find the objects based on criteria. it receives multiple field parameters and if those fields are not empty and not null, they were being added as part of hibernate query criteria.
To replace this, what i am thinking to do is
For each valid param (not null and no empty) I will create a HashSet which will satisfy this criteria.
Once i am done making hashsets for all valid criteria, I will call Set.retainAll(second_set) on all sets. So that at the end, I will have only that set which is intersection of all valid criteria.
Does it sound like the best approach or is there any better way to implement this ?
EDIT
Though, My original post is still valid and I am looking for that answer. I ended up implementing it in the following way. The reason is that it was kind a cumbersome with sets since after creating all sets, I had to first figure out which set is non empty so that the retainAll could be called. it was resulting in lots of if-else statements. My current implementation is like this
private List<MyObj> getCachedObjs(Long criteria1, String criteria2, String criteria3) {
List<MyObj> results = new ArrayList<>();
int totalActiveFilters = 0;
if (criteria1 != null){
totalActiveFilters++;
}
if (!StringUtil.isBlank(criteria2)){
totalActiveFilters++;
}
if (!StringUtil.isBlank(criteria3)){
totalActiveFilters++;
}
for (Map.Entry<Long, MyObj> objEntry : objCache.entrySet()){
MyObj obj = objEntry.getValue();
int matchedFilters = 0;
if (criteria1 != null) {
if (obj.getCriteria1().equals(criteria1)) {
matchedFilters++;
}
}
if (!StringUtil.isBlank(criteria2)){
if (obj.getCriteria2().equals(criteria2)){
matchedFilters++;
}
}
if (!StringUtil.isBlank(criteria3)){
if (game.getCriteria3().equals(criteria3)){
matchedFilters++;
}
}
if (matchedFilters == totalActiveFilters){
results.add(obj);
}
}
return results;
}
I am looking for a way to retrieve object from hashSet in Java. I did iteration over its elements like this:
for (Customer remainingNode : availableNodes) {
remainingNode.setMarginalGain(calculateMarginalGain(
remainingNode, seedSet, network, availableNodes,
churnNet));
}
Unfortunately due to concurrent modification Exception I have to change that to something like this:
for(int i=0;i<numberofRemainingNodes;i++){
Customer remainingNode=availableNodes.get(i);
remainingNode.setMarginalGain(calculateMarginalGain(
remainingNode, seedSet, network, availableNodes,
churnNet));
numberofRemainingNodes=availableNodes.size();
}
But I can not do that because there is not any get(index) method for Java hashSet. Would you please help me to handle this situation?
P.S: I used HashSet because of I want to handle the union and intersection situation and I did not want to add duplicate element to that. Please consider that this part of my program should be run millions of times so a little extra latency could be expensive for whole program.
FYI:
private int calculateMarginalGain(Customer remainingNode,
HashSet<Customer> seedSet,
DirectedSparseGraph<Customer, Transaction> net,
Set<Customer> availableNodes, HashSet<Customer> churnNetwork) {
// Marginal gain for short-term campaign
HashSet<Customer> tmp = new HashSet<Customer>(); // seedset U
// {remainingNode}
tmp.add(remainingNode);
Set<Customer> tmpAvailableNodes = availableNodes;
HashSet<Customer> NeighborOfChurn = getNeighbors(churnNetwork, net);
// sigma function for calculating the expected number of influenced
// customers- seedSettmp=seedset U {u}
tmpAvailableNodes.removeAll(NeighborOfChurn);
Set<Customer> influencedNet = getNeighbors(tmp, net);
tmpAvailableNodes.retainAll(influencedNet);
return tmpAvailableNodes.size();
}
private HashSet<Customer> getNeighbors(HashSet<Customer> churnNetwork,
DirectedSparseGraph<Customer, Transaction> net) {
HashSet<Customer> churnNeighbors = churnNetwork;
Collection<Customer> neighbors = new HashSet<Customer>();
for (Customer node : churnNetwork) {
neighbors = net.getNeighbors(node);
for (Customer neighbor : neighbors) {
churnNeighbors.add(neighbor);
}
}
return churnNeighbors;
}
The problem in your code is that you change the structure of your HashSet during the iteration It is within the calculateMarginalGain() method, in this line:
tmpAvailableNodes.removeAll(NeighborOfChurn);
Think twice whether this is really right! If yes, then you can work easily around the problem by making you a copy of the set for the iteration first. E.g.:
Set<Customer> copy = new HashSet<Customer>;
copy.addAll(availableNodes);
for (Customer : copy) {
....
}
Actually tmpAvailableNodes and availableNodes are the identical set. Maybe you can improve here in general.
You have to use an Iterator:
Iterator<Customer> custIter = availableNodes.iterator();
while(custIter.hasNext()) {
Customer customer = custIter.next();
// do your work here
}
Using this you won't get ConcurrentModificationException. It is not clear why you get it though. If you are tampering with the HashSet from multiple Threads consider using a concurrent data structure instead.
If you modify availableNodes in setMarginalGain you will still get the exception though.
So I am supposed to make an add method for an array list which adds a new movie object to the list if it doesnt exist, or if it finds a movie object with a similar title within the list, it just increases the quantity property of that object. Here is what I've got so far.
public void add(String title, double rating, int releaseYear){
if(this.myMovies.size() < 1)
{
Movie mymovie = new Movie(title, rating, releaseYear);
this.myMovies.add(mymovie);
}
else
{
for(int i = 0; i < this.myMovies.size(); i++)
{
Movie temp = this.myMovies.get(i);
if(temp.Title.equals(title)){
this.myMovies.get(i).quantity++;
break;
}
else
{
Movie mymovie = new Movie(title, rating, releaseYear);
this.myMovies.add(mymovie);
break;
}
}
}
}
My problem is that this ends up not taking account of similar names and doesn't increase the quantity but just adds another object to the list. I have a strong feeling that the problem lies within my For loop but I just can't identify it. Can anyone see anything that I may be doing wrong? Thank you!
You're testing only for equality, not similarity here:
if(temp.Title.equals(title)){
Instead, you should write a helper method to test for similarity based on whatever criteria are appropriate. For example:
if (isSimilar(temp.Title, title)){
and the isSimilar method might look something like this (assuming you don't need any input validation):
private void isSimilar(String title1, String title2) {
return title1.equalsIgnoreCase(title2)
|| title1.toLowerCase().contains(title2.toLowerCase())
|| title2.toLowerCase().contains(title1.toLowerCase());
}
or, perhaps more appropriately, like this (if you implement it in the Movie class):
private void isSimilar(otherMovie) {
return title.equalsIgnoreCase(otherMovie.title)
|| title.toLowerCase().contains(otherMovie.title.toLowerCase())
|| otherMovie.title.toLowerCase().contains(title.toLowerCase());
}
...in which case your if statement would also change slightly.
Keep in mind that I don't know what you consider 'similar'; only that the movies are considered similar if the names are similar.
A couple more comments:
Fields and method names generally start with a lowercase letter (so the field Movie.Title should instead be Movie.title).
It's usually preferable to loop over a Collection using an Iterator instead of using the raw index--partly because the Iterator should always know how to loop over the Collection efficiently.
Learn to use your IDE's debugger (it's probably very easy). Then you can step through each line of code to see exactly where your program is doing something unexpected.
I would do something like this:
public void add(String title, double rating, int releaseYear){
for(Movie m: myMovies.size())
{
if(m.Title.equals(title)){
m.quantity++;
return;
}
}
// movie with same title not found in the list -> insert
this.myMovies.add(new Movie(title, rating, releaseYear));
}
By the way: variable names should start with a lowercase character (Title -> title).
I'm addressing your "similarity" requirement. If you really want to do this properly it could be a lot of work. Essentially you have two strings and want to get a measure of the similarity. I am doing the same thing for figure captions and I plan to tackle it by:
splitting the title into words
lowercasing them
using them as features for classifier4J (http://classifier4j.sourceforge.net/)
That will go a long way based on simple word counts. But then you have the problem of stemming
(words that differ by endings - "Alien" and "Aliens"). If you go down this road you'll need to read up about Classification and Natural Language Processing
im currently working on a multiple class assignment where i have to add a course based on whether the prerequisites exist within the program.
im storing my courses within the program class using a hashmap. (thought i would come in handy) however, im having a bit of trouble ensuring that these preReqs exist.
here is some code ive currently got going
public boolean checkForCourseFeasiblity(AbstractCourse c) throws ProgramException
{
AbstractCourse[] tempArray = new AbstractCourse[0];
tempArray= courses.keySet().toArray(tempArray);
String[] preReqsArray = new String[1];
preReqsArray = c.getPreReqs();
//gets all course values and stores them in tempArray
for(int i = 0; i < preReqsArray.length; i++)
{
if(courses.containsKey(preReqsArray[i]))
{
continue;
}
else if (!courses.containsKey(preReqsArray[i]))
{
throw new ProgramException("preReqs do not exist"); //?
}
}
return true;
}
ok so basically, tempArray is storing all the keySets inside the courses hashmap and i need to compare all of them with the preReqs (which is an array of Strings). if the preReqs exist within the keyset then add the course, if they dont do not add the course. return true if the course adds otherwise through me an exception. keep in mind my keysets are Strings e.g. a keyset value could be "Programming1" and the required prerquisite for a course could be "programming1". if this is the case add then add the course as the prereq course exists in the keyset.
i believe my error to be when i initialize mypreReqsArray with c.getPreReqs (note: getPreReqs is a getter with a return type String[]).
it would be really great if someone could aid me with my dilemma. ive tried to provide as much as possible, i feel like ive been going around in circles for the past 3 hours :(
-Thank you.
Try something like this, you don't need tempArray. The "for each" loop looks lots nicer too. If you want to throw an Exception I would put that logic in the place that calls this method.
public boolean checkForCourseFeasiblity(AbstractCourse c)
{
for(String each : c.getPreReqs())
{
if(! courses.containsKey(each))
{
return false;
}
}
return true;
}