I am creating a mock Twitter project which loads user data from a somewhat large text file containing ~3.6 million lines formatted like this:
0 12
0 32
1 9
1 54
2 33
etc...
The first string token is the userId and the second is the followId.
The first half of this helper method takes in the current user's ID, checks to see if it exists and creates a new user if necessary. After that, the followId is added to this new or existing user's following list of type ArrayList<Integer>.
With ~3.6 million lines to read, this doesn't take long (9868 ms).
Now the second half creates or finds the followed user (followId) and adds the userId to their followers list, but this additional code extends the amount of time to read the file exponentially (172744 ms).
I tried using the same TwitterUser object throughout the method. All of the adding methods (follow, addFollower) are simple ArrayList.add() methods. Is there anything I can do to make this method more efficient?
Please note: While this is school-related, I'm not asking for an answer to my solution. My professor permitted this slow object initialization, but I'd like to understand how I can make it faster.
private Map<Integer, TwitterUser> twitterUsers = new HashMap<Integer, TwitterUser>();
private void AddUser(int userId, int followId){
TwitterUser user = getUser(userId);
if (user == null){
user = new TwitterUser(userId);
user.follow(followId);
twitterUsers.putIfAbsent(userId, user);
} else{
user.follow(followId);
}
//adding the code below, slows the whole process enormously
user = getUser(followId);
if (user == null){
user = new TwitterUser(followId);
user.addFollower(userId);
twitterUsers.putIfAbsent(followId, user);
} else{
user.addFollower(userId);
}
}
private TwitterUser getUser(int id){
if (twitterUsers.isEmpty()) return null;
return twitterUsers.get(id);
}
If putIfAbsent(int, User) does what you would expect it to do, that is: checking if it's there before inserting, why do you use it within an if block whose condition already checks if the user is there?
In other words, if fetching a user returned a null value you can safely assume that the user was not there.
Now I'm not too sure about the internal workings of the *putIfAbsent* method (probably it would loop through the set of the keys in the map), but intuitively I would expect a normal put(int, User) to perform better, even more with a map that gets as large as yours as the input file gets scanned through.
Therefore I would suggest to try something like:
user = getUser(followId);
if (user == null){
user = new TwitterUser(followId);
user.addFollower(userId);
twitterUsers.put(followId, user);
} else{
user.addFollower(userId);
}
which would apply to the first half as well.
Related
I did a function on java to get the next turn from a database (PostgreSQL) table. After getting the next turn, the record is updated so no other user can get the same turn. If another users request next turn at the same time, there is a change that both get the same next turn. So firt idea is to syncronize the function so only one user can request turn at the same time. But there are several departments, so two users from the same department cannot request turn at the same time, but two users from diferent departments could without any issue.
This is a simplified / pseudocode of the function
private DailyTurns callTurnLocal(int userId)
{
try {
DailyTurns turn = null;
DailyTurns updateTurn = null;
//get next turn for user (runs a query to the database)
turn = getNextTurnForUser(userId);
//found turn for user
if (turn != null)
{
//copy information from original record object to new one
updateTurn = turn;
//change status tu turn called
updateTurn.setTurnStatusId(TURN_STATUS_CALLED);
//add time for the event
updateTurn.setEventDate(new Date());
//update user that took the turn
updateTurn.setUserId(userId);
//save new record in the DB
updateTurn = save(updateTurn);
}
return updateTurn;
}
catch (Exception e)
{
logger.error( "Exception: " + e.getMessage(), e );
return null;
}
}
I'm aware that I can syncronize the entire function, but that would slow process if two or more threads from users in different departments want to get next turn. How can I add syncronization per department? Or is something that I can achieve with a function in the DB?
Seems like a more obviously solution would be to keep a cache like ConcurrentHashMap where the keys are defined as department.
This won't lock the entire object and different threads can operate concurrently for different departments.
Everytime before I place a new order to IB, I need to make a request to IB for next valid orderId and do Thread.Sleep(500) to sleep for 0.5 seconds and wait for IB API's callback function nextValidId to return the latest orderID. If I want to place multiple orders out, then I have to naively do thread.sleep multiple times, This is not a very good way to handle this, as the orderID could have been updated earlier and hence the new order could have been placed earlier. And what if the orderID takes longer time to update than thread sleep time, this would result in error.
Is there a more efficient and elegant way to do this ?
Ideally, I want the program to prevent running placeNewOrder until the latest available orderID is updated and notify the program to run placeNewOrder.
I do not know much about Java data synchronization but I reckon there might be a better solution using synchronized or wait-notify or locking or blocking.
my code:
// place first order
ib_client.reqIds(-1);
Thread.sleep(500);
int currentOrderId = ib_wrapper.getCurrentOrderId();
placeNewOrder(currentOrderId, orderDetails); // my order placement method
// place 2nd order
ib_client.reqIds(-1);
Thread.sleep(500);
int currentOrderId = ib_wrapper.getCurrentOrderId();
placeNewOrder(currentOrderId, orderDetails); // my order placement method
IB EWrapper:
public class EWrapperImpl implements EWrapper {
...
protected int currentOrderId = -1;
...
public int getCurrentOrderId() {
return currentOrderId;
}
public void nextValidId(int orderId) {
System.out.println("Next Valid Id: ["+orderId+"]");
currentOrderId = orderId;
}
...
}
You never need to ask for id's. Just increment by one for every order.
When you first connect, nextValidId is the first or second message to be received, just keep track of the id and keep incrementing.
The only rules for orderId is to use an integer and always increment by some amount. This is per clientId so if you connect with a new clientId then the last orderId is something else.
I always use max(1000, nextValidId) to make sure my id's start at 1000 or more since I use <1000 for data requests. It just helps with errors that have ids.
You can also reset the sequence somehow.
https://interactivebrokers.github.io/tws-api/order_submission.html
This means that if there is a single client application submitting
orders to an account, it does not have to obtain a new valid
identifier every time it needs to submit a new order. It is enough to
increase the last value received from the nextValidId method by one.
You should not mess around with order ID, it's automatically tracked and being set by the API. Otherwise you will get the annoying "Duplicate order id" error 103. From ApiController class:
public void placeOrModifyOrder(Contract contract, final Order order, final IOrderHandler handler) {
if (!checkConnection())
return;
// when placing new order, assign new order id
if (order.orderId() == 0) {
order.orderId( m_orderId++);
if (handler != null) {
m_orderHandlers.put( order.orderId(), handler);
}
}
m_client.placeOrder( contract, order);
sendEOM();
}
While implementing a database structure, my goal is to provide easy access to player data.
So, I have created the User class, which holds a Json instance and exposes the methods to take specific information from it.
public class User {
private Json data;
public User(OfflinePlayer player) {
File path = new File(player.getUniqueId() + ".json");
data = new Json(path);
}
public boolean isPremium() {
return data.getBoolean("premium");
}
}
The problem is that I have to create a new instance every time I need to know something about the same player from different parts of my code. That's very expensive!
So, is there a design pattern for this particular situation?
This is a simple cache. If you are using ORM such as hibernate, you could use second level cache for this.
You could also have unique user identifier (UUID id) as a key, with user data as a value in Map.
So, when you get request for user data, you first see if you have user with this uuid in cache(Map) and return data if you do.
If you don't have it, then go in database and fetch data.
Try creating a Map like this:
User user = null;
Map<UUID, User> usermap = new HashMap<>;
//before creating new user instance check if its present in Map
if(usermap.containskey(id){
//get user from Map
user = usermap.get(id);
else{
//not in map so create new User
user = new User(id);
usermap.put(id,user);
}
//use user object
But please be careful to destroy usermap instance or object containing it once it is not required. You can also so several modification with limiting size etc.
I have an Array of objects. Each object is a customer record, which is the customer ID (int), first name (String), last name(String), and balance (double).
My problem is that i am not supposed to have duplicate customer records, so if they appear in the file twice, I have to just update their balance. I cannot figure out how to search the array to find out if i need to just update the balance or make a new record in the array.
I feel like i should do this in the get/setters, but i am not exactly sure.
edit: to clarify on "if they appear in the file twice, I have to just update their balance." I have a file i made in notepad which is supposed to be a customer list, which has all of their information. if the same customer shows up twice, say the following day to buy more stuff, i am not supposed to create a new object for them since they already have an object and a place in the array. instead, i am supposed to take the amount they spent, and add it to their already existing balance within their existing object.
edit2: i thought i would give you the bit of code i have already where i read in the values into the array. i based this off of the example we did in class, but we didn't have to update anything, just store information into an array and print it if needed.
public CustomerList(){
data = new CustomerRecord[100]; //i'm only allowed 100 unique customers
try {
Scanner input = new Scanner(new File("Records.txt"));
for(int i = 0; i < 100; ++i){
data[i] = new CustomerRecord();
data[i].setcustomerNumber(input.nextInt());
data[i].setfirstName(input.next());
data[i].setlastName(input.next());
data[i].setTransactionAmount(input.nextDouble());
}
} catch (FileNotFoundException e) {
e.printStackTrace();
}
}
You shouldn't be using arrays in that case. A Set would be much more suitable as it, by definition, does not have duplicate entries.
What you need to do is to implement the equals() and hashCode() methods in your Customer class so they only use id (or id and name fields) but not balance.
If for some reason you need to use arrays you have two options:
sort the array and use binary search to find if the customer is there, this is nice if the array doesn't change much but you're doing a lot of updates
simply do a linear scan of the array, checking each entry to see if a given customer is already there, if so then update the balance, otherwise add it as a new entry
It would be something like:
public void updateOrAdd(Customer cst) {
boolean exists = false;
for(Customer existing : array) {
// !!! You need to implement your own equals method in the
// customer so it doesn't take into account the balance !!!
if(existing.equals(cst)) {
exists = true;
existing.updateBalance(cst.getBalance());
break;
}
}
if(!exists) {
// add the cst to the array
}
}
The difference is in runtime, the set solution will be constant O(1) on average (unless you incorrectly implement your hashCode() method).
Suppose you have a Customer array:
Customer[] customers = new Customer[size];
... // fill the array with data
Then you get a new customer object called newCustomer. You need to search for newCustomer in your array and, update it if it is already there, or add it if it's not. So you can do something like this:
// Return, if it exists, a customer with id equal to newCustomer.getId()
Optional<Customer> existingCustomer =
Arrays.stream(customers)
.filter(c -> newCustomer.getId().equals(c.getId()))
.findFirst();
if (existingCustomer.isPresent()) {
// update the customer object with newCustomer information as appropriate
Customer customer = existingCustomer.get();
// assuming you have an updateBalance() method
customer.updateBalance(newCustomer.amountSpent());
} else {
// if the customer is not in the array already, add them to the current position
customers[currentIndex] = newCustomer;
}
I have a static Vector users, each user has one or more accounts, so for every User there is a Vector accounts. Users and Accounts has an unique id.
Adding a new User is simple: i got a static Vector and i can easily check the id of the last User and i can get the new id for the new User doing user.getId()+1.
After a new User is added a problem comes by adding a new Account. An Account id must be unique, so i have to check for the largest id contained in every user's Accounts Vector. Provided that many processes can add/remove users and Accounts, what is the best way to synchronize all the Account vectors and safely add/remove accounts?
Actually, a User is added as follows
public boolean addNewUser(User user)
{
if(user!=null)
{
int id=getNewUserId();
if(id!=-1)
{
user.setId(id);
utenti.add(user);
return false;
}
else
return false;
}
else
return false;
}
private int getNewUserId()
{
User user=utenti.lastElement();
if(user!=null)
return user.getId()+1;
else
return -1;
}
Concerning the id, I recommend using a long value that you increment all the time (regadless of accounts being deleted).
Synchronizing will be harder and depends on what you want to do.
An easy rule would be, if you have a list object, use a synchronized block whenever you access it (read or write):
synchronized(myList) {
//update the list, do whatever you like, there is no concurrent modification
}
Depending on your exact usage, there might be better ways, but this should work.
Probably not the most efficient way, but you could make an int and then go through the sizes of every list of accounts for every user.
int id = 0;
for (User user : AllUsers) {
id += user.listOfAccounts.size();
}
Now your id will be the last id. Simply add 1 when you create a new account.
Note that there will be problems if it's possible to delete accounts.