Running new thread from within a thread - java

I have a puzzle that I solved. I will try to briefly describe it and then ask question.
Map phone numbers to text. For example:
101 -> 101, but 501 -> J01, K01, L01 (AFAIK).
We get 9digit number and need to produce all combination, even those that are not grammaticaly correct.
I created a program that basically grows like a tree. When a number that is not 0 or 1 is found we create a new branch with already translated text + one of possible letters + rest of number.
I solved it by creating a new thread every time new translateable digit is encountered.
Do you think it is a bad practice? Can it be solved better?
Thank you.

Creating a thread for each possibility seems like overkill, what you really want is recursion. Try this:
String [] lookup = { "0", "1", "ABC", "DEF", "GHI", "JKL", "MNO", "PQRS", "TUV", "WXYZ" };
List<String> results = new ArrayList<String>();
void translatePhone(String phoneNumber, int position) {
int index = Integer.parseInt(phoneNumber.substring(position, position + 1));
for (int i = 0; i < lookup[index].length; i++) {
String xlated = phoneNumber.substring(0, position) + lookup[index].charAt(i) + phoneNumber.substring(position + 1);
if (position + 1 == phoneNumber.length) {
results.add(xlated);
} else {
translatePhone(xlated, position + 1);
}
}
}
Call translatePhone(phoneString, 0) to start it off. Once that returns results should have all your results.

Related

What is the least complex way to iterate a LinkedList in Java?

I have the simple task to print elements inside a LinkedList that has no duplicates with commas in between, but I have come up with two ways to do it and I'm unsure about how LinkedList iteration works so I don't know which way is best. I have some assumptions about both ways.
public static void main(String[] args) {
LinkedList<Integer> pres = new LinkedList<Integer>();
pres.add(112);
pres.add(114);
pres.add(326);
pres.add(433);
pres.add(119);
// ---------------------------------- METHOD 1 ---------------------------------
for (int i = 0; i < pres.size(); i++) {
System.out.print(pres.get(i) + (i == pres.size() - 1 ? "" : ","));
}
// ---------------------------------- METHOD 2 ---------------------------------
for (int courseIndex : pres) {
System.out.print(courseIndex + (courseIndex == pres.getLast() ? "" : ","));
}
}
For Method 1, I'm wondering if calling pres.get(i) in every iteration traverses the list from the beginning each time: (112) - (112 -> 114) - (112 -> 114 -> 326)...
Or does the pointer stay where it last was and just move to the next element?
For Method 2, it seems like the foreach loop avoids the possible problem that I'm assuming in Method 1, but I'm calling getLast on every iteration as well. Is it a doubly linked list? Is get last an O(1) operation? If not, is calling getLast on each iteration even worse than Method 1, since it traverses the list all the way down each time?
String s = pres.stream()
.map(Integer::toString)
.collect(Collectors.joining(", "));
By the way:
List<Integer> pres = new LinkedList<>();
Collections.addAll(pres,
112, 114, 326, 433, 119);
or
List<Integer> pres = List.of(112, 114, 326, 433, 119);
You then later might change to an other implementation:
List<Integer> pres = new ArrayList<>();
In a linked list, get-by-index (get(i)) is slow and can require iterating through much of the list if the index isn't near the ends. You ask if it just moves "the pointer", but there is no such pointer. Method 1 is therefore a bad idea.
Method 2 is the right idea. The for(var: list) will use an iterator, which is exactly the kind of pointer you are talking about.
Checking the last element is bad form, though, even though it is a double-linked list. You should do it something like this, instead of writing code that behaves oddly if the no-duplicates rule is broken:
String sep = "";
for (int courseIndex : pres) {
System.out.print(sep + courseIndex);
sep = ",";
}
System.out.println();
Actually, I/O calls should usually be considered slow, so it's best to make just one per line:
StringBuilder buf = new StringBuilder();
for (int courseIndex : pres) {
if (buf.length() > 0) {
buf.append(",");
}
buf.append(courseIndex);
}
System.out.println(buf.toString());

Sliding window of 10 seconds?

You are given an infinite stream of words (no spaces), each word also has an attached timestamp, starting at 0 and its in the format like 0, 1, 2, 3, 4, 5, ... , 6. We have APIs:
public class StreamClass {
public void consumeNextString(String next, int timeStamp);
public String getStrings(); // joins all strings into space seprated string using the below constraint
}
You are to implement both these functions. getStrings, specifically has the behavior that if you say had a stream like
one : 4
the: 5
hello : 12
the : 14
menlo: 15
If you got called getStrings now, it should print one hello the menlo instead of one the hello the menlo since the is duplicated at timestamp 11, 14 (current timestamp is 15). The oldest the at timestamp 5 got disregarded.
Later on, after the stream looks like:
one : 4
the: 5
hello : 12
the : 14
menlo: 15
big: 123
getStrings should print one the hello the menlo big because there are no duplicates in the last 10 second window (current timestamp is 123)
Work: I am thinking of an optimal way to do this, this is from an interview question.
The problem is, I dont see any good way of doing this other than just brute force, ie, storing every string then manually looking at the 10 second window to take out the oldest string, but surely there must be SOMETHING more optimal?
Well, here is a possible solution.
I used two Lists to hold the words and their timestamps.
The field lastTimeStamp is updated as each entry is consumed. It is used
to maintain the local window of seconds
when the last windows of time is entered, I simply iterate over the list of words removing the oldest duplicates.
after getString() is called, all lists are cleared to start the process anew.
This works for the supplied data and other data I have tested.
public class SlidingWindow10seconds {
public static void main(String[] args) {
StreamClass sc = new StreamClass();
sc.consumeNextString("one", 4);
sc.consumeNextString("the", 5);
sc.consumeNextString("hello", 12);
sc.consumeNextString("the", 14);
sc.consumeNextString("menlo", 15);
System.out.println(sc.getStrings());
sc.consumeNextString("one", 4);
sc.consumeNextString("the", 5);
sc.consumeNextString("hello", 12);
sc.consumeNextString("the", 14);
sc.consumeNextString("menlo", 15);
sc.consumeNextString("big", 123);
System.out.println(sc.getStrings());
}
Prints
one the hello menlo
one the hello the menlo big
class StreamClass {
int lastTimeStamp = 0;
final int windowSize = 10;
List<Integer> timeStamps = new ArrayList<>();
List<String> words = new ArrayList<>();
public void consumeNextString(String next, int timeStamp) {
words.add(next);
timeStamps.add(timeStamp);
lastTimeStamp = timeStamp;
}
public String getStrings() {
StringBuilder sb = new StringBuilder();
for (int i = 0; i < words.size(); i++) {
int ts = timeStamps.get(i);
// append all words if outside the window
if (ts < lastTimeStamp - windowSize) {
sb.append(words.get(i) + " ");
} else {
// Now iterate thru the list removing the oldest
// duplicates by adding in reverse order
Set<String> distinct = new LinkedHashSet<>();
for (int k = words.size()-1; k >= i; k--) {
distinct.add(words.get(k));
}
// and now reverse that using stack.
Stack<String> stk = new Stack<>();
stk.addAll(distinct);
while(!stk.isEmpty()) {
sb.append(stk.pop()+" ");
}
break;
}
}
sb.setLength(sb.length()-1);
words.clear();
timeStamps.clear();
return sb.toString();
}
}

Java - Return random index of specific character in string

So given a string such as: 0100101, I want to return a random single index of one of the positions of a 1 (1, 5, 6).
So far I'm using:
protected int getRandomBirthIndex(String s) {
ArrayList<Integer> birthIndicies = new ArrayList<Integer>();
for (int i = 0; i < s.length(); i++) {
if ((s.charAt(i) == '1')) {
birthIndicies.add(i);
}
}
return birthIndicies.get(Randomizer.nextInt(birthIndicies.size()));
}
However, it's causing a bottle-neck on my code (45% of CPU time is in this method), as the strings are over 4000 characters long. Can anyone think of a more efficient way to do this?
If you're interested in a single index of one of the positions with 1, and assuming there is at least one 1 in your input, you can just do this:
String input = "0100101";
final int n=input.length();
Random generator = new Random();
char c=0;
int i=0;
do{
i = generator.nextInt(n);
c=input.charAt(i);
}while(c!='1');
System.out.println(i);
This solution is fast and does not consume much memory, for example when 1 and 0 are distributed uniformly. As highlighted by #paxdiablo it can perform poorly in some cases, for example when 1 are scarce.
You could use String.indexOf(int) to find each 1 (instead of iterating every character). I would also prefer to program to the List interface and to use the diamond operator <>. Something like,
private static Random rand = new Random();
protected int getRandomBirthIndex(String s) {
List<Integer> birthIndicies = new ArrayList<>();
int index = s.indexOf('1');
while (index > -1) {
birthIndicies.add(index);
index = s.indexOf('1', index + 1);
}
return birthIndicies.get(rand.nextInt(birthIndicies.size()));
}
Finally, if you need to do this many times, save the List as a field and re-use it (instead of calculating the indices every time). For example with memoization,
private static Random rand = new Random();
private static Map<String, List<Integer>> memo = new HashMap<>();
protected int getRandomBirthIndex(String s) {
List<Integer> birthIndicies;
if (!memo.containsKey(s)) {
birthIndicies = new ArrayList<>();
int index = s.indexOf('1');
while (index > -1) {
birthIndicies.add(index);
index = s.indexOf('1', index + 1);
}
memo.put(s, birthIndicies);
} else {
birthIndicies = memo.get(s);
}
return birthIndicies.get(rand.nextInt(birthIndicies.size()));
}
Well, one way would be to remove the creation of the list each time, by caching the list based on the string itself, assuming the strings are used more often than they're changed. If they're not, then caching methods won't help.
The caching method involves, rather than having just a string, have an object consisting of:
current string;
cached string; and
list based on the cached string.
You can provide a function to the clients to create such an object from a given string and it would set the string and the cached string to whatever was passed in, then calculate the list. Another function would be used to change the current string to something else.
The getRandomBirthIndex() function then receives this structure (rather than the string) and follows the rule set:
if the current and cached strings are different, set the cached string to be the same as the current string, then recalculate the list based on that.
in any case, return a random element from the list.
That way, if the list changes rarely, you avoid the expensive recalculation where it's not necessary.
In pseudo-code, something like this should suffice:
# Constructs fastie from string.
# Sets cached string to something other than
# that passed in (lazy list creation).
def fastie.constructor(string s):
me.current = s
me.cached = s + "!"
# Changes current string in fastie. No list update in
# case you change it again before needing an element.
def fastie.changeString(string s):
me.current = s
# Get a random index, will recalculate list first but
# only if necessary. Empty list returns index of -1.
def fastie.getRandomBirthIndex()
me.recalcListFromCached()
if me.list.size() == 0:
return -1
return me.list[random(me.list.size())]
# Recalculates the list from the current string.
# Done on an as-needed basis.
def fastie.recalcListFromCached():
if me.current != me.cached:
me.cached = me.current
me.list = empty
for idx = 0 to me.cached.length() - 1 inclusive:
if me.cached[idx] == '1':
me.list.append(idx)
You also have the option of speeding up the actual searching for the 1 character by, for example, useing indexOf() to locate them using the underlying Java libraries rather than checking each character individually in your own code (again, pseudo-code):
def fastie.recalcListFromCached():
if me.current != me.cached:
me.cached = me.current
me.list = empty
idx = me.cached.indexOf('1')
while idx != -1:
me.list.append(idx)
idx = me.cached.indexOf('1', idx + 1)
This method can be used even if you don't cache the values. It's likely to be faster using Java's probably-optimised string search code than doing it yourself.
However, you should keep in mind that your supposed problem of spending 45% of time in that code may not be an issue at all. It's not so much the proportion of time spent there as it is the absolute amount of time.
By that, I mean it probably makes no difference what percentage of the time being spent in that function if it finishes in 0.001 seconds (and you're not wanting to process thousands of strings per second). You should only really become concerned if the effects become noticeable to the user of your software somehow. Otherwise, optimisation is pretty much wasted effort.
You can even try this with best case complexity O(1) and in worst case it might go to O(n) or purely worst case can be infinity as it purely depends on Randomizer function that you are using.
private static Random rand = new Random();
protected int getRandomBirthIndex(String s) {
List<Integer> birthIndicies = new ArrayList<>();
int index = s.indexOf('1');
while (index > -1) {
birthIndicies.add(index);
index = s.indexOf('1', index + 1);
}
return birthIndicies.get(rand.nextInt(birthIndicies.size()));
}
If your Strings are very long and you're sure it contains a lot of 1s (or the String you're looking for), its probably faster to randomly "poke around" in the String until you find what you are looking for. So you save the time iterating the String:
String s = "0100101";
int index = ThreadLocalRandom.current().nextInt(s.length());
while(s.charAt(index) != '1') {
System.out.println("got not a 1, trying again");
index = ThreadLocalRandom.current().nextInt(s.length());
}
System.out.println("found: " + index + " - " + s.charAt(index));
I'm not sure about the statistics, but it rare cases might happen that this Solution take much longer that the iterating solution. On case is a long String with only a very few occurrences of the search string.
If the Source-String doesn't contain the search String at all, this code will run forever!
One possibility is to use a short-circuited Fisher-Yates style shuffle. Create an array of the indices and start shuffling it. As soon as the next shuffled element points to a one, return that index. If you find you've iterated through indices without finding a one, then this string contains only zeros so return -1.
If the length of the strings is always the same, the array indices can be static as shown below, and doesn't need reinitializing on new invocations. If not, you'll have to move the declaration of indices into the method and initialize it each time with the correct index set. The code below was written for strings of length 7, such as your example of 0100101.
// delete this and uncomment below if string lengths vary
private static int[] indices = { 0, 1, 2, 3, 4, 5, 6 };
protected int getRandomBirthIndex(String s) {
int tmp;
/*
* int[] indices = new int[s.length()];
* for (int i = 0; i < s.length(); ++i) indices[i] = i;
*/
for (int i = 0; i < s.length(); i++) {
int j = randomizer.nextInt(indices.length - i) + i;
if (j != i) { // swap to shuffle
tmp = indices[i];
indices[i] = indices[j];
indices[j] = tmp;
}
if ((s.charAt(indices[i]) == '1')) {
return indices[i];
}
}
return -1;
}
This approach terminates quickly if 1's are dense, guarantees termination after s.length() iterations even if there aren't any 1's, and the locations returned are uniform across the set of 1's.

How to train data correctly using libsvm?

I want to use SVM (Support vector machine) in my program, but I could not get the true result.
I want to know that how we must train data for SVM.
What I am doing:
Think that we have 5 document (the numbers are just an example), 3 of them is on first category and others (2 of them) are on second category, I merge the categories to each other (it means that the 3 doc that are in the first category will merge in one document), after that I made a train array like this:
double[][] train = new double[cat1.getDocument().getAttributes().size() + cat2.getDocument().getAttributes().size()][];
and I will fill the array like this:
int i = 0;
Iterator<String> iteraitor = cat1.getDocument().getAttributes().keySet().iterator();
Iterator<String> iteraitor2 = cat2.getDocument().getAttributes().keySet().iterator();
while (i < train.length) {
if (i < cat2.getDocument().getAttributes().size()) {
while (iteraitor2.hasNext()) {
String key = (String) iteraitor2.next();
Long value = cat2.getDocument().getAttributes().get(key);
double[] vals = { 0, value };
train[i] = vals;
i++;
System.out.println(vals[0] + "," + vals[1]);
}
} else {
while (iteraitor.hasNext()) {
String key = (String) iteraitor.next();
Long value = cat1.getDocument().getAttributes().get(key);
double[] vals = { 1, value };
train[i] = vals;
i++;
System.out.println(vals[0] + "," + vals[1]);
}
i++;
}
so I will continue like this to get the model :
svm_problem prob = new svm_problem();
int dataCount = train.length;
prob.y = new double[dataCount];
prob.l = dataCount;
prob.x = new svm_node[dataCount][];
for (int k = 0; k < dataCount; k++) {
double[] features = train[k];
prob.x[k] = new svm_node[features.length - 1];
for (int j = 1; j < features.length; j++) {
svm_node node = new svm_node();
node.index = j;
node.value = features[j];
prob.x[k][j - 1] = node;
}
prob.y[k] = features[0];
}
svm_parameter param = new svm_parameter();
param.probability = 1;
param.gamma = 0.5;
param.nu = 0.5;
param.C = 1;
param.svm_type = svm_parameter.C_SVC;
param.kernel_type = svm_parameter.LINEAR;
param.cache_size = 20000;
param.eps = 0.001;
svm_model model = svm.svm_train(prob, param);
Is this way correct? if not please help me to make it true.
these two answers are true : answer one , answer two,
Even without examining the code one can find conceptual errors:
think that we have 5 document , 3 of them is on first category and others( 2 of them) are on second category , i merge the categories to each other (it means that the 3 doc that are in the first category will merge in one document ) ,after that i made a train array like this
So:
training on the 5 documents won't give any reasonable effects, with any machine learning model... these are statistical models,there is no reasonable statistics in 5 points in R^n, where n~10,000
You do not merge anything. Such approach can work for Naive Bayes, which do not really treat documents as "whole" but rather - as probabilistic dependencies between features and classes. In SVM each document should be separate point in the R^n space, where n can be number of distinct words (for bag of words/set of words representation).
A problem might be that you do not terminate each set of features in a training example with an index of -1 which you should according to the read me...
I.e. if you have one example with two features i think you should do:
Index[0]: 0
Value[0]: 22
Index[1]: 1
Value[1]: 53
Index[2]: -1
Good luck!
Using SVMs to classify text is a common task. You can check out research papers by Joachims [1] regarding SVM text classification.
Basically you have to:
Tokenize your documents
Remove stopwords
Apply stemming technique
Apply feature selection technique (see [2])
Transform your documents using features achieved in 4.) (simple would be binary (0: feature is absent, 1: feature is present) or other measures like TFC)
Train your SVM and be happy :)
[1] T. Joachims: Text Categorization with Support Vector Machines: Learning with Many Relevant Features; Springer: Heidelberg, Germany, 1998, doi:10.1007/BFb0026683.
[2] Y. Yang, J. O. Pedersen: A Comparative Study on Feature Selection in Text Categorization. International Conference on Machine Learning, 1997, 412-420.

Java: How to write computing data in arraylist?

Hello guys I'am beginner of the Java and i've got some problems with array&arraylist. My main problem is how to write computing, dynamic data into the array and later how to read it? Here's my weird code:
public static void main(String[] args) {
int yil, bolum = 0, kalan;
Scanner klavye = new Scanner(System.in);
ArrayList liste = new ArrayList();
//or shall i use this? >> int[] liste = new int[10];
System.out.println("Yıl Girin: "); // enter the 1453
yil = klavye.nextInt();
do{ // process makes 1453 separate then write in the array or arraylist. [1, 4, 5, 3]
kalan = yil % 10;
liste.add(kalan); //my problem starts in here. How can i add "kalan" into the "liste".
bolum = yil / 10;
yil = bolum;
}while( bolum == 0 );
System.out.println("Sayının Basamak Sayısı: " + liste.size()); //in here read the number of elements of the "liste"
klavye.close();
}
Edit:
//needs to be like that
while( bolum != 0 );
System.out.println("Sayının Basamak Sayısı: " + liste);
I think that you most likely want your loop stopping condition to be:
while( bolum != 0)
because bolum will only be 0 when there are no more digits left in your number to process. Also, as amit mentions above it could be the case that the user entered 0 when prompted for a number, so you should take that into account.
To obtain a string representation of your ArrayList (showing the elements it contains through their string representations), you can just use
System.out.println("Sayının Basamak Sayısı: " + liste);
No need to convert to an array. This works because it causes liste's toString method to be called (which is why we don't need to call it explicitly).
You must change this line:
}while( bolum == 0 );
To this:
}while( bolum > 0 );
If you want to print your elements in the ArrayList, update your last statement to print as below:
System.out.println("Sayının Basamak Sayısı: " + liste);
Or you can iterate your list and print as :
for(Object i: liste){
System.out.println(i);
}
This will print your individual list items in separate lines.
Also please fix your while condition as while(bolum != 0); as it may terminate after the very first iteration as bolum will be non zero i.e. 1, 2...(!= 0).

Categories