Find first occurrence of any symbol of string in other string - java

I have a problem: I need to find first occurrence of any symbol from string s2 (or array of char) in string s1.
Is there standard function for this purpose? If there isn't, whats the good implementation for this problem? (Of course I can run indexOf for every char from my s2, but this does't seem like a good algorithm, because if only the last symbol occurs in s1, we must run through s1 |s2|-1 times before I get an answer).
Thank you very much!

Put all characters from s2 into a constant-time lookup data structure (e.g. HashSet). Iterate over each character in s1 and see if your data structure contains that character.
Roughly (untested):
public int indexOfFirstContainedCharacter(String s1, String s2) {
Set<Character> set = new HashSet<Character>();
for (int i=0; i<s2.length; i++) {
set.add(s2.charAt(i)); // Build a constant-time lookup table.
}
for (int i=0; i<s1.length; i++) {
if (set.contains(s1.charAt(i)) {
return i; // Found a character in s1 also in s2.
}
}
return -1; // No matches.
}
This algorithm is O(n) as opposed to O(n^2) in the algorithm you describe.

Using regex:
public static void main(final String[] args) {
final String s1 = "Hello World";
final String s2 = "log";
final Pattern pattern = Pattern.compile("[" + Pattern.quote(s2) + "]");
final Matcher matcher = pattern.matcher(s1);
if (matcher.find()) {
System.out.println(matcher.group());
}
}

What you are looking for is indexOfAny from Apache StringUtils.
It looks like the implementation is:
public static int indexOfAny(String str, char[] searchChars) {
if (isEmpty(str) || ArrayUtils.isEmpty(searchChars)) {
return -1;
}
for (int i = 0; i < str.length(); i++) {
char ch = str.charAt(i);
for (int j = 0; j < searchChars.length; j++) {
if (searchChars[j] == ch) {
return i;
}
}
}
return -1;
}

What is meant by symbol in this context? If it's just a 16-bit Java char, it's easy. Make a lookup table (array) for all possible values, indicating whether they appear in s2. Then step through s1 until either you've found a symbol from s2 or you've reached the end of s1. If a symbol is a Unicode code-point, it's more complicated, but the above gives a method to find out where you need to take a closer look.

Related

Replace part of the string from reference string

I have a String ArrayList consisting alphabets followed by a digit as a suffix to each of the alphabet.
ArrayList <String> baseOctave = new ArrayList();
baseOctave.add("S1");
baseOctave.add("R2");
baseOctave.add("G4");
baseOctave.add("M2");
baseOctave.add("P3");
baseOctave.add("D1");
baseOctave.add("N1");
I pass the strings from this baseOctave and few other characters as input pattern for creating an object.
MyClass obj1 = new MyClass ("S1,,R2.,M2''-");
Since I frequently make use of these kind of input patterns during object instantiation, I would like to use simple characters S, R, G, M etc.
Ex:
MyClass obj1 = new MyClass ("S,,R.,M''-");
MyClass obj2 = new MyClass ("S1,G.,M,D1");
So the alphabets used during object creation may contain digits as suffix or it may not have digit as suffix.
But inside the constructor (or in separate method), I would like to replace these simple alphabets with alphabets having suffix. The suffix is taken from the baseOctave.
Ex: above two strings in obj1 and obj2 should be "S1,,R2.,M2''-" and "S1,G4.,M2,D1"
I tied to do this, but could not continue the code below. Need some help for replacing please..
static void addSwaraSuffix(ArrayList<String> pattern) {
for (int index = 0; index < pattern.size(); index++) {
// Get the patterns one by one from the arrayList and verify and manipulate if necessary.
String str = pattern.get(index);
// First see if the second character in Array List element is digit or not.
// If digit, nothing should be done.
//If not, replace/insert the corresponding index from master list
if (Character.isDigit(str.charAt(1)) != true) {
// Replace from baseOctave.
str = str.replace(str.charAt(0), ?); // replace with appropriate alphabet having suffix from baseOctave.
// Finally put the str back to arrayList.
pattern.set(index, str);
}
}
}
Edited information is below:
Thanks for the answer. I found another solution and works fine. below is the complete code that I found working. Let me know if there is any issue.
static void addSwaraSuffix(ArrayList<String> inputPattern, ArrayList<String> baseOctave) {
String temp = "";
String str;
for (int index = 0; index < inputPattern.size(); index++) {
str = inputPattern.get(index);
// First see if the second character in Array List is digit or not.
// If digit, nothing should be done. If not, replace/insert the corresponding index from master list
// Sometimes only one swara might be there. Ex: S,R,G,M,P,D,N
if (((str.length() == 1)) || (Character.isDigit(str.charAt(1)) != true)) {
// Append with index.
// first find the corresponsing element to be replaced from baseOctave.
for (int index2 = 0; index2 < baseOctave.size(); index2++) {
if (baseOctave.get(index2).startsWith(Character.toString(str.charAt(0)))) {
temp = baseOctave.get(index2);
break;
}
}
str = str.replace(Character.toString(str.charAt(0)), temp);
}
inputPattern.set(index, str);
}
}
I assume that abbreviation is only one character and that in full pattern second character is always digit. Code below relies on this assumptions, so please inform me if they are wrong.
static String replace(String string, Collection<String> patterns) {
Map<Character, String> replacements = new HashMap<Character, String>(patterns.size());
for (String pattern : patterns) {
replacements.put(pattern.charAt(0), pattern);
}
StringBuilder result = new StringBuilder();
for (int i = 0; i < string.length(); i++) {
Character c = string.charAt(i);
char next = i < string.length() - 1 ? string.charAt(i + 1) : ' ';
String replacement = replacements.get(c);
if (replacement != null && (next <= '0' || next >= '9')) {
result.append(replacement);
} else {
result.append(c);
}
}
return result.toString();
}
public static void main(String[] args) {
ArrayList<String> baseOctave = new ArrayList<String>();
baseOctave.add("S1");
baseOctave.add("R2");
baseOctave.add("G4");
baseOctave.add("M2");
baseOctave.add("P3");
baseOctave.add("D1");
baseOctave.add("N1");
System.out.println(replace("S,,R.,M''-", baseOctave));
System.out.println(replace("S1,G.,M,D1", baseOctave));
System.out.println(replace("", baseOctave));
System.out.println(replace("S", baseOctave));
}
Results:
S1,,R2.,M2''-
S1,G4.,M2,D1
S1

Repetition of a word in String with out using collections

Can any one suggest me to write a logic for this without using collections.
I have a string s="This is the place called island and it is a nice place ";
Input String to find repetition word is: "is";
output should be : 4
You can follow the below logic to do it.
Split the String on whitespace.
Initialize an integer counter to 0.
Traverse through each of the elements of the resultant array and for each String element of the array, do the following:
a) If stringElement.contains("is"), increment the counter created in step 2.
b) If !stringElement.contains("is"), do nothing and move on to the next element.
Do this till you exhaust all the elements of the array.
Print the counter value.
Try to write the code for this on your own and get back here if you're stuck up anywhere.
As simple as
int count = 0;
for (int startIndex = 0; startIndex >= 0; startIndex = str.indexOf("is", startIndex)) {
count++;
}
Use the following method, it should work:
public static int countSubStrings(String subString, String mainString){
return (mainString.length() - mainString.replace(subString, "").length()) / subString.length();
}
string s="This is the place called island and it is a nice place ";
s = s+" ";
System.out.println(s.split("is").length-1);
you can use this. Hope you are using Java.
public static void main(String[] args) {
String str = "This is the place called island and it is a nice place";
Pattern pattern = Pattern.compile("is");
Matcher matcher = pattern.matcher(str);
int count = 0;
while (matcher.find())
count++;
System.out.println(count); // prints 4
}
this method will count how many time s2 appears in s1
public static int CountStr(String s1,String s2){
int cnt=0,temp=0;
if(s1.indexOf(s2)>0){
temp=s1.indexOf(s2);
cnt++;
}
while(s1.indexOf(s2,temp+1)>0){
cnt++;
temp=s1.indexOf(s2,temp+1);
}
return cnt;
}

Removing duplicates from a string

I am trying to remove duplicates from a String in Java. Here i what I have tried
public void unique(String s)
{
// put your code here
char[]newArray = s.toCharArray();
Set<Character> uniquUsers = new HashSet<Character>();
for (int i = 0; i < newArray.length; i++) {
if (!uniquUsers.add(newArray[i]))
newArray[i] =' ';
}
System.out.println(new String(newArray));
}
Problem with this is when I try to remove the duplicate I replace it with a space. I tried replacing the duplicate with '' but it cannot be done or I cant set the duplicate place to null. What is the best way to do this?
If you use regex, you only need one line!
public void unique(String s) {
System.out.println(s.replaceAll("(.)(?=.*\\1)", ""));
}
This removes (by replacing with blank) all characters that found again later in the input (by using a look ahead with a back reference to the captured character).
If I understand your question correctly, perhaps you could try something like:
public static String unique(final String string){
final StringBuilder builder = new StringBuilder();
for(final char c : string.toCharArray())
if(builder.indexOf(Character.toString(c)) == -1)
builder.append(c);
return builder.toString();
}
You can use BitSet
public String removeDuplicateChar(String str){
if(str==null || str.equals(""))throw new NullPointerException();
BitSet b = new BitSet(256);
for(int i=0;i<str.length();i++){
b.set(str.charAt(i));
}
StringBuilder s = new StringBuilder();
for(int i=0;i<256;i++){
if(b.isSet(i)){
s.append((char)i);
}
}
return s.toString();
}
You can roll down your own BitSet like below:
class BitSet {
int[] numbers;
BitSet(int k){
numbers = new int[(k >> 5) + 1];
}
boolean isSet(int k){
int remender = k & 0x1F;
int devide = k >> 5;
return ((numbers[devide] & (1 << remender)) == 1);
}
void set(int k){
int remender = k & 0x1F;
int devide = k >> 5;
numbers[devide] = numbers[devide] | (1 << remender);
}
}
This will work for what you are attempting.
public static void unique(String s) {
// r code here
char[] newArray = s.toCharArray();
Set<Character> uniqueUsers = new HashSet<>();
for (int i = 0; i < newArray.length; i++) {
uniqueUsers.add(newArray[i]);
}
newArray = new char[uniqueUsers.size()];
Iterator iterator = uniqueUsers.iterator();
int i = 0;
while (iterator.hasNext()) {
newArray[i] = (char)iterator.next();
i++;
}
System.out.println(new String(newArray));
}
without changing almost anything in your code, change the line
System.out.println(new String(newArray));
for
System.out.println( new String(newArray).replaceAll(" ", ""));
the addition of replaceAll will remove blanks
import java.util.*;
class StrDup{
public static void main(String[] args){
String s = "abcdabacdabbbabbbaaaaaaaaaaaaaaaaaaabbbbbbbbbbdddddddddcccccc";
String dup = removeDupl(s);
}
public static String removeDupl(String s){
StringBuilder sb = new StringBuilder(s);
String ch = "";
for(int i = 0; i < sb.length(); i++){
ch = sb.substring(i,i+1);
int j = i+1;
int k = 0;
while(sb.indexOf(ch,j)!=-1){
k = sb.indexOf(ch,j);
sb.deleleCharAt(k);
j = k;
}
}
return sb.toString();
}
}
In the code above, I'm doing the following tasks.
I'm first converting the string to a StringBuilder. Strings in Java are immutable, which means they are like CDs. You can't do anything with them once they are created. The only thing they are vulnerable to is their departure, i.e. the end of their life cycle by the garbage collector, but that's a whole different thing. Foe example:
String s = "Tanish";
s + "is a good boy";
This will do nothing. String s is still Tanish. To make the second line of code happen, you will have to assign the operation to some variable, like this:
s = s + "is a good boy";
And, make no mistake! I said strings are immutable, and here I am reassigning s with some new string. But, it's a NEW string. The original string Tanish is still there, somewhere in the pool of strings. Think of it like this: the string that you are creating is immutable. Tanish is immutable, but s is a reference variable. It can refer to anything in the course of its life. So, Tanish and Tanish is a good boy are 2 separate strings, but s now refers to the latter, instead of the former.
StringBuilder is another way of creating strings in Java, and they are mutable. You can change them. So, if Tanish is a StringBuilder, it is vulnerable to every kind of operation (append, insert, delete, etc.).
Now we have the StringBuilder sb, which is same as the String s.
I've used a StringBuilder built-in method, i.e. indexOf(). This methods finds the index of the character I'm looking for. Once I have the index, I delete the character at that index.
Remember, StringBuilder is mutable. And that's the reason I can delete the characters.
indexOf is overloaded to accept 2 arguments (sb.indexOf(substr ,index)). This returns you the position of the first occurrence of string within the sb, starting from index.
In the example string, sb.indexOf(a,1) will give me 4. All I'm trying to say to Java is, "Return me the index of 'a', but start looking for 'a' from index 1'. So, this way I've the very first a at 0, which I don't want to get rid of.
Now all I'm doing inside the for loop is extracting the character at ith position. j represents the position from where to start looking for the extracted character. This is important, so that we don't loose the one character we need. K represents the result of indexOf('a',j), i.e. the first occurrence of a, after index j.
That's pretty much it. Now, as long as we have a character ch lying in the string (indexOf(....) returns -1, if it can't find the specified character (...or the string as i specified before) as a duplicate, we will obtain it's position (k), delete it using deleteCharAt(k) and update j to k. i.e., the next duplicate a (if it exists) will appear after k, where it was last found.
DEMONSTRATION:
In the example I took, let's say we want to get rid of duplicate cs.
So, we will start looking for the first c after the very first c, i.e. index 3.
sb.indexOf("c",3) will give us 7, where a c is lying. so, k = 7. delete it, and then set j to k. Now, j = 7. Basically after deleting the character, the succeeding string shifts to left by 1. So, now at 7th pos we have d, which was at 8 before. Now, k = indexOf("c",7) and repeat the entire cycle. Also, remember that indexOf("c",j) will start looking right from j. which means if c, is found at j, it will return j. That's why when we extracted the first character, we started looking from position 1 after the character's position.
public class Duplicates {
public static void main(String[] args) {
String str="aabbccddeeff";
String[] str1 = str.split("");
ArrayList<String> List = new ArrayList<String>
Arrays.asList(str1);
List<String> newStr = List.stream().distinct().collect(Collectors.toList());
System.out.print(newStr);
}
}

check how many times string contains character 'g' in eligible string

How we can check any string that contains any character how may time....
example:
engineering is a string contains how many times 'g' in complete string
I know this is and old question, but there is an option that wasn't answered and it's pretty simple one-liner:
int count = string.length() - string.replaceAll("g","").length()
Try this
int count = StringUtils.countMatches("engineering", "e");
More about StringUtils can be learned from the question: How do I use StringUtils in Java?
I would use a Pattern and Matcher:
String string = "engineering";
Pattern pattern = Pattern.compile("([gG])"); //case insensitive, use [g] for only lower
Matcher matcher = pattern.matcher(string);
int count = 0;
while (matcher.find()) count++;
Although Regex will work fine, but it is not really required here. You can do it simply using a for-loop to maintain a count for a character.
You would need to convert your string to a char array: -
String str = "engineering";
char toCheck = 'g';
int count = 0;
for (char ch: str.toCharArray()) {
if (ch == toCheck) {
count++;
}
}
System.out.println(count);
or, you can also do it without converting to charArray: -
for (int i = 0; i < str.length(); i++) {
if (str.charAt(i) == toCheck) {
count++;
}
}
String s = "engineering";
char c = 'g';
s.replaceAll("[^"+ c +"]", "").length();
Use regex [g] to find the char and count the findings as below:
Pattern pattern = Pattern.compile("[g]");
Matcher matcher = pattern.matcher("engineering");
int countCharacter = 0;
while(matcher.find()) {
countCharacter++;
}
System.out.println(countCharacter);
If you want case insensitive count, use regex as [gG] in the Pattern.
use org.apache.commons.lang3 package for use StringUtils class.
download jar file and place it into lib folder of your web application.
int count = StringUtils.countMatches("engineering", "e");
You can try Java-8 way. Easy, simple and more readable.
long countOfA = str.chars().filter(ch -> ch == 'g').count();
this is a very very old question but this might help someone ("_")
you can Just simply use this code
public static void main(String[] args){
String mainString = "This is and that is he is and she is";
//To find The "is" from the mainString
String whatToFind = "is";
int result = countMatches(mainString, whatToFind);
System.out.println(result);
}
public static int countMatches(String mainString, String whatToFind){
String tempString = mainString.replaceAll(whatToFind, "");
//this even work for on letter
int times = (mainString.length()-tempString.length())/whatToFind.length();
//times should be 4
return times;
}
You can try following :
String str = "engineering";
int letterCount = 0;
int index = -1;
while((index = str.indexOf('g', index+1)) > 0)
letterCount++;
System.out.println("Letter Count = " + letterCount);
You can loop through it and keep a count of the letter you want.
public class Program {
public static int countAChars(String s) {
int count = 0;
for(char c : s.toCharArray()) {
if('a' == c) {
count++;
}
}
return count;
}
}
or you can use StringUtils to get a count.
int count = StringUtils.countMatches("engineering", "e");
This is an old question and it is in Java but I will answer it in Python. This might be helpful:
string = 'E75;Z;00001;'
a = string.split(';')
print(len(a)-1)

Compare strings in java and remove the part of string where they are identical

I have two strings with me:
s1="MICROSOFT"
s2="APPLESOFT"
I need to compare the strings and remove the duplicate part (always towards the end) from the second string. So I should get "MICROSOFT" and "APPLE" as output.
I have compared both the strings character by character.
String s1 = "MICROSOFT";
String s2 = "APPLESOFT";
for(int j=0; j<s1.length(); j++)
{
char c1 = s1.charAt(j);
char c2 = s2.charAt(j);
if(c1==c2)
System.out.println("Match found!!!");
else
System.out.println("No match found!");
}
It should check the strings and if the two strings have same characters until the end of string, then I need to remove that redundant part, SOFT in this case, from the second string. But I can't think of how to proceed from here.
There can be more duplicates...but we have to remove only those which are continuously identical. if i have APPWWSOFT and APPLESOFT, i should get APPLE again in the second string since we got LE different than WW in between
Can you guys please help me out here?
Search and read about Longest Common Subsequence, you can find efficient algorithms to find out the LCS of two input strings. After finding the LCS of the input strings, it is easy to manipulate the inputs. For example, in your case an LCS algorithm will find "SOFT" as the LCS of these two strings, then you might check whether the LCS is in the final part of the 2nd input and then remove it easily. I hope this idea helps.
An example LCS code in Java is here, try it: http://introcs.cs.princeton.edu/java/96optimization/LCS.java.html
Example scenario (pseudocode):
input1: "MISROSOFT";
input2: "APPLESOFT";
execute LCS(input1, input2);
store the result in lcs, now lcs = "SOFT";
iterate over the characters of input2,
if a character exists in lcs then remove it from input2.
As far as I understand, you want to remove any identical characters from the two strings. By identical I mean: same position and same character(code). I think the following linear complexity solution is the simplest:
StringBuilder sb1 = new StringBuilder();
StringBuilder sb2 = new StringBuilder(); //if you want to remove the identical char
//only from one string you don't need the 2nd sb
char c;
for(int i = 0; i<Math.min(s1.length,s2.length);i++){
if((c = s1.charAt(i)) != s2.charAt(i)){
sb1.append(c);
}
}
return sb1.toString();
Try this algo- Create characters sequences of your first string and find it in second string.
performance -
Average case = (s1.length()-1)sq
public class SeqFind {
public static String searchReplace(String s1,String s2) {
String s3;
boolean brk=false;
for(int j=s1.length();j>0&&!brk;j--){
for (int i = j-4; i > 0; i--) {
String string = s1.substring( i,j);
if(s2.contains(string)){
System.out.println(s2+" - "+string+" "+s2.replace( string,""));
brk=true;
break;
}
}
}
return s3;
}
public static void main(String[] args) {
String s1 = "MICROSOFT";
String s2 = "APPLESOFT";
String s3 = searchReplace(s1,s2);
}
}
Out put -
APPLESOFT - SOFT - APPLE
public class Match {
public static void main(String[] args)
{
String s1="MICROSOFT";
String s2="APPLESOFT";
String[] s=new String[10];
String s3;
int j=0,k=0;
for(int i=s2.length();i>0;i--)
{
s[j]=s2.substring(k,s2.length());
if(s1.contains(s[j]))
{
s3=s2.substring(0,j);
System.out.println(s1+""+s3);
System.exit(0);
}
else
{
System.out.println("");
}
j++;
k++;
}
}
}
I have edited the code you can give it an another try.
try this, not tested thou
String s1 = "MICROSOFT";
String s2 = "APPLESOFT";
String s3="";
for(int j=0; j<s1.length(); j++)
{
if(s1.charAt(j)==s2.charAt(j)){
s3+=s1.charAt(j);
}
}
System.out.println(s1.replace(s3, " ") + " \n"+ s2.replace(s3, " "));
You should rather use StringBuffer if you want your String to be modified..
And in this case, you can have one extra StringBuffer, in which you can keep on appending non-matching character: -
StringBuffer s1 = new StringBuffer("MICROSOFT");
StringBuffer s2 = new StringBuffer("APPLESOFT");
StringBuffer s3 = new StringBuffer();
for(int j=0; j<s1.length(); j++)
{
char c1 = s1.charAt(j);
char c2 = s2.charAt(j);
if(c1==c2) {
System.out.println("Match found!!!");
} else {
System.out.println("No match found!");
s3.append(c1);
}
}
s1 = s3;
System.out.println(s1); // Prints "MICRO"
I have solved my problem after racking some brains off. Please feel free to correct/improve/refine my code. The code not only works for "MICROSOFT" and "APPLESOFT" inputs, but also for inputs like "APPWWSOFT" and "APPLESOFT" (i needed to remove the continuous duplicates from the end - SOFT in both the above inputs). I'm in the learning stage and I'll appreciate any valuable inputs.
public class test
{
public static void main(String[] args)
{
String s1 = "MICROSOFT";
String s2 = "APPLESOFT";
int counter1=0;
int counter2=0;
String[] test = new String[100];
test[0]="";
for(int j=0; j<s1.length(); j++)
{
char c1 = s1.charAt(j);
char c2 = s2.charAt(j);
if(c1==c2)
{
if(counter1==counter2)
{
//System.out.println("Match found!!!");
test[0]=test[0]+c2;
counter2++;
//System.out.println("Counter 2: "+counter2);
}
else
test[0]="";
}
else
{
//System.out.print("No match found!");
//System.out.println("Counter 2: "+counter2);
counter2=counter1+1;
test[0]="";
}
counter1++;
//System.out.println("Counter 1: "+counter1);
}
System.out.println(test[0]);
System.out.println(s2.replaceAll(test[0]," "));
}
}

Categories