Characters of given string must be sorted according to the order defined by another pattern string. Requirements for complexity O(n + m) where n is length of string and m is length of pattern.
Example:
Pattern: 1234567890AaBbCcDdEeFfGgHh
String: dH7ee2D6a341Fb9Ea20dhC1g7ca32Ba2Gac5f76A2g
Result: 112222233456677790AaaaaaBbCccDddEeeFfGggHh
Pattern has all characters of the string and each one appears in pattern only once.
My code:
// Instances of possible values for input:
String pattern = "1234567890AaBbCcDdEeFfGgHh";
String string = "dH7ee2D6a341Fb9Ea20dhC1g7ca32Ba2Gac5f76A2g";
// Builder to collect characters for sorted result:
StringBuilder result = new StringBuilder();
// Hash table based on characters from pattern to count occurrence of each character in string:
Map<Character, Integer> characterCount = new LinkedHashMap<>();
for (int i = 0; i < pattern.length(); i++) {
// Put each character from pattern and initialize its counter with initial value of 0:
characterCount.put(pattern.charAt(i), 0);
}
// Traverse string and increment counter at each occurrence of character
for (int i = 0; i < string.length(); i++) {
char ch = string.charAt(i);
Integer count = characterCount.get(ch);
characterCount.put(ch, ++count);
}
// Traverse completed dictionary and collect sequentially all characters collected from string
for (Map.Entry<Character, Integer> entry : characterCount.entrySet()) {
Integer count = entry.getValue();
if (count > 0) {
Character ch = entry.getKey();
// Append each character as many times as it appeared in string
for (int i = 0; i < count; i++) {
result.append(ch);
}
}
}
// Get final result from builder
return result.toString();
Is this code optimal? Is there any way to improve this algorithm? Do I understand correctly that it satisfies the given complexity O(n + m)?
Not sure if timing wise yours or mine is faster.
But here's an alternative:
import java.math.BigDecimal;
class Playground {
public static void main(String[ ] args) {
String pattern = "1234567890AaBbCcDdEeFfGgHh";
String s = "dH7ee2D6a341Fb9Ea20dhC1g7ca32Ba2Gac5f76A2g";
long startTime = System.nanoTime();
StringBuilder sb = new StringBuilder();
for (char c : pattern.toCharArray()) {
sb.append(s.replaceAll("[^" + c + "]", ""));
}
System.out.println(sb.toString());
BigDecimal elapsedTime =
new BigDecimal( String.valueOf(System.nanoTime() - startTime)
)
.divide(
new BigDecimal( String.valueOf(1_000_000_000)
)
);
System.out.println(elapsedTime + " seconds");
}
}
Explanation:
For each character in pattern, use a String's regex based replaceAll method to replace all characters except the current one with an empty string. Rinse and repeat. That will leave you with the count of each character in original intact, ordered by the character sequence of pattern.
Outputs:
112222233456677790AaaaaaBbCccDddEeeFfGggHh
0.021151652 seconds
(The timing is somewhat subjective. It came from the Sololearn Java Playground. It obviously depends on the current load on their servers)
Related
Lets say I have a string like this:
String str = "~asdfl;kjx,~rgadfaeg,dsafnewgfljka;ldfjsfa;dlkjfa;lvjvbnaber;fwelfjadfafa"
int character = 12
What I want to do is delete every 12th character in the string, so i would delete the 12 index, then the 24th, then the 36th, etc until the string is over.
Which index I delete (every 12th, or every 2nd) has to equal the character variable I have, since that variable changes.
I tried doing this with regex:
System.out.println(s.replaceAll(".(.)", "$12"));
But it didnt work. any help?
Sometimes, a simple for loop is all you need:
public class Test {
public static void main(String[] args) {
String str = "~asdfl;kjx,~rgadfaeg,dsafnewgfljka;ldfjsfa;dlkjfa;lvjvbnaber;fwelfjadfafa";
int character = 12;
StringBuilder sb = new StringBuilder();
for (int i = 0; i < str.length(); i++) {
if ((i + 1) % character != 0) {
sb.append(str.charAt(i));
}
}
String result = sb.toString();
System.out.println(result);
}
}
If you insist on using regular expressions, you can interpolate the character variable into the expression as follows:
public class Test {
public static void main(String[] args) {
String str = "~asdfl;kjx,~rgadfaeg,dsafnewgfljka;ldfjsfa;dlkjfa;lvjvbnaber;fwelfjadfafa";
int character = 12;
System.out.println(str.replaceAll("(.{" + (character - 1) + "}).", "$1"));
}
}
To delete every 12th character using regex, use this pattern:
(.{11}).
And then replace with just the captured $1.
Sample Java code:
String str = "~asdfl;kjx,~rgadfaeg,dsafnewgfljka;ldfjsfa;dlkjfa;lvjvbnaber;fwelfjadfafa";
String output = str.replaceAll("(.{11}).", "$1");
System.out.println(output);
This prints:
~asdfl;kjx,rgadfaeg,dsfnewgfljka;dfjsfa;dlkja;lvjvbnabe;fwelfjadfaa
Edit:
To do a regex replacement of some fixed width, use:
String str = "~asdfl;kjx,~rgadfaeg,dsafnewgfljka;ldfjsfa;dlkjfa;lvjvbnaber;fwelfjadfafa";
int width = 11;
String output = str.replaceAll("(.{" + width + "}).", "$1");
System.out.println(output);
Avoid char
The char type in Java is legacy, essentially broken. As a 16-bit value, a char is incapable of representing most characters.
Code points
Instead, use code point integers.
Make an array of each character’s code point.
int[] codePointsArray = input.codePoints().toArray() ;
Make a list of that array.
List< Integer > codePoints = List.of( codePointsArray ) ;
Alternatively:
List< Integer > codePoints = input.codePoints().boxed().toList() ;
Make an IntStream of the indexes we need to access each element of that list. Use each index to pull out a code point, and filter by the nth element. Collect into a StringBuilder.
String result =
IntStream
.range( 0 , codePoints.size() )
.filter( n -> n % 12 != 0 )
.mapToObj( codePoints :: get )
.collect( StringBuilder :: new , StringBuilder :: appendCodePoint , StringBuilder :: append )
.toString()
;
That is untested code, but should be close to what you need.
My code here is based on what I saw on this similar Question.
Trying to search for patterns of letters in a file, the pattern is entered by a user and comes out as a String, so far I've got it to find the first letter by unsure how to make it test to see if the next letter also matches the pattern.
This is the loop I currently have. any help would be appreciated
public void exactSearch(){
if (pattern==null){UI.println("No pattern");return;}
UI.println("===================\nExact searching for "+patternString);
int j = 0 ;
for(int i=0; i<data.size(); i++){
if(patternString.charAt(i) == data.get(i) )
j++;
UI.println( "found at " + j) ;
}
}
You need to iterate over the first string until you find the first character of the other string. From there, you can create an inner loop and iterate on both simultaneously, like you did.
Hint: be sure to look watch for boundaries as the strings might not be of the same size.
You can try this :-
String a1 = "foo-bar-baz-bar-";
String pattern = "bar";
int foundIndex = 0;
while(foundIndex != -1) {
foundIndex = a1.indexOf(pattern,foundIndex);
if(foundIndex != -1)
{
System.out.println(foundIndex);
foundIndex += 1;
}
}
indexOf - first parameter is the pattern string,
second parameter is starting index from where we have to search.
If pattern is found, it will return the starting index from where the pattern matched.
If pattern is not found, indexOf will return -1.
String data = "foo-bar-baz-bar-";
String pattern = "bar";
int foundIndex = data.indexOf(pattern);
while (foundIndex > -1) {
System.out.println("Match found at: " + foundIndex);
foundIndex = data.indexOf(pattern, foundIndex + pattern.length());
}
Based on your request, you can use this algorithm to search for your positions:
1) We check if we reach at the end of the string, to avoid the invalidIndex error, we verify if the remaining substring's size is smaller than the pattern's length.
2) We calculate the substring at each iteration and we verify the string with the pattern.
List<Integer> positionList = new LinkedList<>();
String inputString = "AAACABCCCABC";
String pattern = "ABC";
for (int i = 0 ; i < inputString.length(); i++) {
if (inputString.length() - i < pattern.length()){
break;
}
String currentSubString = inputString.substring(i, i + pattern.length());
if (currentSubString.equals(pattern)){
positionList.add(i);
}
}
for (Integer pos : positionList) {
System.out.println(pos); // Positions : 4 and 9
}
EDIT :
Maybe it can be optimized, not to use a Collection for this simple task, but I used a LinkedList to write a quicker approach.
There are 2 functions defined below. They does the exactly same function i.e takes input a template (in which one wants to replace some substrings) and array of strings values( key value pair to replace, ex:[subStrToReplace1,value1,subStrToReplace1,value2,.....]) and returns the replaced String.
In second function I am iterating over words of the templates and searching for the relevant key if exist in hashmap and then next word. If I want to replace a word with some substring , which I again want to replace with some other key in values, I need to iterate over template twice. Thats what I did.
I would like to know which one should I use and why ? Any than alternative better than these are also welcome.
1st function
public static String populateTemplate1(String template, String... values) {
String populatedTemplate = template;
for (int i = 0; i < values.length; i += 2) {
populatedTemplate = populatedTemplate.replace(values[i], values[i + 1]);
}
return populatedTemplate;
}
2nd function
public static String populateTemplate2(String template, String... values) {
HashMap<String, String> map = new HashMap<>();
for (int i = 0; i < values.length; i += 2) {
map.put(values[i],values[i+1]);
}
StringBuilder regex = new StringBuilder();
boolean first = true;
for (String word : map.keySet()) {
if (first) {
first = false;
} else {
regex.append('|');
}
regex.append(Pattern.quote(word));
}
Pattern pattern = Pattern.compile(regex.toString());
int N0OfIterationOverTemplate =2;
// Pattern allowing to extract only the words
// Pattern pattern = Pattern.compile("\\w+");
StringBuilder populatedTemplate=new StringBuilder();;
String temp_template=template;
while(N0OfIterationOverTemplate!=0){
populatedTemplate = new StringBuilder();
Matcher matcher = pattern.matcher(temp_template);
int fromIndex = 0;
while (matcher.find(fromIndex)) {
// The start index of the current word
int startIdx = matcher.start();
if (fromIndex < startIdx) {
// Add what we have between two words
populatedTemplate.append(temp_template, fromIndex, startIdx);
}
// The current word
String word = matcher.group();
// Replace the word by itself or what we have in the map
// populatedTemplate.append(map.getOrDefault(word, word));
if (map.get(word) == null) {
populatedTemplate.append(word);
}
else {
populatedTemplate.append(map.get(word));
}
// Start the next find from the end index of the current word
fromIndex = matcher.end();
}
if (fromIndex < temp_template.length()) {
// Add the remaining sub String
populatedTemplate.append(temp_template, fromIndex, temp_template.length());
}
N0OfIterationOverTemplate--;
temp_template=populatedTemplate.toString();
}
return populatedTemplate.toString();
}
Definitively the first one for at least two reasons:
It is easier to read and shorter, so it is easier to maintain as it is much less error prone
You don't rely on a regular expression so it is faster by far
The first function is much clearer and easier to understand. I would prefer it unless you find out (by a profiler) that it takes a considerable amount of time and slows your application down. Then you can figure out how to optimize it.
Why make things complicated when you can make simple.
Keep in mind that simple solutions tend to be the best.
FYI, if the numbers of elements is and odd number you will get an ArrayIndexOutOfBoundsException.
I propose this improvement:
public static String populateTemplate(String template, String... values) {
String populatedTemplate = template;
int nextTarget = 2;
int lastTarget = values.length - nextTarget;
for (int i = 0; i <= lastTarget; i += nextTarget) {
String target = values[i];
String replacement = values[i + 1];
populatedTemplate = populatedTemplate.replace(target, replacement);
}
return populatedTemplate;
}
"Good programmers write code that humans can understand". Martin Fowler
So I was developing an algorithm to count the number of repetitions of each character in a given word. I am using a HashMap and I add each unique character to the HashMap as the key and the value is the number of repetitions. I would like to know what the run time of my solution is and if there is a more efficient way to solve the problem.
Here is the code :
public static void getCount(String name){
public HashMap<String, Integer> names = new HashMap<String, Integer>() ;
for(int i =0; i<name.length(); i++){
if(names.containsKey(name.substring(i, i+1))){
names.put(name.substring(i, i+1), names.get(name.substring(i, i+1)) +1);
}
else{
names.put(name.substring(i, i+1), 1);
}
}
Set<String> a = names.keySet();
Iterator i = a.iterator();
while(i.hasNext()){
String t = (String) i.next();
System.out.println(t + " Ocurred " + names.get(t) + " times");
}
}
The algorithm has a time complexity of O(n), but I'd change some parts of your implementation, namely:
Using a single get() instead of containsKey() + get();
Using charAt() instead of substring() which will create a new String object;
Using a Map<Character, Integer> instead of Map<String, Integer> since you only care about a single character, not the entire String:
In other words:
public static void getCount(String name) {
Map<Character, Integer> names = new HashMap<Character, Integer>();
for(int i = 0; i < name.length(); i++) {
char c = name.charAt(i);
Integer count = names.get(c);
if (count == null) {
count = 0;
}
names.put(c, count + 1);
}
Set<Character> a = names.keySet();
for (Character t : a) {
System.out.println(t + " Ocurred " + names.get(t) + " times");
}
}
Your solution is O(n) from an algorithmic perspective, which is already optimal (at a minimum you have to inspect each character in the entire string at least once which is O(n)).
However there are a couple of ways that you could speed it up be reducing the constant overhead, e.g.
Use a HashMap<Character,Integer>. Characters will be much more efficient than Strings of length 1.
use charAt(i) instead of substring(i,i+1). This avoids creating a new String which will help you a lot. Probably the biggest single improvement you can make.
If the string is going to be long (e.g. thousands of characters or more), consider using an int[] array to count the individual characters rather than a HashMap, with the character's ASCII value used as an index into the array. This isn't a good idea if your Strings are short though.
Store the initial time to a variable, like so:
long start = System.currentTimeMillis();
then at the end, when you finish, print out the current time minus the start time:
System.out.println((System.currentTimeMillis() - start) + "ms taken");
to see the time taken to do it. As far as I can tell, that is the most efficient way to do it, but there may be another good method. Also, use char rather than strings for each individual character (as char/Character is the best class for characters, strings for a series of chars) then do name.charAt(i) rather than name.substring(i, i+1) and change your hashmap to HashMap<Character, Integer>
String s="good";
//collect different unique characters
ArrayList<String> temp=new ArrayList<>();
for (int i = 0; i < s.length(); i++) {
char c=s.charAt(i);
if(!temp.contains(""+c))
{
temp.add(""+s.charAt(i));
}
}
System.out.println(temp);
//get count of each occurrence in the string
for (int i = 0; i < temp.size(); i++) {
int count=0;
for (int j = 0; j < s.length(); j++) {
if(temp.get(i).equals(s.charAt(j)+"")){
count++;
}
}
System.out.println("Occurance of "+ temp.get(i) + " is "+ count+ " times" );
}*/
At input i have some string : "today snowing know " , here i have 3 words , so i must to parse them is such way : every character i must compare with all other characters , and to sum how many same characters these words have , like exemple for "o" letter will be 2 (from "today" and "snowing") or "w" letter will be 2 (from "know" and "snowing"). After that i must to replace these characters with number(transformed in char format) of letters. The result should be "13111 133211 1332".
What i did ?
First i tape some words and
public void inputStringsForThreads () {
boolean flag;
do {
// will invite to input
stringToParse = Input.value();
try {
flag = true;
// in case that found nothing , space , number and other special character , throws an exception
if (stringToParse.equals("") | stringToParse.startsWith(" ") | stringToParse.matches(".*[0-9].*") | stringToParse.matches(".*[~`!##$%^&*()-+={};:',.<>?/'_].*"))
throw new MyStringException(stringToParse);
else analizeString(stringToParse);
}
catch (MyStringException exception) {
stringToParse = null;
flag = false;
exception.AnalizeException();
}
}
while (!flag);
}
I eliminate spaces between words , and from those words make just one
static void analizeString (String someString) {
// + sign treat many spaces as one
String delimitator = " +";
// words is a String Array
words = someString.split(delimitator);
// temp is a string , will contain a single word
temp = someString.replaceAll("[^a-z^A-Z]","");
System.out.println("=============== Words are : ===============");
for (int i=0;i<words.length;i++)
System.out.println((i+1)+")"+words[i]);
}
So i try to compare for every word in part (every word is split in letters) with all letter from all words , But i don know how to count number of same letter and after replace letters with correct number of each letter??? Any ideas ?
// this will containt characters for every word in part
char[] motot = words[id].toCharArray();
// this will containt all characters from all words
char[] notot = temp.toCharArray();
for (int i =0;i<words[i].length();i++)
for (int j=0;j<temp.length ;j++)
{
if (i == j) {
System.out.println("Same word");
}
else if (motot[i] == notot[j] ) {
System.out.println("Found equal :"+lol[i]+" "+lol1[j]);
}}
For counting you might want to use a Map<Character, Integer> counter like java.util.HashMap. If getting a Value(Integer) using a specific key (Character) from counter is 'not null', then your value++ (leverage autoboxing). Otherwise put a new entry (char, 1) in the counter.
Replacing the letters with the numbers should be fairly easy then.
It is better to use Pattern Matching like this:
initially..
private Matcher matcher;
Pattern regexPattern = Pattern.compile( pattern );
matcher = regexPattern.matcher("");
for multiple patterns to match.
private final String[] patterns = new String [] {/* instantiate patterns here..*/}
private Matcher matchers[];
for ( int i = 0; i < patterns.length; i++) {
Pattern regexPattern = Pattern.compile( pattern[i] );
matchers[i] = regexPattern.matcher("");
}
and then for matching pattern.. you do this..
if(matcher.reset(charBuffer).find() ) {//matching pattern.}
for multiple matcher check.
for ( int i = 0; i < matchers.length; i++ ) if(matchers[i].reset(charBuffer).find() ) {//matching pattern.}
Don't use string matching, not efficient.
Always use CharBuffer instead of String.
Here is some C# code (which is reasonably similar to Java):
void replace(string s){
Dictionary<char, int> counts = new Dictionary<char, int>();
foreach(char c in s){
// skip spaces
if(c == ' ') continue;
// update count for char c
if(!counts.ContainsKey(c)) counts.Add(c, 1);
else counts[c]++;
}
// replace characters in s
for(int i = 0; i < s.Length; i++)
if(s[i] != ' ')
s[i] = counts[s[i]];
}
Pay attention to immutable strings in the second loop. Might want to use a StringBuilder of some sort.
Here is a solution that works for lower case strings only. Horrible horrible code, but I was trying to see how few lines I could write a solution in.
public static String letterCount(String in) {
StringBuilder out = new StringBuilder(in.length() * 2);
int[] count = new int[26];
for (int t = 1; t >= 0; t--)
for (int i = 0; i < in.length(); i++) {
if (in.charAt(i) != ' ') count[in.charAt(i) - 'a'] += t;
out.append((in.charAt(i) != ' ') ? "" + count[in.charAt(i) - 'a'] : " ");
}
return out.substring(in.length());
}