Longest Common Substring in a big text - java

i have this assignment for school which ask us to write code to find the longest common Substring. I have done that, but it only works with text that are not so big and it is being asked to find the common substring for Moby Dick and War And Peace. If you could point me in the right direction of what i'm doing wrong, i would appreciate it. The compiler is complaining that the error is in the substring method of the MyString class when i call it to create the SuffixArray but idk why its saying its too big, giving me the outofmemory
package datastructuresone;
import java.io.File;
import java.io.FileNotFoundException;
import java.util.Arrays;
import java.util.Scanner;
class SuffixArray
{
private final MyString[] suffixes;
private final int N;
public SuffixArray(String s)
{
N = s.length();
MyString snew = new MyString(s);
suffixes = new MyString[N];
for (int i = 0; i < N; i++)
{
suffixes[i] = snew.substring(i);
}
Arrays.sort(suffixes);
}
public int length()
{
return N;
}
public int index(int i)
{
return N - suffixes[i].length();
}
public MyString select(int i)
{
return suffixes[i];
}
// length of longest common prefix of s and t
private static int lcp(MyString s, MyString t)
{
int N = Math.min(s.length(), t.length());
for (int i = 0; i < N; i++)
{
if (s.charAt(i) != t.charAt(i))
{
return i;
}
}
return N;
}
// longest common prefix of suffixes(i) and suffixes(i-1)
public int lcp(int i)
{
return lcp(suffixes[i], suffixes[i - 1]);
}
// longest common prefix of suffixes(i) and suffixes(j)
public int lcp(int i, int j)
{
return lcp(suffixes[i], suffixes[j]);
}
}
public class DataStructuresOne
{
public static void main(String[] args) throws FileNotFoundException
{
Scanner in1 = new Scanner(new File("./build/classes/WarAndPeace.txt"));
Scanner in2 = new Scanner(new File("./build/classes/MobyDick.txt"));
StringBuilder sb = new StringBuilder();
StringBuilder sb1 = new StringBuilder();
while (in1.hasNextLine())
{
sb.append(in1.nextLine());
}
while (in2.hasNextLine())
{
sb1.append(in2.nextLine());
}
String text1 = sb.toString().replaceAll("\\s+", " ");
String text2 = sb1.toString().replaceAll("\\s+", " ");
int N1 = text1.length();
int N2 = text2.length();
SuffixArray sa = new SuffixArray(text1 + "#" + text2);
int N = sa.length();
String substring = "";
for (int i = 1; i < N; i++)
{
// adjacent suffixes both from second text string
if (sa.select(i).length() <= N2 && sa.select(i - 1).length() <= N2)
{
continue;
}
// adjacent suffixes both from first text string
if (sa.select(i).length() > N2 + 1 && sa.select(i - 1).length() > N2 + 1)
{
continue;
}
// check if adjacent suffixes longer common substring
int length = sa.lcp(i);
if (length > substring.length())
{
substring = sa.select(i).toString().substring(0, length);
System.out.println(substring + " ");
}
}
System.out.println("The length of the substring " + substring.length() + "length on first N " + N1 + " length of Second N " + N2
+ "The length of the array sa: " + N);
System.out.println("'" + substring + "'");
final class MyString implements Comparable<MyString>
{
public MyString(String str)
{
offset = 0;
len = str.length();
arr = str.toCharArray();
}
public int length()
{
return len;
}
public char charAt(int idx)
{
return arr[ idx + offset];
}
public int compareTo(MyString other)
{
int myEnd = offset + len;
int yourEnd = other.offset + other.len;
int i = offset, j = other.offset;
for (; i < myEnd && j < yourEnd; i++, j++)
{
if (arr[ i] != arr[ j])
{
return arr[ i] - arr[ j];
}
}
// reached end. Who got there first?
if (i == myEnd && j == yourEnd)
{
return 0; // identical strings
}
if (i == myEnd)
{
return -1;
} else
{
return +1;
}
}
public MyString substring(int beginIndex, int endIndex)
{
return new MyString(arr, beginIndex + offset, endIndex - beginIndex);
}
public MyString substring(int beginIndex)
{
return substring(beginIndex, offset + len);
}
public boolean equals(Object other)
{
return (other instanceof MyString) && compareTo((MyString) other) == 0;
}
public String toString()
{
return new String(arr, offset, len);
}
private MyString(char[] a, int of, int ln)
{
arr = a;
offset = of;
len = ln;
}
private char[] arr;
private int offset;
private int len;
}

Here:
for (int i = 0; i < N; i++)
{
suffixes[i] = snew.substring(i);
}
You are trying to store, not only the entire long string, but the entire string - 1 letter, and the entire string - 2 letters, etc. All of these are stored separately.
If your String were only 10 letters, you would be storing a total of 55 characters worth in 10 different string.
At 1000 characters, you are storing 500500 characters total.
More generally, you are having to handle, length*(length+1)/2 characters.
Just for fun, I don't know how many characters are in War and Peace, but with a page count around 1250, a typical words/page estimate being 250, and the average word being about 5 characters long, comes to:
(1250 * 250 * 5)*(1250 * 250 * 5 + 1)/2 = 1.2207039 * 10^12 characters.
The size of a char in memory being 2 bytes, so you're looking at about 2.22 TB in size (compared to 1.49 MB for just the text of the novel).

I count at least 3 copies of both texts in the first few lines of the code. Here's a few ideas
convert the spaces as you read each line in--not after they are huge strings. Don't forget the case of spaces at the front and end of lines.
build your MyString class using StringBuilder as the base instead of String. Do all the looking inside the StringBuilder with its native methods, if you can.
don't extract strings any more than you have to.
Look up the -Xmx java runtime option and set the heap space large than the default. You'll have to google this as I don't have it memorized. Just notice that -Xmx=1024M needs that M at the end. (Look at the file size to see how big the two books are.)

When you construct MyString, you call arr = str.toCharArray(); which makes a new copy of the string's character data. But in Java, a string is immutable - so why not store a reference to the string instead of a copy of its data?
You construct every suffix at once, but you only refer to one (well, two) at a time. If you recode your solution to only reference the suffixes it currently cares about, and construct them only when it needs them (and lose a reference to them afterwards), they can be garbage collected by Java. This will make running out of memory less likely. Compare the memory overhead of storing 2 strings to storing hundreds of thousands of strings :)

I wrote this program in Scala. Maybe you can translate it to Java.
class MyString private (private val string: String, startIndex: Int, endIndex: Int) extends Comparable[MyString] {
def this(string: String) = this(string, 0, string.length)
def length() = endIndex-startIndex
def charAt(i: Int) = {
if(i >= length) throw new IndexOutOfBoundsException
string.charAt(startIndex + i)
}
def substring(start: Int, end: Int): MyString = {
if(start < 0 || end > length || end < start) throw new IndexOutOfBoundsException
new MyString(string, startIndex + start, startIndex + end)
}
def substring(start: Int): MyString = substring(start, length)
def longestCommonSubstring(other: MyString): MyString = {
var index = 0
val len = math.min(length, other.length)
while(index < len && charAt(index) == other.charAt(index)) index += 1
substring(0, index)
}
def compareTo(other: MyString): Int = {
val len = math.min(length, other.length)
for(i <- 0 until len) {
if(charAt(i) > other.charAt(i)) return 1
if(charAt(i) < other.charAt(i)) return -1
}
length-other.length
}
def >(other: MyString) = compareTo(other) > 0
def <(other: MyString) = compareTo(other) < 0
override def equals(other: Any) = other.isInstanceOf[MyString] && compareTo(other.asInstanceOf[MyString]) == 0
override def toString() = "\"" + string.substring(startIndex, endIndex) + "\""
}
def readFile(name: String) = new MyString(io.Source.fromFile(name).getLines.mkString(" ").replaceAll("\\s+", " "))
def makeList(str: MyString) = (0 until str.length).map(i => str.substring(i)).toIndexedSeq
val string1 = readFile("WarAndPeace.txt")
val string2 = readFile("MobyDick.txt")
val (list1, list2) = (makeList(string1).sorted, makeList(string2).sorted)
var longestMatch = new MyString("")
var (index1, index2) = (0,0)
while(index1 < list1.size && index2 < list2.size) {
val lcs = list1(index1).longestCommonSubstring(list2(index2))
if(lcs.length > longestMatch.length) longestMatch = lcs
if(list1(index1) < list2(index2)) index1 += 1
else index2 += 1
}
println(longestMatch)

Related

Modifying String in Java

I have a String , String a = newNumber + "*" + nn + "+" + difference;
the newNumber = 106 , nn = 3 and difference = 3.
so the output should be as follow ;
Output :
106*3+3
I would like to modify the String so that the output becomes (35*3+1)*3+3 and then with this new String I would like to modify it again so that it becomes ((11*3+2)*3+1)*3+3
Basically I just need to replace the newNumber which was 106 and kept changing to 11, as you can see I'm trying to modify only the newNumber and replacing it with another while keeping the entire String untouched , I'm just replacing and adding to it , how can this be achieved ?
The output should be like this,
Output :
106*3+3
(35*3+1)*3+3
((11*3+2)*3+1)*3+3
I'm solving an equation with steps , the formulas don't matter I'm just trying to figure out how can I modify the String by replacing the newNumber with a another number and adding new brackets to the equation.
I hope I wrote my problem in a way you would understand , I'd really appreciate the help.
I could not get to the same output which you have but here the code which try to solve this problem I think it might give you little help though which you could solve the problem.
Breaking the number until its prime number and adding the prime numbers to the result. Since we are replacing and appending with strings its better to use StringBuilder.
import java.io.PrintStream;
import java.util.Arrays;
public class StringSimplification {
public static PrintStream out = System.out;
public static final boolean prime[];
public static final int SIZE = 1000000;
static {
prime = new boolean[SIZE];
Arrays.fill(prime, true);
prime[0] = prime[1] = false;
//Sieve of Eratosthenes algorithm to find weather number is prime
for (int i = 2; i < SIZE; i++)
if (prime[i])
for (int j = i * 2; j < SIZE; j += i)
prime[j] = false;
}
//simplifies your String expression
public static String simplify(final String expression) {
StringBuilder result = new StringBuilder("");
String exp = "";
for (char ch : expression.toCharArray()) {
if (Character.isDigit(ch))
exp += ch;
else {
if (isNumber(exp)) {
String simplified = getExpression(Integer.parseInt(exp));
result.append(simplified+ch);
exp = "";//clearing exp
};
}
}
result.append(exp);
return result.toString();
}
//returns weather number is prime or not
static boolean isPrime(final int val) {
return prime[val];
}
static String getExpression(final int val) {
if (val == 0 || val == 1 || prime[val])
return "(" + val + ")";
int prev = 1;
int div = 1;
for (int i = 1; i < val; i++) {
if (val % i == 0) {
prev = i;
div = val / i;
}
}
return getExpression(prev) + "*" + getExpression(div);
}
//Check's weather the expression is number
public static boolean isNumber(final String s) {
for (var c : s.toCharArray())
if (!Character.isDigit(c))
return false;
return s.length() > 0;
}
public static void main(final String... $) {
out.println(simplify("106*3+3"));
out.println(simplify("1024*3+3"));
}
}
Output:
(53)*(2)*(3)+3
(2)*(2)*(2)*(2)*(2)*(2)*(2)*(2)*(2)*(2)*(3)+3
You can’t actually modify Strings, but you can use replaceFirst() like this:
s = s.replaceFirst("106", "(35*3+1)");
s = s.replaceFirst("35", "(11*3+2)");
etc
Strings in java are immutable. You will have to use StringBuilder or String Buffer
However if you insist then you may try(from what I understood of the pattern)
int num = 106;
String rep = "";
String S = "106*3+3";
String target;
int b = 1;
int largestfactor = 1;
System.out.println(S);
for (int i = num; i > 0; i--) {
for (int j = 1; j < (num - b); j++) {
if ((num - b) % j == 0)
largestfactor = j;
}
target = "" + num;
rep = "(" + largestfactor + "*" + (num - b) / largestfactor + ")" + "+" + b;
S = S.replace(target,rep);
System.out.println(S);
num = largestfactor;
b++;
if(b>num)
break;
}

Get the length of longest substring with distinct characters in any given input string

I have implemented the logic for this problem but it succeeds only for smaller strings and time limit exceeds as well as the memory usage is very high for larger string inputs ( as below ). My intent is to implement the problem in the same approach as I did below but with possible enhancements. Could someone tell me where am I doing wrong?
Input :
"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabcde"
Output :
5
Code :
class Solution {
public static int lengthOfLongestSubstring(String s) {
int len=0;
char[] cstr = null;
LinkedHashSet<Character> lhs = new LinkedHashSet<Character>();
HashMap<String,Integer> hm = new HashMap<String,Integer>();
String newstr = "";
//Generating all possible substrings using for loops as below and storing them in a hashmap.
for( int i=0;i<s.length();i++ )
{
for( int j=i+1;j<=s.length();j++ )
{
//System.out.println(s.substring(i,j));
if( hm.containsKey( s.substring(i,j) ) )
hm.put( s.substring(i,j), hm.get( s.substring(i,j) ) + 1 );
else
hm.put( s.substring(i,j), 1 );
}
}
//Iterating through hashmap, get the distinct characters and store them in a linked hashset( for every iteration )
for( Map.Entry<String,Integer> m : hm.entrySet() ) {
//System.out.println( m.getKey()+"-"+m.getValue() );
cstr = m.getKey().toCharArray();
//System.out.println(cstr);
for( int i=0;i<cstr.length;i++ )
lhs.add(cstr[i]);
//System.out.println(lhs);
for( Character c : lhs.toArray( new Character[lhs.size()]) )
newstr += c.toString();
//System.out.println(newstr);
if( s.contains(newstr) && newstr.length() > len )
len = newstr.length();
cstr = null;
lhs.clear();
newstr = "";
}
return len;
}
public static void main(String[] args) {
System.out.println(lengthOfLongestSubstring("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabcde"));
}
}
I'll give a solution in python, which is good with time complexity (linear). You can translate it to java if need be.
def fun(s):
last_index_of = [-1]*26
last_index_of[ord(s[0])-ord('a')]=0
length, max = 1,1
for i in range(1,len(s)):
index = ord(s[i])-ord('a')
if last_index_of[index]==-1 or i-last_index_of[index]>length:
length = length+1
if length>max:
max=length
else:
length = i - last_index_of[index]
last_index_of[index]=i
return max
Some results:
fun("abccab")
3
fun("abacc")
3
fun("abcdeabcdeff")
6
fun("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabcde")
5
This is my naive bruteforce solution using plain indices and avoiding to create additional strings.
public static String findLongestSubstringWithDistinctChars(String str)
{
Objects.requireNonNull(str);
int maxFrameBeginIndex = 0;
int maxFrameLength = 0;
for (int frameBeginIndex = 0; frameBeginIndex < str.length(); frameBeginIndex++)
{
for (int frameLength = 1; frameBeginIndex + frameLength <= str.length(); frameLength++)
{
// Check that last char in frame is not contained in the rest of the frame.
char lastChar = str.charAt(frameBeginIndex + frameLength - 1);
int frameEndIndexExclusive = frameBeginIndex + frameLength - 1;
if (containsChar(str, frameBeginIndex, frameEndIndexExclusive, lastChar))
{
break;
}
if (frameLength > maxFrameLength)
{
maxFrameBeginIndex = frameBeginIndex;
maxFrameLength = frameLength;
}
}
}
return str.substring(maxFrameBeginIndex, maxFrameBeginIndex + maxFrameLength);
}
/**
* Checks whether str's substring of range [beginIndex,endIndexEclusive[
* contains the character c.
*
* #param str
* the string
* #param beginIndex
* the begin index
* #param endIndexExclusive
* the end index, exclusive
* #param c
* the character to check
* #return whether c is contained in the substring
*/
private static boolean containsChar(String str, int beginIndex, int endIndexExclusive, char c)
{
for (int i = beginIndex; i < endIndexExclusive; i++)
{
if (str.charAt(i) == c)
{
return true;
}
}
return false;
}
Your sample input needs ~1sec on coderpad.io (interesting benchmarking ^_^).
I might have missed something in the OP, but a simple solution might just use substring, indexof etc. on the string itself. My solution looks like:
public class LengthOfLongestSubstringWithUniqueCharacters {
public static void main(String[] args) {
String input = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabcde";
int maxLength = -1;
int startPos = -1;
for (int i = 0; i < input.length(); i++) {
int len = getSubstringWithUniqueCharacters(input, i);
if (len > maxLength) {
maxLength = len;
startPos = i;
}
}
System.out.println("String: " + input.substring(startPos, startPos + maxLength));
System.out.println("Max: " + maxLength);
}
public static int getSubstringWithUniqueCharacters(String s, int pos) {
int i = 0;
for (; i < s.length() - pos; i++) {
if (s.substring(pos, pos + i).indexOf(s.charAt(pos + i)) != -1) {
return i;
}
}
return i;
}
}
/**
Output:
String: abcde
Max: 5
*/

Java: Find the longest substring without any number and at least one upper case character

Came across a programming exercise and was stuck. The problem is:
You need to define a valid password for an email but the only
restrictions are:
The password must contain one uppercase character
The password should not have numeric digit
Now, given a String, find the length of the longest substring which
is a valid password. For e.g Input Str = "a0Ba" , the output should
be 2 as "Ba" is the valid substring.
I used the concept of longest substring without repeating characters which I already did before but was unable to modify it to find the solution to above problem. My code for longest substring without repeating characters is:
public int lengthOfLongestSubstring(String s) {
int n = s.length();
Set<Character> set = new HashSet<>();
int ans = 0, i = 0, j = 0;
while (i < n && j < n) {
// try to extend the range [i, j]
if (!set.contains(s.charAt(j))){
set.add(s.charAt(j++));
ans = Math.max(ans, j - i);
}
else {
set.remove(s.charAt(i++));
}
}
return ans;
}
How about
final String input = "a0Ba";
final int answer = Arrays.stream(input.split("[0-9]+"))
.filter(s -> s.matches("(.+)?[A-Z](.+)?"))
.sorted((s1, s2) -> s2.length() - s1.length())
.findFirst()
.orElse("")
.length();
out.println(answer);
Arrays.stream(input.split("[0-9]+")) splits the original string into an array of strings. The separator is any sequence of numbers (numbers aren't allowed so they serve as separators). Then, a stream is created so I can apply functional operations and transformations.
.filter(s -> s.matches("(.+)?[A-Z](.+)?")) keeps into the stream only strings that have at least one upper-case letter.
.sorted((s1, s2) -> s2.length() - s1.length()) sorts the stream by length (desc).
.findFirst() tries to get the first string of the stream.
.orElse("") returns an empty string if no string was found.
.length(); gets the length of the string.
I suggest that you split your String to have an array of strings without digit:
yourString.split("[0-9]")
Then iterate over this array (says array a) to get the longest string that contains one Upper case character:
a[i].matches("[a-z]*[A-Z]{1}[a-z]*");
You can use a simple array. The algorithm to use would be a dynamic sliding window. Here is an example of a static sliding window: What is a Sliding Window
The algorithm should be as follows:
Keep track of 2 indexes of the array of char. These 2 indexes will be referred to as front and back here, representing the front and back of the array.
Have an int (I'll name it up here) to keep track of the number of upper case char.
Set all to 0.
Use a while loop that terminates if front > N where N is the number of char given.
If the next char is not a number, add 1 to front. Then check if that char is upper case. If so, add 1 to up.
If up is at least 1, update the maximum length if necessary.
If the next char is a number, continue checking the following char if they are also numbers. Set front to the first index where the char is not a number and back to front-1.
Output the maximum length.
You can use my solution which runs in O(n) time and finds the longest part without any digit and with a capital letter:
String testString = "skjssldfkjsakdfjlskdssfkjslakdfiop7adfaijsldifjasdjfil8klsasdfŞdijpfjapodifjpoaidjfpoaidjpfi9a";
int startIndex = 0;
int longestStartIndex = 0;
int endIndex = 0;
int index = 0;
int longestLength = Integer.MIN_VALUE;
boolean foundUpperCase = false;
while(index <= testString.length()) {
if (index == testString.length() || Character.isDigit(testString.charAt(index))) {
if (foundUpperCase && index > startIndex && index - startIndex > longestLength) {
longestLength = index - startIndex;
endIndex = index;
longestStartIndex = startIndex;
}
startIndex = index + 1;
foundUpperCase = false;
} else if (Character.isUpperCase(testString.charAt(index))) {
foundUpperCase = true;
}
index++;
}
System.out.println(testString.substring(longestStartIndex, endIndex));
You don't need regular expressions. Just use a few integers to act as index pointers into the string:
int i = 0;
int longestStart = 0;
int longestEnd = 0;
while (i < s.length()) {
// Skip past all the digits.
while (i < s.length() && Character.isDigit(s.charAt(i))) {
++i;
}
// i now points to the start of a substring
// or one past the end of the string.
int start = i;
// Keep a flag to record if there is an uppercase character.
boolean hasUppercase = false;
// Increment i until you hit another digit or the end of the string.
while (i < s.length() && !Character.isDigit(s.charAt(i))) {
hasUppercase |= Character.isUpperCase(s.charAt(i));
++i;
}
// Check if this is longer than the longest so far.
if (hasUppercase && i - start > longestEnd - longestStart) {
longestEnd = i;
longestStart = start;
}
}
String longest = s.substring(longestStart, longestEnd);
Ideone demo
Whilst more verbose than regular expressions, this has the advantage of not creating any unnecessary objects: the only object created is the longest string, right at the end.
I am using modification of Kadane algorithm to search the required password length. You may use isNumeric() and isCaps() function or include inline if statements. I have shown below with functions.
public boolean isNumeric(char x){
return (x>='0'&&x<='9');
}
public boolean isCaps(char x){
return (x>='A'&&x<='Z');
}
public int maxValidPassLen(String a)
{
int max_so_far = 0, max_ending_here = 0;
boolean cFlag = false;
int max_len = 0;
for (int i = 0; i < a.length(); i++)
{
max_ending_here = max_ending_here + 1;
if (isCaps(a.charAt(i))){
cFlag = true;
}
if (isNumeric(a.charAt(i))){
max_ending_here = 0;
cFlag = false;
}
else if (max_so_far<max_ending_here){
max_so_far = max_ending_here;
}
if(cFlag&&max_len<max_so_far){
max_len = max_so_far;
}
}
return max_len;
}
Hope this helps.
There are plenty of good answers here but thought it might be of interest to add one that uses Java 8 streams:
IntStream.range(0, s.length()).boxed()
.flatMap(b -> IntStream.range(b + 1, s.length())
.mapToObj(e -> s.substring(b, e)))
.filter(t -> t.codePoints().noneMatch(Character::isDigit))
.filter(t -> t.codePoints().filter(Character::isUpperCase).count() == 1)
.mapToInt(String::length).max();
If you wanted the string (rather than just the length), then the last line can be replaced with:
.max(Comparator.comparingInt(String::length));
Which returns an Optional<String>.
I'd use Streams and Optionals:
public static String getBestPassword(String password) throws Exception {
if (password == null) {
throw new Exception("Invalid password");
}
Optional<String> bestPassword = Stream.of(password.split("[0-9]"))
.filter(TypeErasure::containsCapital)
.sorted((o1, o2) -> o1.length() > o2.length() ? 1 : 0)
.findFirst();
if (bestPassword.isPresent()) {
return bestPassword.get();
} else {
throw new Exception("No valid password");
}
}
/**
* Returns true if word contains capital
*/
private static boolean containsCapital(String word) {
return word.chars().anyMatch(Character::isUpperCase);
}
Be sure to write some unit tests
public String pass(String str){
int length = 0;
boolean uppercase = false;
String s= "";
String d= "";
for(int i=0;i<str.length();i++){
if(Character.isUpperCase(str.charAt(i)) == true){
uppercase = true;
s = s+str.charAt(i);
}else if(Character.isDigit(str.charAt(i)) == true ){
if(uppercase == true && s.length()>length){
d = s;
s = "";
length = s.length();
uppercase = false;
}
}else if(i==str.length()-1&&Character.isDigit(str.charAt(i))==false){
s = s + str.charAt(i);
if(uppercase == true && s.length()>length){
d = s;
s = "";
length = s.length();
uppercase = false;
}
}else{
s = s+str.charAt(i);
}
}
return d;}
Here is a simple solution with Scala
def solution(str: String): Int = {
val strNoDigit = str.replaceAll("[0-9]", "-")
strAlphas = strNoDigit.split("-")
Try(strAlphas.filter(_.trim.find(_.isUpper).isDefined).maxBy(_.size))
.toOption
.map(_.length)
.getOrElse(-1)
}
Another solution using tail recursion in Scala
def solution2(str: String): Int = {
val subSt = new ListBuffer[Char]
def checker(str: String): Unit = {
if (str.nonEmpty) {
val s = str.head
if (!s.isDigit) {
subSt += s
} else {
subSt += '-'
}
checker(str.tail)
}
}
checker(str)
if (subSt.nonEmpty) {
val noDigitStr = subSt.mkString.split("-")
Try(noDigitStr.filter(s => s.nonEmpty && s.find(_.isUpper).isDefined).maxBy(_.size))
.toOption
.map(_.length)
.getOrElse(-1)
} else {
-1
}
}
This is a dynamic programming problem. You can solve this yourself using a matrix. It is easy enough. Just give it a try. Take the characters of the password as the rows and columns of the matrix. Add the diagonals if the current character appended to the last character forms a valid password. Start with the smallest valid password as the initial condition.
String[] s = testString.split("[0-9]");
int length = 0;
int index = -1;
for(int i=0; i< s.length; i++){
if(s[i].matches("[a-z]*.*[A-Z].*[a-z]*")){
if(length <= s[i].length()){
length = s[i].length();
index = i;
}
}
}
if(index >= 0){
System.out.println(s[index]);
}
//easiest way to do it:
String str = "a0Ba12hgKil8oPlk";
String[] str1 = str.split("[0-9]+");
List<Integer> in = new ArrayList<Integer>();
for (int i = 0; i < str1.length; i++) {
if (str1[i].matches("(.+)?[A-Z](.+)?")) {
in.add(str1[i].length());
} else {
System.out.println(-1);
}
}
Collections.sort(in);
System.out.println("string : " + in.get(in.size() - 1));
This is my solution with c#. I tested a range of strings and it gave me the correct value. Used Split. No Regex or Substrings. Let me know if it works; open to improvements and corrections.
public static int validPassword(string str)
{
List<int> strLength = new List<int>();
if (!(str.All(Char.IsDigit)))
{
//string str = "a0Bb";
string[] splitStrs = str.Split(new char[] { '0', '1', '2', '3', '4', '5', '6', '7', '8', '9' });
//check if each string contains a upper case
foreach (string s in splitStrs)
{
//Console.WriteLine(s);
if (s.Any(char.IsUpper) && s.Any(char.IsLower) || s.Any(char.IsUpper))
{
strLength.Add(s.Length);
}
}
if (strLength.Count == 0)
{
return -1;
}
foreach (int i in strLength)
{
//Console.WriteLine(i);
}
return strLength.Max();
}
else
{
return -1;
}
}
I think this solution takes care of all the possible corner cases. It passed all the test cases in an Online Judge. It is a dynamic sliding window O(n) solution.
public class LongestString {
public static void main(String[] args) {
// String testString = "AabcdDefghIjKL0";
String testString = "a0bb";
int startIndex = 0, endIndex = 0;
int previousUpperCaseIndex = -1;
int maxLen = 0;
for (; endIndex < testString.length(); endIndex++) {
if (Character.isUpperCase(testString.charAt(endIndex))) {
if (previousUpperCaseIndex > -1) {
maxLen = Math.max(maxLen, endIndex - startIndex);
startIndex = previousUpperCaseIndex + 1;
}
previousUpperCaseIndex = endIndex;
} else if (Character.isDigit(testString.charAt(endIndex))) {
if (previousUpperCaseIndex > -1) {
maxLen = Math.max(maxLen, endIndex - startIndex);
}
startIndex = endIndex + 1;
previousUpperCaseIndex = -1;
}
}
if (previousUpperCaseIndex > -1)
maxLen = Math.max(maxLen, endIndex - startIndex);
System.out.println(maxLen);
}}
function ValidatePassword(password){
var doesContainNumber = false;
var hasUpperCase = false;
for(var i=0;i<password.length;i++){
if(!isNaN(password[i]))
doesContainNumber = true;
if(password[i] == password[i].toUpperCase())
hasUpperCase = true;
}
if(!doesContainNumber && hasUpperCase)
return true;
else
return false;
}
function GetLongestPassword(inputString){
var longestPassword = "";
for(var i=0;i<inputString.length-1;i++)
{
for (var j=i+1;j<inputString.length;j++)
{
var substring = inputString.substring(i,j+1);
var isValid = ValidatePassword(substring);
if(isValid){
if(substring.length > longestPassword.length)
{
longestPassword = substring;
}
}
}
}
if(longestPassword == "")
{
return "No Valid Password found";
}
else
{
return longestPassword;
}
}

Java: Find the longest sequential same character array

I am a new guy to java. I want to find the longest sequential same character array in a input character arrays. For example,this character array bddfDDDffkl, the longest is DDD, and this one: rttttDDddjkl, the longest is tttt.
I use the following code to deal with this problem. But, I want to improve my code, For example, if there are two same length arrays (for example rtttgHHH, there are two longest: ttt and HHH), how to solve this problem?
Thanks in advance.
My following code:
public class SeqSameChar {
public static void main (String[] args) {
int subLength = 0;
Scanner sc = new Scanner(System.in);
String[] num = null;
num = sc.nextLine().split(" ");
String[] number = new String[num.length];
for(int i = 0; i< number.length;i++) {
number[i] = String.valueOf(num[i]);
}
subLength =length(number,num.length);
System.out.println(subLength);
for(int i = index; i < index+subLength; i++) {
System.out.print(number[i]);
}
System.out.println(c==c1);
}
public static int index;
//to calculate the longest contiguous increasing sequence
public static int length(String[] A,int size){
if(size<=0)return 0;
int res=1;
int current=1;
for(int i=1;i<size;i++){
if(A[i].equals(A[i-1])){
current++;
}
else{
if(current>res){
index=i-current;
res=current;
}
current=1;
}
}
return res;
}
}
This algorithm will work perfectly fine for what you want to develop:
Before that, let me make it clear that if you want to check repeatitions of 2 different characters same number of times, you have to run a for loop in reverse to identify the 2nd character. So if the 2nd character is not same as the first one identified, and also if it's number of repeatitions are the same, you print both the characters or else, just print the single character you find at the first for loop because both the characters are going to be same.
public static void main(String[] args) {
Scanner sc = new Scanner(System.in);
System.out.println("Enter String 1: ");
String A1 = sc.nextLine();
MaxRepeat(A1);
}
public static void MaxRepeat(String A) {
int count = 1;
int max1 = 1;
char mostrepeated1 = ' ';
for(int i = 0; i < A.length()-1;i++) {
char number = A.charAt(i);
if(number == A.charAt(i+1)) {
count++;
if(count>max1) {
max1 = count;
mostrepeated1 = number;
}
continue;
}
count = 1;
}
count = 1;
int max2 = 1;
char mostrepeated2 = ' ';
for(int i = A.length()-1; i>0; i--) {
char number = A.charAt(i);
if(number == A.charAt(i-1)) {
count++;
if(count>max2) {
max2 = count;
mostrepeated2 = number;
}
continue;
}
count = 1;
}
if((max1==max2) && (mostrepeated1==mostrepeated2)) {
System.out.println("Most Consecutively repeated character is: " + mostrepeated1 + " and is repeated " + max1 + " times.");
}
else if((max1==max2) && (mostrepeated1!=mostrepeated2)) {
System.out.println("Most continously repeated characters are: " + mostrepeated1 + " and " + mostrepeated2 + " and they are repeated " + max1 + " times");
}
}
I'll give you a Scala implementation for that problem.
Here it is the automatic test (in BDD style with ScalaTest)
import org.scalatest._
class RichStringSpec extends FlatSpec with MustMatchers {
"A rich string" should "find the longest run of consecutive characters" in {
import Example._
"abceedd".longestRun mustBe Set("ee", "dd")
"aeebceeedd".longestRun mustBe Set("eee")
"aaaaaaa".longestRun mustBe Set("aaaaaaa")
"abcdefgh".longestRun mustBe empty
}
}
Following is the imperative style implementation, with nested loops and mutable variables as you would normally choose to do in Java or C++:
object Example {
implicit class RichString(string: String) {
def longestRun: Set[String] = {
val chunks = mutable.Set.empty[String]
val ilen = string.length
var gmax = 0
for ((ch, curr) <- string.zipWithIndex) {
val chunk = mutable.ListBuffer(ch)
var next = curr + 1
while (next < ilen && string(next) == ch) {
chunk += string(next)
next = next + 1
}
gmax = chunk.length max gmax
if (gmax > 1) chunks += chunk.mkString
}
chunks.toSet.filter( _.length == gmax )
}
}
}
Following is a functional-style implementation, hence no variables, no loops but tail recursion with result accumulators and pattern matching to compare each character with the next one (Crazy! Isn't it?):
object Example {
implicit class RichString(string: String) {
def longestRun: Set[String] = {
def recurse(chars: String, chunk: mutable.ListBuffer[Char], chunks: mutable.Set[String]): Set[String] = {
chars.toList match {
case List(x, y, _*) if (x == y) =>
recurse(
chars.tail,
if (chunk.isEmpty) chunk ++= List(x, y) else chunk += y,
chunks
)
case Nil =>
// terminate recursion
chunks.toSet
case _ => // x != y
recurse(
chars.tail,
chunk = mutable.ListBuffer(),
chunks += chunk.mkString
)
}
}
val chunks = recurse(string, mutable.ListBuffer(), mutable.Set.empty[String])
val max = chunks.map(_.length).max
if (max > 0) chunks.filter( _.length == max ) else Set()
}
}
}
For example, for the given "aeebceeedd" string, both implementations above will build the following set of chunks (repeating characters)
Set("ee", "eee", "dd")
and they will filter those chunks having the maximum length (resulting "eee").

implement basic string compression

I am working on question 1.5 from the book Cracking The Coding interview. The problem is to take a string "aabcccccaaa" and turn it into a2b1c5a3.
If the compressed string is not smaller than the original string, then return the original string.
My code is below. I used an ArrayList because I would not know how long the compressed string would be.
My output is [a, 2, b, 1, c, 5], aabc, []. When the program gets to the end of string, it doesn't have a character to compare the last character too.
import java.util.*;
import java.io.*;
public class stringCompression {
public static void main(String[] args) {
String a = "aabcccccaaa";
String b = "aabc";
String v = "aaaa";
check(a);
System.out.println("");
check(b);
System.out.println("");
check(v);
}
public static void check(String g){
ArrayList<Character> c = new ArrayList<Character>();
int count = 1;
int i = 0;
int h = g.length();
for(int j = i + 1; j < g.length(); j++)
{
if(g.charAt(i) == g.charAt(j)){
count++;
}
else {
c.add(g.charAt(i));
c.add((char)( '0' + count));
i = j;
count = 1;
}
}
if(c.size() == g.length()){
System.out.print(g);
}
else{
System.out.print(c);
}
}
}
In the last loop you're not adding the result to the array. When j = g.length() still needs to add the current char and count to the array. So you could check the next value of j before increment it:
for(int j = i + 1; j < g.length(); j++)
{
if(g.charAt(i) == g.charAt(j)){
count++;
}
else {
c.add(g.charAt(i));
c.add((char)( '0' + count));
i = j;
count = 1;
}
if((j + 1) = g.length()){
c.add(g.charAt(i));
c.add((char)( '0' + count));
}
}
I would use a StringBuilder rather than an ArrayList to build your compressed String. When you start compressing, the first character should already be added to the result. The count of the character will be added once you've encountered a different character. When you've reached the end of the String you should just be appending the remaining count to the result for the last letter.
public static void main(String[] args) throws Exception {
String[] data = new String[] {
"aabcccccaaa",
"aabc",
"aaaa"
};
for (String d : data) {
System.out.println(compress(d));
}
}
public static String compress(String str) {
StringBuilder compressed = new StringBuilder();
// Add first character to compressed result
char currentChar = str.charAt(0);
compressed.append(currentChar);
// Always have a count of 1
int count = 1;
for (int i = 1; i < str.length(); i++) {
char nextChar = str.charAt(i);
if (currentChar == nextChar) {
count++;
} else {
// Append the count of the current character
compressed.append(count);
// Set the current character and count
currentChar = nextChar;
count = 1;
// Append the new current character
compressed.append(currentChar);
}
}
// Append the count of the last character
compressed.append(count);
// If the compressed string is not smaller than the original string, then return the original string
return (compressed.length() < str.length() ? compressed.toString() : str);
}
Results:
a2b1c5a3
aabc
a4
You have two errors:
one that Typo just mentioned, because your last character was not added;
and another one, if the original string is shorter like "abc" with only three chars: "a1b1c1" has six chars (the task is "If the compressed string is not smaller than the original string, then return the original string.")
You have to change your if statement, ask for >= instead of ==
if(c.size() >= g.length()){
System.out.print(g);
} else {
System.out.print(c);
}
Use StringBuilder and then iterate on the input string.
private static string CompressString(string inputString)
{
var count = 1;
var compressedSb = new StringBuilder();
for (var i = 0; i < inputString.Length; i++)
{
// Check if we are at the end
if(i == inputString.Length - 1)
{
compressedSb.Append(inputString[i] + count.ToString());
break;
}
if (inputString[i] == inputString[i + 1])
count++;
else
{
compressedSb.Append(inputString[i] + count.ToString());
count = 1;
}
}
var compressedString = compressedSb.ToString();
return compressedString.Length > inputString.Length ? inputString : compressedString;
}

Categories