Preserving the punctuation after converting it to pig latin - java

Currently, I am able to translate an English word to pig latin. My lab assignment says that punctuation occurring before the word should be removed, stored, and prepended to the piglatinized word. Punctuation occurring after the word should be removed, stored, and appended to the piglatinized word. Any punctuation that is in the middle of the word is to be treated as a regular letter.
For example:
what? -> atwhay?
Oh!!! -> Ohway!!!
"hello" -> "ellohay"
don't -> on'tday
"pell-mell" -> "ell-mellpayā€¯
This is what I have right now to find and store the punctuations:
public static final String punct = ",./;:'\"?<>[]{}|`~!##$%^&*()";
String startPunct = "";
String endPunct = "";
for (int c = 0; c < s.length(); c++) {
for (int i = 0; i < punct.length(); i++) {
if (s.charAt(c) == punct.charAt(i)) {
startPunct = startPunct + s.charAt(c);
}
}
}
If needed, this is the basic idea of how I print my translated word:
s = s.substring(i) + s.substring(0, i) + "ay";
return s;
So the question is, how do I preserve the punctuation so that it appears in the beginning and at the end of the translated word (recursion preferably but regex is fine)?
Any help is much appreciated. Thanks in advance.

Some problems lend themselves to recursion but your task is not one of them, in my opinion. Hence the below code uses regular expressions.
Notes after the code.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
/**
* Converts an English word into pig latin. Algorithm follows.
* <ol>
* <li>All initial consonants are moved to the end of the word and <i>ay</i> is appended, for
* example <i>what</i> becomes <i>atwhay</i></li>
* <li>For words that begin with a vowel, <i>way</i> is appended to the word for example <i>oh</i>
* becomes <i>ohway</i>.</li>
* </ol>
* Additional stipulations include the following.
* <ol>
* <li>Initial punctuation and terminal punctuation are unchanged in the converted word, for
* example if the original word ends with a question mark then the converted word also ends with a
* question mark meaning that <i>what?</i> becomes <i>atwhay?</i></li>
* <li>Case sensitivity is preserved.</li>
* </ol>
*/
public class PigLatin {
private static final String VOWELS = "aeiou";
private static int getIndexOfFirstVowelInWord(String word) {
int index = -1;
if (word != null && !word.isBlank()) {
word = word.strip();
word = word.toLowerCase();
char[] letters = word.toCharArray();
for (int i = 0; i < letters.length; i++) {
if (VOWELS.indexOf(letters[i]) >= 0) {
index = i;
break;
}
}
}
return index;
}
/**
* First method invoked when this class launched via <tt>java</tt> command. Recognizes a single
* command argument which is the word to be converted into pig latin.
*
* #param args - <tt>java</tt> command arguments.
*/
public static void main(String[] args) {
if (args.length == 0) {
System.out.println("ARGS: word");
}
else {
System.out.printf("Word: ^%s^%n", args[0]);
Pattern pattern = Pattern.compile("^([!?\"'():;,.-]*)(\\w+[!?\"'():;,.-]*\\w+)([!?\"'():;,.-]*)$");
Matcher matcher = pattern.matcher(args[0]);
if (matcher.matches()) {
String initial = matcher.group(1);
String word = matcher.group(2);
word = word.strip();
String terminal = matcher.group(3);
int index = getIndexOfFirstVowelInWord(word);
if (index == 0) {
word += "way";
}
else {
String suffix = word.substring(0, index);
word = word.substring(index);
word += suffix;
word += "ay";
}
String result = initial + word + terminal;
System.out.println("Result: " + result);
}
else {
System.out.println("No match.");
}
}
}
}
I test for punctuation which is commonly found in prose, including the following.
exclamation mark
question mark
double quote
single quote
parentheses
colon
semi colon
comma
period
dash
The regular expression contains three groups.
First group is leading punctuation.
Second group is actual word, which may contain embedded punctuation.
Third group is trailing punctuation.
We only need to handle the second group. The handling algorithm is described in the class comments in the above code.
I tested the code for all the example words in your question and got your expected result for each word.

Related

Find words in String consisting of all distinct characters without using Java Collection Framework

I need your help. I am stuck on one problem, solving it for several hours.
*1. Find word containing only of various characters. Return first word if there are a few of such words.
2. #param words Input array of words
3. #return First word that containing only of various characters*
**public String findWordConsistingOfVariousCharacters(String[] words) {
throw new UnsupportedOperationException("You need to implement this method");
}**
#Test
public void testFindWordConsistingOfVariousCharacters() {
String[] input = new String[] {"aaaaaaawe", "qwer", "128883", "4321"};
String expectedResult = "qwer";
StringProcessor stringProcessor = new StringProcessor();
String result = stringProcessor.findWordConsistingOfVariousCharacters(input);
assertThat(String.format("Wrong result of method findWordConsistingOfVariousCharacters (input is %s)", Arrays.toString(input)), result, is(expectedResult));
}
Thank you in advance
Just go through the data and check whether each string is made up of only distinct characters:
public static boolean repeat(String str) {
char[] chars = str.toCharArray();
Arrays.sort(chars);//The same character will only appear in groups
for(int i = 1;i<chars.length;i++) {
if(chars[i] == chars[i - 1]) {
return false;//Same character appeared twice
}
}
return true;//There is no repeating character
}
The method above is used to check whether a string is made up of distinct characters, now loops through the data:
for(int i = 0;i<input.length;i++){
if(repeat(input[i])){
System.out.println("The answer is " + input[i] + " at index " + i);
break;//you find it! Now break the loop
}
}
Assuming the strings are all ASCII characters, use a boolean[] to mark if you have encountered that character in the word already:
boolean [] encountered = new boolean[256];
for (char c : word.toCharArray()) {
if (encountered[(int)c]) {
// not unique
} else {
encountered[(int)c] = true;
}
}

Count all substring that make a valid command using java

I am trying to make an assembly language parser, and from the given string I have to find all the valid commands. For a command to be valid, the following conditions have to be met:
the first letter is a lowercase English letter
next, it contains a sequence of zero or more of the following
characters: lowercase English letters, digits, and colons.
Next, it contains a forward slash /
Next, it contains a sequence of zero or more of the following
characters: lowercase English letters, digits.
Next, it contains a backward slash \
Next, it contains a sequence of one or more lowercase English letters.
e.g. given a command abc:/b1c\xy
there are six valid commands:
abc:/b1c\xy
bc:/b1c\xy
c:/b1c\xy
abc:/b1c\x
bc:/b1c\x
c:/b1c\x
I don't know anything about regular expression can someone please help me with it.
Following steps are to be followed to solve your problem.
Step 1: We need to find all possible subsets in increasing order for the given string.
For e.g. abc --> {a,b,ab,bc,abc} in increasing order.
Step 2: Now we have to check whether the string you find out is following the regex pattern or not.
Credits: I am going to use the regex pattern given by Varun Chaudhary.
Step 3 : If it matches I will return 1 and keep adding count for all possible valid subsets of your string.
Step 4: Print the result.
import java.util.regex.MatchResult;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
class Main
{
static int regexWork(String command) {
int count =0;
Pattern COMMAND_PATTERN = Pattern.compile("[a-z]{1}[a-z0-9:]*\\/[a-z0-9]*\\\\[a-z]+");
Matcher matcher = COMMAND_PATTERN.matcher(command);
//If matches returns 1
while (matcher.find())
count=1;
return count;
}
// Finding all subsets of given set[]
static void printSubsets(char set[])
{
int count =0;
int n = set.length;
// Pick starting point
for (int len = 1; len <= n; len++)
{
// Pick ending point
for (int i = 0; i <= n - len; i++)
{
StringBuffer sb = new StringBuffer();
// Print characters from current
// starting point to current ending
// point.
int j = i + len - 1;
for (int k = i; k <= j; k++)
sb.append(set[k]+"");
count+=Main.regexWork(sb.toString());
}
}
System.out.println(count);
}
// Driver code
public static void main(String[] args)
{
String set ="abc:/b1c\\xy"; //Can be any string for which you are checking
printSubsets(set.toCharArray()); //Passing char array
}
}
I suppose you can use the following regex that satisfies your conditions -
[a-z]{1}[a-z0-9:]*\/{1}[a-z0-9]*\\[a-z]+
JAVA CODE -
String command = "abc:/b1c\\xy";
Pattern COMMAND_PATTERN = Pattern.compile("[a-z]{1}[a-z0-9:]*\\/[a-z0-9]*\\\\[a-z]+");
Matcher matcher = COMMAND_PATTERN.matcher(command);
while (matcher.find())
count++;
System.out.println("MATCH COUNT = " + count);

Java Loops -- Strings of First and last characters of every word

I am currently in an Intro to Programming class at San Jose and as part of our assignment we were to create a class with methods that either return a string with the first letters of each word or the last letters of each word.
The instance variable "phrase" holds the phrase that is accessed in the methods.
Here are the rules:
The words are separated by spaces,
It starts with a letter,
It does not end with a space,
There are never 2 consecutive spaces,
There are never 2 consecutive digits or punctuation.
Both the firstLetter() and lastLetter() methods must return an empty string if the phrase is empty.
My question is: What is a more efficient solution to this problem? I am new to algorithms so I would appreciate a more seasoned approach to this simple problem. In the firstLetter() and the lastLetter() method, would I check the status of two characters at a time within the for loop or just one?
Here is my code:
/**
* Processes first and last letters of words
* #author (Adrian DeRose)
*/
public class StartToFinish
{
private String phrase;
/**
* Constructs a StartToFinish object
* #param myString the phase for this object
*/
public StartToFinish(String myString)
{
this.phrase = myString;
}
/**
* Gets first letter of every word in string.
*
* #return first letter of every word in string
*/
public String firstLetters()
{
String firstLetters = "";
if (Character.isLetter(this.phrase.charAt(0)))
{
firstLetters += this.phrase.substring(0,1);
}
for (int i = 1; i < this.phrase.length(); i++)
{
char currentCharacter = this.phrase.charAt(i);
String previousCharacter = Character.toString(this.phrase.charAt(i-1));
if (Character.isLetter(currentCharacter) && previousCharacter.equals(" "))
{
String characterString = Character.toString(currentCharacter);
firstLetters += characterString;
}
}
return firstLetters;
}
/**
* Gets last letter of every word in string.
*
* #return last letter of every word in string
*/
public String lastLetters()
{
String lastLetters = "";
char lastCharacter = this.phrase.charAt(lastIndex);
if (this.phrase.length() == 0)
{
return "";
}
for (int i = 1; i < this.phrase.length(); i++)
{
char currentCharacter = this.phrase.charAt(i);
char previousCharacter = this.phrase.charAt(i-1);
if (Character.isLetter(previousCharacter) && !Character.isLetter(currentCharacter))
{
String previousCharacterString = Character.toString(previousCharacter);
lastLetters += previousCharacterString;
}
}
if (Character.isLetter(lastCharacter))
{
lastLetters += Character.toString(lastCharacter);
}
return lastLetters;
}
}
Thank you!
i don't know if this is what you are looking for, but this is much more simple way to write the same (sorry for my English)
String a="john snow winter is comming";
String[] parts = a.split(" ");
for(String word:parts){
System.out.println("first letter "+word.charAt(0)+ " last letter "+word.charAt(word.length()-1));
}
I don't think so you have to write all those code just use java function:
String a = "Hello";
System.out.println("First:"+a.charAt(0));
System.out.println("Last:"+a.charAt(a.length()-1));
Output:
First:H
Last:o
One of the solution I will provide is :
1. Check if the phrase is empty in the constructor.
2. Begin with a split, and then do some check.
In the constructor (This isn't needed in your case btw)
splitedPhrase = phrase.split(' ');
In the dedicated function
public String firstLetters() {
String result = "";
for(String word : splitedPhrase) {
if (Character.isLetter(word.charAt(0)))
result+=word.charAt(0);
}
return result;
}
And you just have to change the charAt for the LastLetter function, like
word.charAt(word.length-1)
Hope this help, despite some people already posted, I think this will better do what your algortihm need.
If I understand your question correctly, I think this is what you're looking for:-
public class StartToFinish {
private String phrase;
private String[] words;
private String firstLetters = "";
private String lastLetters = "";
/**
* Constructs a StartToFinish object
*
* #param myString
* the phase for this object
*/
public StartToFinish(String myString) {
this.phrase = myString;
words = phrase.split(" ");
for (String string : words) {
if (string.length() == 0)
continue;
if (Character.isLetter(string.charAt(0))) {
firstLetters += string.charAt(0);
}
if (Character.isLetter(string.charAt(string.length() - 1))) {
lastLetters += string.charAt(string.length() - 1);
}
}
}
/**
* Gets first letter of every word in string.
*
* #return first letter of every word in string
*/
public String firstLetters() {
return firstLetters;
}
/**
* Gets last letter of every word in string.
*
* #return last letter of every word in string
*/
public String lastLetters() {
return lastLetters;
}
}

Java - Get first letter of string

I am trying to extract the first letters of each word in a sentence the user has spoken into my app. Currently if the user speaks "Hello World 2015" it inserts that into the text field. I wish to split this so if the user speaks "Hello World 2015" only "HW2015" is inserted into the text field.
final ArrayList<String> matches = data.getStringArrayListExtra(
RecognizerIntent.EXTRA_RESULTS);
The matches variable is storing the users input in an array.I have looked into using split but not sure exactly how this works.
How would I achieve this?
Thank You
pass this regex and your list into applyRegexToList
it reads:
(get first character) or (any continuous number) or (any character after a space)
(^.{0,1})|(\\d+)|((?<=\\s)[a-zA-z])
()
public static ArrayList<String> applyRegexToList(ArrayList<String> yourList, String regex){
ArrayList<String> matches = new ArrayList<String>();
// Create a Pattern object
Pattern r = Pattern.compile(regex);
for (String sentence:yourList) {
// Now create matcher object.
Matcher m = r.matcher(sentence);
String temp = "";
//while patterns are still being found, concat
while(m.find())
{
temp += m.group(0);
}
matches.add(temp);
}
return matches;
}
You can split a string into an array of string by doing this:
String[] result = my_string.split("\\s+"); // This is a regex for matching spaces
You could then loop over your array, taking the first character of each string:
// The string we'll create
String abbrev = "";
// Loop over the results from the string splitting
for (int i = 0; i < result.length; i++){
// Grab the first character of this entry
char c = result[i].charAt(0);
// If its a number, add the whole number
if (c >= '0' && c <= '9'){
abbrev += result[i];
}
// If its not a number, just append the character
else{
abbrev += c;
}
}

Compressing a string in Java

Not sure why my code isn't working. If I input qwwwwwwwwweeeeerrtyyyyyqqqqwEErTTT, I get qw9w5e2ry5y4qE2ET3T when I should be getting q9w5e2rt5y4qw2Er3T.
Run-length encoding (RLE) is a simple "compression algorithm" (an algorithm which takes a block of data and reduces its size, producing a block that contains the same information in less space). It works by replacing repetitive sequences of identical data items with short "tokens" that represent entire sequences. Applying RLE to a string involves finding sequences in the string where the same character repeats. Each such sequence should be replaced by a "token" consisting of:
the number of characters in the sequence
the repeating character
If a character does not repeat, it should be left alone.
For example, consider the following string:
qwwwwwwwwweeeeerrtyyyyyqqqqwEErTTT
After applying the RLE algorithm, this string is converted into:
q9w5e2rt5y4qw2Er3T
In the compressed string, "9w" represents a sequence of 9 consecutive lowercase "w" characters. "5e" represents 5 consecutive lowercase "e" characters, etc.
Write a program that takes a string as input, compresses it using RLE, and outputs the compressed string. Case matters - uppercase and lowercase characters should be considered distinct. You may assume that there are no digit characters in the input string. There are no other restrictions on the input - it may contain spaces or punctuation. There is no need to treat non-letter characters any differently from letters.
public class Compress{
public static void main(String[] args){
System.out.println("Enter a string: ");
String str = IO.readString();
int count = 0;
String result = "";
for (int i=1; i<=str.length(); i++) {
char a = str.charAt(i-1);
count = 1;
if (i-2 >= 0) {
while (i<=str.length() && str.charAt(i-1) == str.charAt(i-2)) {
count++;
i++;
}
}
if (count==1) {
result = result.concat(Character.toString(a));
}
else {
result = result.concat(Integer.toString(count).concat(Character.toString(a)));
}
}
IO.outputStringAnswer(result);
}
}
I would start at zero, and look forward:
public static void main(String[] args){
System.out.println("Enter a string: ");
String str = IO.readString();
int count = 0;
String result = "";
for (int i=0; i < str.length(); i++) {
char a = str.charAt(i);
count = 1;
while (i + 1 < str.length() && str.charAt(i) == str.charAt(i+1)) {
count++;
i++;
}
if (count == 1) {
result = result.concat(Character.toString(a));
} else {
result = result.concat(Integer.toString(count).concat(Character.toString(a)));
}
}
IO.outputStringAnswer(result);
}
Some outputs:
qwwwwwwwwweeeeerrtyyyyyqqqqwEErTTT => q9w5e2rt5y4qw2Er3T
qqwwwwwwwweeeeerrtyyyyyqqqqwEErTTT => 2q8w5e2rt5y4qw2Er3T
qqwwwwwwwweeeeerrtyyyyyqqqqwEErTXZ => 2q8w5e2rt5y4qw2ErTXZ
aaa => 3a
abc => abc
a => a

Categories