How to convert arabic char into hexstring in java

How to convert arabic char into hexstring in java - java

The following code is returning in ?????? as output , when str has Arabic string :
String str="مرحبا",str2="";
for (int i = 0; i < str.length(); ++i) {
str2 += displayChar(str.charAt(reorder[i]));
System.out.print(reorder[i]);
}
System.out.println(str2); // output is : ?????
and :
String displayChar(char c) {
if (c < '\u0010') {
return "0x0" + Integer.toHexString(c);
} else if (c < '\u0020' || c >= '\u007f') {
return "0x" + Integer.toHexString(c);
} else {
return c+"";
}
}
For
reorder is integer array only carries the new index (order) of the character in the given str
Here is the complete code, .. hope it will help you to understand the problem :
/*
* (C) Copyright IBM Corp. 1999, All Rights Reserved
*
* version 1.0
*/
import java.io.*;
/**
* A simple command-line interface to the BidiReference class.
* <p>
* This prompts the user for an ASCII string, runs the reference
* algorithm on the string, and displays the results to the terminal.
* An empty return to the prompt exits the program.
* <p>
* ASCII characters are preassigned various bidi direction types.
* These types can be displayed by the user for reference by
* typing <code>-display</code> at the prompt. More help can be
* obtained by typing <code>-help</code> at the prompt.
*/
public class BidiReferenceTest {
BufferedReader reader = new BufferedReader(new InputStreamReader(System.in));
PrintWriter writer = new PrintWriter(new BufferedOutputStream(System.out));
BidiReferenceTestCharmap charmap = BidiReferenceTestCharmap.TEST_ARABIC;
byte baseDirection = -1;
/**
* Run the interactive test.
*/
public static void main(String args[]) {
new BidiReferenceTest().run();
}
void run() {
//printHelp();
while (true) {
writer.print("> ");
writer.flush();
String input;
try {
input = reader.readLine();
}
catch (Exception e) {
writer.println(e);
continue;
}
if (input.length() == 0) {
writer.println("Bye!");
writer.flush();
return;
}
if (input.charAt(0) == '-') { // command
int limit = input.indexOf(' ');
if (limit == -1) {
limit = input.length();
}
String cmd = input.substring(0, limit);
if (cmd.equals("-display")) {
charmap.dumpInfo(writer);
} else if (cmd.equals("-english")) {
charmap = BidiReferenceTestCharmap.TEST_ENGLISH;
charmap.dumpInfo(writer);
} else if (cmd.equals("-hebrew")) {
charmap = BidiReferenceTestCharmap.TEST_HEBREW;
charmap.dumpInfo(writer);
} else if (cmd.equals("-arabic")) {
charmap = BidiReferenceTestCharmap.TEST_ARABIC;
charmap.dumpInfo(writer);
} else if (cmd.equals("-mixed")) {
charmap = BidiReferenceTestCharmap.TEST_MIXED;
charmap.dumpInfo(writer);
} else if (cmd.equals("-baseLTR")) {
baseDirection = 0;
} else if (cmd.equals("-baseRTL")) {
baseDirection = 1;
} else if (cmd.equals("-baseDefault")) {
baseDirection = -1;
} else {
}
} else {
String ss= runSample(input);
System.out.println(ss);
Character.UnicodeBlock block = Character.UnicodeBlock.of(Character.codePointAt(ss, 0));
}
}
}
String runSample(String str) {
String str2 = "";
try {
charmap = BidiReferenceTestCharmap.TEST_ARABIC;
byte[] codes = charmap.getCodes(str);
baseDirection = 1;
BidiReference bidi = new BidiReference(codes, baseDirection); // baseDirection = 1
int[] reorder = bidi.getReordering(new int[] { codes.length });
/*
writer.println("base level: " + bidi.getBaseLevel() + (baseDirection != -1 ? " (forced)" : ""));
// output original text
for (int i = 0; i < str.length(); ++i) {
displayChar(str.charAt(i));
}
writer.println();
*/
// output visually ordered text
for (int i = 0; i < str.length(); ++i) {
str2 += displayChar(str.charAt(reorder[i]));
System.out.print(reorder[i]);
}
return str2;
}
catch (Exception e) {
return "";
}
}
String displayChar(char c) {
if (c < '\u0010') {
return "0x0" + Integer.toHexString(c);
} else if (c < '\u0020' || c >= '\u007f') {
return "0x" + Integer.toHexString(c);
} else {
return c+"";
}
}
}

If I were to guess I'd say you run under Windows with the default console settings (i.e. Raster fonts) and you run the Java program from the console and not within Eclipse.
If that is the case, then just change the console settings to use a TrueType font (Lucida Console or Consolas) and you should see boxes instead of question marks. Those won't look right either, but at least it's the actual text instead of question marks.
Side note: Question marks are a common occurrence if something does support Unicode but converts it into another encoding somewhere, e.g. Latin 1.

One problem is that your terminal probably does not support Unicode characters correctly (this might not be the only problem).

Related

Decode String in Java

I am trying to convert this Python Solution in Java. For some reason, my Java Solution is not working. How can this be done correctly?
https://leetcode.com/problems/decode-string/description/
Given an encoded string, return its decoded string. The encoding rule is: k[encoded_string], where the encoded_string inside the square brackets is being repeated exactly k times. Note that k is guaranteed to be a positive integer.
You may assume that the input string is always valid; there are no extra white spaces, square brackets are well-formed, etc. Furthermore, you may assume that the original data does not contain any digits and that digits are only for those repeat numbers, k. For example, there will not be input like 3a or 2[4].
The test cases are generated so that the length of the output will never exceed 105.
Example 1:
Input: s = "3[a]2[bc]"
Output: "aaabcbc"
Example 2:
Input: s = "3[a2[c]]"
Output: "accaccacc"
Python Solution:
class Solution:
def decodeString(self, s: str) -> str:
stack = []
for char in s:
if char is not "]":
stack.append(char)
else:
sub_str = ""
while stack[-1] is not "[":
sub_str = stack.pop() + sub_str
stack.pop()
multiplier = ""
while stack and stack[-1].isdigit():
multiplier = stack.pop() + multiplier
stack.append(int(multiplier) * sub_str)
return "".join(stack)
Java Attempt:
class Solution {
public String decodeString(String s) {
Deque<String> list = new ArrayDeque<String>();
String subword = "";
String number = "";
for (int i = 0; i < s.length(); i++) {
if (s.charAt(i) != ']' ) {
list.add(String.valueOf(s.charAt(i)));
}
else {
subword = "";
while (list.size() > 0 && !list.getLast().equals("[") ) {
subword = list.pop() + subword;
}
if (list.size() > 0) list.pop();
number = "";
while (list.size() > 0 && isNumeric(list.getLast())){
number = list.pop() + number;
}
for (int j = 1; (isNumeric(number) && j <= Integer.parseInt(number)); j++) list.add(subword);
}
}
return String.join("", list);
}
public static boolean isNumeric(String str) {
try {
Double.parseDouble(str);
return true;
} catch(NumberFormatException e){
return false;
}
}
}

The reason why your posted code is not working is because the pop() method in python removes the last element by default.
But in Java, the ArrayDeque class's pop() method removes the first element.
In order to emulate the python code with the ArrayDeque, you'll need to use the removeLast() method of the ArrayDeque instance instead.

public class Solution{
public static String decodeString(String s) {
StringBuilder stack = new StringBuilder();
for(char c : s.toCharArray()) {
if(c != ']') {
stack.append(c);
} else {
StringBuilder sub_str = new StringBuilder();
while(stack.charAt(stack.length() - 1) != '[') {
sub_str.insert(0, stack.charAt(stack.length() - 1));
stack.deleteCharAt(stack.length() - 1);
}
stack.deleteCharAt(stack.length() - 1);
StringBuilder multiplier = new StringBuilder();
while(stack.length() > 0 && Character.isDigit(stack.charAt(stack.length() - 1))) {
multiplier.insert(0, stack.charAt(stack.length() - 1));
stack.deleteCharAt(stack.length() - 1);
}
for(int i = 0; i < Integer.parseInt(multiplier.toString()); i++) {
stack.append(sub_str);
}
}
}
return stack.toString();
}
public static void main(String[] args) {
System.out.println( decodeString("3[a2[c]]"));
//Output: "accaccacc"
System.out.println( decodeString("3[a]2[bc]"));
//Output: "aaabcbc"
}
}

Abbreviation expander for a given lexicon

I am trying to write a program that will allows users to make short blog entries by typing abbreviations for common words. On completion of the input, Program will expand the abbreviations according to the lexicon defined.
Conditions
A substituted word must be the shortest word that can be formed by adding zero or more letters (or punctuation symbols) to the abbreviation.
If two or more unique words can be formed by adding the same number of letters, then the abbreviation should be printed as it is.
Input
The input is divided into two sections.
The first section is the lexicon itself, and the second section is a user's blog entry that needs to be expanded. The sections are divided by a single | character.
For example:-
cream chocolate every ever does do ice is fried friend friends lick like floor favor flavor flower best but probably poorly say says that what white our you your strawberry storyboard the | wht flvr ic crm ds yr bst fnd lke? ur frds lk stbry, bt choc s prly th bs flr vr!
Output
what flavor ice cream does your best friend like? our friends lk strawberry, but chocolate is poorly the best floor ever!
I have written the program for this and tested it locally with many different test cases with success but it fails on submission to test server.
An automated Test suit runs to validate the program’s output on its submission to test server. In case of failure, details of the failing test case/cases are not visible.
Below is the program
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.StringTokenizer;
public class BlogEntry {
/**
* #param args
*/
public static void main(String[] args) {
// TODO Auto-generated method stub
String[][] info = readInput();
String[] output = inputExpander(info[0],info[1]);
//System.out.println();
for(int i = 0; i < output.length; ++i) {
if(i!=0)
System.out.print(" ");
System.out.print(output[i]);
}
}
public static String[][] readInput() {
BufferedReader bufferReader = new BufferedReader(new InputStreamReader(
System.in));
String input = null;
String[][] info = new String[2][];
String[] text;
String[] abbr;
try {
input = bufferReader.readLine();
StringTokenizer st1 = new StringTokenizer(input, "|");
String first = "", second = "";
int count = 0;
while (st1.hasMoreTokens()) {
++count;
if(count == 1)
first = st1.nextToken();
if(count == 2)
second = st1.nextToken();
}
st1 = new StringTokenizer(first, " ");
count = st1.countTokens();
text = new String[count];
count = 0;
while (st1.hasMoreTokens()) {
text[count] = st1.nextToken();
count++;
}
st1 = new StringTokenizer(second, " ");
count = st1.countTokens();
abbr = new String[count];
count = 0;
while (st1.hasMoreTokens()) {
abbr[count] = st1.nextToken();
count++;
}
info[0] = text;
info[1] = abbr;
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return info;
}
public static String[] inputExpander(String[] text, String[] abbr) {
String[] output = new String[abbr.length];
boolean result;
for (int i = 0; i < abbr.length; ++i) {
String abbrToken = abbr[i];
char[] char_abbr_token = abbrToken.toCharArray();
for (int j = 0; j < text.length; ++j) {
String textToken = text[j];
boolean flag2 = false;
if ((char_abbr_token[char_abbr_token.length - 1] == '!')
|| (char_abbr_token[char_abbr_token.length - 1] == '?')
|| (char_abbr_token[char_abbr_token.length - 1] == ',')
|| (char_abbr_token[char_abbr_token.length - 1] == ';')) {
flag2 = true;
}
char[] char_text_token = textToken.toCharArray();
result = ifcontains(char_text_token, char_abbr_token);
if (result) {
int currentCount = textToken.length();
int alreadyStoredCount = 0;
if (flag2)
textToken = textToken
+ char_abbr_token[char_abbr_token.length - 1];
if (output[i] == null)
output[i] = textToken;
else {
alreadyStoredCount = output[i].length();
char[] char_stored_token = output[i].toCharArray();
if ((char_stored_token[char_stored_token.length - 1] == '!')
|| (char_stored_token[char_stored_token.length - 1] == '?')
|| (char_stored_token[char_stored_token.length - 1] == ',')
|| (char_stored_token[char_stored_token.length - 1] == ';')) {
alreadyStoredCount -= 1;
}
if (alreadyStoredCount > currentCount) {
output[i] = textToken;
} else if (alreadyStoredCount == currentCount) {
output[i] = abbrToken;
}
}
}
}
if(output[i] == null)
output[i] = abbrToken;
}
return output;
}
public static boolean ifcontains(char[] char_text_token,
char[] char_abbr_token) {
int j = 0;
boolean flag = false;
for (int i = 0; i < char_abbr_token.length; ++i) {
flag = false;
for (; j < char_text_token.length; ++j) {
if ((char_abbr_token[i] == '!') || (char_abbr_token[i] == '?')
|| (char_abbr_token[i] == ',')
|| (char_abbr_token[i] == ';')) {
flag = true;
break;
}
if (char_abbr_token[i] == char_text_token[j]) {
flag = true;
break;
}
}
if (!flag)
return flag;
}
//System.out.println("match found" + flag);
return flag;
}
}
Can someone direct/hint me to/about the possible use case which I may have missed in the implementation? Thanks in advance.

Ran your program with duplicate word in input (lexicon). When a word is repeated in the lexicon, it is not getting expanded because the check is only on the length(line no. 112) of the stored word not its content.
I think you need to check:-
If same word appears more than once then expand.
If 2 or more unique words of same length appear then keep it short.

How would I approach solving this:
Parse the input, tokenize the lexicon and the text.
For each (possibly abbreviated) token like choc convert it to a regular expression like .*c.*h.*o.*c.*.
Search for shortest lexicon words matching this regular expression. Replace the text token if exactly one is found, otherwise leave it alone.
It is quite hard to say what's wrong with your code without careful debugging. It is hard to understand what one or the other part of the code does, it's not quite self-evident.

How could I handle the '\n' character in my console formatter and why it prints the text 'randomly'?

I'm trying to make a console formatter for my application and so far I've made it! But there are two big problems that I want to fix.
First: how do I handle the \n char in it?
and Second: Why it prints the text 'randomly'?
So this is my code:
public synchronized void formattedLog(PrintStream stream, String message, String prefix, int consoleWidth, int fontWidth) {
int maxCharsOnOneLine = consoleWidth / fontWidth;
int maxCharsWithPrefix = maxCharsOnOneLine - prefix.length();
StringBuilder builder = new StringBuilder();
String toAppend = "";
int charAt = 0;
while (true) {
String word = "";
int charWordIndex = charAt + 1;
if (message.charAt(charAt) == ' ' || charAt == 0) {
if (charAt == 0) {
word += message.charAt(0);
}
while (charWordIndex < message.length()) {
char nextChar = message.charAt(charWordIndex);
if (nextChar != ' ' && nextChar != '\n') {
word += nextChar;
} else {
break;
}
charWordIndex++;
}
}
if (word != "") {
if (word.length() <= maxCharsWithPrefix) {
if ((toAppend + word).length() <= maxCharsWithPrefix) {
toAppend += word + " ";
} else {
builder.append(prefix);
builder.append(toAppend);
builder.append("\n");
toAppend = "";
toAppend += word + " ";
}
} else {
int wordChar = 0;
toAppend += "";
while (true) {
if (toAppend.length() < maxCharsWithPrefix) {
if (wordChar < word.length()) toAppend += word.charAt(wordChar);
} else {
builder.append(prefix);
builder.append(toAppend);
builder.append("\n");
toAppend = "";
toAppend += word.charAt(wordChar);
}
wordChar++;
if (wordChar >= word.length()) {
toAppend += " ";
break;
}
}
}
}
charAt++;
if (charAt >= message.length()) {
builder.append(prefix);
builder.append(toAppend);
builder.append("\n");
break;
}
}
stream.println(builder.toString());
}
First: Now it does not 'parse' the strings with \n in it, but if it would it would look like this:
LOG> Text (\n)
Other text
LOG> ...
And I don't want that. I want it to print
LOG> Text (\n)
LOG> Other Text
LOG> ...
How could I achieve this?
And second by random I mean this:
I have this simple code to try the thing out; here it is:
public static void main(String[] args) {
//executorService = Executors.newFixedThreadPool(10);
consoleFormatter = new ConsoleFormatter();
consoleFormatter.logDefault("This is a long text to try the functionality of the method. This is literally ", "LOG> ", 600);
consoleFormatter.logDefault("eeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeee test words words words", "LOG> ", 200);
try {
throwFakeExecption(new NullPointerException("Fake null pointer exception."));
} catch (Exception e) {
consoleFormatter.logError(e.toString(), "ERROR> ", 1000);
consoleFormatter.logError(e.getStackTrace()[0].toString(), "ERROR> ", 1000);
}
}
public static void throwFakeExecption(Exception ex) throws Exception {
throw ex;
}
Sometimes it prints it like this:
Other times like this:
And even other times like this:
I want the text to be in order and I don't know why it is doing that.
How could I fix these two problems?

Programmatically remove comments from Java File [duplicate]

I have a java project and i have used comments in many location in various java files in the project. Now i need to remove all type of comments : single line , multiple line comments .
Please provide automation for removing comments. using tools or in eclipse etc.
Currently i am manually trying to remove all commetns

You can remove all single- or multi-line block comments (but not line comments with //) by searching for the following regular expression in your project(s)/file(s) and replacing by $1:
^([^"\r\n]*?(?:(?<=')"[^"\r\n]*?|(?<!')"[^"\r\n]*?"[^"\r\n]*?)*?)(?<!/)/\*[^\*]*(?:\*+[^/][^\*]*)*?\*+/
It's possible that you have to execute it more than once.
This regular expression avoids the following pitfalls:
Code between two comments /* Comment 1 */ foo(); /* Comment 2 */
Line comments starting with an asterisk: //***NOTE***
Comment delimiters inside string literals: stringbuilder.append("/*");; also if there is a double quote inside single quotes before the comment
To remove all single-line comments, search for the following regular expression in your project(s)/file(s) and replace by $1:
^([^"\r\n]*?(?:(?<=')"[^"\r\n]*?|(?<!')"[^"\r\n]*?"[^"\r\n]*?)*?)\s*//[^\r\n]*
This regular expression also avoids comment delimiters inside double quotes, but does NOT check for multi-line comments, so /* // */ will be incorrectly removed.

I had to write somehting to do this a few weeks ago. This should handle all comments, nested or otherwise. It is long, but I haven't seen a regex version that handled nested comments properly. I didn't have to preserve javadoc, but I presume you do, so I added some code that I belive should handle that. I also added code to support the \r\n and \r line separators. The new code is marked as such.
public static String removeComments(String code) {
StringBuilder newCode = new StringBuilder();
try (StringReader sr = new StringReader(code)) {
boolean inBlockComment = false;
boolean inLineComment = false;
boolean out = true;
int prev = sr.read();
int cur;
for(cur = sr.read(); cur != -1; cur = sr.read()) {
if(inBlockComment) {
if (prev == '*' && cur == '/') {
inBlockComment = false;
out = false;
}
} else if (inLineComment) {
if (cur == '\r') { // start untested block
sr.mark(1);
int next = sr.read();
if (next != '\n') {
sr.reset();
}
inLineComment = false;
out = false; // end untested block
} else if (cur == '\n') {
inLineComment = false;
out = false;
}
} else {
if (prev == '/' && cur == '*') {
sr.mark(1); // start untested block
int next = sr.read();
if (next != '*') {
inBlockComment = true; // tested line (without rest of block)
}
sr.reset(); // end untested block
} else if (prev == '/' && cur == '/') {
inLineComment = true;
} else if (out){
newCode.append((char)prev);
} else {
out = true;
}
}
prev = cur;
}
if (prev != -1 && out && !inLineComment) {
newCode.append((char)prev);
}
} catch (IOException e) {
e.printStackTrace();
}
return newCode.toString();
}

you can try it with the java-comment-preprocessor:
java -jar ./jcp-6.0.0.jar --i:/sourceFolder --o:/resultFolder -ef:none --r
source

I made a open source library and uploaded to github, its called CommentRemover you can remove single line and multiple line Java Comments.
It supports remove or NOT remove TODO's.
Also it supports JavaScript , HTML , CSS , Properties , JSP and XML Comments too.
There is a little code snippet how to use it (There is 2 type usage):
First way InternalPath
public static void main(String[] args) throws CommentRemoverException {
// root dir is: /Users/user/Projects/MyProject
// example for startInternalPath
CommentRemover commentRemover = new CommentRemover.CommentRemoverBuilder()
.removeJava(true) // Remove Java file Comments....
.removeJavaScript(true) // Remove JavaScript file Comments....
.removeJSP(true) // etc.. goes like that
.removeTodos(false) // Do Not Touch Todos (leave them alone)
.removeSingleLines(true) // Remove single line type comments
.removeMultiLines(true) // Remove multiple type comments
.startInternalPath("src.main.app") // Starts from {rootDir}/src/main/app , leave it empty string when you want to start from root dir
.setExcludePackages(new String[]{"src.main.java.app.pattern"}) // Refers to {rootDir}/src/main/java/app/pattern and skips this directory
.build();
CommentProcessor commentProcessor = new CommentProcessor(commentRemover);
commentProcessor.start();
}
Second way ExternalPath
public static void main(String[] args) throws CommentRemoverException {
// example for externalInternalPath
CommentRemover commentRemover = new CommentRemover.CommentRemoverBuilder()
.removeJava(true) // Remove Java file Comments....
.removeJavaScript(true) // Remove JavaScript file Comments....
.removeJSP(true) // etc..
.removeTodos(true) // Remove todos
.removeSingleLines(false) // Do not remove single line type comments
.removeMultiLines(true) // Remove multiple type comments
.startExternalPath("/Users/user/Projects/MyOtherProject")// Give it full path for external directories
.setExcludePackages(new String[]{"src.main.java.model"}) // Refers to /Users/user/Projects/MyOtherProject/src/main/java/model and skips this directory.
.build();
CommentProcessor commentProcessor = new CommentProcessor(commentRemover);
commentProcessor.start();
}

This is an old post but this may help someone who enjoys working on command line like myself:
The perl one-liner below will remove all comments:
perl -0pe 's|//.*?\n|\n|g; s#/\*(.|\n)*?\*/##g;' test.java
Example:
cat test.java
this is a test
/**
*This should be removed
*This should be removed
*/
this should not be removed
//this should be removed
this should not be removed
this should not be removed //this should be removed
Output:
perl -0pe 's#/\*\*(.|\n)*?\*/##g; s|//.*?\n|\n|g' test.java
this is a test
this should not be removed
this should not be removed
this should not be removed
If you want get rid of multiple blank lines as well:
perl -0pe 's|//.*?\n|\n|g; s#/\*(.|\n)*?\*/##g; s/\n\n+/\n\n/g' test.java
this is a test
this should not be removed
this should not be removed
this should not be removed
EDIT: Corrected regex

Dealing with source code is hard unless you know more on the writing of comment.
In the more general case, you could have // or /* in text constants. So your really need to parse the file at a syntaxic level, not only lexical. IMHO the only bulletproof solution would be to start for example with the java parser from openjdk.
If you know that your comments are never deeply mixed with the code (in my exemple comments MUST be full lines), a python script could help
multiple = False
for line in text:
stripped = line.strip()
if multiple:
if stripped.endswith('*/'):
multiple = False
continue
elif stripped.startswith('/*'):
multiple = True
elif stripped.startswith('//'):
pass
else:
print(line)

If you are using Eclipse IDE, you could make regex do the work for you.
Open the search window (Ctrl+F), and check 'Regular Expression'.
Provide the expression as
/\*\*(?s:(?!\*/).)*\*/
Prasanth Bhate has explained it in Tool to remove JavaDoc comments?

public class TestForStrings {
/**
* The main method.
*
* #param args
* the arguments
* #throws Exception
* the exception
*/
public static void main(String args[]) throws Exception {
String[] imports = new String[100];
String fileName = "Menu.java";
// This will reference one API at a time
String line = null;
try {
FileReader fileReader = new FileReader(fileName);
// Always wrap FileReader in BufferedReader.
BufferedReader bufferedReader = new BufferedReader(fileReader);
int startingOffset = 0;
// This will reference one API at a time
List<String> lines = Files.readAllLines(Paths.get(fileName),
Charset.forName("ISO-8859-1"));
// remove single line comments
for (int count = 0; count < lines.size(); count++) {
String tempString = lines.get(count);
lines.set(count, removeSingleLineComment(tempString));
}
// remove multiple lines comment
for (int count = 0; count < lines.size(); count++) {
String tempString = lines.get(count);
removeMultipleLineComment(tempString, count, lines);
}
for (int count = 0; count < lines.size(); count++) {
System.out.println(lines.get(count));
}
} catch (FileNotFoundException ex) {
System.out.println("Unable to open file '" + fileName + "'");
} catch (IOException ex) {
System.out.println("Error reading file '" + fileName + "'");
} catch (Exception e) {
}
}
/**
* Removes the multiple line comment.
*
* #param tempString
* the temp string
* #param count
* the count
* #param lines
* the lines
* #return the string
*/
private static List<String> removeMultipleLineComment(String tempString,
int count, List<String> lines) {
try {
if (tempString.contains("/**") || (tempString.contains("/*"))) {
int StartIndex = count;
while (!(lines.get(count).contains("*/") || lines.get(count)
.contains("**/"))) {
count++;
}
int endIndex = ++count;
if (StartIndex != endIndex) {
while (StartIndex != endIndex) {
lines.set(StartIndex, "");
StartIndex++;
}
}
}
} catch (Exception e) {
// Do Nothing
}
return lines;
}
/**
* Remove single line comments .
*
* #param line
* the line
* #return the string
* #throws Exception
* the exception
*/
private static String removeSingleLineComment(String line) throws Exception {
try {
if (line.contains(("//"))) {
int startIndex = line.indexOf("//");
int endIndex = line.length();
String tempoString = line.substring(startIndex, endIndex);
line = line.replace(tempoString, "");
}
if ((line.contains("/*") || line.contains("/**"))
&& (line.contains("**/") || line.contains("*/"))) {
int startIndex = line.indexOf("/**");
int endIndex = line.length();
String tempoString = line.substring(startIndex, endIndex);
line = line.replace(tempoString, "");
}
} catch (Exception e) {
// Do Nothing
}
return line;
}
}

This is what I came up with yesterday.
This is actually homework I got from school so if anybody reads this and finds a bug before I turn it in, please leave a comment =)
ps. 'FilterState' is a enum class
public static String deleteComments(String javaCode) {
FilterState state = FilterState.IN_CODE;
StringBuilder strB = new StringBuilder();
char prevC=' ';
for(int i = 0; i<javaCode.length(); i++){
char c = javaCode.charAt(i);
switch(state){
case IN_CODE:
if(c=='/')
state = FilterState.CAN_BE_COMMENT_START;
else {
if (c == '"')
state = FilterState.INSIDE_STRING;
strB.append(c);
}
break;
case CAN_BE_COMMENT_START:
if(c=='*'){
state = FilterState.IN_COMMENT_BLOCK;
}
else if(c=='/'){
state = FilterState.ON_COMMENT_LINE;
}
else {
state = FilterState.IN_CODE;
strB.append(prevC+c);
}
break;
case ON_COMMENT_LINE:
if(c=='\n' || c=='\r') {
state = FilterState.IN_CODE;
strB.append(c);
}
break;
case IN_COMMENT_BLOCK:
if(c=='*')
state=FilterState.CAN_BE_COMMENT_END;
break;
case CAN_BE_COMMENT_END:
if(c=='/')
state = FilterState.IN_CODE;
else if(c!='*')
state = FilterState.IN_COMMENT_BLOCK;
break;
case INSIDE_STRING:
if(c == '"' && prevC!='\\')
state = FilterState.IN_CODE;
strB.append(c);
break;
default:
System.out.println("unknown case");
return null;
}
prevC = c;
}
return strB.toString();
}

private static int find(String s, String t, int start) {
int ret = s.indexOf(t, start);
return ret < 0 ? Integer.MAX_VALUE : ret;
}
private static int findSkipEsc(String s, String t, int start) {
while(true) {
int ret = find(s, t, start);
if( ret == Integer.MAX_VALUE) return -1;
int esc = find(s, "\\", start);
if( esc > ret) return ret;
start += 2;
}
}
private static String removeLineCommnt(String s) {
int i, start = 0;
while (0 <= (i = find(s, "//", start))) { //Speed it up
int j = find(s, "'", start);
int k = find(s, "\"", start);
int first = min(i, min(j, k));
if (first == Integer.MAX_VALUE) return s;
if (i == first) return s.substring(0, i);
//skipp quoted string
start = first+1;
if (k == first) { // " asdas\"dasd "
start = findSkipEsc(s,"\"",start);
if (start < 0) return s;
start++;
continue;
}
//if j == first ' asda\'sasd ' --- not in JSON
start = findSkipEsc(s,"'\"'",start);
if (start < 0) return s;
start++;
}
return s;
}
static String removeLineCommnts(String s) {
if (!s.contains("//")) return s; //Speed it up
return Arrays.stream(s.split("[\\n\\r]+")).
map(Common::removeLineCommnt).
collect(Collectors.joining("\n"));
}

arrayListOutOfBoundsException

This is my class Debugger. Can anyone try and run it and see whens wrong? Ive spent hours on it already. :(
public class Debugger {
private String codeToDebug = "";
public Debugger(String code) {
codeToDebug = code;
}
/**
* This method itterates over a css file and adds all the properties to an arraylist
*/
public void searchDuplicates() {
boolean isInside = false;
ArrayList<String> methodStorage = new ArrayList();
int stored = 0;
String[] codeArray = codeToDebug.split("");
try {
int i = 0;
while(i<codeArray.length) {
if(codeArray[i].equals("}")) {
isInside = false;
}
if(isInside && !codeArray[i].equals(" ")) {
boolean methodFound = false;
String method = "";
int c = i;
while(!methodFound) {
method += codeArray[c];
if(codeArray[c+1].equals(":")) {
methodFound = true;
} else {
c++;
}
}
methodStorage.add(stored, method);
System.out.println(methodStorage.get(stored));
stored++;
boolean stillInside = true;
int skip = i;
while(stillInside) {
if(codeArray[skip].equals(";")) {
stillInside = false;
} else {
skip++;
}
}
i = skip;
}
if(codeArray[i].equals("{")) {
isInside = true;
}
i++;
}
} catch(ArrayIndexOutOfBoundsException ar) {
System.out.println("------- array out of bounds exception -------");
}
}
/**
* Takes in String and outputs the number of characters it contains
* #param input
* #return Number of characters
*/
public static int countString(String input) {
String[] words = input.split("");
int counter = -1;
for(int i = 0; i<words.length; i++){
counter++;
}
return counter;
}
public static void main(String[] args) {
Debugger h = new Debugger("body {margin:;\n}");
h.searchDuplicates();
}
}

Any place where an element of an array is being obtained without a bounds check after the index is manipulated is an candidate for an ArrayIndexOutOfBoundsException.
In the above code, there are at least two instances where the index is being manipulated without being subject to a bounds check.
The while loop checking the !methodFound condition
The while loop checking the stillInside condition
In those two cases, the index is being manipulated by incrementing or adding a value to the index, but there are no bound checks before an element is being obtained from the String[], therefore there is no guarantee that the index being specified is not outside the bounds of the array.

I think this block of codes can create your problem
int c = i;
while(!methodFound) {
method += codeArray[c];
if(codeArray[c+1].equals(":")) {
methodFound = true;
} else {
c++;
}
}
int skip = i;
while(stillInside) {
if(codeArray[skip].equals(";")) {
stillInside = false;
} else {
skip++;
}
}
i = skip;
The reason is that if the condition is true, and i = codeArray.length - 1. The c + 1 will create the error of ArrayIndexOutOfBound

Try evaluating if your index exists in the array...
adding:
while (!methodFound && c < codeArray.length) {
while (stillInside && skip < codeArray.length) {
if (i < codeArray.length && codeArray[i].equals("{")) {
so, your code looks like:
public class Debugger {
private String codeToDebug = "";
public Debugger(String code) {
codeToDebug = code;
}
/**
* This method itterates over a css file and adds all the properties to an
* arraylist
*/
public void searchDuplicates() {
boolean isInside = false;
List<String> methodStorage = new ArrayList<String>();
int stored = 0;
String[] codeArray = codeToDebug.split("");
try {
int i = 0;
while (i < codeArray.length) {
if (codeArray[i].equals("}")) {
isInside = false;
}
if (isInside && !codeArray[i].equals(" ")) {
boolean methodFound = false;
String method = "";
int c = i;
while (!methodFound && c < codeArray.length) {
method += codeArray[c];
if (codeArray[c].equals(":")) {
methodFound = true;
} else {
c++;
}
}
methodStorage.add(stored, method);
System.out.println(methodStorage.get(stored));
stored++;
boolean stillInside = true;
int skip = i;
while (stillInside && skip < codeArray.length) {
if (codeArray[skip].equals(";")) {
stillInside = false;
} else {
skip++;
}
}
i = skip;
}
if (i < codeArray.length && codeArray[i].equals("{")) {
isInside = true;
}
i++;
}
} catch (ArrayIndexOutOfBoundsException ar) {
System.out.println("------- array out of bounds exception -------");
ar.printStackTrace();
}
}
/**
* Takes in String and outputs the number of characters it contains
*
* #param input
* #return Number of characters
*/
public static int countString(String input) {
String[] words = input.split("");
int counter = -1;
for (int i = 0; i < words.length; i++) {
counter++;
}
return counter;
}
public static void main(String[] args) {
Debugger h = new Debugger("body {margin:prueba;\n}");
h.searchDuplicates();
}
}
Also, declaring implementation types is a bad practice, because of that in the above code i Change the ArrayList variable = new ArrayList() to List variable = new ArrayList()

I couldn't resist to implement this task of writing a CSS parser in a completely different way. I have split the task of parsing into many small ones.
The smallest is called skipWhitespace, since you will need it everywhere when parsing text files.
The next one is parseProperty, which reads one property of the form name:value;.
Based on that, parseSelector reads a complete CSS selector, starting with the selector name, an opening brace, possibly many properties, and finishing with the closing brace.
Still based on that, parseFile reads a complete file, consisting of possibly many selectors.
Note how carefully I checked whether the index is small enough. I did that before every access to the chars array.
I used LinkedHashMaps to save the properties and the selectors, because these kinds of maps remember in which order the things have been inserted. Normal HashMaps don't do that.
The task of parsing a text file is generally quite complex, and this program only attempts to handle the basics of CSS. If you need a full CSS parser, you should definitely look for a ready-made one. This one cannot handle #media or similar things where you have nested blocks. But it shouldn't bee too difficult to add it to the existing code.
This parser will not handle CSS comments very well. It only expects them at a few places. If comments appear in other places, the parser will not treat them as comments.
import java.util.LinkedHashMap;
import java.util.Map;
public class CssParser {
private final char[] chars;
private int index;
public Debugger(String code) {
this.chars = code.toCharArray();
this.index = 0;
}
private void skipWhitespace() {
/*
* Here you should also skip comments in the CSS file, which either look
* like this comment or start with a // and go until the end of line.
*/
while (index < chars.length && Character.isWhitespace(chars[index]))
index++;
}
private void parseProperty(String selector, Map<String, String> properties) {
skipWhitespace();
// get the CSS property name
StringBuilder sb = new StringBuilder();
while (index < chars.length && chars[index] != ':')
sb.append(chars[index++]);
String propertyName = sb.toString().trim();
if (index == chars.length)
throw new IllegalArgumentException("Expected a colon at index " + index + ".");
// skip the colon
index++;
// get the CSS property value
sb.setLength(0);
while (index < chars.length && chars[index] != ';' && chars[index] != '}')
sb.append(chars[index++]);
String propertyValue = sb.toString().trim();
/*
* Here is the check for duplicate property definitions. The method
* Map.put(Object, Object) always returns the value that had been stored
* under the given name before.
*/
String previousValue = properties.put(propertyName, propertyValue);
if (previousValue != null)
throw new IllegalArgumentException("Duplicate property \"" + propertyName + "\" in selector \"" + selector + "\".");
if (index < chars.length && chars[index] == ';')
index++;
skipWhitespace();
}
private void parseSelector(Map<String, Map<String, String>> selectors) {
skipWhitespace();
// get the CSS selector
StringBuilder sb = new StringBuilder();
while (index < chars.length && chars[index] != '{')
sb.append(chars[index++]);
String selector = sb.toString().trim();
if (index == chars.length)
throw new IllegalArgumentException("CSS Selector name \"" + selector + "\" without content.");
// skip the opening brace
index++;
skipWhitespace();
Map<String, String> properties = new LinkedHashMap<String, String>();
selectors.put(selector, properties);
while (index < chars.length && chars[index] != '}') {
parseProperty(selector, properties);
skipWhitespace();
}
// skip the closing brace
index++;
}
private Map<String, Map<String, String>> parseFile() {
Map<String, Map<String, String>> selectors = new LinkedHashMap<String, Map<String, String>>();
while (index < chars.length) {
parseSelector(selectors);
skipWhitespace();
}
return selectors;
}
public static void main(String[] args) {
CssParser parser = new CssParser("body {margin:prueba;A:B;a:Arial, Courier New, \"monospace\";\n}");
Map<String, Map<String, String>> selectors = parser.parseFile();
System.out.println("There are " + selectors.size() + " selectors.");
for (Map.Entry<String, Map<String, String>> entry : selectors.entrySet()) {
String selector = entry.getKey();
Map<String, String> properties = entry.getValue();
System.out.println("Selector " + selector + ":");
for (Map.Entry<String, String> property : properties.entrySet()) {
String name = property.getKey();
String value = property.getValue();
System.out.println(" Property name \"" + name + "\" value \"" + value + "\"");
}
}
}
}

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.

How to convert arabic char into hexstring in java - java

One problem is that your terminal probably does not support Unicode characters correctly (this might not be the only problem).

Related

Decode String in Java

Abbreviation expander for a given lexicon

How could I handle the '\n' character in my console formatter and why it prints the text 'randomly'?

Programmatically remove comments from Java File [duplicate]

arrayListOutOfBoundsException

Categories

Resources