me and a friend are programming our own console in java, but we have Problems to adjust the lines correctly, because of the width of the unicode characters which can not be determined exactly. This leads to the problem that not only the line of the unicode, but also following lines are shifted.
Is there a way to determine the width of the unicodes?
Screenshots of the problem can be found bellow.
This is how it should look: https://abload.de/img/richtigslkmg.jpeg
This is an example in Terminal: https://abload.de/img/terminal7dj5o.jpeg
This is an example in PowerShell: https://abload.de/img/powershelln7je0.jpeg
This is an example in Visual Studio Code: https://abload.de/img/visualstudiocode4xkuo.jpeg
This is an example in Putty: https://abload.de/img/putty0ujsk.png
EDIT:
I am sorry that the question was unclear.
It is about the display width, in the example I try to determine the display length to have each line the same length.
The function real_length is to calculate/determine and return the display width.
here the example code:
public static void main(String[] args) {
String[] tests = {
"Peter",
"SHGAMI",
"Marcel №1",
"💏",
"👨❤️👨",
"👩❤️💋👩",
"👨👩👦"
};
for(String test : tests) test(test);
}
public static void test(String text) {
int max = 20;
for(int i = 0; i < max;i++) System.out.print("#");
System.out.println();
System.out.print(text);
int length = real_length(text);
for(int i = 0; i < max - length;i++) System.out.print("#");
System.out.println();
}
public static int real_length(String text) {
return text.length();
}
Unfortunately there is no easy solution to your deceptively simple question, for several reasons:
The width of the characters being rendered on the console might (and probably will) vary, based on the font being used. So the code would need to determine, or assume, the target font in order to calculate widths.
System.out is just a PrintStream that does not know or care about fonts and character width, so any solution has to be independent of that.
Even if you could determine the font being used on the console, and you had a way to determine the width of each character you were trying to render in that specific font, how would that help you? Knowing the variation in widths might conceivably allow you to cleverly tweak the lines being rendered so that they were aligned, but it's just as likely that it wouldn't be practicable.
A potential solution is to leave your code as it stands, and use a monospaced font on the console that println() is writing to, but there are still some major problems with that approach. First, you need to identify a font that is monospaced, but will also support all of the characters you want to render. This can be problematic when including emojis. Second, even if you identify such a font, you may find that all the glyphs for that font are not monospaced! Such a font will ensure that (say) a lowercase i and an uppercase W have the same width, but you can't also make that assumption for emojis, and you can't even assume that the "monospaced" emojis will all have the same non-standard width! Third, the font you identify (if it exists at all) would have to be available in your target environments (your PowerShell, your friend's PuTTY shell, etc.). That is not a major obstacle, but it is one more thing to worry about.
You may find that the rendered text varies by operating system. Your output may look aligned in a Linux terminal window, but that same output, using the same font, might be misaligned in a PowerShell window.
Given all that, a better approach might be to use Swing or JavaFX, where you have finer control over the output being rendered. Even if you are unfamiliar with those technologies, it wouldn't take too long to get something working, just by tweaking some sample code obtained through a search. And even allowing for the learning curve, it would still take less time than coming up with a robust solution for aligning arbitrary characters written to an arbitrary console, because that is a hard problem to solve.
Notes:
Your real_length() method is merely returning the number of code points in the supplied Java String. That relates to its internal representation, and has no direct correlation with the width of the rendered characters, which is determined by the font being used.
See Emoji exceed monospace character width, breaking column alignment #100730 where Microsoft have declined to address the issue for VS Code.
For SO question Java: how to align UTF Miscellaneous Symbols in plain text, see this answer which solved a similar but simpler problem, but only for the Command Prompt window on Windows.
tl;dr
Use code points rather than char. Avoid calling String#length.
input
+
"#".repeat( targetLength - input.codePoints().toArray().length )
Details
Your Question neglected to show any code. So I can only guess what you are doing and what might be the problem.
Avoid char
I am guessing that your goal is to append a certain number of NUMBER SIGN characters as needed to make a fixed-length row of text.
I am guessing the problem is that you are using the legacy char type, or its wrapper class Character. The char type has been essentially broken since Java 2. As a 16-bit value, char is physically incapable of representing most characters.
Use code point numbers
Instead, use code point integer numbers when working with individual characters. A code point is the number permanently assigned to each of the over 140,000 characters defined in Unicode.
A variety of code point related methods have been added to various classes in Java 5+: String, StringBuilder, Character, etc.
Here we use String#codePoints to get an IntStream of code points, one element for each character in the source. And we use StringBuilder#appendCodePoint to collect the code points for our final result string.
final int targetLength = 10;
final int fillerCodePoint = "#".codePointAt( 0 ); // Annoying zero-based index counting.
String input = "😷🤠🤡";
int[] codePoints = input.codePoints().toArray();
StringBuilder stringBuilder = new StringBuilder();
for ( int index = 0 ; index < targetLength ; index++ )
{
if ( index < codePoints.length )
{
stringBuilder.appendCodePoint( codePoints[ index ] );
} else
{
stringBuilder.appendCodePoint( fillerCodePoint );
}
}
Or, shorten that for loop with the use of a ternary operator.
for ( int index = 0 ; index < targetLength ; index++ )
{
int codePoint = ( index < codePoints.length ) ? codePoints[ index ] : fillerCodePoint;
stringBuilder.appendCodePoint( codePoint );
}
Report result.
System.out.println( Arrays.toString( codePoints ) );
String output = stringBuilder.toString();
System.out.println( "output = " + output );
[128567, 129312, 129313]
output = 😷🤠🤡#######
There is likely a clever way to write that code more briefly with streams and lambdas, but I cannot think of one at the moment.
And, one could cleverly use the String#repeat method in Java 11+.
String output = input + "#".repeat( targetLength - input.codePoints().toArray().length ) ;
Note: This answer is distinct and qualitatively different from my earlier one (which I still stand by).
There is a simple way for a Java application (i.e. one not using a graphical user interface) to obtain the width of a String being rendered in a given font with a given font size. It requires the use of some awt classes which are supported even in a non-AWT environment. Here's a demo using the data provided in the question:
package fixedwidth;
import java.awt.Canvas;
import java.awt.Font;
import java.awt.FontMetrics;
public class FixedWidth {
static String[] tests = {
"Peter", "SHGAMI", "Marcel №1", "💏", "👨❤️👨", "👩❤️💋👩", "👨👩👦"
};
static Font smallFont = new Font("Monospaced", Font.PLAIN, 10);
static Font bigFont = new Font("Monospaced", Font.BOLD, 24);
/**
* This code is based on an answer by SO user Lonzak.
* See SO Answer https://stackoverflow.com/a/18123024/2985643
*/
public static void main(String[] args) {
FontMetrics fm1 = new Canvas().getFontMetrics(FixedWidth.smallFont);
FixedWidth.demo(tests, fm1);
FontMetrics fm2 = new Canvas().getFontMetrics(FixedWidth.bigFont);
FixedWidth.demo(tests, fm2);
}
static void demo(String[] tests, FontMetrics fm) {
Font f = fm.getFont();
System.out.println("\nFont name:" + f.getName() + ", font size:" +
f.getSize() + ", font style:" + f.getStyle());
for (String test : tests) {
int width = fm.stringWidth(test);
System.out.println("width=" + width + ", data=" + test);
}
}
}
The code above is based on this old answer by user Lonzak to the question Java - FontMetrics without Graphics. Those AWT classes allow you to create a Font with defined characteristics (i.e. name, size, style), and then use a FontMetrics instance to obtain the width of an arbitrary String when using that font.
Here is the output from running the code shown above:
Font name:Monospaced, font size:10, font style:0
width=30, data=Peter
width=60, data=SHGAMI
width=59, data=Marcel №1
width=10, data=💏
width=30, data=👨❤️👨
width=40, data=👩❤️💋👩
width=30, data=👨👩👦
Font name:Monospaced, font size:24, font style:1
width=70, data=Peter
width=149, data=SHGAMI
width=140, data=Marcel №1
width=25, data=💏
width=73, data=👨❤️👨
width=98, data=👩❤️💋👩
width=74, data=👨👩👦
Notes:
The first set of results shows the widths of the sample data in the question when using plain Monospaced 10 point font. The second set of results shows the widths of those same strings when using bold Monospaced 24 point font.
The widths don't look correct for some of the emojis, but that is because when the source code and output results are pasted into SO some emoji representations are changed, presumably because of the different font being used in the browser. (I was using Monospaced for both the source and the output.) Here's a screen shot of the original output, showing that the widths at least look plausible:
Even though the widths are being calculated and rendered for a fixed width font (Monospaced), it's clear that the width of the emojis cannot be predicted from the widths of normal keyboard characters.
Sounds like you're looking for a Java implementation of the POSIX wcwidth and wcswidth functions, which implement the rules defined in Unicode Technical Report #11 (which exclusively focuses on display widths for Unicode codepoints when rendered to fixed width devices - terminals and the like). The only such Java implementation that I'm aware of is in the JLine3 library, which is a lot of code to bring in for just this one class, but that may be your best bet.
Note however that that code appears to be incomplete. Unicode codepoint 0x26AA (⚪️), for example, is reported as having a width of 1 by the JLine3 code, but on every platform I've tested on (including here in the StackOverflow editor, which is a fixed width "device") that codepoint is displayed over two columns.
Good luck - this stuff is a lot more complex than it looks. The JVM's unfortunate UCS-2 history (not Sun's fault - it was bad timing wrt the Unicode standard) only makes matters worse, and as others have said here, avoid the char and Character data types like the plague - they do not work the way you expect, and the instant code that uses those types encounters data including codepoints from the Unicode supplemental planes, it is almost certain to function incorrectly (unless the author has been especially careful - do you feel lucky? 😉).
For a programming project in Calculus we were instructed to code a program that models the Simpson's 1/3 and 3/8 rule.
We are supposed to take in a polynomial(i.e. 5x^2+7x+10) but I am struggling conceptualizing this. I have began by using scanner but is there a better way to correctly read the polynomial?
Any examples or reference materials will be greatly appreciated.
I'd suggest that you start with a Function interface that takes in a number of input values and returns an output value:
public interface Function {
double evaluate(double x);
}
Write a polynomial implementation:
public class Poly {
public static double evaluate(double x, double [] coeffs) {
double value = 0.0;
if (coeffs != null) {
// Use Horner's method to evaluate.
for (int i = coeffs.length-1; i >= 0; --i) {
value = coeffs[i] + (x*value);
}
}
return value;
}
}
Pass that to your integrator and let it do its thing.
A simple way (to get you started) is to use an array.
In your example: 5x^2 + 7x + 10 would be:
{10,7,5}
I.e. at index 0 is the factor 10 for x^0 at index 1 is 7 for x^1 at index 2 is 10 for x^2.
Of course this not the best approach. To figure out way figure out how you would represent x^20
In java it would be easiest to pre-format your input and just ask for constants--as in, "Please enter the X^2 term" (and then the X term, and then the constant).
If that's not acceptable, you are going to be quite vulnerable to input style differences. You can separate the terms by String.split[ting] on + and -, that will leave you something like:
[5x^2], [7x], [10]
You could then search for strings containing "x^2" and "x" to differentiate your terms
Remove spaces and .toLowerCase() first to counter user variances, of course.
When you split your string you will need to identify the - cases so you can negate those constants.
You could do two splits, one on + the other on -. You could also use StringTokenizer with the option to keep the "Tokens" which might be more straight-forward but StringTokenizer makes some people a little uncomfortable, so go with whatever works for you.
Note that this will succeed even if the user types "5x^2 + 10 + 7 x", which can be handy.
I believe parsing is my problem. I am somewhat new to java so this is troubling me.
You should use a parser generator.
A parser generator is a tool that reads a grammar specification and converts it to a Java program that can recognize matches to the grammar. In addition to the parser generator itself, JavaCC provides other standard capabilities related to parser generation such as tree building (via a tool called JJTree included with JavaCC), actions, debugging, etc.
JavaCC's FAQ answers How do I parse arithmetic expressions?
See the examples that come with JavaCC.
See any text on compiling.
See Parsing Epressions by Recursive Descent and a tutorial by Theodore Norvell.
Also, see JavaCC - Parse math expressions into a class structure
Is there a way to dynamically change output in Java? For instance, in a terminal window if I have:
System.out.print("H")
and then I have:
System.out.print("I")
The output will be:
HI
Is there a way to assign a position to outputs that allows you to replace characters dynamically? For instance (and I know this would not output what I want, I merely want to demonstrate my thinking) this:
System.out.print("H")
Thread.sleep("1")
System.out.print("I")
And it would first print out
H
and then after a second, replace the H with an I?
I'm sure this sounds stupid, I am just interested in dynamically changing content without GUIs. Can someone point me in the direction for this technique? Thank you very much in advance.
You might want to take a look at
System.out.printf
Look at the example shown here: http://masterex.github.com/archive/2011/10/23/java-cli-progress-bar.html
edit:
printf displays formatted strings, which means you can adapt that format and change it for your needs.
for example you could do something like:
String[] planets = {"Mars", "Earth", "Jupiter"};
String format = "\r%s says Hello";
for(String planet : planets) {
System.out.printf(format, planet);
try {
Thread.sleep(1000);
}catch(Exception e) {
//... oh dear
}
}
Using the formatted string syntax found here: http://docs.oracle.com/javase/6/docs/api/java/util/Formatter.html#syntax
As the comment says this solution is only limited to a singular line however dependent on your needs this might be enough.
If you require a solution for the whole screen then a possible solution would be (although quite dirty) would be to hook the operating system using JNA and get a handle on the console window, find its height and then loop println() to "clear" the window then redraw your output.
If you would like to read more then I can answer more questions or here is a link: https://github.com/twall/jna
You can use \b to backspace and erase the previous character.
$ cat T.java
import java.lang.Thread;
public class T {
public static void main(String... args) throws Exception {
System.out.print("H");
System.out.flush();
Thread.sleep(1000);
System.out.print("\bI\n");
System.out.flush();
}
}
$ javac T.java && java T
I
It will output H, then replace it with I after one second.
Sadly, it doesn't work in Eclipse console, but in normal console it does.
This is what you need (uses carriage return '\r' to overwrite the previous output):
System.out.print("H");
Thread.sleep(1000);
System.out.print("\rI");
The C library that is usually used to do this sort of thing is called curses. (Also used from scripting languages that rely on bindings to C libraries, like Python.) You can use a Java binding to it, like JCurses. Google also tells me a pure-Java equivalent is available, called lanterna.
Basically I have a bunch of large strings that I want to remove spaces/punctuation/numbers from, I just want the words.
This is my code:
String str = "hughes/conserdyne corp, unit <hughes capital corp> made bear stearns <bsc> exclusive investment banker develop market 2,188,933 financing design installation micro-utility systems municipalities. company systems self-contained electrical generating facilities alternate power sources, photovoltaic cells, replace public utility power sources.";
String[] arr = str.split("[\\p{P}\\s\\t\\n\\r<>\\d]");
for (int i = 0; i < arr.length; i++) {
if(arr[i] != null)
System.out.println(arr[i]);
}
This is the output I get:
hughes
conserdyne
corp
unit
lt
hughes
capital
corp
made
bear
stearns
lt
bsc
exclusive
investment
banker
develop
market
financing
design
installation
micro
utility
systems
municipalities
company
systems
self
contained
electrical
generating
facilities
alternate
power
sources
photovoltaic
cells
replace
public
utility
power
sources
So as you can see, there's a lot of white space and such appearing where commas and numbers used to be. I get this with or without that if condition on printing.
Yet, if I concatenate all of arr's contents into a new string, and then split that with regex "\s+" it works and produces the correct output.
So what's wrong with my current regex? Any help would be appreciated.
You should just be able to throw a + on the end of your regex:
String[] arr = str.split("[\\p{P}\\s\\t\\n\\r<>\\d]");
To:
String[] arr = str.split("[\\p{P}\\s\\t\\n\\r<>\\d]+");
// ^-- This guy
Adding the + means to match 1 or more of the previous element, so if you have multiple "break characters" in a row, they will be treated as a single delimiter and you won't get empty Strings in your result.
I'm trying to figure out the following issue related to BigIntegers in Java 7 x64. I am attempting to calculate a number to an extremely high power. Code is below, followed by a description of the problem.
import java.math.BigInteger;
public class main {
public static void main(String[] args) {
// Demo calculation; Desired calculation: BigInteger("4096").pow(800*600)
BigInteger images = new BigInteger("2").pow(15544);
System.out.println(
"The number of possible 16 bpc color 800x600 images is: "
+ images.toString());
}
}
I am encountering issues printing the result of this operation. When this code executes it prints the message but not the value of images.toString().
To isolate the problem I started calculating powers of two instead of the desired calculation listed in the comment on that line. On the two systems I have tested this on, 2^15544 is the smallest calculation that triggers the problem; 2^15543 works fine.
I'm no where close to hitting the memory limit on the host systems and I don't believe that I am even close to the VM limit (at any rate running with the VM arguments -Xmx1024M -Xms1024M has no effect).
After poking around the internet looking for answers I have come to suspect that I am hitting a limit in either BigInteger or String related to the maximum size of an array (Integer.MAX_VALUE) that those types use for internal data storage. If the problem is in String I think it would be possible to extend BigInteger and write a print method that spews out a few chars at a time until the entire BigInteger is printed, but I rather suspect that the problem lies elsewhere.
Thank you for taking the time to read my question.
The problem is a bug of the Console view in Eclipse.
On my setup, Eclipse (Helios and Juno) can't show a single line longer than 4095 characters without CRLF. The maximum length can vary depending on your font choice - see below.
Therefore, even the following code will show the problem - there's no need for a BigInteger.
StringBuilder str = new StringBuilder();
for (int i = 0; i < 4096; i++) {
str.append('?');
}
System.out.println(str);
That said, the string is actually printed in the console - you can for instance copy it out of it. It is just not shown.
As a workaround, you can set Fixed width console setting in Console preferences, the string will immediatelly appear:
The corresponding bugs on Eclipse's bugzilla are:
Display problem in console when a line reaches 4096 characters
Texteditor can't show a line with more than 4095 chars. Limit at 4096 chars.
Long lines are not displayed by editor
According to those, it's a Windows/GTK bug and Eclipse's developers can't do anything about it.
The bug is related to the length of the text is pixels, use a smaller
font and you will be able to get more characters in the text before it
breaks.