I have a russian string which i have encoded to UTF-8
String str = "\u041E\u041A";
System.out.println("String str : " + str);
When i print the string in eclipse console i get ?? Can anyone suggest how to print the russian strings to console or what i am doing wrong here?
I have tried converting it to bytes using byte myArr[] = str.getBytes("UTF-8") and then new String(myArr, "UTF-8") still same problem :-(
Try this:
String myString = "some cyrillic text";
byte bytes[] = myString.getBytes("ISO-8859-1");
String value = new String(bytes, "UTF-8");
Or this:
String myString = "some cyrillic text";
byte bytes[] = myString.getBytes("UTF-8");
String value = new String(bytes, "UTF-8");
The main problem with russian its to set UTF-8 encoding correctly.
In eclipse Go to Run > Run Configuration > Common > Change the console encoding to UTF-8. You will be able to see the Russian Characters in console
My Eclipse prints it correctly
String str : ОК
try to change Run Configurations encoding to UTF-8 or CP1251
The display font of the console will most likely not cope with nonASCII characters.
You could try printing to a file rather then System.out
It's an old topic, but nevertheless maybe below would help someone.
If you're reading data using InputStream / InputStreamReader (for example from some API) that contains Cyrillic symbols and you get some gibberish like ������ ��� or ?????? ???, try to apply encoding Charset as second parameter of InputStreamReader constructor.
EXAMPLE:
Let's use Russian Central Bank API to get US dollars and euro price in Russian rubles. In below code we get data for the current day when we make a request. Data from API is in xml so we also need to parse it.
import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import java.io.*;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URL;
import java.nio.charset.Charset;
public class CBRFApi {
public static void main(String[] args) throws UnsupportedEncodingException {
String output = getAndReadData("http://www.cbr.ru/scripts/XML_daily.asp");
Document document = loadXMLFromString(output);
// getting root element
Node root = document.getDocumentElement();
NodeList currencies = root.getChildNodes();
// just for further reference
Node usDollar;
Node euro;
for (int i = 0; i < currencies.getLength(); i++) {
Node currency = currencies.item(i);
String key = currency.getAttributes().getNamedItem("ID").getNodeValue();
if (key.equals("R01235") || key.equals("R01239")) {
if (key.equals("R01235")) // US dollar ID
usDollar = currency;
else if (key.equals("R01239")) // Euro ID
euro = currency;
NodeList currencySpecs = currency.getChildNodes();
System.out.print(currencySpecs.item(1).getTextContent());
System.out.print(" " + currencySpecs.item(3).getTextContent());
System.out.print(" " + currencySpecs.item(4).getTextContent());
System.out.println();
}
}
}
public static String getAndReadData(String link) {
String output = "";
try {
URL url = new URL(link);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod("GET");
conn.setRequestProperty("Accept", "application/xml");
conn.setRequestProperty("User-Agent", "Mozilla/5.0 (Linux; Android 4.2.2; en-us; SAMSUNG GT-I9505 Build/JDQ39) " +
"AppleWebKit/535.19 (KHTML, like Gecko) Version/1.0 Chrome/18.0.1025.308 Mobile Safari/535.19.");
if (conn.getResponseCode() != 200) {
throw new RuntimeException("Failed : HTTP error code : "
+ conn.getResponseCode());
}
// below is the key line,
// without second parameter - Charset.forName("CP1251") -
// data in Cyrillic will be returned like ������ ���
InputStreamReader inputStreamReader = new InputStreamReader(conn.getInputStream(), Charset.forName("CP1251"));
BufferedReader bufferedReader = new BufferedReader(inputStreamReader);
String line;
while ((line = bufferedReader.readLine()) != null) {
output += line;
}
conn.disconnect();
return output;
} catch (MalformedURLException e) {
e.printStackTrace();
return null;
} catch (IOException e) {
e.printStackTrace();
return null;
}
}
public static Document loadXMLFromString(String xml)
{
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = null;
try {
builder = factory.newDocumentBuilder();
InputSource inputSource = new InputSource(new StringReader(xml));
return builder.parse(inputSource);
} catch (ParserConfigurationException | SAXException | IOException e) {
e.printStackTrace();
return null;
}
}
}
So proper output is:
USD Доллар США 63,3791
EUR Евро 70,5980
And without indicating Charset.forName("CP1251"):
USD ������ ��� 63,3791
EUR ���� 70,5980
Of course actual encoding in your case may differ from CP1251, so if this one doesn't work, try other encoding.
This helped me in similar situation:
String myString = "some cyrillic text";
byte bytes[] = myString.getBytes("windows-1251");
String value = URLEncoder.encode(new String(bytes, "UTF-8"), "UTF-8");
I have found a solution -> right charset!
String path = "E:\\java\\test.txt"; File file = new File(path); Scanner scan = new Scanner(file, **"CP1251"**); System.out.println(scan.nextLine());
output was in Russian!
In XCode you can print values by LLDB function po [SomeClass returnAnObject]
For automatic work it can be added in some action in a breakpoint. Also you need add checkbox in "Automatic continue after eval actions"
I have same problem when I read "MyFile.txt" file with russian letters. May be helps to anyone. The solution is:
package j;
import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;
public class J4 {
public static void Read_TXT_File(String fileName) throws
FileNotFoundException {
try{int i=0;
Scanner scanner = new Scanner(new File(fileName), "utf-8");
while (scanner.hasNext()) {
String line = scanner.nextLine();
//byte bytes[] = line.getBytes("UTF-8");
//line = new String(bytes, "UTF-8");
if (line.isEmpty()) {
System.out.println(i+": Empty line");
}
else {
System.out.println(i+": "+ line);
// here is your code for example String MyString = line
}
i++;
}
}catch(Exception ex){ex.printStackTrace();}
}
public static void main(String[] args) throws
FileNotFoundException {
Read_TXT_File("MyFile.txt");
}
}
Related
I have some current code and the problem is its creating a 1252 codepage file, i want to force it to create a UTF-8 file
Can anyone help me with this code, as i say it currently works... but i need to force the save on utf.. can i pass a parameter or something??
this is what i have, any help really appreciated
var out = new java.io.FileWriter( new java.io.File( path )),
text = new java.lang.String( src || "" );
out.write( text, 0, text.length() );
out.flush();
out.close();
Instead of using FileWriter, create a FileOutputStream. You can then wrap this in an OutputStreamWriter, which allows you to pass an encoding in the constructor. Then you can write your data to that inside a try-with-resources Statement:
try (OutputStreamWriter writer =
new OutputStreamWriter(new FileOutputStream(PROPERTIES_FILE), StandardCharsets.UTF_8))
// do stuff
}
Try this
Writer out = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream("outfilename"), "UTF-8"));
try {
out.write(aString);
} finally {
out.close();
}
Try using FileUtils.write from Apache Commons.
You should be able to do something like:
File f = new File("output.txt");
FileUtils.writeStringToFile(f, document.outerHtml(), "UTF-8");
This will create the file if it does not exist.
Since Java 7 you can do the same with Files.newBufferedWriter a little more succinctly:
Path logFile = Paths.get("/tmp/example.txt");
try (BufferedWriter writer = Files.newBufferedWriter(logFile, StandardCharsets.UTF_8)) {
writer.write("Hello World!");
// ...
}
All of the answers given here wont work since java's UTF-8 writing is bugged.
http://tripoverit.blogspot.com/2007/04/javas-utf-8-and-unicode-writing-is.html
var out = new java.io.PrintWriter(new java.io.File(path), "UTF-8");
text = new java.lang.String( src || "" );
out.print(text);
out.flush();
out.close();
The Java 7 Files utility type is useful for working with files:
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.io.IOException;
import java.util.*;
public class WriteReadUtf8 {
public static void main(String[] args) throws IOException {
List<String> lines = Arrays.asList("These", "are", "lines");
Path textFile = Paths.get("foo.txt");
Files.write(textFile, lines, StandardCharsets.UTF_8);
List<String> read = Files.readAllLines(textFile, StandardCharsets.UTF_8);
System.out.println(lines.equals(read));
}
}
The Java 8 version allows you to omit the Charset argument - the methods default to UTF-8.
we can write the UTF-8 encoded file with java using
use PrintWriter to write UTF-8 encoded xml
Or Click here
PrintWriter out1 = new PrintWriter(new File("C:\\abc.xml"), "UTF-8");
Below sample code can read file line by line and write new file in UTF-8 format. Also, i am explicitly specifying Cp1252 encoding.
public static void main(String args[]) throws IOException {
BufferedReader br = new BufferedReader(new InputStreamReader(
new FileInputStream("c:\\filenonUTF.txt"),
"Cp1252"));
String line;
Writer out = new BufferedWriter(
new OutputStreamWriter(new FileOutputStream(
"c:\\fileUTF.txt"), "UTF-8"));
try {
while ((line = br.readLine()) != null) {
out.write(line);
out.write("\n");
}
} finally {
br.close();
out.close();
}
}
Here is an example of writing UTF-8 characters in the Eclipse IDE and to a File.
For Eclipse.simply set the Encoding to UTF-8 from Run -> Run Configurations -> Common
Common Dialog
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileOutputStream;
import java.io.OutputStreamWriter;
public class UTF_8_Example {
/**
* Example of printing UTF-8 characters inside Eclipse IDE and a File.
* <p>
* For eclipse, you must go to Run -> Run Configurations -> Common
* and set Encoding to UTF-8.
* <p>
* #param args
*/
public static void main(String[] args) {
BufferedWriter writer = null;
try {
///////////////////////////////////////////////////////////////////
// WRITE UTF-8 WITHIN ECLIPSE EDITOR
///////////////////////////////////////////////////////////////////
char character = '►';
int code = character;
char hex = '\u25ba';
String value = "[" + Integer.toHexString(code) + "][\u25ba][" + character + "][" + (char)code + "][" + hex + "]";
System.out.println(value);
///////////////////////////////////////////////////////////////////
// WRITE UTF-8 TO A FILE
///////////////////////////////////////////////////////////////////
File file = new File("UTF_8_EXAMPLE.TXT");
FileOutputStream fileOutputStream = new FileOutputStream(file);
OutputStreamWriter outputStreamWriter = new OutputStreamWriter(fileOutputStream, "UTF-8");
writer = new BufferedWriter(outputStreamWriter);
writer.write(value);
}
catch(Throwable e) {
throw new RuntimeException(e);
}
finally {
try {
if(writer != null) { writer.close(); }
}
catch(Throwable e) {
throw new RuntimeException(e);
}
}
}
}
I have JSON`` data I'm getting with java GET request.
I need to count how many age objects are greater than 50 in the JSON object.
Right now I am just getting the whole JSON data line by line using bufferreader, but how do I get the single element age in the JSON object and compare it with the number 50?
package problem;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
import java.net.URL;
import java.net.URLConnection;
public class Test {
public static void main(String args[])
{
BufferedReader rd;
OutputStreamWriter wr;
try
{
URL url = new URL("https://coderbyte.com/api/challenges/json/age-counting");
URLConnection conn = url.openConnection();
conn.setDoOutput(true);
wr = new OutputStreamWriter(conn.getOutputStream());
wr.flush();
// Get the response
rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String line;
while ((line = rd.readLine()) != null) {
System.out.println(line);
}
}
catch (Exception e) {
System.out.println(e.toString());
}
}
Sample response for JSON data, I need to get the age value as an integer:
{
"data":
"key=IAfpK,
age=58,
key=WNVdi,
age=64,
key=jp9zt,
age=47"
}
You are going to need a json library like jackson
ObjectMapper mapper =new ObjectMapper();
try{
BufferedReader br = new BufferedReader(
new FileReader(new InputStreamReader(conn.getInputStream())));
//convert the json string back to object
Data cdataObj = mapper.readValue(br, Data.class);
if (cdataObj.age>50) {
you'd have to map the json to a class or use more rudimentary json api reading nodes
for reference
https://java2blog.com/jackson-example-read-and-write-json/
You can use any one of the most popular JSON libraries such as Jackson or Gson to parse value of the key data, then you can narrow down the problem to How to count how many ages whose value is grater than 50 in the value string?.
Code snippet
String valueStr = "key=IAfpK,age=58,key=WNVdi,age=64,key=jp9zt,age=47";
int count = 0;
for (String part : valueStr.split(",")) {
String[] subparts = part.split("=", 2);
if ("age".equals(subparts[0]) && Integer.valueOf(subparts[1]) > 50) {
count++;
}
}
System.out.print(count);
Console output
2
Without any JSON libraries, please try following code
WebRequest request = WebRequest.Create("https://coderbyte.com/api/challenges/json/age-counting");
WebResponse response = request.GetResponse();
//Console.WriteLine("Content length is {0}", response.ContentLength);
//Console.WriteLine("Content type is {0}", response.ContentType);
// Get the stream associated with the response.
Stream receiveStream = response.GetResponseStream();
// Pipes the stream to a higher level stream reader with the required encoding format.
StreamReader readStream = new StreamReader(receiveStream, Encoding.UTF8);
string _data = readStream.ReadToEnd();
var numbers = _data.Split(',');
var age = numbers.Where(c => c.Contains("age="));
int _total = 0;
foreach (var item in age)
{
string _item = item.Replace("\"}", "");
if (int.Parse(_item.Replace("age=", "")) >= 50){
_total++;
}
}
//Console.WriteLine(readStream.ReadToEnd());
Console.WriteLine(_total);
response.Close();
readStream.Close();
Console.ReadLine();
//final output will be 128
I have some current code and the problem is its creating a 1252 codepage file, i want to force it to create a UTF-8 file
Can anyone help me with this code, as i say it currently works... but i need to force the save on utf.. can i pass a parameter or something??
this is what i have, any help really appreciated
var out = new java.io.FileWriter( new java.io.File( path )),
text = new java.lang.String( src || "" );
out.write( text, 0, text.length() );
out.flush();
out.close();
Instead of using FileWriter, create a FileOutputStream. You can then wrap this in an OutputStreamWriter, which allows you to pass an encoding in the constructor. Then you can write your data to that inside a try-with-resources Statement:
try (OutputStreamWriter writer =
new OutputStreamWriter(new FileOutputStream(PROPERTIES_FILE), StandardCharsets.UTF_8))
// do stuff
}
Try this
Writer out = new BufferedWriter(new OutputStreamWriter(
new FileOutputStream("outfilename"), "UTF-8"));
try {
out.write(aString);
} finally {
out.close();
}
Try using FileUtils.write from Apache Commons.
You should be able to do something like:
File f = new File("output.txt");
FileUtils.writeStringToFile(f, document.outerHtml(), "UTF-8");
This will create the file if it does not exist.
Since Java 7 you can do the same with Files.newBufferedWriter a little more succinctly:
Path logFile = Paths.get("/tmp/example.txt");
try (BufferedWriter writer = Files.newBufferedWriter(logFile, StandardCharsets.UTF_8)) {
writer.write("Hello World!");
// ...
}
All of the answers given here wont work since java's UTF-8 writing is bugged.
http://tripoverit.blogspot.com/2007/04/javas-utf-8-and-unicode-writing-is.html
var out = new java.io.PrintWriter(new java.io.File(path), "UTF-8");
text = new java.lang.String( src || "" );
out.print(text);
out.flush();
out.close();
The Java 7 Files utility type is useful for working with files:
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.io.IOException;
import java.util.*;
public class WriteReadUtf8 {
public static void main(String[] args) throws IOException {
List<String> lines = Arrays.asList("These", "are", "lines");
Path textFile = Paths.get("foo.txt");
Files.write(textFile, lines, StandardCharsets.UTF_8);
List<String> read = Files.readAllLines(textFile, StandardCharsets.UTF_8);
System.out.println(lines.equals(read));
}
}
The Java 8 version allows you to omit the Charset argument - the methods default to UTF-8.
we can write the UTF-8 encoded file with java using
use PrintWriter to write UTF-8 encoded xml
Or Click here
PrintWriter out1 = new PrintWriter(new File("C:\\abc.xml"), "UTF-8");
Below sample code can read file line by line and write new file in UTF-8 format. Also, i am explicitly specifying Cp1252 encoding.
public static void main(String args[]) throws IOException {
BufferedReader br = new BufferedReader(new InputStreamReader(
new FileInputStream("c:\\filenonUTF.txt"),
"Cp1252"));
String line;
Writer out = new BufferedWriter(
new OutputStreamWriter(new FileOutputStream(
"c:\\fileUTF.txt"), "UTF-8"));
try {
while ((line = br.readLine()) != null) {
out.write(line);
out.write("\n");
}
} finally {
br.close();
out.close();
}
}
Here is an example of writing UTF-8 characters in the Eclipse IDE and to a File.
For Eclipse.simply set the Encoding to UTF-8 from Run -> Run Configurations -> Common
Common Dialog
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileOutputStream;
import java.io.OutputStreamWriter;
public class UTF_8_Example {
/**
* Example of printing UTF-8 characters inside Eclipse IDE and a File.
* <p>
* For eclipse, you must go to Run -> Run Configurations -> Common
* and set Encoding to UTF-8.
* <p>
* #param args
*/
public static void main(String[] args) {
BufferedWriter writer = null;
try {
///////////////////////////////////////////////////////////////////
// WRITE UTF-8 WITHIN ECLIPSE EDITOR
///////////////////////////////////////////////////////////////////
char character = '►';
int code = character;
char hex = '\u25ba';
String value = "[" + Integer.toHexString(code) + "][\u25ba][" + character + "][" + (char)code + "][" + hex + "]";
System.out.println(value);
///////////////////////////////////////////////////////////////////
// WRITE UTF-8 TO A FILE
///////////////////////////////////////////////////////////////////
File file = new File("UTF_8_EXAMPLE.TXT");
FileOutputStream fileOutputStream = new FileOutputStream(file);
OutputStreamWriter outputStreamWriter = new OutputStreamWriter(fileOutputStream, "UTF-8");
writer = new BufferedWriter(outputStreamWriter);
writer.write(value);
}
catch(Throwable e) {
throw new RuntimeException(e);
}
finally {
try {
if(writer != null) { writer.close(); }
}
catch(Throwable e) {
throw new RuntimeException(e);
}
}
}
}
So I'm supposed to insert information in a file with UTF-16 encoding, than do some operations (count lines, words, etc). Problem is that if I choose the UTF-16 encoding, an exception is thrown, but the UTF-8 works fine.
import java.io.*;
import java.util.Scanner;
public final class Q4 {
public static void main(String[ ] args)throws FileNotFoundException{
final String ENCODING = "UTF-16";
final String FILE = " testcount";
PrintWriter out = null;
// Given code – do not modify(!) This will create the UTF-16 test file on your drive.
try {
out = new PrintWriter(FILE, ENCODING);
out.write("Test file for UTF-16\n" + "(contains surrogate pairs:\n" +
"Musical symbols in the range 1D100–1D1FF)\n\n");
out.write("F-clef (1D122): \uD834\uDD22\tCrotchet (1D15F): \uD834\uDD5F\n");
out.write("G-clef (1D120): \uD834\uDD20\tSemiquaver (1D161): \uD834\uDD61\n");
out.write("\n(? lines, ?? words, ??? chars but ??? code points)\n");
} catch (IOException e) { System.out.println("uh? cannot write to file!");
} finally { if (out != null) out.close();
}
// Your code – scan the test file and count lines, words, characters, and code points.
Scanner fin = new Scanner(new File(FILE));
String s = "";
//get the data in file
while (fin.hasNext()){
s = s + fin.next();
System.out.println(s);
}
fin.close();
//count words and lines
}
}
My only guess, a far fetched one, is that it has to something to do with the OS (windows 8.1) not being able to save a UTF- 16 code, but sounds like a silly guess.
Specify the encoding when you read the file:
Scanner fin = new Scanner(new File(FILE), ENCODING);
I've got a text file called log.txt.
It's got the following data
1,,Mon May 05 00:05:45 WST 2014,textFiles/a.txt,images/download.jpg
2,,Mon May 05 00:05:45 WST 2014,textFiles/a.txt,images/download.jpg
The numbers before the first comma are indexes that specify each item.
What I want to do is to read the file and then replace one part of the string(e.g. textFiles/a.txt) in a given line with another value(e.g. something/bob.txt).
This is what I have so far:
File log= new File("log.txt");
String search = "1,,Mon May 05 00:05:45 WST 2014,textFiles/a.txt,images/download.jpg;
//file reading
FileReader fr = new FileReader(log);
String s;
try (BufferedReader br = new BufferedReader(fr)) {
while ((s = br.readLine()) != null) {
if (s.equals(search)) {
//not sure what to do here
}
}
}
You could create a string of total file content and replace all the occurrence in the string and write to that file again.
You could something like this:
File log= new File("log.txt");
String search = "textFiles/a.txt";
String replace = "replaceText/b.txt";
try{
FileReader fr = new FileReader(log);
String s;
String totalStr = "";
try (BufferedReader br = new BufferedReader(fr)) {
while ((s = br.readLine()) != null) {
totalStr += s;
}
totalStr = totalStr.replaceAll(search, replace);
FileWriter fw = new FileWriter(log);
fw.write(totalStr);
fw.close();
}
}catch(Exception e){
e.printStackTrace();
}
One approach would be to use String.replaceAll():
File log= new File("log.txt");
String search = "textFiles/a\\.txt"; // <- changed to work with String.replaceAll()
String replacement = "something/bob.txt";
//file reading
FileReader fr = new FileReader(log);
String s;
try {
BufferedReader br = new BufferedReader(fr);
while ((s = br.readLine()) != null) {
s.replaceAll(search, replacement);
// do something with the resulting line
}
}
You could also use regular expressions, or String.indexOf() to find where in a line your search string appears.
Solution with Java Files and Stream
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;
import java.util.stream.Stream;
private static void replaceAll(String filePath, String text, String replacement) {
Path path = Paths.get(filePath);
// Get all the lines
try (Stream<String> stream = Files.lines(file, StandardCharsets.UTF_8)) {
// Do the replace operation
List<String> list = stream.map(line -> line.replace(text, replacement)).collect(Collectors.toList());
// Write the content back
Files.write(file, list, StandardCharsets.UTF_8);
} catch (IOException e) {
LOG.error("IOException for : " + file, e);
e.printStackTrace();
}
}
Usage
replaceAll("test.txt", "original text", "new text");
A very simple solution would be to use:
s = s.replace( "textFiles/a.txt", "something/bob.txt" );
To replace all occurrences, use replaceAll shown in another proposal, where a regular expression is used - take care to escape all magic characters, as indicated there.