Torrent, query URL encoding using java - java

I am writing my torrent client and stuck at the moment when I need to send a GET request with info hash. When sending a request, I need to format it correctly. As it turned out, URLEncode does not match its name, other ways I know do not lead me to the target. (Sorry for the bad English)
I try to do it without using third-party libraries.
As I have seen, I need "Conversion from hexadecimal representation to the bytestring value of the hash." but my attempts to do so do not give the desired result.
I found these answers and a few others, but they were all on other programming language I could not understand and reproduce them in my code.
link vb.net
link rust
I also found the Bittorent library but even using its encoding method, nothing happened to my program.
UPD 1:
info hash that i get when bencoding: 0a85522a2f09e42f3d63a89a0d45e4589f8b904c
Here's what I see in Wireshark:
https://bt.toloka.to/announce/h=IT5FwgeUF1& (Tracker blocks most countries so if you want to check, use VPN (recomend Netherlands))
&info_hash=%0A%85R%2A%2F%09%E4%2F%3Dc%A8%9A%0DE%E4X%9F%8B%90L
&peer_id=-UT360W-%FE%B5%95%1A%88%0A%DF%97K%E9%FD%23
&port=19708
&uploaded=0
&downloaded=0
&left=16421367202
&corrupt=0
&key=A36E3AE9
&event=stopped
&numwant=0
&compact=1
&no_peer_id=1
It encodes info hash as follows: %0A%85R%2A%2F%09%E4%2F%3Dc%A8%9A%0DE%E4X%9F%8B%90L
UPD 2:
My problem is that I can't implement URL encoding. I need to convert
here this:
0a85522a2f09e42f3d63a89a0d45e4589f8b904c
Into this:
%0a%85R%2a%2f%09%e4%2f%3dc%a8%9a%0dE%e4X%9f%8b%90L
I tried to rewrite the code from other answers that are on stackoverflow, but I did not succeed in anything sensible.
String a = "0a85522a2f09e42f3d63a89a0d45e4589f8b904c";
byte[] hash = a.getBytes(StandardCharsets.UTF_8);
StringBuilder res = new StringBuilder();
for(char element : a.toCharArray()){
if(Character.getNumericValue(element) <= 127){
char[] result = URLEncoder.encode(String.valueOf(element), String.valueOf(StandardCharsets.UTF_8)).toCharArray();
if(result[0] == '%'){
res.append(toLowerCase(result));
}else{
char[] reinfo = new char[result.length + 1];
reinfo[0] = '%';
for(int i = 0; i < result.length; i++){
reinfo[i + 1] = result[i];
}
res.append(toLowerCase(reinfo));
}
}
}

An infohash is 20 bytes of binary data, the hex string (40 characters) already is an encoded form. So the simplest approach is to percent-encode all bytes, whether they could be represented as ASCII or not.
For example 0x8552 can be encoded as %85R or as %85%52. The latter is less space-efficient, but the former is simpler to implement as a starting point.

Update 1:
Okay, I realized another problem, URLEncoder only allows us to encode our hash if we encode our byte array in ISO8859_1. If we encode our array in UTF-8 or ACSII, URLEncoder cannot encode it correctly. Therefore, if we have the original byte array, we can write the following:
URLEncoder.encode(new String( <your byte array> , "ISO8859_1"), "ISO8859_1");
Input data:
array = [10, -123, 82, 42, 47, 9, -28, 47, 61, 99, -88, -102, 13, 69, -28, 88, -97, -117, -112, 76].
result:
%0a%85R%2a%2f%09%e4%2f%3dc%a8%9a%0dE%e4X%9f%8b%90L
If we already get a converted byte array in a string, then we will have to manually describe the encoding logic, there are several ways, one of them I will write below, and if you need a more compact code, then you can use "for" or "while", an example of how to do this is here
Primitive implementation of url encoding:
I did manage to do it in java. As I noticed, only a-z, A-Z, 0-9 characters are used in hash_info, so all you have to do is convert the HEX to the corresponding letter. (there is necessary information here)
Here's the code I got:
public static String encodeURL(char[] element){
StringBuilder result = new StringBuilder();
for(int i = 0; i < element.length; i++){
result.append(encode( String.valueOf(element[i++]) + element[i]));
}
return result.toString();
}
private static String encode(String sumChar){
switch (sumChar){
case "41": return "A";
case "42": return "B";
case "43": return "C";
case "44": return "D";
case "45": return "E";
case "46": return "F";
case "47": return "G";
case "48": return "H";
case "49": return "I";
case "4A":
case "4a": return "J";
case "4B":
case "4b": return "K";
case "4C":
case "4c": return "L";
case "4D":
case "4d": return "M";
case "4E":
case "4e": return "N";
case "4F":
case "4f": return "O";
case "50": return "P";
case "51": return "Q";
case "52": return "R";
case "53": return "S";
case "54": return "T";
case "55": return "U";
case "56": return "V";
case "57": return "W";
case "58": return "X";
case "59": return "Y";
case "5A":
case "5a": return "Z";
case "61": return "a";
case "62": return "b";
case "63": return "c";
case "64": return "d";
case "65": return "e";
case "66": return "f";
case "67": return "g";
case "68": return "h";
case "69": return "i";
case "6A":
case "6a": return "j";
case "6B":
case "6b": return "k";
case "6C":
case "6c": return "l";
case "6D":
case "6d": return "m";
case "6E":
case "6e": return "n";
case "6F":
case "6f": return "o";
case "70": return "p";
case "71": return "q";
case "72": return "r";
case "73": return "s";
case "74": return "t";
case "75": return "u";
case "76": return "v";
case "77": return "w";
case "78": return "x";
case "79": return "y";
case "7A":
case "7a": return "z";
default: return "%" + sumChar;
}
}
Input data:
0a85522a2f09e42f3d63a89a0d45e4589f8b904c
Outcome:
%0a%85R%2a%2f%09%e4%2f%3dc%a8%9a%0dE%e4X%9f%8b%90L
Thank you for the suggestions and advice to those who wanted to help.

Related

'!' is accepted in a int method and has two diffrent values java

Can someone explain to me, how my method here in java that expects a int, but accepts '!' as an argument? further more, how can it interpet it as 33 when i debug it, but when i do System.out.println(Character.getNumericValue('!')); it prints -1?
here is the code guys:
public abstract class Stuff {
public static char getCharacterFromNumber(int number) throws InvalidCharException {
if(number>=20) {
if (number <= 45) {
switch (number) {
case 20:
return 'a';
case 21:
return 'b';
case 22:
return 'c';
case 23:
return 'd';
case 24:
return 'e';
case 25:
return 'f';
case 26:
return 'g';
case 27:
return 'h';
case 28:
return 'i';
case 29:
return 'j';
case 30:
return 'k';
case 31:
return 'l';
case 32:
return 'm';
case 33:
return 'n';
case 34:
return 'o';
case 35:
return 'p';
case 36:
return 'q';
case 37:
return 'r';
case 38:
return 's';
case 39:
return 't';
case 40:
return 'u';
case 41:
return 'v';
case 42:
return 'w';
case 43:
return 'x';
case 44:
return 'y';
case 45:
return 'z';
}
}
}
throw new InvalidCharException();
}
public static void main(String [] args){
try {
System.out.println(Stuff.getCharacterFromNumber('!'));
} catch (InvalidCharException e) {
e.printStackTrace();
}
System.out.println(Character.getNumericValue('!'));
}
}
i have searched but havent found anything similar to my problem, and if someone has a better idea for the title i'd appricate it :)
Character literals are nothing more than pure values, but with different types. 48 represents exactly the same value as '0' (though the first one is of a type int and the second one is of a type char). 49 also represents the same value as '1'. That's why 'a' + 1 will result in a value equal to 'b'.
What you are confused about is why this:
System.out.println(Character.getNumericValue('!'));
prints -1. Well, to find out why, let's jump into java docs. We can see that the method Character.getNumericValue:
Returns the int value that the specified character (Unicode code point) represents.
It does not return the Unicode value of a character. If you want to see what number represents the character literal '!' you might want to do it like so:
System.out.println((int)'!');
Which will print: 33.
The character '!' has an ASCII value of 33. Java allows for a char value to be widened to an int, which explains why you can pass a char into a method expecting an int.
However, Character.getNumericValue does something different.
Returns the int value that the specified Unicode character represents.
(bold emphasis mine)
That is '1' returns 1, whereas the ASCII code is 49. If the character doesn't represent a numeric value:
If the character does not have a numeric value, then -1 is returned.
You get 2 different values because the ASCII code and the numeric value are two different concepts.

Why does this Java Switch-Case not work?

So, all variables in the conditions are static strings. type itself is a string in fact.
switch(type) {
case (INT || TINYINT):
preparedStatement = setInteger(preparedStatement, value, index);
break;
case (BIGINT || LONG):
preparedStatement = setLong(preparedStatement, value, index);
break;
case (DATETIME || TIMESTAMP):
preparedStatement = setTimestamp(preparedStatement, value, index);
break;
case (MEDIUMTEXT || ENUM || TEXT || LONGTEXT || VARCHAR):
preparedStatement = setString(preparedStatement, value, index);
break;
}
First, switch statements on strings are supported in Java 7+, but not in Java 6 and before.
Next, the || operator (the logical-OR operator) only works on boolean values, not String values. But you can get the same code to be run on multiple cases by listing the cases and not breaking until past the relevant code:
switch(type) {
case INT:
case TINYINT:
// This code will run for INT and TINYINT only.
preparedStatement = setInteger(preparedStatement, value, index);
break;
case BIGINT:
case LONG:
// This code will run for BIGINT and LONG only.
preparedStatement = setLong(preparedStatement, value, index);
break;
// etc.
Java 7 example:
public String getTypeOfDayWithSwitchStatement(String dayOfWeekArg) {
String typeOfDay;
switch (dayOfWeekArg) {
case "Monday":
typeOfDay = "Start of work week";
break;
case "Tuesday":
case "Wednesday":
case "Thursday":
typeOfDay = "Midweek";
break;
case "Friday":
typeOfDay = "End of work week";
break;
case "Saturday":
case "Sunday":
typeOfDay = "Weekend";
break;
default:
throw new IllegalArgumentException("Invalid day of the week: " + dayOfWeekArg);
}
return typeOfDay;
}
Further I have never seen an OR statement inside of a switch like that. I would highly recommend not doing that.
Assuming you are using Java SE 7 (or later) and the constants are static final Strings, then the syntax is not Java.
case INT: case TINYINT:
What does this expression evaluate to?
INT || TINYINT
What are the datatypes for INT and TINYINT
I've only ever seen switch used with some primitives (and new in Java 7, string) literals or variables declared as final.
If this isn't throwing a compile error, then the || operator must be defined for whatever datatype those are. But unless that's somehow being resolved at compile time, that operator is not going to be allowed. (Again, this might be something new in Java 7 I'm not aware of.)
If you are trying to do "or" logic, the normative pattern (in pre-7 versions of Java at least), is:
switch(type) {
case INT:
case TINYINT:
preparedStatement = setInteger(preparedStatement, value, index);
break;
case BIGINT:
case LONG:
preparedStatement =
break;
It is supported on and after java 7
You cannot use logical operators in switch statements, even with Strings. You can only test one case at a time.

Java Switch Statement - Is "or"/"and" possible?

I implemented a font system that finds out which letter to use via char switch statements. There are only capital letters in my font image. I need to make it so that, for example, 'a' and 'A' both have the same output. Instead of having 2x the amount of cases, could it be something like the following:
char c;
switch(c){
case 'a' & 'A': /*get the 'A' image*/; break;
case 'b' & 'B': /*get the 'B' image*/; break;
...
case 'z' & 'Z': /*get the 'Z' image*/; break;
}
Is this possible in java?
You can use switch-case fall through by omitting the break; statement.
char c = /* whatever */;
switch(c) {
case 'a':
case 'A':
//get the 'A' image;
break;
case 'b':
case 'B':
//get the 'B' image;
break;
// (...)
case 'z':
case 'Z':
//get the 'Z' image;
break;
}
...or you could just normalize to lower case or upper case before switching.
char c = Character.toUpperCase(/* whatever */);
switch(c) {
case 'A':
//get the 'A' image;
break;
case 'B':
//get the 'B' image;
break;
// (...)
case 'Z':
//get the 'Z' image;
break;
}
Above, you mean OR not AND. Example of AND: 110 & 011 == 010 which is neither of the things you're looking for.
For OR, just have 2 cases without the break on the 1st. Eg:
case 'a':
case 'A':
// do stuff
break;
The above are all excellent answers. I just wanted to add that when there are multiple characters to check against, an if-else might turn out better since you could instead write the following.
// switch on vowels, digits, punctuation, or consonants
char c; // assign some character to 'c'
if ("aeiouAEIOU".indexOf(c) != -1) {
// handle vowel case
} else if ("!##$%,.".indexOf(c) != -1) {
// handle punctuation case
} else if ("0123456789".indexOf(c) != -1) {
// handle digit case
} else {
// handle consonant case, assuming other characters are not possible
}
Of course, if this gets any more complicated, I'd recommend a regex matcher.
Observations on an interesting Switch case trap --> fall through of switch
"The break statements are necessary because without them, statements in switch blocks fall through:"
Java Doc's example
Snippet of consecutive case without break:
char c = 'A';/* switch with lower case */;
switch(c) {
case 'a':
System.out.println("a");
case 'A':
System.out.println("A");
break;
}
O/P for this case is:
A
But if you change value of c, i.e., char c = 'a';, then this get interesting.
O/P for this case is:
a
A
Even though the 2nd case test fails, program goes onto print A, due to missing break which causes switch to treat the rest of the code as a block. All statements after the matching case label are executed in sequence.
From what I understand about your question, before passing the character into the switch statement, you can convert it to lowercase. So you don't have to worry about upper cases because they are automatically converted to lower case.
For that you need to use the below function:
Character.toLowerCase(c);
Enhanced switch/ case / Switch with arrows syntax (Since Java 13):
char c;
switch (c) {
case 'A', 'a' -> {} // c is either 'A' or 'a'.
case ...
}

Converting the integer representation of a word back to a String

This is a homework assignment and I'm having trouble with my output. Everything works as expected except the Integer.toString() isn't giving me the result I want. It is still outputting just a bunch of numbers when I want them to be converted to the actual word. Here's the code and output:
import java.io.*;
public class NumStream extends OutputStream
{
public void write(int c) throws IOException
{
StringBuffer sb = new StringBuffer();
switch(c)
{
case ' ': sb.append(" ");
break;
case '1': sb.append("One");
break;
case '2': sb.append("Two");
break;
case '3': sb.append("Three");
break;
case '4': sb.append("Four");
break;
case '5': sb.append("Five");
break;
case '6': sb.append("Six");
break;
case '7': sb.append("Seven");
break;
case '8': sb.append("Eight");
break;
case '9': sb.append("Nine");
break;
case '0': sb.append("Zero");
break;
default: sb.append(Integer.toString(c));
break;
}
System.out.print(sb);
}
public static void main(String[] args)
{
NumStream ns = new NumStream();
PrintWriter pw = new PrintWriter(new OutputStreamWriter(ns));
pw.println("123456789 and ! and # ");
pw.flush();
}
}
the output is: OneTwoThreeFourFiveSixSevenEightNine 97110100 33 97110100 35 1310
can somebody please tell me how to format code easier in this forum? I had to manually 8 space indent each line and there's got to be an easier way!
For characters that aren't digits, you're taking the character code and converting it to a number. So 97 110 and 100 are the character codes for 'a', 'n', and 'd' while 33 and 35 are ! and #.
What you probably want for your default case is just:
default: sb.append((char)c); break;
Note that creating a new StringBuffer each time the write routine is called is extremely wasteful and inefficient. Since you're only ever appending one string/char to it, you might as well just print that string/char directly rather than copying through a StringBuffer.
You are outputing the ascii code of characters which are not digits in sb.append(Integer.toString(c)).

is there a utility method for this in commons-lang?

I searched for sometime but I couldn't find any
boolean isAlpha(final char character)
{
char c = Character.toUpperCase(character);
switch (c)
{
case 'A':
case 'B':
case 'C':
case 'D':
case 'E':
case 'F':
case 'G':
case 'H':
case 'I':
case 'J':
case 'K':
case 'L':
case 'M':
case 'N':
case 'O':
case 'P':
case 'Q':
case 'R':
case 'S':
case 'T':
case 'U':
case 'V':
case 'W':
case 'X':
case 'Y':
case 'Z':
return true;
default:
return false;
}
}
Commons Lang has CharUtils.isAsciiAlpha, but perhaps you could just use java.lang.Character.isLetter(char) (javadoc). Not quite the same (the latter matches more than just A-Z ASCII), but may be enough for your needs.
I know this is not from lang, but how about return (c >= 'A' && c <= 'Z')?
You could use StringUtils.isAlpha
That switch is pretty verbose, if I had to write it myself I'd make something like:
boolean isAlpha(final char c) {
return "abcdefghijklmnopqrstuvwxyz".indexOf(Character.toLowerCase(c)) != -1;
}
You want CharUtils.isAsciiAlpha.
It should be faster than StringUtils.isAlpha(String) because you're not creating a new String object.
You avoid the cost of converting to an uppercase char in your original method.
It's more readable then range checks (which is how it's implemented).
java.lang.Character.isLetter(char) will return true for certain non-Latin characters for which your method returns false.
How about Character.isLetter()?
If you simply want to check whether the given character is somewhere between A-Z, an easier way to do this would be to use regular expressions:
Pattern.matches("[A-Z]", input)
Where input is a CharSequence. More information on the Java Pattern class: http://download.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html
Don't know how this would compare performance wise to the other options though.
Character class provides many useful APIs. You need not convert the character. Few examples are
Character.isLetter(char ch)
Character.isLowerCase(char ch)
Character.isUpperCase(char ch)
Character.isDigit(char ch)
Character.isLetterOrDigit(char ch)

Categories