Fatal Error :1:40: Content is not allowed in prolog - java
I have a super simple XML document encoded in UTF-16 LE.
<?xml version="1.0" encoding="utf-16"?><X id="1" />
I'm loading it in as such (using jcabi-xml):
BOMInputStream bomIn = new BOMInputStream(Main.class.getResourceAsStream("resources/test.xml"), ByteOrderMark.UTF_16LE);
String firstNonBomCharacter = Character.toString((char)bomIn.read());
Reader reader = new InputStreamReader(bomIn, "UTF-16");
String xmlString = IOUtils.toString(reader);
xmlString = xmlString.trim();
xmlString = firstNonBomCharacter + xmlString;
bomIn.close();
reader.close();
final XML xml = new XMLDocument(xmlString);
I have checked that there are no extra BOM/junk symbols (leading or anywhere) by saving out the file and inspecting it with a hex editor. The XML is properly formatted.
However, I still get the following error:
[Fatal Error] :1:40: Content is not allowed in prolog.
Exception in thread "main" java.lang.IllegalArgumentException: Invalid XML: "<?xml version="1.0" encoding="utf-16"?><X id="1" />"
at com.jcabi.xml.DomParser.document(DomParser.java:115)
at com.jcabi.xml.XMLDocument.<init>(XMLDocument.java:155)
at Main.getTransformedString(Main.java:47)
at Main.main(Main.java:26)
Caused by: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 40; Content is not allowed in prolog.
at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source)
at javax.xml.parsers.DocumentBuilder.parse(Unknown Source)
at com.jcabi.xml.DomParser.document(DomParser.java:105)
... 3 more
I have googled up and down for this error but they all say that it's the BOM's fault, which I have confirmed (to the best of my knowledge) to not be the case. What else could be wrong?
The following works for me:
try (InputStream stream = Test.class.getResourceAsStream("/Test.xml")) {
StreamSource source = new StreamSource(stream);
final XML xml = new XMLDocument(source);
}
With the input file's hex dump:
FF FE 3C 00 3F 00 78 00 6D 00 6C 00 20 00 76 00 65 00 72 00 73 00 69 00
6F 00 6E 00 3D 00 27 00 31 00 2E 00 30 00 27 00 20 00 65 00 6E 00 63 00
6F 00 64 00 69 00 6E 00 67 00 3D 00 27 00 55 00 54 00 46 00 2D 00 31 00
36 00 27 00 3F 00 3E 00 3C 00 58 00 20 00 69 00 64 00 3D 00 22 00 31 00
22 00 2F 00 3E 00
As far as I can tell, in your example you are converting the contents of the file to a string. But this is problematic because you actually throw away the encoding when you convert bytes to string. When the SAX parser converts the string to a byte array, it decides it will be UTF-8, but the prolog states that it is UTF-16 and so you have a problem.
Instead, when I use the StreamSource, it just automatically detects the fact that the file is encoded in UTF-16 LE from the BOM.
If you are not using java-7 or up and cannot use try-with-resources, then use the stream.close() as before.
Related
Byte conversion
I have two img file. Origin (2GB) and Destination (4GB), They are the result of some sort of encoding wich I'm trying to identify and revert. So in order to successfully revert encoding I have to see if I'm able to obtain again Origin from the Destionation file I've built a table that show that Origin has 256 types of bytes and Destination has 256 types of bytes-pair. Here is the list of the bytes converted in Hex of Origin with occurrency. FF=24575615 FE=3242667 FD=3009202 FC=3063146 FB=3003652 FA=3025947 F9=3005543 F8=7684326 F7=4554041 F6=2933185 F5=3373967 F4=5597006 F3=2906784 F2=3789554 9F=3102630 9E=3005388 F1=3557574 F0=4365911 9D=3078506 9C=2840242 9B=2763692 9A=2804976 EF=2941117 EE=3025616 99=2877085 ED=2902961 98=3028895 EC=2817617 97=2752245 EB=3333926 96=2789702 EA=2850121 95=2989513 94=3031653 93=2911830 92=2658657 91=2728002 90=3419534 E9=2887403 E8=3208952 E7=3285198 E6=2644790 E5=4609467 E4=2650016 E3=4372245 8F=2991145 E2=3368100 E1=5113630 8E=2575537 E0=9155599 8D=3578967 8C=3038052 8B=2921954 8A=2675041 DF=2917213 DE=2560516 89=2736502 DD=2625394 88=3270888 DC=2599744 87=3366265 DB=2698959 86=2899131 DA=2673989 85=3330569 84=3367665 83=3421457 82=3444192 81=3864339 80=6354686 D9=2792340 D8=3572281 D7=2917209 D6=2502705 D5=2726792 D4=2599407 D3=2526731 7F=3667154 D2=2594634 D1=3798179 7E=2752138 D0=5792504 7D=2931975 7C=2876880 7B=3192909 7A=3348958 CF=2842460 CE=2904295 79=4933142 CD=2468499 78=4201043 CC=2551223 77=4251200 CB=2410778 76=5307097 CA=2417649 75=7217741 74=15428931 73=12268233 72=14409973 71=4741548 70=9798438 C9=2359024 C8=2549326 C7=2608153 C6=2524731 C5=2483222 C4=2848155 C3=3696683 6F=15455489 C2=2971749 6E=14311776 C1=2383297 6D=8538221 C0=3270606 6C=10639469 6B=4601490 6A=3337833 BF=3527482 BE=3305589 69=15717960 BD=3364649 68=6544569 BC=2989446 67=7873918 BB=2867947 66=5310067 BA=2996525 65=22005763 64=10819109 63=10271386 62=5649243 61=17118578 60=3714590 B9=2931805 B8=3617901 B7=2980605 B6=2841578 B5=3470008 B4=3329220 B3=2808383 5F=7462619 B2=3022737 5E=2545337 B1=3328536 B0=4808034 5D=3011851 5C=2786455 5B=3763489 5A=3363499 AF=3138318 AE=3058472 59=3023985 AD=2753771 58=3200666 AC=2718493 57=3198750 AB=2727749 56=3157681 AA=3016716 55=3625987 54=7058037 53=6318637 52=5403634 51=2927288 50=5225038 A9=2758574 A8=3190446 A7=2891160 A6=2873612 A5=3024935 A4=3732070 A3=2715548 4F=4252264 A2=2423484 4E=4458144 A1=2799897 4D=4589889 A0=4347937 4C=5262566 4B=4257717 4A=3099467 49=5937076 48=3346052 47=3830489 46=6790552 45=6137365 44=5804764 43=5414206 42=4114199 41=5409554 40=4442287 3F=3156472 3E=3225065 3D=4457800 3C=3929336 3B=4066190 3A=9022387 39=6277213 38=8240388 37=6495438 36=5451005 35=6141671 34=7080579 33=7806046 32=9798066 31=11882632 30=15283799 2F=6985857 2E=8044627 2D=6636208 2C=4805977 2B=3220182 2A=3167464 29=4090111 28=5709938 27=3502804 26=2929070 25=3358752 24=3916999 23=4057819 22=5124209 21=5277533 20=42872703 1F=3987784 1E=3484472 1D=3643916 1C=4174216 1B=3662986 1A=4933323 19=3677299 18=4216614 17=4043968 16=3582845 15=3683685 14=4540186 13=4812066 12=6464885 11=6488640 10=12415842 0F=4932667 0E=6787886 0D=4760047 0C=8731063 0B=7069143 0A=12241413 09=10858120 08=13149164 07=8219751 06=6926974 05=7701026 04=12557557 03=14887136 02=20154437 01=29508103 00=835691837 and here is the list of bytes couple in the destination file 0E,00=6791835 2C,00=4806159 4A,00=3099823 FF,00=3030567 80,25=2915869 B3,00=3061678 D1,00=3024917 7E,00=2752043 14,00=4543724 32,00=9800493 50,00=5226411 C9,00=3419606 E7,00=3367141 48,00=3344687 66,00=5308554 BA,00=2890612 1B,00=3662605 EE,00=3039868 51,25=2996741 4F,00=4251746 A2,00=3364659 6D,00=8535725 C0,00=2980676 03,00=14884374 21,00=5277874 B8,00=4554035 1C,25=3697411 D6,00=2878193 F4,00=2911302 19,00=3677995 37,00=6496436 55,00=3621900 73,00=12268699 0A,00=16849664 BF,00=3191022 DD,00=2901038 FB,00=2790679 3E,00=3226874 5C,00=2785989 7A,00=3348851 10,00=12415134 92,25=3328216 A7,00=3374104 C5,00=2992633 E3,00=2524591 FF,FE=1 08,00=13152284 26,00=2927651 44,00=5803368 62,00=5647266 F9,00=2750935 5D,25=2990402 AE,00=2758502 78,00=4211254 CC,00=2560487 EA,00=3271925 0F,00=4934398 2D,00=6635518 4B,00=4257690 63,25=2931269 B4,00=2940342 D2,00=4371679 7F,00=3667613 F0,00=5791943 15,00=3684778 33,00=7806422 51,00=2927325 E8,00=2675786 49,00=5937427 67,00=7873266 BB,00=3134171 00,25=2849189 1C,00=4180501 3A,00=9021107 34,25=2382740 EF,00=2921132 A3,00=2840826 6E,00=14310898 C1,00=3469735 04,00=12561200 22,00=5125096 40,00=4443771 B9,00=3002742 D7,00=3005298 F5,00=2649810 38,00=8247159 56,00=3158613 AA,00=2874152 74,00=15429251 92,01=3102012 0B,00=7069820 DE,00=3208144 FC,00=3865021 3F,00=3156189 B0,00=7683525 5D,00=3011343 7B,00=3193376 57,25=2867868 11,00=6490582 93,25=3022263 A8,00=3006278 0C,25=2674557 C6,00=2658527 E4,00=3366420 09,00=10858792 27,00=3506948 45,00=6136951 63,00=10272001 AF,00=3026422 79,00=4934274 CD,00=2502816 EB,00=2734596 2E,00=8048888 4C,00=5263799 6A,00=3337574 00,00=835408324 B5,00=2644329 D3,00=9153408 F1,00=3732278 16,00=3583727 34,00=7080805 52,00=5404524 70,00=9797831 E9,00=3443955 A0,25=3241460 68,00=6544647 BC,00=2721172 DA,00=2887297 1D,00=3644022 3B,00=4065122 17,20=3790204 A4,00=2842361 6F,00=15455236 C2,00=2841458 E0,00=3329927 05,00=7700764 69,25=2417913 23,00=4057297 41,00=5410631 D8,00=3078746 F6,00=3032474 3C,25=2483865 5A,25=2550298 39,00=6276359 AB,00=3058994 57,00=3198862 75,00=7216338 0C,00=8731202 2A,00=3167488 DF,00=5114400 24,25=3329540 FD,00=2819394 60,25=2551483 B1,00=3556946 5E,00=2545159 7C,00=2883942 12,00=6465795 30,00=15283965 A9,00=3617885 C7,00=6356230 E5,00=2898862 28,00=5709988 46,00=6790071 64,00=10820537 CE,00=2917633 EC,00=3579490 2F,00=6986708 A0,00=24806546 4D,00=4589771 6B,00=4601028 01,00=29501552 B6,00=5596464 D4,00=3367061 F2,00=2990103 17,00=4044395 35,00=6142063 53,00=6317913 71,00=4740237 6C,25=2904343 69,00=15721818 BD,00=2728255 02,25=2808648 DB,00=2849348 1E,00=3485468 3C,00=3929076 5A,00=3363435 18,25=2793271 54,25=2359093 A5,00=3305528 C3,00=2608966 E1,00=4348666 06,00=6927361 24,00=3917870 88,25=2699985 42,00=4114151 60,00=3715031 D9,00=3334096 F7,00=2933573 AC,00=3016682 58,00=3203765 76,00=5306969 CA,00=2594516 0D,00=16849664 2B,00=3219425 FE,00=3284535 5F,00=7459837 B2,00=3008654 D0,00=3798597 7D,00=2932025 13,00=4813318 31,01=2726914 31,00=11882336 C8,00=2599419 E6,00=2728324 2C,25=2971956 29,00=4090622 47,00=3830335 65,00=22005256 1A,00=4933435 CF,00=3572248 14,25=3268735 ED,00=2800139 50,25=2468708 4E,00=4457687 A1,00=2753982 6C,00=10638547 02,00=20156417 66,25=2411484 20,00=42875581 84,25=2599935 B7,00=3025065 D5,00=4608195 F3,00=2423749 18,00=4217479 36,00=5451070 54,00=7055997 72,00=14410154 BE,00=2907783 DC,00=2804396 FA,00=2715354 1F,00=3988798 3D,00=4458111 5B,00=3762930 91,25=4806083 A6,00=2624919 C4,00=2576697 E2,00=3421114 07,00=8216863 25,00=3358496 43,00=5414386 61,00=17120598 F8,00=2763409 AD,00=4364881 59,00=3024233 77,00=4249782 CB,00=2526456 10,25=3526973 at the beginning of the Destination file I have FF,FE, which is the little endian BOM, followed by lots of zeroes. I have tried reading the destination file with UTF-16 encoding and saving the result as UTF-8 but the latter is 2,5 GB and has unwanted transformation for example the sequence. ORIGIN CD 7C 78 38 81 7C 78 7C 38 06 00 FF FF 53 EF ORIGIN REBUILD 7C 78 38 C3 BC 7C 78 38 06 00 C2 A0 C2 A0 I have later tried to read the stream as UTF-16 and the convert it to IBM850. I have found that this conversion looks promising (the reverted file resemble a little more the origin) but the copy have some addition and some unexplicable conversion, that will be (correctly) converted in the reverted file, making it unreadable. For example on the original file I have: 7F 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 6C 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 01 20 On the file copied with netcat file I have: 7F 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 F5 00 FE 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 02 00 01 00 20 00 The first impression is a byte to UTF16LE conversion. But I'm wondering why the byte 6c should be converted to the three bytes F5 00 FE 00 01 00 What kind of conversion has happened? Have you ever seen it? I have done the import with dd over netcat. I do not have the original file anymore. Thats why I want to revert the conversion. The netcat command I used: SHELL1 adb forward tcp:9999 tcp:9999 adb shell su dd if=/dev/block/nandxxx | nc -l -p 9999 SHELL2 nc localhost 9999 >nandxxx.img To be more clear, of what I mean with the word "copy" and "reverted": original physical image -> copy (or copied image) -> reverted I would like to obtain is of course original physical = reverted. If you need any more information please tell me in comments: I will update this question, with the requested details. Thanks in advance. UPDATE 29-03-2017: I have reproduced the steps on a known ISO Image wich is not the original one (that's where the statistics and the observations comes from). I still observe the byte manipulation/insertion. Obviously applying the IBM850 encoding on the copy does not lead to a reverted version of the original Android good enough to perform any recovery operation (carved images are way too blurried, too few system files, etc.)
Although an interesting problem, no boundy-worthy answer presents itself upon initial examination. The following commands were use to analyze the possibility of a byte-to-byte-pair cipher: sed -rn "s/^.*=//p" <originByteOccurance |sort -u >/tmp/qqq1 sed -rn "s/^.*=//p" <destianationBytePairOccurance |sort -u >/tmp/qqq2 wc -l /tmp/qqq1 /tmp/qqq2 # /tmp/qqq1 256 # /tmp/qqq2 256 cat /tmp/qqq1 /tmp/qqq2 |sort |uniq -d sed -rn "s/,.*//p" </tmp/qq2 |sort -u |wc -l # 230 sed -rn "s/^..,//; s/=.*//p" </tmp/qq2 |sort -u # 00 # 01 # 20 # 25 # FE Although wc -l reveals that there are indeed 256 unique counts of both unique byte occurrences in the origin and unique byte pair occurrences in the destination, there is no overlap between source and destination in the occurrences sets, therefore the transformation is not an encoding of bytes to byte pairs. An inverse transformation using a reverse map is not possible. A block encryption is unlikely because of the sparse coverage most significant byte in the destination byte pair set. If the origin images entering the transform can be controlled, then creating an experimental fixture that allows the sending of a series of 2 x 2 pixel images with pastel colors (sparse 1s among 0s) through the transform may be helpful in revealing more about the transform (and improve the likelihood of SO assistance). These may be a good first set of pixel colors to try in these micro-images: #000000 #00000f #000f00 #0f0000 Upon examination of simpler results, several hypotheses may come to mind. Hypothetical models that produce the experimental results perfectly can then be tested with 16 x 16 pixel images to gain evidence for those hypotheses. The ideas that pass 16 x 16 can then be tried with 1600 x 900 HD images until a high level of confidence is established.
Extract .gz files in java
I'm trying to unzip some .gz files in java. After some researches i wrote this method: public static void gunzipIt(String name){ byte[] buffer = new byte[1024]; try{ GZIPInputStream gzis = new GZIPInputStream(new FileInputStream("/var/www/html/grepobot/API/"+ name + ".txt.gz")); FileOutputStream out = new FileOutputStream("/var/www/html/grepobot/API/"+ name + ".txt"); int len; while ((len = gzis.read(buffer)) > 0) { out.write(buffer, 0, len); } gzis.close(); out.close(); System.out.println("Extracted " + name); } catch(IOException ex){ ex.printStackTrace(); } } when i try to execute it i get this error: java.util.zip.ZipException: Not in GZIP format how can i solve it? Thanks in advance for your help
Test a sample, correct, gzipped file to see whether the problem lies in your code or not. There are many possible ways to build a (g)zip file. Your file may have been built differently from what Java's built-in support expects, and the fact that one uncompressor understands a compression variant is no guarantee that Java will also recognize that variant. Please verify exact file type with file and/or other uncompression utilities that can tell you which options were used when compressing it. You may also have a look at the file itself with a tool such as hexdump. This is the output of the following command: $ hexdump -C lgpl-2.1.txt.gz | head 00000000 1f 8b 08 08 ed 4f a9 4b 00 03 6c 67 70 6c 2d 32 |.....O.K..lgpl-2| 00000010 2e 31 2e 74 78 74 00 a5 5d 6d 73 1b 37 92 fe 8e |.1.txt..]ms.7...| 00000020 ba 1f 81 d3 97 48 55 34 13 7b 77 73 97 78 2b 55 |.....HU4.{ws.x+U| 00000030 b4 44 d9 bc 95 25 2d 29 c5 eb ba ba aa 1b 92 20 |.D...%-)....... | 00000040 39 f1 70 86 99 17 29 bc 5f 7f fd 74 37 30 98 21 |9.p...)._..t70.!| 00000050 29 7b ef 52 9b da 58 c2 00 8d 46 bf 3c fd 02 d8 |){.R..X...F.<...| 00000060 da fe 3f ef 6f 1f ed cd 78 36 1b 4f ed fb f1 ed |..?.o...x6.O....| 00000070 78 3a ba b1 f7 8f ef 6e 26 97 96 fe 1d df ce c6 |x:.....n&.......| 00000080 e6 e0 13 f9 e7 57 57 56 69 91 db 37 c3 d7 03 7b |.....WWVi..7...{| 00000090 ed e6 65 93 94 7b fb fa a7 9f 7e 32 c6 5e 16 bb |..e..{....~2.^..| In this case, I used standard gzip on this license text. The 1st few bytes are unique to GZipped files (although they do not specify variants) - if your file does not start with 1f 8b, Java will complain, regardless of remaining contents. If the problem is due to the file, it is possible that other uncompression libraries available in Java may deal with the format correctly - for example, see Commons Compress
import com.horsefly.utils.GZIP; import org.apache.commons.io.FileUtils; .... String content = new String(new GZIP().decompresGzipToBytes(FileUtils.readFileToByteArray(fileName)), "UTF-8"); in case someone needs it.
Java serialization with empty and substrings
Had a look at the implementation and haven't been able to think of an explanation to this but maybe someone here will know. public static void main(String[] args) throws Exception { List<String> emptyStrings = new ArrayList<String>(); List<String> emptySubStrings = new ArrayList<String>(); for (int i = 0; i < 20000; i++) { String actuallyEmpty = ""; String subStringedEmpty = " "; subStringedEmpty = subStringedEmpty.substring(0, 0); emptyStrings.add(actuallyEmpty); emptySubStrings.add(subStringedEmpty); } System.out.println("Substring test"); // Write to files long time = System.currentTimeMillis(); writeObjectToFile(emptyStrings, "empty.list"); System.out.println("Time taken to write empty list " + (System.currentTimeMillis() - time)); time = System.currentTimeMillis(); writeObjectToFile(emptySubStrings, "substring.list"); System.out.println("Time taken to write substring list " + (System.currentTimeMillis() - time)); //Read from files time = System.currentTimeMillis(); List<String> readEmptyString = readObjectFromFile("empty.list"); System.out.println("Time taken to read empty list " + (System.currentTimeMillis() - time)); time = System.currentTimeMillis(); List<String> readEmptySubStrings = readObjectFromFile("substring.list"); System.out.println("Time taken to read substring list " + (System.currentTimeMillis() - time)); } private static void writeObjectToFile(Object o, String file) throws Exception { FileOutputStream out = new FileOutputStream(file); ObjectOutputStream oout = new ObjectOutputStream(out); oout.writeObject(o); oout.flush(); oout.close(); } private static <T> T readObjectFromFile(String file) throws Exception { ObjectInputStream ois = null; try { ois = new ObjectInputStream(new FileInputStream(file)); return (T) ois.readObject(); } finally { ois.close(); } } Ultimately these 2 lists contain 20,000 empty strings (one list contains "" empty strings and the other contains empty strings generated by substring(0,0)). But if you check the sizes of the serialized files generated (empty.list and substring.list) you will notice that the empty.list contains substantially more data. I have noticed that the callers of remote EJB's which un-serialize these substring objects seem to have severe performance issues also.
The sizes of the lists are different because java uses a mechanism to store multiples references to the same object, like described: References to other objects (except in transient or static fields) cause those objects to be written also. Multiple references to a single object are encoded using a reference sharing mechanism so that graphs of objects can be restored to the same shape as when the original was written. see ObjectOutputStream If you look the generated serialized file, you will see: With 1 String empty inside: empty.list: ac ed 00 05 73 72 00 13 6a 61 76 61 2e 75 74 69 6c 2e 41 72 72 61 79 4c 69 73 74 78 81 d2 1d 99 c7 61 9d 03 00 01 49 00 04 73 69 7a 65 78 70 00 00 00 01 77 04 00 00 00 01 74 00 00 78 The string "" corresponds to the last three bytes (00 00 78) substring.list ac ed 00 05 73 72 00 13 6a 61 76 61 2e 75 74 69 6c 2e 41 72 72 61 79 4c 69 73 74 78 81 d2 1d 99 c7 61 9d 03 00 01 49 00 04 73 69 7a 65 78 70 00 00 00 01 77 04 00 00 00 01 74 00 00 78 Note that with one element the resulted file is the same. But if we want to add more times the same object, we will be faced with other behavior. Look the respective files with 2 times that string. empty.list: ac ed 00 05 73 72 00 13 6a 61 76 61 2e 75 74 69 6c 2e 41 72 72 61 79 4c 69 73 74 78 81 d2 1d 99 c7 61 9d 03 00 01 49 00 04 73 69 7a 65 78 70 00 00 00 02 77 04 00 00 00 02 74 00 00 71 00 7e 00 02 78 substring.list ac ed 00 05 73 72 00 13 6a 61 76 61 2e 75 74 69 6c 2e 41 72 72 61 79 4c 69 73 74 78 81 d2 1d 99 c7 61 9d 03 00 01 49 00 04 73 69 7a 65 78 70 00 00 00 02 77 04 00 00 00 02 74 00 00 74 00 00 78 Note that substring continues "normal", two non related strings with different references. But empty has some extra bytes to handle the issue of same reference. Six bytes from substring (00 00 74 00 00 78) versus eight bytes from emptylist (00 00 71 00 7e 00 02 78) This goes wrong because every repeated string that you add, more extra bytes are added. So when you full your arrayList there will be a lot of extra bytes to make it possible to reconstruct as it's original way. If you want to know why there is that sharing mechanism, I suggest you to take a look at this question: What is the meaning of reference sharing in Serialization? How Enums are Serialized?
empty.list contains one String object and lots of references to it. substring.list contains 2000 string objects, all of them are equal in content. You could "fix" this by intern()ing the strings. private void verify(String name, Supplier<String> stringSupplier) throws IOException, ClassNotFoundException { List<String> inputStrings = new ArrayList<String>(); inputStrings.add(stringSupplier.get()); inputStrings.add(stringSupplier.get()); ByteArrayOutputStream boas = new ByteArrayOutputStream(); ObjectOutputStream emptyOut = new ObjectOutputStream(boas); emptyOut.writeObject(inputStrings); emptyOut.flush(); ObjectInputStream ois = new ObjectInputStream(new ByteArrayInputStream(boas.toByteArray())); List<String> returnedStrings = (List<String>)ois.readObject(); if(returnedStrings.get(0) == returnedStrings.get(1)) { System.out.println(name + " contains the same object"); } else { System.out.println(name + " contains DIFFERENT objects"); } } #Test public void test() throws IOException, ClassNotFoundException { verify("empty string", new Supplier<String>() { #Override public String get() { return ""; } }); verify("sub string", new Supplier<String>() { #Override public String get() { String data = " "; return data.substring(0, 0); } }); verify("intern()ed substring", new Supplier<String>() { #Override public String get() { String data = " "; return data.substring(0, 0).intern(); } }); }
servlet request parameter character encoding
I have a Java servlet that receives data from an upstream system via a HTTP GET request. This request includes a parameter named "text". If the upstream system sets this parameter to: TEST3 please ignore: It appears in the logs of the upstream system as: 00 54 00 45 00 53 00 54 00 33 00 20 00 70 00 6c //TEST3 pl 00 65 00 61 00 73 00 65 00 20 00 69 00 67 00 6e //ease ign 00 6f 00 72 00 65 00 3a //ore: (The // comments do not actually appear in the logs) In my servlet I read this parameter with: String text = request.getParameter("text"); If I print the value of text to the console, it appears as: T E S T 3 p l e a s e i g n o r e : If I inspect the value of text in the debugger, it appears as: \u000T\u000E\u000S\u000T\u0003\u0000 \u000p\u000l\u000e\u000a\u000s\u000e\u0000 \u000i\u000g\u000n\u000o\u000r\u000e\u000: So it seems that there's a problem with the character encoding. The upstream system is supposed to use UTF-16. My guess is that the servlet is assuming UTF-8 and therefore is reading twice the number of characters it should be. For the message "TEST3 please ignore:" the first byte of each character is 00. This is being interpreted as a space when read by the servlet, which explains the space that appears before each character when the message is logged by the servlet. Obviously my goal is simply to get the message "TEST3 please ignore:" when I read the text request param. My guess is that I could achieve this by specifying the character encoding of the request parameter, but I don't know how to do this.
Use like this new String(req.getParameter("<my request value>").getBytes("ISO-8859-1"),"UTF-8")
Try to use Filter for this public class CustomCharacterEncodingFilter implements Filter { public void init(FilterConfig config) throws ServletException { } public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException { request.setCharacterEncoding("UTF-8"); response.setCharacterEncoding("UTF-8"); chain.doFilter(request, response); } public void destroy() { } This should set encoding right for whole application
Looks like it was encoded with UTF-16LE (Little Endian) encoding, here is a class that successfully prints your string: import java.io.UnsupportedEncodingException; import java.math.BigInteger; public class Test { public static void main(String[] args) throws UnsupportedEncodingException { String hex = "00 54 00 45 00 53 00 54 00 33 00 20 00 70 00 6c" + "00 65 00 61 00 73 00 65 00 20 00 69 00 67 00 6e" + "00 6f 00 72 00 65 00 3a"; // + " 00"; System.out.println(new String(new BigInteger(hex.replaceAll(" ", ""), 16).toByteArray(), "UTF-16LE")); } } Output: TEST3 please ignore? Output with two zero's added to the input TEST3 please ignore: UPDATE To get this working with your Servlet you can try: String value = request.getParameter("text"); try { value = new String(value.getBytes(), "UTF-16LE"); } catch(java.io.UnsupportedEncodingException ex) {} UPDATE see the following link, it verifies that the hex produced is in fact UTF-16LE
H.323, How to make a simple ring without media. This script was following Q.931 setup but still not working
Can anyone please help me to solve this? When i send this request i have seen in wireshark that packets are going to SJPhone in 1720 tcp port. But still SJPhone does not ring. I want to make it ring (no matter for media). I would really appreciate your support. I must be missing the message protocol details to implement this. Please show me some positive pointers. FYI: i have used this trace: http://www.vconsole.com/usermanuals/sample_isdn_trace.pdf import java.io.*; import java.net.*; public class test { public static void main(String[] args) throws UnknownHostException, IOException { /* Step 1: simulate the Q.931 packets exchange */ byte st[]=new byte[256]; st[0]=0x08; // protocol discriminator st[1]=0x02; // length (bytes) of call reference st[2]=0x02; // call reference (1-15 bytes) // message type st[3]=0x05; // information Elements st[4]=0x6C; // calling party number st[5]=0; // unknown st[6]=0; // unknown st[7]=1; // "1" // information elements st[8]=0x70; // called party number st[9]=0; // unknown st[10]=0; // unknown st[11]=5; // "5" System.out.println(st); /* Step 2: by-pass it for testing with tcpdump */ Socket clientSocket = new Socket("localhost", 1720); DataInputStream input = new DataInputStream(clientSocket.getInputStream()); DataOutputStream outToServer = new DataOutputStream(clientSocket.getOutputStream()); BufferedReader inFromServer = new BufferedReader(new InputStreamReader(clientSocket.getInputStream())); outToServer.write(st); String get; get = inFromServer.readLine(); System.out.println("FROM SERVER: " + get); clientSocket.close(); } } --More info: -When SJPhone to SJPhone communicate i see this logs: 0000 00 00 03 04 00 06 00 00 00 00 00 00 00 00 08 00 ................ 0010 45 00 00 3c 88 6c 40 00 40 06 b4 4d 7f 00 00 01 E..<.l#.#..M.... 0020 7f 00 00 01 b4 a8 06 b8 56 f6 c4 b3 00 00 00 00 ........V....... 0030 a0 02 80 18 fe 30 00 00 02 04 40 0c 04 02 08 0a .....0....#..... 0040 02 a1 4c df 00 00 00 00 01 03 03 06 ..L......... -When this test.java communicate to SJPhone i see this logs: 0000 00 00 03 04 00 06 00 00 00 00 00 00 00 00 08 00 ........ ........ 0010 45 00 00 3c 1f ba 40 00 40 06 1d 00 7f 00 00 01 E..<..#. #....... 0020 7f 00 00 01 d6 ca 06 b8 8c e7 41 15 00 00 00 00 ........ ..A..... 0030 a0 02 80 18 fe 30 00 00 02 04 40 0c 04 02 08 0a .....0.. ..#..... 0040 02 a6 5e af 00 00 00 00 01 03 03 06 ..^..... .... Note: Friendly dump can be done using this command, to see realtime + save what you have seen: tcpdump -XX -s 0 -i lo | tee /tmp/log.log
So, the question was only to send a ring, without an actual call. Here is some code that does the job. It was a lot of fun to see when it started working: import java.io.DataInputStream; import java.io.DataOutputStream; import java.io.IOException; import java.io.OutputStream; import java.net.Socket; /** * * #author martijncourteaux */ public class SJPhoneRinger { private static String setupOpenLogicChannel = "03|00|02|32|08|02|10|00|05|04|03|88|c0|a5|28|07|75|6e|6b|6e|6f|77|6e|6c|05|81|32|30|30|35|7e|02|11|05|20|88|06|00|08|91|4a|00|04|22|c0|00|00|00|00|0f|53|4a|20|4c|61|62|73|ae|20|53|4a|70|68|6f|6e|65|08|31|2e|36|30|2e|32|39|39|61|00|7f|00|00|01|06|b8|00|c4|a1|56|3c|d2|1d|b2|11|a8|ed|b7|8a|c2|c1|98|8f|00|c1|1d|80|04|11|00|f6|a1|56|3c|d2|1d|b2|11|a8|ed|b7|8a|c2|c1|98|8f|81|3a|0a|21|40|00|00|06|04|01|00|4e|0c|03|00|20|00|80|12|1e|00|01|00|7f|00|00|01|c0|04|00|7f|00|00|01|c0|05|00|2b|40|00|00|06|04|01|00|4c|10|00|00|00|00|09|69|4c|42|43|2d|31|33|6b|33|80|12|1e|40|01|00|7f|00|00|01|c0|04|00|7f|00|00|01|c0|05|04|2b|40|00|00|06|04|01|00|4c|10|00|00|00|00|09|69|4c|42|43|2d|31|35|6b|32|80|12|1e|40|01|00|7f|00|00|01|c0|04|00|7f|00|00|01|c0|05|08|1e|40|00|00|06|04|01|00|4c|20|13|80|12|1e|00|01|00|7f|00|00|01|c0|04|00|7f|00|00|01|c0|05|00|1e|40|00|00|06|04|01|00|4c|60|13|80|12|1e|00|01|00|7f|00|00|01|c0|04|00|7f|00|00|01|c0|05|00|16|00|00|0d|0e|0c|03|00|20|00|80|0b|06|00|01|00|7f|00|00|01|c0|05|00|20|00|00|0e|0c|10|00|00|00|00|09|69|4c|42|43|2d|31|33|6b|33|80|0b|06|40|01|00|7f|00|00|01|c0|05|04|20|00|00|0f|0c|10|00|00|00|00|09|69|4c|42|43|2d|31|35|6b|32|80|0b|06|40|01|00|7f|00|00|01|c0|05|08|13|00|00|10|0c|20|13|80|0b|06|00|01|00|7f|00|00|01|c0|05|00|13|00|00|11|0c|60|13|80|0b|06|00|01|00|7f|00|00|01|c0|05|00|01|00|01|00|01|00|01|00|6e|02|64|02|70|01|06|00|08|81|75|00|08|80|0d|00|00|3c|00|01|00|00|01|00|00|01|00|00|04|80|00|00|24|18|03|00|20|00|80|00|01|20|20|00|00|00|00|09|69|4c|42|43|2d|31|33|6b|33|80|00|02|20|20|00|00|00|00|09|69|4c|42|43|2d|31|35|6b|32|80|00|03|20|40|13|80|00|04|20|c0|13|00|80|06|00|04|00|00|00|01|00|02|00|03|00|04|07|01|00|32|80|d8|50|5f|02|80|01|80"; private static String alertingTSCA = "03|00|00|d8|08|02|90|00|01|28|07|75|6e|6b|6e|6f|77|6e|7e|00|c3|05|23|c0|06|00|08|91|4a|00|04|22|c0|00|00|00|00|0f|53|4a|20|4c|61|62|73|ae|20|53|4a|70|68|6f|6e|65|08|31|2e|36|30|2e|32|38|39|61|00|5e|e0|cc|44|98|ff|0d|0c|11|00|f6|a1|56|3c|d2|1d|b2|11|a8|ed|b7|8a|c2|c1|98|8f|01|00|01|00|04|c0|01|80|74|04|03|21|80|01|64|02|70|01|06|00|08|81|75|00|08|80|0d|00|00|3c|00|01|00|00|01|00|00|01|00|00|04|80|00|00|24|18|03|00|20|00|80|00|01|20|20|00|00|00|00|09|69|4c|42|43|2d|31|33|6b|33|80|00|02|20|20|00|00|00|00|09|69|4c|42|43|2d|31|35|6b|32|80|00|03|20|40|13|80|00|04|20|c0|13|00|80|06|00|04|00|00|00|01|00|02|00|03|00|04|06|01|00|32|40|1b|33|02|20|80"; private static String facilityTSCA = "03|00|00|62|08|02|10|00|62|1c|00|28|07|75|6e|6b|6e|6f|77|6e|7e|00|4b|05|26|90|06|00|08|91|4a|00|04|c4|a1|56|3c|d2|1d|b2|11|a8|ed|b7|8a|c2|c1|98|8f|86|01|00|13|05|80|11|00|f6|a1|56|3c|d2|1d|b2|11|a8|ed|b7|8a|c2|c1|98|8f|07|00|5e|e0|cc|44|ab|62|01|00|01|00|04|c0|01|80|08|02|03|21|80|01|02|20|a0"; private static String connectOpenLogicChannel = "03|00|00|b6|08|02|90|00|07|28|07|75|6e|6b|6e|6f|77|6e|7e|00|a1|05|22|c0|06|00|08|91|4a|00|04|00|5e|e0|cc|44|98|ff|22|c0|00|00|00|00|0f|53|4a|20|4c|61|62|73|ae|20|53|4a|70|68|6f|6e|65|08|31|2e|36|30|2e|32|38|39|61|00|c4|a1|56|3c|d2|1d|b2|11|a8|ed|b7|8a|c2|c1|98|8f|0d|1c|11|00|f6|a1|56|3c|d2|1d|b2|11|a8|ed|b7|8a|c2|c1|98|8f|41|02|21|40|00|03|06|04|01|00|4e|0c|03|00|20|00|80|12|1e|00|01|00|7f|00|00|01|c0|04|00|5e|e0|cc|44|c0|07|00|1d|00|00|0d|0e|0c|03|00|20|00|80|12|1e|00|01|00|5e|e0|cc|44|c0|06|00|5e|e0|cc|44|c0|07|00|01|00|01|00|02|80|01|80"; /** * #param args the command line arguments */ public static void main(String[] args) throws IOException { Socket socket = new Socket("localhost", 1720); System.out.println("Connected to SJPhone!"); DataOutputStream dos = new DataOutputStream(socket.getOutputStream()); DataInputStream dis = new DataInputStream(socket.getInputStream()); sendBytes(dos, setupOpenLogicChannel); /* Following three packages were not needed. It worked without them. * Of course, because it is only a sample ring request, SJPhone will tell you * That the call is corrupted */ // sendBytes(dos, alertingTSCA); // sendBytes(dos, facilityTSCA); // sendBytes(dos, connectOpenLogicChannel); try { Thread.sleep(10000); // Let it ring for ten seconds. } catch (Exception e) { } } public static void sendBytes(OutputStream os, String bytes) throws IOException { String byteArrayStr[] = bytes.split("\\|"); byte bytesArray[] = new byte[byteArrayStr.length]; for (int i = 0; i < byteArrayStr.length; ++i) { bytesArray[i] = (byte) (Integer.parseInt(byteArrayStr[i],16)); } os.write(bytesArray); os.flush(); } }