I have a string which has huge data
String string = "afsa fd fdsfdsfdsfdsfdsfds fdsfds fdsf dsfds fdsfds fdsf dsfdsf dsfdsfsdfsdf JsonStr [{\"apk_name\":\"Android System\",\"apk_package\":\"android\",\"apk_versioncode\":17},{\"apk_name\":\"Bubble\",\"apk_package\":\"bz.ktk.bubble\",\"apk_versioncode\":21},{\"apk_name\":\"Kingsoft Office\",\"apk_package\":\"cn.wps.moffice_eng\",\"apk_versioncode\":74},{\"apk_name\":\"Math Workout\",\"apk_package\":\"com.akbur.mathsworkout\",\"apk_versioncode\":118},{\"apk_name\":\"Apollo\",\"apk_package\":\"com.andrew.apollo\",\"apk_versioncode\":2},{\"apk_name\":\"Tags\",\"apk_package\":\"com.android.apps.tag\",\"apk_versioncode\":101},{\"apk_name\":\"com.android.backupconfirm\",\"apk_package\":\"com.android.backupconfirm\",\"apk_versioncode\":17},{\"apk_name\":\"Bluetooth Share\",\"apk_package\":\"com.android.bluetooth\",\"apk_versioncode\":17},{\"apk_name\":\"Browser\",\"apk_package\":\"com.android.browser\",\"apk_versioncode\":17},{\"apk_name\":\"Calculator\",\"apk_package\":\"com.android.calculator2\",\"apk_versioncode\":17},{\"apk_name\":\"Calendar\",\"apk_package\":\"com.android.calendar\",\"apk_versioncode\":17},{\"apk_name\":\"Cell Broadcasts\",\"apk_package\":\"com.android.cellbroadcastreceiver\",\"apk_versioncode\":17},{\"apk_name\":\"Certificate Installer\",\"apk_package\":\"com.android.certinstaller\",\"apk_versioncode\":17},{\"apk_name\":\"Chrome\",\"apk_package\":\"com.android.chrome\",\"apk_versioncode\":1916122},{\"apk_name\":\"Contacts\",\"apk_package\":\"com.android.contacts\",\"apk_versioncode\":17},{\"apk_name\":\"Package Access Helper\",\"apk_package\":\"com.android.defcontainer\",\"apk_versioncode\":17},{\"apk_name\":\"Clock\",\"apk_package\":\"com.android.deskclock\",\"apk_versioncode\":203},{\"apk_name\":\"Dev Tools\",\"apk_package\":\"com.android.development\",\"apk_versioncode\":1},{\"apk_name\":\"Basic Daydreams\",\"apk_package\":\"com.android.dreams.basic\",\"apk_versioncode\":17},{\"apk_name\":\"Photo Screensavers\",\"apk_package\":\"com.android.dreams.phototable\",\"apk_versioncode\":17},{\"apk_name\":\"Email\",\"apk_package\":\"com.android.email\",\"apk_versioncode\":410000},{\"apk_name\":\"Exchange Services\",\"apk_package\":\"com.android.exchange\",\"apk_versioncode\":500000},{\"apk_name\":\"Face Unlock\",\"apk_package\":\"com.android.facelock\",\"apk_versioncode\":17},{\"apk_name\":\"Black Hole\",\"apk_package\":\"com.android.galaxy4\",\"apk_versioncode\":1},{\"apk_name\":\"Gallery\",\"apk_package\":\"com.android.gallery3d\",\"apk_versioncode\":40001},{\"apk_name\":\"HTML Viewer\",\"apk_package\":\"com.android.htmlviewer\",\"apk_versioncode\":17},{\"apk_name\":\"Input Devices\",\"apk_package\":\"com.android.inputdevices\",\"apk_versioncode\":17},{\"apk_name\":\"Android keyboard (AOSP)\",\"apk_package\":\"com.android.inputmethod.latin\",\"apk_versioncode\":17},{\"apk_name\":\"Key Chain\",\"apk_package\":\"com.android.keychain\",\"apk_versioncode\":17},{\"apk_name\":\"Fused Location\",\"apk_package\":\"com.android.location.fused\",\"apk_versioncode\":17},{\"apk_name\":\"Magic Smoke Wallpapers\",\"apk_package\":\"com.android.magicsmoke\",\"apk_versioncode\":17},{\"apk_name\":\"Messaging\",\"apk_package\":\"com.android.mms\",\"apk_versioncode\":17},{\"apk_name\":\"Music Visualization Wallpapers\",\"apk_package\":\"com.android.musicvis\",\"apk_versioncode\":17},{\"apk_name\":\"Nfc Service\",\"apk_package\":\"com.android.nfc\",\"apk_versioncode\":17},{\"apk_name\":\"Bubbles\",\"apk_package\":\"com.android.noisefield\",\"apk_versioncode\":1},{\"apk_name\":\"Package installer\",\"apk_package\":\"com.android.packageinstaller\",\"apk_versioncode\":17},{\"apk_name\":\"Phase Beam\",\"apk_package\":\"com.android.phasebeam\",\"apk_versioncode\":1},{\"apk_name\":\"Phone\",\"apk_package\":\"com.android.phone\",\"apk_versioncode\":17},{\"apk_name\":\"Search Applications Provider\",\"apk_package\":\"com.android.providers.applications\",\"apk_versioncode\":17},{\"apk_name\":\"Calendar Storage\",\"apk_package\":\"com.android.providers.calendar\",\"apk_versioncode\":17},{\"apk_name\":\"Contacts Storage\",\"apk_package\":\"com.android.providers.contacts\",\"apk_versioncode\":17},{\"apk_name\":\"Download Manager\",\"apk_package\":\"com.android.providers.downloads\",\"apk_versioncode\":17},{\"apk_name\":\"Downloads\",\"apk_package\":\"com.android.providers.downloads.ui\",\"apk_versioncode\":17},{\"apk_name\":\"DRM Protected Content Storage\",\"apk_package\":\"com.android.providers.drm\",\"apk_versioncode\":17},{\"apk_name\":\"Media Storage\",\"apk_package\":\"com.android.providers.media\",\"apk_versioncode\":511},i am still testing this :) ";
when I print this string like
System.out.println(TAG + string);
The string is truncated on console, why is it so?
System.out is redirected to logcat. Logcat messages have a maximum length of about 1k and extra characters are truncated.
If you need to log longer messages, use your own logging/file-writing solution.
Related
I am trying to develop and application to read and write to RF tags. Reading is flawless, but I'm having issues with writing. Specifically the error "GetStatus Write RFID_API_UNKNOWN_ERROR data(x)- Field can Only Take Word values"
I have tried reverse-engineering the Zebra RFID API Mobile by obtaining the .apk and decoding it, but the code is obfuscated and I am not able to decypher why that application's Write works and mine doesn't.
I see the error in the https://www.ptsmobile.com/rfd8500/rfd8500-rfid-developer-guide.pdf at page 185, but I have no idea what's causing it.
I've tried forcefully changing the writeData to Hex, before I realized that the API does that on its own, I've tried changing the Length of the writeData as well, but it just gets a null value. I'm so lost.
public boolean WriteTag(String sourceEPC, long Password, MEMORY_BANK memory_bank, String targetData, int offset) {
Log.d(TAG, "WriteTag " + targetData);
try {
TagData tagData = null;
String tagId = sourceEPC;
TagAccess tagAccess = new TagAccess();
tagAccess.getClass();
TagAccess.WriteAccessParams writeAccessParams = tagAccess.new WriteAccessParams();
String writeData = targetData; //write data in string
writeAccessParams.setAccessPassword(Password);
writeAccessParams.setMemoryBank(MEMORY_BANK.MEMORY_BANK_USER);
writeAccessParams.setOffset(offset); // start writing from word offset 0
writeAccessParams.setWriteData(writeData);
// set retries in case of partial write happens
writeAccessParams.setWriteRetries(3);
// data length in words
System.out.println("length: " + writeData.length()/4);
System.out.println("length: " + writeData.length());
writeAccessParams.setWriteDataLength(writeData.length()/4);
// 5th parameter bPrefilter flag is true which means API will apply pre filter internally
// 6th parameter should be true in case of changing EPC ID it self i.e. source and target both is EPC
boolean useTIDfilter = memory_bank == MEMORY_BANK.MEMORY_BANK_EPC;
reader.Actions.TagAccess.writeWait(tagId, writeAccessParams, null, tagData, true, useTIDfilter);
} catch (InvalidUsageException e) {
System.out.println("INVALID USAGE EXCEPTION: " + e.getInfo());
e.printStackTrace();
return false;
} catch (OperationFailureException e) {
//System.out.println("OPERATION FAILURE EXCEPTION");
System.out.println("OPERATION FAILURE EXCEPTION: " + e.getResults().toString());
e.printStackTrace();
return false;
}
return true;
}
With
Password being 00
sourceEPC being the Tag ID obtained after reading
Memory Bank being MEMORY_BANK.MEMORY_BANK_USER
target data being "8426017056458"
offset being 0
It just keeps giving me "GetStatus Write RFID_API_UNKNOWN_ERROR data(x)- Field can Only Take Word values" and I have no idea why this is the case, nor I know what a "Word value" is, and i've searched for it. This is all under the "OperationFailureException", as well. Any help would be appreciated, as there's almost no resources online for this kind of thing.
Even this question is a bit older, I had the same problem so as far as I know this should be the answer.
Your target data "8426017056458" length is 13 and at writeAccessParams.setWriteDataLength(writeData.length()/4)
you are devide it with four. Now if you are trying to write the target data it is longer than the determined WriteDataLength. And this throws the Error.
One 'word' is 4 Hex => 16 Bits long. So your Data have to be filled up first and convert it to Hex.
I am new to Spark and It seems very confusing to me. I had gone through the spark documentation for Java API But couldn't figure out the way to solve my problem.
I have to process a logfile in spark-Java and have very little time left for the same. Below is the log file that contains the device records(device id, decription, ip address, status) span over multiple lines.
It also contains some other log information which I am not bothered about.
How can I get the device information log from this huge log file.
Any help is much appreciated.
Input Log Data :
!
!
!
device AGHK75
description "Optical Line Terminal"
ip address 1.11.111.12/10
status "FAILED"
!
device AGHK78
description "Optical Line Terminal"
ip address 1.11.111.12/10
status "ACTIVE"
!
!
context local
!
no ip domain-lookup
!
interface IPA1_A2P_1_OAM
description To_A2P_1_OAM
ip address 1.11.111.12/10
propagate qos from ip class-map ip-to-pd
!
interface IPA1_OAM_loopback loopback
description SE1200_IPA-1_OAM_loopback
ip address 1.11.111.12/10
ip source-address telnet snmp ssh radius tacacs+ syslog dhcp-server tftp ftp icmp-dest-unreachable icmp-time-exceed netop flow-ip
What I have done so far is:
Java Code
JavaRDD<String> logData = sc.textFile("logFile").cache();
List<String> deviceRDD = logData.filter(new Function<String, Boolean>() {
Boolean check=false;
public Boolean call(String s) {
if(s.contains("device") ||(check == true && ( s.contains("description") || s.contains("ip address"))))
check=true;
else if(check==true && s.contains("status")){
check=false;
return true;
}
else
check=false;
return check; }
}).collect();
Current Output :
device AGHK75
description "Optical Line Terminal"
ip address 1.11.111.12/10
status "FAILED"
device AGHK78
description "Optical Line Terminal"
ip address 1.11.111.12/10
status "ACTIVE"
Expected Output is:
AGHK75,"Optical Line Terminal",1.11.111.12/10,"FAILED"
AGHK78,"Optical Line Terminal",1.11.111.12/10,"ACTIVE"
You can use sc.wholeTextFiles("logFile") for getting the data as key,value pair of where key will be the file name and value as data in it.
Then you can use some string operation for splitting of the data as per the start and end delimiter of single log data with "!" and do a filter first for checking if the first word is device and then do a flatMap on it which will make it as singleLog text RDD.
and then get the data from it using the map.
Please try it and let me know whether if this logic is working for you.
Added code in Spark Scala:
val ipData = sc.wholeTextFiles("abc.log")
val ipSingleLog = ipData.flatMap(x=>x._2.split("!")).filter(x=>x.trim.startsWith("device"))
val logData = ipSingleLog.map(x=>{
val rowData = x.split("\n")
var device = ""
var description = ""
var ipAddress = ""
var status = ""
for (data <- rowData){
if(data.trim().startsWith("device")){
device = data.split("device")(1)
}else if(data.trim().startsWith("description")){
description = data.split("description")(1)
}else if(data.trim().startsWith("ip address")){
ipAddress = data.split("ip address")(1)
}else if(data.trim().startsWith("status")){
status = data.split("status")(1)
}
}
(device,description,ipAddress,status)
})
logData.foreach(println)
Spark will take each line as a separate item with sc.textFile. You can get it to split on a different char using sc.hadoopConfiguration().set("textinputformat.record.delimiter", "!").
#Test
public void test() throws ParseException, IOException {
hadoop.write("/test.txt", "line 1\nline 2\n!\nline 3\nline 4");
JavaSparkContext sc = spark.getContext();
sc.hadoopConfiguration().set("textinputformat.record.delimiter", "!");
System.out.println(sc.textFile(hadoop.getMfs().getUri() + "/test.txt").collect());
assertThat(sc.textFile(hadoop.getMfs().getUri() + "/test.txt").count(), is(2L));
}
I believe the only correct way that works everywhere is
Configuration hadoopConf = new Configuration();
hadoopConf.set("textinputformat.record.delimiter", "delimiter");
JavaPairRDD<LongWritable, Text> input = jsc.newAPIHadoopFile(path,
TextInputFormat.class, LongWritable.class, Text.class, hadoopConf);
There are issues in hadoop related code. Depending on size of the input file it produces additional records: MAPREDUCE-6549,MAPREDUCE-5948. It certainly works starting with 2.7.2.
Even though as mlk suggests using spark context would perfectly work, it'll fail in case you try to read another file with different delimiter using the same spark context. By default the delimiter is new line symbol and it'll be changed as soon as this option is applied.
The reason is that spark context shares hadoopConfiguration object and it's hard to reason, where exactly this value is going to be needed. As a workaround the one might materialize RDD and cache it, but it's still might happen that the same RDD would be recomputed.
Given way would work everywhere, because every time it uses new Configuration.
I have a need to collect a subset of info from log files that reside on one-to-many log file servers. I have the following java code that does the initial data collection/filtering:
public String getLogServerInfo(String userName, String password, String hostNames, String id) throws Exception{
int timeout = 5;
String results = "";
String[] hostNameArray = hostNames.split("\\s*,\\s*");
for (String hostName : hostNameArray) {
SSHClient ssh = new SSHClient();
ssh.addHostKeyVerifier(new PromiscuousVerifier());
try {
Utils.writeStdOut("Parsing server: " + hostName);
ssh.connect(hostName);
ssh.authPassword(userName, password);
Session s = ssh.startSession();
try {
String sh1 = "cat /logs/en/event/event*.log | grep \"" + id + "\" | grep TYPE=ERROR";
Command cmd = s.exec(sh1);
results += IOUtils.readFully(cmd.getInputStream()).toString();
cmd.join(timeout, TimeUnit.SECONDS);
Utils.writeStdOut("\n** exit status: " + cmd.getExitStatus());
} finally {
s.close();
}
} finally {
ssh.disconnect();
ssh.close();
}
}
return results;
}
The results string variable looks something like this:
TYPE=ERROR, TIMESTAMP=10/03/2015 07:14:31 253 AM, HOST=server1, APPLICATION=app1, FUNCTION=function1, STATUS=null, GUID=null, etc. etc.
TYPE=ERROR, TIMESTAMP=10/03/2015 07:14:59 123 AM, HOST=server1, APPLICATION=app1, FUNCTION=function1, STATUS=null, GUID=null, etc. etc.
TYPE=ERROR, TIMESTAMP=10/03/2015 07:14:28 956 AM, HOST=server2, APPLICATION=app1, FUNCTION=function2, STATUS=null, GUID=null, etc. etc.
I need to accomplish the following:
What do I need to do to be able to sort results by TIMESTAMP? It is unsorted right now, because i am enumerating one to many files, and appending results to end of a string.
I only want a subset of "columns" returned, such as TYPE, TIMESTAMP, FUNCTION. I thought i could REGEX it in the grep, but maybe arrays would be better?
Results are simply being printed to console/report, as this is only printed for failed tests, and is there for troubleshooting purposes only.
I took the list of output that you provided and put it in a file, named test.txt, making sure that each "TYPE=ERROR etc. etc" was in a new line (I guess it's the same in your output, but it isn't clear).
Then I used cat test.txt | cut -d',' -f1,2,5 | sort -k2 to do what you want.
cut -d',' -f1,2,5 basically splits by comma and only reports tokens number 1,2,5 (TYPE,TIMESTAMP,FUNCTION). If you want more, you can add more numbers depending on what token you want
sort -k2 sorts according to the 2nd column (TIMESTAMP)
The output I get is:
TYPE=ERROR, TIMESTAMP=10/03/2015 07:14:28 956 AM, FUNCTION=function2
TYPE=ERROR, TIMESTAMP=10/03/2015 07:14:31 253 AM, FUNCTION=function1
TYPE=ERROR, TIMESTAMP=10/03/2015 07:14:59 123 AM, FUNCTION=function1
So what you should try and do, is to further pipe your command with |cut -d',' -f1,2,5 | sort -k2
I hope it helps.
After working on this some more, i come to find that one of the key/value pairs allows commas in the values, thus cut will not work. Here is the finished product:
My grep command stays the same, collecting data from all servers:
String sh1 = "cat /logs/en/event/event*.log | grep \"" + id + "\" | grep TYPE=ERROR";
Command cmd = s.exec(sh1);
results += IOUtils.readFully(cmd.getInputStream()).toString();
Put the string into an array, so i can process them line by line:
String lines[] = results.split("\r?\n");
I then used regex to get the data i needed, repeating the below for each line in the array, and for as many columns as needed. It's a bit of a hack, I probably could have done it better by simply replacing the comma in the offending key/value pair, then using SPLIT() and comma as delimeter, then looping for the fields i want.
lines2[i] = "";
Pattern p = Pattern.compile("TYPE=(.*?), APPLICATION=.*");
Matcher m = p.matcher(lines[i]);
if (m.find()) {
lines2[i] += ("TYPE=" + m.group(1));
}
Finally, this will sort by Timestamp, since it is 2nd column:
Arrays.sort(lines2);
I am getting a very weird error. So, my program read a csv file.
Whenever it comes to this line:
"275081";"cernusco astreet, milan, italy";NULL
I get an error:
In the debug screen, I see that the BufferedReader read only
"275081";"cernusco as
That is a part of the line. But, it should read all of the line.
What bugs me the most is when I simply remove that line out of the csv file, the bug disappear! The program runs without any problem. I can remove the line, maybe it is a bad input or whatever; but, I want to understand why I am having this problem.
For better understanding, I will include a part of my code here:
reader = new BufferedReader(new FileReader(userFile));
reader.readLine(); // skip first line
while ((line = reader.readLine()) != null) {
String[] fields = line.split("\";\"");
int id = Integer.parseInt(stripPunctionMark(fields[0]));
String location = fields[1];
if (location.contains("\";")) { // When there is no age. The data is represented as "location";NULL. We cannot split for ";" here. So check for "; and split.
location = location.split("\";")[0];
System.out.printf("Added %d at %s\n", id, location);
people.put(id, new Person(id, location));
numberOfPeople++;
}
else {
int age = Integer.parseInt(stripPunctionMark(fields[2]));
people.put(id, new Person(id, location, age));
System.out.printf("Added %d at: %s age: %d \n", id, location, age);
numberOfPeople++;
}
Also, you can find the csv file here or here is a short version of the part that I encountered the error:
"275078";"el paso, texas, usa";"62"
"275079";"istanbul, eurasia, turkey";"26"
"275080";"madrid, n/a, spain";"29"
"275081";"cernusco astreet, milan, italy";NULL
"275082";"hacienda heights, california, usa";"16"
"275083";"cedar rapids, iowa, usa";"22"
This has nothing whatsoever to do with BufferedReader. It doesn't even appear in the stack trace.
It has to do with your failure to check the result and length of the array returned by String.split(). Instead you are just assuming the input is well-formed, with at least three columns in each row, and you have no defences if it isn't.
I am getting this string via a message broker (Stomp):
João
and that's how it suposed to be:
João
Is there a way to revert this in Java?!
Thanks!
U+00C3 Ã c3 83 LATIN CAPITAL LETTER A WITH TILDE
U+00C2 Â c3 82 LATIN CAPITAL LETTER A WITH CIRCUMFLEX
U+00A3 £ c2 a3 POUND SIGN
U+00E3 ã c3 a3 LATIN SMALL LETTER A WITH TILDE
I'm having trouble determining how this could be a data (encoding) conversion problem. Is it possible the data is just bad?
If the data isn't bad, then we have to assume you are misinterpreting the encoding. We don't know the original encoding and unless you're doing something different, the default encoding for Java is UTF-16. I don't see how João encoded in any common encoding could be interpreted as João in UTF-16
Just to be sure, I whipped this python script up with no match found. I'm not entirely sure it covers all encodings or I'm not missing a corner case, FWIW.
#!/usr/bin/env python
# -- coding: utf-8 --
import pkgutil
import encodings
good = u'João'
bad = u'João'
false_positives = set(["aliases"])
found = set(name for imp, name, ispkg in pkgutil.iter_modules(encodings.__path__) if not ispkg)
found.difference_update(false_positives)
print found
for x in found:
for y in found:
res = None
try:
res = good.encode(x).decode(y)
print res,x,y
except:
pass
if not res is None:
if res == bad:
print "FOUND"
exit(1)
In some cases a hack works. But best is to prevent it from ever happening.
I had this problem before when I had a servlet that correctly printed the correct headers and http content type and encoding on the page, but IE would submit forms encoded with latin1 instead of the correct one. So I created a quick dirty hack (involving a request wrapper that detects and converts if it is indeed IE) to fix it for new data which worked fine. And for the data in the database that was already messed up, I used the following hack.
Unfortunately my hack doesn't work perfectly for your example string, but it looks very close (just an extra à in your broken string compared to my 'theoretical cause' reproduced broken string). So perhaps my guess of "latin1" is wrong, and you should try others (such as in that other link posted by Tomas).
package peter.test;
import java.io.UnsupportedEncodingException;
/**
* User: peter
* Date: 2012-04-12
* Time: 11:02 AM
*/
public class TestEncoding {
public static void main(String args[]) throws UnsupportedEncodingException {
//In some cases a hack works. But best is to prevent it from ever happening.
String good = "João";
String bad = "João";
//this line demonstrates what the "broken" string should look like if it is reversible.
String broken = breakString(good, bad);
//here we show that it is fixable if broken like breakString() does it.
fixString(good, broken);
//this line attempts to fix the string, but it is not fixable unless broken in the same way as breakString()
fixString(good, bad);
}
private static String fixString(String good, String bad) throws UnsupportedEncodingException {
byte[] bytes = bad.getBytes("latin1"); //read the Java bytes as if they were latin1 (if this works, it should result in the same number of bytes as java characters; if using UTF8, it would be more bytes)
String fixed = new String(bytes, "UTF8"); //take the raw bytes, and try to convert them to a string as if they were UTF8
System.out.println("Good: " + good);
System.out.println("Bad: " + bad);
System.out.println("bytes1.length: " + bytes.length);
System.out.println("fixed: " + fixed);
System.out.println();
return fixed;
}
private static String breakString(String good, String bad) throws UnsupportedEncodingException {
byte[] bytes = good.getBytes("UTF8");
String broken = new String(bytes, "latin1");
System.out.println("Good: " + good);
System.out.println("Bad: " + bad);
System.out.println("bytes1.length: " + bytes.length);
System.out.println("broken: " + broken);
System.out.println();
return broken;
}
}
And the result (with Sun jdk 1.7.0_03):
Good: João
Bad: João
bytes1.length: 5
broken: João
Good: João
Bad: João
bytes1.length: 5
fixed: João
Good: João
Bad: João
bytes1.length: 6
fixed: Jo�£o