Suppose I have java.util.Set<String> of "200Y2Z", "20012Y", "200829", "200T2K" which follows the same pattern "200$2$", where "$" is the placeholder. Now which is the most efficient way to get Set of just unique codes from such strings in Java?
Input: java.util.Set<String> of "200Y2Z", "20012Y", "200829", "200T2K"
Expected output: java.util.Set<String> of "YZ", "1Y", "89", "TK"
My Try ::
public static void getOutPut()
{
Set<String> input = new HashSet<String>();
Set<String> output = new HashSet<String>();
StringBuffer out = null;
for(String in : input)
{
out = new StringBuffer();
StringCharacterIterator sci = new StringCharacterIterator(in);
while (sci.current( ) != StringCharacterIterator.DONE){
if (sci.current( ) == '$')
{
out.append(in.charAt(sci.getIndex()));
}
sci.next( );
}
output.add(out.toString());
}
System.out.println(output);
}
It is working fine, but is there any efficient way than this to achieve it? I need to do it for more than 1000K codes.
Get the indexes of the placeholder in the pattern:
int i = pattern.getIndexOf('$');
You'll must to iterate to obtain all the indexes:
pattern.getIndexOf('$', lastIndex+1);
The loop and the checks are up to you.
Then use charAt with the indexes over each element of the set.
Related
I am working on a project where I will be given two files; one with jumbled up words and the other with real words. I then need to print out the list of jumbled up words in alphabetical order with its matching real word(s) next to it. The catch is that there can be multiple real words per jumbled up word.
For example:
cta cat
ezrba zebra
psot post stop
I completed the program without accounting for the multiple words per jumbled up words, so in my HashMap I had to change < String , String > to < String , List < String > >, but after doing this I ran into some errors in the .get and .put methods. How can I get multiple words stored per key for each jumbled up word? Thank you for your help.
My code is below:
import java.io.*;
import java.util.*;
public class Project5
{
public static void main (String[] args) throws Exception
{
BufferedReader dictionaryList = new BufferedReader( new FileReader( args[0] ) );
BufferedReader scrambleList = new BufferedReader( new FileReader( args[1] ) );
HashMap<String, List<String>> dWordMap = new HashMap<String, List<String>>();
ArrayList<String> scrambled = new ArrayList<String>();
while (dictionaryList.ready())
{
String word = dictionaryList.readLine();
//throw in an if statement to account for multiple words
dWordMap.put(createKey(word), word);
}
dictionaryList.close();
ArrayList<String> scrambledList = new ArrayList<String>();
while (scrambleList.ready())
{
String scrambledWord = scrambleList.readLine();
scrambledList.add(scrambledWord);
}
scrambleList.close();
Collections.sort(scrambledList);
for (String words : scrambledList)
{
String dictionaryWord = dWordMap.get(createKey(words));
System.out.println(words + " " + dictionaryWord);
}
}
private static String createKey(String word)
{
char[] characterWord = word.toCharArray();
Arrays.sort(characterWord);
return new String(characterWord);
}
}
you could do something like :
replace the line:
dWordMap.put(createKey(word), word);
with:
String key = createKey(word);
List<String> scrambled = dWordMap.get(key);
//make sure that scrambled words list is initialized in the map for the sorted key.
if(scrambled == null){
scrambled = new ArrayList<String>();
dWordMap.put(key, scrambled);
}
//add the word to the list
scrambled.add(word);
dWordMap.put(createKey(word), word);
The dwordMap is of type HashMap>. So instead of word i.e. String it should be List.
I have a set of strings like this
A_2007-04, A_2007-09, A_Agent, A_Daily, A_Execute, A_Exec, B_Action, B_HealthCheck
I want output as:
Key = A, Value = [2007-04,2007-09,Agent,Execute,Exec]
Key = B, Value = [Action,HealthCheck]
I'm using HashMap to do this
pckg:{A,B}
count:total no of strings
reports:set of strings
Logic I used is nested loop:
for (String l : reports[i]) {
for (String r : pckg) {
String[] g = l.split("_");
if (g[0].equalsIgnoreCase(r)) {
report.add(g[1]);
dirFiles.put(g[0], report);
} else {
break;
}
}
}
I'm getting output as
Key = A, Value = [2007-04,2007-09,Agent,Execute,Exec]
How to get second key?
Can someone suggest logic for this?
Assuming that you use Java 8, it can be done using computeIfAbsent to initialize the List of values when it is a new key as next:
List<String> tokens = Arrays.asList(
"A_2007-04", "A_2007-09", "A_Agent", "A_Daily", "A_Execute",
"A_Exec", "P_Action", "P_HealthCheck"
);
Map<String, List<String>> map = new HashMap<>();
for (String token : tokens) {
String[] g = token.split("_");
map.computeIfAbsent(g[0], key -> new ArrayList<>()).add(g[1]);
}
In terms of raw code this should do what I think you are trying to achieve:
// Create a collection of String any way you like, but for testing
// I've simply split a flat string into an array.
String flatString = "A_2007-04,A_2007-09,A_Agent,A_Daily,A_Execute,A_Exec,"
+ "P_Action,P_HealthCheck";
String[] reports = flatString.split(",");
Map<String, List<String>> mapFromReportKeyToValues = new HashMap<>();
for (String report : reports) {
int underscoreIndex = report.indexOf("_");
String key = report.substring(0, underscoreIndex);
String newValue = report.substring(underscoreIndex + 1);
List<String> existingValues = mapFromReportKeyToValues.get(key);
if (existingValues == null) {
// This key hasn't been seen before, so create a new list
// to contain values which belong under this key.
existingValues = new ArrayList<>();
mapFromReportKeyToValues.put(key, existingValues);
}
existingValues.add(newValue);
}
System.out.println("Generated map:\n" + mapFromReportKeyToValues);
Though I recommend tidying it up and organising it into a method or methods as fits your project code.
Doing this with Map<String, ArrayList<String>> will be another good approach I think:
String reports[] = {"A_2007-04", "A_2007-09", "A_Agent", "A_Daily",
"A_Execute", "A_Exec", "P_Action", "P_HealthCheck"};
Map<String, ArrayList<String>> map = new HashMap<>();
for (String rep : reports) {
String s[] = rep.split("_");
String prefix = s[0], suffix = s[1];
ArrayList<String> list = new ArrayList<>();
if (map.containsKey(prefix)) {
list = map.get(prefix);
}
list.add(suffix);
map.put(prefix, list);
}
// Print
for (Map.Entry<String, ArrayList<String>> entry : map.entrySet()) {
String key = entry.getKey();
ArrayList<String> valueList = entry.getValue();
System.out.println(key + " " + valueList);
}
for (String l : reports[i]) {
String[] g = l.split("_");
for (String r : pckg) {
if (g[0].equalsIgnoreCase(r)) {
report = dirFiles.get(g[0]);
if(report == null){ report = new ArrayList<String>(); } //create new report
report.add(g[1]);
dirFiles.put(g[0], report);
}
}
}
Removed the else part of the if condition. You are using break there which exits the inner loop and you never get to evaluate the keys beyond first key.
Added checking for existing values. As suggested by Orin2005.
Also I have moved the statement String[] g = l.split("_"); outside inner loop so that it doesn't get executed multiple times.
Hello people of the internet,
We're having the following problem with the Stanford NLP API:
We have a String that we want to transform into a list of sentences.
First, we used String sentenceString = Sentence.listToString(sentence); but listToString does not return the original text because of the tokenization. Now we tried to use listToOriginalTextString in the following way:
private static List<String> getSentences(String text) {
Reader reader = new StringReader(text);
DocumentPreprocessor dp = new DocumentPreprocessor(reader);
List<String> sentenceList = new ArrayList<String>();
for (List<HasWord> sentence : dp) {
String sentenceString = Sentence.listToOriginalTextString(sentence);
sentenceList.add(sentenceString.toString());
}
return sentenceList;
}
This does not work. Apparently we have to set an attribute " invertible " to true but we don't know how to. How can we do this?
In general, how do you use listToOriginalTextString properly? What preparations do you need?
sincerely,
Khayet
If I understand correctly, you want to get the mapping of tokens to the original input text after tokenization. You can do it like this;
//split via PTBTokenizer (PTBLexer)
List<CoreLabel> tokens = PTBTokenizer.coreLabelFactory().getTokenizer(new StringReader(text)).tokenize();
//do the processing using stanford sentence splitter (WordToSentenceProcessor)
WordToSentenceProcessor processor = new WordToSentenceProcessor();
List<List<CoreLabel>> splitSentences = processor.process(tokens);
//for each sentence
for (List<CoreLabel> s : splitSentences) {
//for each word
for (CoreLabel token : s) {
//here you can get the token value and position like;
//token.value(), token.beginPosition(), token.endPosition()
}
}
String sentenceStr = sentence.get(CoreAnnotations.TextAnnotation.class)
It gives you original text. An example for JSONOutputter.java file :
l2.set("id", sentence.get(CoreAnnotations.SentenceIDAnnotation.class));
l2.set("index", sentence.get(CoreAnnotations.SentenceIndexAnnotation.class));
l2.set("sentenceOriginal",sentence.get(CoreAnnotations.TextAnnotation.class));
l2.set("line", sentence.get(CoreAnnotations.LineNumberAnnotation.class));
I have a redirect uri of the form https://stackexchange.com/oauth/login_success#access_token=token&expires=5678. I am trying to get the acces token from this url. tried following methods
uri.getQueryParameter("access_token"); //will return null since it is not a query param
uri.getFragment(); //will return "access_token=token&expires=5678" so i need to seperate it again.
Any direct methods? Pls help
Some one might find this helpful
String queryAfterFragment = uri.getFragment();
String dummy_url = "http://localhost?" + queryAfterFragment;
Uri dummy_uri = Uri.parse(dummy_url);
String access_token = dummy_uri.getQueryParameter("access_token");
Works like a charm and easy to use, thank me later :-)
Simple and elegant solution which can get the values which you want:
public static Map<String, String> parseUrlFragment (String url) {
Map<String, String> output = new LinkedHashMap<> ();
String[] keys = url.split ("&");
for (String key : keys) {
String[] values = key.split ("=");
output.put (values[0], (values.length > 1 ? values[1] : ""));
}
return output;
}
It's using LinkedHashMap to represent values, so it's output:
Map<String, String> data = parseUrlFragment (uri.getFragment ());
data.get ("access_token") // token
data.get ("expires") // 5678
You can try in this way
String str = "https://stackexchange.com/oauth/
login_success#access_token=token&expires=5678";
int indexOfHash = str.indexOf("#");
// now you can substring from this
String subStr = str.substring(indexOfHash+1, str.length());
System.out.println(subStr);
// now you can substring from &
String sStr=subStr.substring(0,subStr.indexOf("&"));
System.out.println(sStr);
// now you can get token
String[] arr=sStr.split("=");
System.out.println(arr[0]);
System.out.println(arr[1]);
Out put
access_token=token&expires=5678
access_token=token
access_token
token
You could use the String method split(String) with Regex
str.split("#|&|=")
this splits the string by the passed 3 chars and you get an array with all the splitted parts.
String s =
"https://stackexchange.com/oauth/login_success#access_token=token&expires=5678";
final String[] split = s.split("#|&|=");
for (String s1 : split) {
System.out.println(s1);
}
Output:
https://stackexchange.com/oauth/login_success
access_token
token
expires
5678
I need to obfuscate values of parameters matching
password, tokenID
Sample query string:
visitorNo=89&password=demo&tokenID=yxr56
Should be obfuscated to:
visitorNo=89&password=$$&tokenID=$$
What I did:
String[] parameters = queryString.split("&");
StringBuffer qS = new StringBuffer();
for(String param : parameters) {
String[] keyValue = param.split("=");
qS.append(keyValue[0]);
qS.append("=");
for(String paramToObfuscate : paramsToObfuscate) {
if(paramToObfuscate.equals(keyValue[0])) {
qS.append("$$");
}
else {
qS.append(keyValue[1]);
}
}
qS.append("&");
}
String queryStr = qS.toString().substring(0, qS.length-1);
For only two parameters, there's no faster way. If there were tens of them, you could use
Set<String> paramsToObfuscate = new HashSet<String>();
and Set.contains, which is surely faster than tens of tests.