Make link inside of string clickable: Java - java

I am creating a messaging system in Java using android studio.
People can send messages back and forth. But if they send a link, it just shows up as regular text. I want the part that is the link to show up as a clickable link and the rest just text.
I checked all day on this site and others but no seems to do this in the way I'm trying too. Most of the answers I see are people using a TexView to accomplish their goal. I'm using a string. Can someone please help me figure this out ?
private void showMessages(){
DatabaseReference userMessageKeyRef = RootRef.child("Messages").child(messageSenderID).child(messageReceiverID);
userMessageKeyRef.addValueEventListener(new ValueEventListener() {
#Override
public void onDataChange(#NonNull DataSnapshot snapshot) {
for (DataSnapshot snapshot1 : snapshot.getChildren()) {
Messages messages = new Messages();
String strMessage = snapshot1.child("message").getValue().toString();
String strFrom = snapshot1.child("from").getValue().toString();
String strType = snapshot1.child("type").getValue().toString();
messages.setMessage(strMessage);
messages.setFrom(strFrom);
messages.setType(strType);
messagesList.add(messages);
// Pattern for recognizing a URL, based off RFC 3986
final Pattern urlPattern = Pattern.compile(
"(?:^|[\\W])((ht|f)tp(s?):\\/\\/|www\\.)"
+ "(([\\w\\-]+\\.){1,}?([\\w\\-.~]+\\/?)*"
+ "[\\p{Alnum}.,%_=?&#\\-+()\\[\\]\\*$~#!:/{};']*)",
Pattern.CASE_INSENSITIVE | Pattern.MULTILINE | Pattern.DOTALL);
// separate input by spaces ( URLs don't have spaces )
String [] parts = strMessage.split("\\s+");
// get every part
for( String item : parts ) {
if(urlPattern.matcher(item).matches()) {
//it's a good url
System.out.print(""+ item + " " );
} else {
// it isn't a url
System.out.print(item + " ");
}
}
}
messageAdapter = new MessageAdapter(ChatActivity.this,messagesList);
userMessagesList.setAdapter(messageAdapter);
}
#Override
public void onCancelled(#NonNull DatabaseError error) {
}
});
}

There are two common ways to do this. One, like you have done, is to add html to the string. The second is to use the TextView's auto link mask feature.
Using HTML
Once you have identified URLs in your incoming string and added the appropriate html tags to turn them into links, you just need to use HtmlCompat when you go to actually display it in the TextView. You also need to make sure to call setMovementMethod or you won't be able to click the link. The advantage of using HTML is that you can have the link text be a readable phrase instead of a URL.
String txt = "This is www.google.com";
TextView link = findViewById(R.id.link);
link.setMovementMethod(LinkMovementMethod.getInstance());
link.setText(HtmlCompat.fromHtml(txt,HtmlCompat.FROM_HTML_MODE_LEGACY));
If you choose to go this route, your existing code just needs to be modified a bit to save the HTML string in the messages list passed to the adapter, then add the TextView calls above inside the adapter when you set the text.
String [] parts = strMessage.split("\\s+");
// replace URL parts with html links
for( int i = 0; i < parts.length; ++i ) {
if(urlPattern.matcher(parts[i]).matches()) {
parts[i] = "" + parts[i] + "";
}
}
// re-join parts back into a single string
String htmlMessage = String.join(" ", parts);
// save a list of html strings to pass to your adapter
htmlMessageStrings.add(htmlMessage);
Using Link Mask
This method doesn't require you to edit the string at all. If you use Linkify.ALL it also recognizes things like web links, emails, phone numbers, and physical addresses - not just web links. If you only want it to recognize web links use Linkify.WEB_URLS instead. This requires a lot less code on your part - you no longer have to try to parse the string for links.
String txt = "This is www.google.com"; // no need to modify the string
TextView link = findViewById(R.id.link);
link.setAutoLinkMask(Linkify.ALL); // or Linkify.WEB_URLS
link.setText(txt);
You can also add android:autoLink="all" to the TextView XML definition instead of calling it in-code.
Both methods produce this output

Related

Java regex for google maps url?

I want to parse all google map links inside a String. The format is as follows :
1st example
https://www.google.com/maps/place/white+house/#38.8976763,-77.0387185,17z/data=!3m1!4b1!4m5!3m4!1s0x89b7b7bcdecbb1df:0x715969d86d0b76bf!8m2!3d38.8976763!4d-77.0365298
https://www.google.com/maps/place/white+house/#38.8976763,-77.0387185,17z
https://www.google.com/maps/place//#38.8976763,-77.0387185,17z
https://maps.google.com/maps/place//#38.8976763,-77.0387185,17z
https://www.google.com/maps/place/#38.8976763,-77.0387185,17z
https://google.com/maps/place/#38.8976763,-77.0387185,17z
http://google.com/maps/place/#38.8976763,-77.0387185,17z
https://www.google.com.tw/maps/place/#38.8976763,-77.0387185,17z
These are all valid google map URLs (linking to White House)
Here is what I tried
String gmapLinkRegex = "(http|https)://(www\\.)?google\\.com(\\.\\w*)?/maps/(place/.*)?#(.*z)[^ ]*";
Pattern patternGmapLink = Pattern.compile(gmapLinkRegex , Pattern.CASE_INSENSITIVE);
Matcher m = patternGmapLink.matcher(s);
while (m.find()) {
logger.info("group0 = {}" , m.group(0));
String place = m.group(4);
place = StringUtils.stripEnd(place , "/"); // remove tailing '/'
place = StringUtils.stripStart(place , "place/"); // remove header 'place/'
logger.info("place = '{}'" , place);
String latLngZ = m.group(5);
logger.info("latLngZ = '{}'" , latLngZ);
}
It works in simple situation , but still buggy ...
for example
It need post-process to grab optional place information
And it cannot extract one line with two urls such as :
s = "https://www.google.com/maps/place//#38.8976763,-77.0387185,17z " +
" and http://google.com/maps/place/#38.8976763,-77.0387185,17z";
It should be two urls , but the regex matches the whole line ...
The points :
The whole URL should be matched in group(0) (including the tailing data part in 1st example),
in the 1st example , if the zoom level : 17z is removed , it is still a valid gmap URL , but my regex cannot match it.
Easier to extract optional place info
Lat / Lng extraction is must , zoom level is optional.
Able to parse multiple urls in one line
Able to process maps.google.com(.xx)/maps , I tried (www|maps\.)? but seems still buggy
Any suggestion to improve this regex ? Thanks a lot !
The dot-asterisk
.*
will always allow anything to the end of the last url.
You need "tighter" regexes, which match a single URL but not several with anything in between.
The "[^ ]*" might include the next URL if it is separated by something other than " ", which includes line break, tab, shift-space...
I propose (sorry, not tested on java), to use "anything but #" and "digit, minus, comma or dot" and "optional special string followed by tailored charset, many times".
"(http|https)://(www\.)?google\.com(\.\w*)?/maps/(place/[^#]*)?#([0123456789\.,-]*z)(\/data=[\!:\.\-0123456789abcdefmsx]+)?"
I tested the one above on a perl-regex compatible engine (np++).
Please adapt yourself, if I guessed anything wrong. The explicit list of digits can probably be replaced by "\d", I tried to minimise assumptions on regex flavor.
In order to match "URL" or "URL and URL", please use a variable storing the regex, then do "(URL and )*URL", replacing "URL" with regex var. (Asuming this is possible in java.) If the question is how to then retrieve the multiple matches: That is java, I cannot help. Let me know and I delete this answer, not to provoke deserved downvotes ;-)
(Edited to catch the data part in, previously not seen, first example, first line; and the multi URLs in one line.)
I wrote this regex to validate google maps links:
"(http:|https:)?\\/\\/(www\\.)?(maps.)?google\\.[a-z.]+\\/maps/?([\\?]|place/*[^#]*)?/*#?(ll=)?(q=)?(([\\?=]?[a-zA-Z]*[+]?)*/?#{0,1})?([0-9]{1,3}\\.[0-9]+(,|&[a-zA-Z]+=)-?[0-9]{1,3}\\.[0-9]+(,?[0-9]+(z|m))?)?(\\/?data=[\\!:\\.\\-0123456789abcdefmsx]+)?"
I tested with the following list of google maps links:
String location1 = "http://www.google.com/maps/place/21.01196755,105.86306012";
String location2 = "https://www.google.com.tw/maps/place/#38.8976763,-77.0387185,17z";
String location3 = "http://www.google.com/maps/place/21.01196755,105.86306012";
String location4 = "https://www.google.com/maps/place/white+house/#38.8976763,-77.0387185,17z/data=!3m1!4b1!4m5!3m4!1s0x89b7b7bcdecbb1df:0x715969d86d0b76bf!8m2!3d38.8976763!4d-77.0365298";
String location5 = "https://www.google.com/maps/place/white+house/#38.8976763,-77.0387185,17z";
String location6 = "https://www.google.com/maps/place//#38.8976763,-77.0387185,17z";
String location7 = "https://maps.google.com/maps/place//#38.8976763,-77.0387185,17z";
String location8 = "https://www.google.com/maps/place/#38.8976763,-77.0387185,17z";
String location9 = "https://google.com/maps/place/#38.8976763,-77.0387185,17z";
String location10 = "http://google.com/maps/place/#38.8976763,-77.0387185,17z";
String location11 = "https://www.google.com/maps/place/#/data=!4m2!3m1!1s0x3135abf74b040853:0x6ff9dfeb960ec979";
String location12 = "https://maps.google.com/maps?q=New+York,+NY,+USA&hl=no&sll=19.808054,-63.720703&sspn=54.337928,93.076172&oq=n&hnear=New+York&t=m&z=10";
String location13 = "https://www.google.com/maps";
String location14 = "https://www.google.fr/maps";
String location15 = "https://google.fr/maps";
String location16 = "http://google.fr/maps";
String location17 = "https://www.google.de/maps";
String location18 = "https://www.google.com/maps?ll=37.0625,-95.677068&spn=45.197878,93.076172&t=h&z=4";
String location19 = "https://www.google.de/maps?ll=37.0625,-95.677068&spn=45.197878,93.076172&t=h&z=4";
String location20 = "https://www.google.com/maps?ll=37.0625,-95.677068&spn=45.197878,93.076172&t=h&z=4&layer=t&lci=com.panoramio.all,com.google.webcams,weather";
String location21 = "https://www.google.com/maps?ll=37.370157,0.615234&spn=45.047033,93.076172&t=m&z=4&layer=t";
String location22 = "https://www.google.com/maps?ll=37.0625,-95.677068&spn=45.197878,93.076172&t=h&z=4";
String location23 = "https://www.google.de/maps?ll=37.0625,-95.677068&spn=45.197878,93.076172&t=h&z=4";
String location24 = "https://www.google.com/maps?ll=37.0625,-95.677068&spn=45.197878,93.076172&t=h&z=4&layer=t&lci=com.panoramio.all,com.google.webcams,weather";
String location25 = "https://www.google.com/maps?ll=37.370157,0.615234&spn=45.047033,93.076172&t=m&z=4&layer=t";
String location26 = "http://www.google.com/maps/place/21.01196755,105.86306012";
String location27 = "http://google.com/maps/bylatlng?lat=21.01196022&lng=105.86298748";
String location28 = "https://www.google.com/maps/place/C%C3%B4ng+vi%C3%AAn+Th%E1%BB%91ng+Nh%E1%BA%A5t,+354A+%C4%90%C6%B0%E1%BB%9Dng+L%C3%AA+Du%E1%BA%A9n,+L%C3%AA+%C4%90%E1%BA%A1i+H%C3%A0nh,+%C4%90%E1%BB%91ng+%C4%90a,+H%C3%A0+N%E1%BB%99i+100000,+Vi%E1%BB%87t+Nam/#21.0121535,105.8443773,13z/data=!4m2!3m1!1s0x3135ab8ee6df247f:0xe6183d662696d2e9";

Android: TextView with url end with -/\n Linkify wrongly

Currently, I have a String like
http://www.example.com/defg-/\nletters
I put this String into a TextView, and make the url clickable by setAutoLinkMask(Linkify.WEB_URLS) and setMovementMethod(LinkMovementMethod.getInstance())
However, the link is recognize wrongly, where only
http://www.example.com/defg   <--missing "-/"
is highlighted but not
http://www.example.com/defg-/  <--I want this
, and results in a wrong url.
What should I do such that the url can be recognized correctly?
The Sample Result (2nd link is wrongly recognized)
Code Implementation
txtNorm = (TextView) findViewById(R.id.txtNorm);
txtNorm.setText("http://www.example.com/defg-/");
txtNorm.setAutoLinkMask(Linkify.WEB_URLS);
txtNorm.setMovementMethod(LinkMovementMethod.getInstance());
txtCustom = (TextView) findViewById(R.id.txtCustom);
txtCustom.setText("http://www.example.com/defg-/\nletters");
txtCustom.setAutoLinkMask(Linkify.WEB_URLS);
txtCustom.setMovementMethod(LinkMovementMethod.getInstance());
i found a way you can try this.. at first you need to know that if you add -/ at the end of url this is not common format of standard Web Url. so i made a custom pattern ..
String urlRegex="[://.a-zA-Z_-]+-/"; // carefully set your pattern.
Pattern pattern = Pattern.compile(urlRegex);
String url1="press http://www.example.com/defg-/\\ or on Android& to search it on google";
text.setText(url1);
Matcher matcher1=Pattern.compile(urlRegex).matcher(url1);
while (matcher1.find()) {
final String tag = matcher1.group(0);
Linkify.addLinks(text, pattern, tag);
}
text.setMovementMethod(LinkMovementMethod.getInstance());

Jsoup get hidden email

I am parsing pages for email data . How would I get a hidden email - which is generated using JavaScript .This is the page I am parsing a page
If you would take a look on the html source(using firebug or something else) you would see that it is a link tag generated inside div named sobi2Details_field_email and set to be display:none .
This is my code for now , but the problem is with email
doc = Jsoup.connect(strLine).get();
Element e5=doc.getElementById("sobi2Details_field_email");
if(e5!=null)
{
emaildata=e5.child(1).absUrl("href").toString();
}
System.out.println (emaildata);
You need to do several steps because Jsoup doesn't allow you to execute JavaScript.
I reverse engineered it and this is what came out:
public static void main(final String[] args) throws IOException
{
final String url = "http://poslovno.com/kategorije.html?sobi2Task=sobi2Details&catid=71&sobi2Id=20001";
final Document doc = Jsoup.connect(url).get();
final Element e5 = doc.getElementById("sobi2Details_field_email");
System.out.println("--- this is how we start");
System.out.println(e5 + "\n\n\n\n");
// remove the xml encoding
System.out.println("---Remove XML encoding\n");
String email = org.jsoup.parser.Parser.unescapeEntities(e5.toString(), false);
System.out.println(email + "\n\n\n\n");
// remove the concatunation with ' + '
System.out.println("--- Remove concatunation (all: ' + ')");
email = email.replaceAll("' \\+ '", "");
System.out.println(email + "\n\n\n\n");
// extract the email address variables
System.out.println("--- Remove useless lines");
Matcher matcher = Pattern.compile("var addy.*var addy", Pattern.MULTILINE + Pattern.DOTALL).matcher(email);
matcher.find();
email = matcher.group();
System.out.println(email + "\n\n\n\n");
// get the to string enclosed by '' and concatunate
System.out.println("--- Extract the email address");
matcher = Pattern.compile("'(.*)'.*'(.*)'", Pattern.MULTILINE + Pattern.DOTALL).matcher(email);
matcher.find();
email = matcher.group(1) + matcher.group(2);
System.out.println(email);
}
If something is generated dynamicly with javascript on client side after response from server is complete, that there is no other way than:
Reverse engineering - figure out what does server side script do, and try to implement same behaviour
Download javascript from processed page, and use java's javascript processor to execute such script and get result (yeah, it is possible, and i was forced to do such thing).Here you have basic example showing how to evaluate javascript in java.

How to get h2 Tag of a table using Jsoup

I need some help scraping a webpage with Jsoup. I want to pars player profiles from the hcfactions webpage and gather their kills and deaths. The problem I'm running into is that each profile page is dynamically created and will only have said tables if the player has kills or deaths. So in order to tell which table I'm parsing I need to get the header text that's set after the call.
example web page: http://www.hcfactions.net/index.php?action=playerinfo&player=Djmaddox.
Below is a html segment from the web page I'm scraping:
<table class='table-bordered'><h2 style='text-align:center'>Deaths</h2>
<tr><td>Date</td><td>Reason</td><td>Details</td></tr><tr><td>Dec 11 5:27pm CST</td>.....
I have this code that pulls the tables and counts entries but it wont pull the h2 tags with it for me to select.
public void getPlayerDetails(String name) {
String data = "";
Avatar temp = _db.getPlayer(name);
playerUrl = "http://www.hcfactions.net/index.php?action=playersearch&player=" + name;
try {
// data = Jsoup.connect(url)
// .url(url).get().html();
playerDoc = Jsoup.connect(playerUrl).get();
} catch (IOException ex) {
Logger.getLogger(JParser.class.getName()).log(Level.SEVERE, null, ex);
}
if (playerDoc.select("table").size() == 1) {
return;
} else if (playerDoc.select("table").size() >= 2) {
for (int x = 1; x < playerDoc.select("table").size(); x++) {
System.out.println("deaths");
Element table = playerDoc.select("table").get(x);
Iterator<Element> ite = table.select("tr").iterator();
int count = 0;
while (ite.hasNext()) {
data = ite.next().text();
count++;
}
if (count > 0) {
temp.setDeaths(count - 1);
}
}
}
}
The tag <h2> is on an invalid position. That's why JSoup cannot find it I think. You have to extract it yourself with regular expressions. You can get the content of the <h2> with the following code:
String tableToString = "<table class='table-bordered'><h2 style='text-align:center'>Deaths</h2>" + "<tr>" + "<td>Date</td>" + "<td>Reason</td>" + "<td>Details</td>" + "</tr>" + "</table>";
String regex = "<h2.*>(.*)?</h2>";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(tableToString);
if (matcher.find()) {
System.out.println(matcher.group(1));
}
You can init tableToString with table.toString() from your code.
As ka3ak says, the <h2> is mispositioned. But you don't have to abandon your parser as resort to regex for that. Assuming JSoup is a decent HTML parser (never used it myself) the <h2> element should be the element immediately preceding the <table> element. Get your 'select' statement to look for it there.
Elements headers=playerDoc.select("div.span10.offset1 h2");
IMHO Your selections seams to be little bit overcomplicated, but maybe it has to be like that. Anyway snippet above will get you every H2 tags present in proper container.
Later on you ca select required tables like that Elements tables=playerDoc.select("div.span10.offset1 table"); and apply proper data digging onto them. Headers will be in corresponding order to tables ofc. I think, that my job is done here :)

Java : replacing text URL with clickable HTML link

I am trying to do some stuff with replacing String containing some URL to a browser compatible linked URL.
My initial String looks like this :
"hello, i'm some text with an url like http://www.the-url.com/ and I need to have an hypertext link !"
What I want to get is a String looking like :
"hello, i'm some text with an url like http://www.the-url.com/ and I need to have an hypertext link !"
I can catch URL with this code line :
String withUrlString = myString.replaceAll(".*://[^<>[:space:]]+[[:alnum:]/]", "HereWasAnURL");
Maybe the regexp expression needs some correction, but it's working fine, need to test in further time.
So the question is how to keep the expression catched by the regexp and just add a what's needed to create the link : catched string
Thanks in advance for your interest and responses !
Try to use:
myString.replaceAll("(.*://[^<>[:space:]]+[[:alnum:]/])", "HereWasAnURL");
I didn't check your regex.
By using () you can create groups. The $1 indicates the group index.
$1 will replace the url.
I asked a simalir question: my question
Some exemples: Capturing Text in a Group in a regular expression
public static String textToHtmlConvertingURLsToLinks(String text) {
if (text == null) {
return text;
}
String escapedText = HtmlUtils.htmlEscape(text);
return escapedText.replaceAll("(\\A|\\s)((http|https|ftp|mailto):\\S+)(\\s|\\z)",
"$1$2$4");
}
There may be better REGEXs out there, but this does the trick as long as there is white space after the end of the URL or the URL is at the end of the text. This particular implementation also uses org.springframework.web.util.HtmlUtils to escape any other HTML that may have been entered.
For anybody who is searching a more robust solution I can suggest the Twitter Text Libraries.
Replacing the URLs with this library works like this:
new Autolink().autolink(plainText)
Belows code replaces links starting with "http" or "https", links starting just with "www." and finally replaces also email links.
Pattern httpLinkPattern = Pattern.compile("(http[s]?)://(www\\.)?([\\S&&[^.#]]+)(\\.[\\S&&[^#]]+)");
Pattern wwwLinkPattern = Pattern.compile("(?<!http[s]?://)(www\\.+)([\\S&&[^.#]]+)(\\.[\\S&&[^#]]+)");
Pattern mailAddressPattern = Pattern.compile("[\\S&&[^#]]+#([\\S&&[^.#]]+)(\\.[\\S&&[^#]]+)");
String textWithHttpLinksEnabled =
"ajdhkas www.dasda.pl/asdsad?asd=sd www.absda.pl maiandrze#asdsa.pl klajdld http://dsds.pl httpsda http://www.onet.pl https://www.onsdas.plad/dasda";
if (Objects.nonNull(textWithHttpLinksEnabled)) {
Matcher httpLinksMatcher = httpLinkPattern.matcher(textWithHttpLinksEnabled);
textWithHttpLinksEnabled = httpLinksMatcher.replaceAll("$0");
final Matcher wwwLinksMatcher = wwwLinkPattern.matcher(textWithHttpLinksEnabled);
textWithHttpLinksEnabled = wwwLinksMatcher.replaceAll("$0");
final Matcher mailLinksMatcher = mailAddressPattern.matcher(textWithHttpLinksEnabled);
textWithHttpLinksEnabled = mailLinksMatcher.replaceAll("$0");
System.out.println(textWithHttpLinksEnabled);
}
Prints:
ajdhkas www.dasda.pl/asdsad?asd=sd www.absda.pl maiandrze#asdsa.pl klajdld http://dsds.pl httpsda http://www.onet.pl https://www.onsdas.plad/dasda
Assuming your regex works to capture the correct info, you can use backreferences in your substitution. See the Java regexp tutorial.
In that case, you'd do
myString.replaceAll(....., "\1")
In case of multiline text you can use this:
text.replaceAll("(\\s|\\^|\\A)((http|https|ftp|mailto):\\S+)(\\s|\\$|\\z)",
"$1<a href='$2'>$2</a>$4");
And here is full example of my code where I need to show user's posts with urls in it:
private static final Pattern urlPattern = Pattern.compile(
"(\\s|\\^|\\A)((http|https|ftp|mailto):\\S+)(\\s|\\$|\\z)");
String userText = ""; // user content from db
String replacedValue = HtmlUtils.htmlEscape(userText);
replacedValue = urlPattern.matcher(replacedValue).replaceAll("$1$2$4");
replacedValue = StringUtils.replace(replacedValue, "\n", "<br>");
System.out.println(replacedValue);

Categories