I know that better methods of URL validation exist and worse methods might be common that this example. But can someone tell me what is probably wrong with the following URL validation code when the url = "Some random english sentence" ?
I see that the validation fails. Dont know why.
/**
* Checks if url is ok
* THIS METHOD DOESNT SEEM TO WORK WELL
*
* #param url
* #return True if url is ok, False otherwise
*/
static public boolean isUrlOk(String url) {
try {
URL urlObject = new URL(url);
String host = urlObject.getHost();
return true;
} catch (Exception e) {
return false;
}
}
The problem: It sometimes returns true for random sentences.
modify the catch part to add e.printStacktrace() to get the details of why it fails.
If you are trying with "Some random english sentence" it will fail with no protocol specified.
According to the java.net.URL API doc at http://docs.oracle.com/javase/7/docs/api/java/net/URL.html#URL(java.lang.String):
MalformedURLException - if no protocol is specified, or an unknown protocol is found, or spec is null.
Since no scheme was specified, the exception was thrown.
Related
I'm using a Jave program to get NSE share price data from NSE's website like this for example:
url = new URL("https://archives.nseindia.com/archives/equities/bhavcopy/pr/PR071122.zip");
f = new File("NSEData.zip");
try {
FileUtils.copyURLToFile(url, f);
} catch (Exception e) {
}
The above code works for dates where market data exists, like 07/11/22 . However, where data does not exist, like on 08/11/22, the url is broken and the copyURLToFile line gets stuck indefinitely during runtime (replacing 071122 with 081122 in the url/code above will cause it to get stuck). Is there an easy way to get the program to recognize that the url for a certain date is broken (eg. https://archives.nseindia.com/archives/equities/bhavcopy/pr/PR081122.zip) and therefore ignore and continue past the try block without getting stuck?
My current workaround is to check whether a certain date is a market holiday using a DayOfWeek check as well as a HashSet containing a list of public holidays, but this is not perfect.
So, basically your URL is returning 500 error upon requesting for invalid date. You can simply use the another method available in FileUtils
https://commons.apache.org/proper/commons-io/javadocs/api-2.5/org/apache/commons/io/FileUtils.html#copyURLToFile(java.net.URL,%20java.io.File,%20int,%20int)
Example code : (Adjust timeouts as per your requirement)
var url = new URL("https://archives.nseindia.com/archives/equities/bhavcopy/pr/PR081122.zip");
var f = new File("NSEData.zip");
try {
FileUtils.copyURLToFile(url, f, 5000, 5000);
} catch (Exception e) {
}
I am new to the Stack Overflow forum. I have a question in remediating the fortify scan issues.
HP Fortify scan reporting the Resource Injection issue for following code.
String testUrl = "http://google.com";
URL url = null;
try {
url = new URL(testUrl);
} catch (MalformedURLException mue) {
log.error("MalformedUrlException URL " + testUrl + " Exception : " + mue);
}
In the above code fortify showing Resource injection in line => url = new URL(testUrl);
I have done following code changes for URL validation using ESAPI to remediate this issue,
String testUrl = "http://google.com";
URL url = null;
try {
String canonURL = ESAPI.encoder().canonicalize(strurl, false, false);
if(ESAPI.validator().isValidInput("URLContext", canonURL, "URL", canonURL.length(), false)) {
url = new URL(canonURL);
} else {
log.error("In Valid script URL passed"+ canonURL);
}
} catch (MalformedURLException mue) {
log.error("MalformedUrlException URL " + canonURL + " Exception : " + mue);
}
However, still Fortify scan reporting as en error. It is not remeditaing this issue. Anything am doing wrong?
Any solution will help lot.
Thanks,
Marimuthu.M
I think that the real issue here is not that the URL may be somehow malformed, but, that the URL may not reference a valid site. More specifically, if I, the bad guy, am able to cause your URL to point to my web site, then you obtain data from my location that is not tested and I can return data that may be used to compromise your system. I might use that to say return a record for "bob the bad guy" that makes bob look like a good guy.
I suspect that in your code you do not set a hard coded value in a string, since this is usually described with words such as
When an application permits a user input to define a resource, like a
file name or port number, this data can be manipulated to execute or
access different resources.
(see https://www.owasp.org/index.php/Resource_Injection)
I think that the proper response will be some combination of:
Do not get the result from the user, but, use the input to choose from your own internal list.
Argue that the value came from a trusted source. For example, read from a strictly controlled database or configuration file.
You do not need to remove the warnings, you need to demonstrate that you understand the risk and indicate why it is OK to use the value in your case.
boolean isValidInput(java.lang.String context,
java.lang.String input,
java.lang.String type,
int maxLength,
boolean allowNull)
throws IntrusionException
type filed in isValidInput function defines a Regular expression or pattern to match with your testUrl.
Like:
try {
ESAPI.validator().getValidInput("URI_VALIDATION", requestUri, "URL", 80, false);
} catch (ValidationException e) {
System.out.println("Validation exception");
e.printStackTrace();
} catch (IntrusionException e) {
System.out.println("Inrusion exception");
e.printStackTrace();
}
It will pass if requestUri matches pattern defined in validation.properties under Validator.URL and its length is less than 80.
Validator.URL=^(ht|f)tp(s?)\:\/\/0-9a-zA-Z(:(0-9))(\/?)([a-zA-Z0-9\-\.\?\,\:\'\/\\\+=&%\$#_])?$
This is piggybacking on Andrew's answer, but the problem Fortify is warning you of is user control of a URL. If your application later decides to make connections to that website, and it is untrusted, this is an issue.
If this is an application where you care more about sharing public URIs, than you'll have to accept the risk, and make sure users are properly trained on the inherent risk, as well as make sure if you redisplay those URLs, that someone doesn't try to embed malicious data.
String[] schemes = {"http","https"};
UrlValidator urlValidator = new UrlValidator(schemes, UrlValidator.ALLOW_ALL_SCHEMES);
System.out.println(urlValidator.isValid(myUrl));
the following URL says, invalid. Any one know why is that. the localnet is a localnetwork. But this works for any other public network (it seems).
http://aunt.localnet/songs/barnbeat.ogg
The class you're using is deprecated. The replacement is
org.apache.commons.validator.routines.UrlValidator
Which is more flexible. You can pass the flag ALLOW_LOCAL_URLS to the constructor which would allow most addresses like the one you are using. In our case, we had authentication data preceeding the address, so we had to use the even-more-flexible UrlValidator(RegexValidator authorityValidator, long options) constructor.
As I thought, its failing on the top level;
String topLevel = domainSegment[segmentCount - 1];
if (topLevel.length() < 2 || topLevel.length() > 4) {
return false;
}
your top level is localnet.
This is fixed in the 1.4.1 release of the Apache Validator:
https://issues.apache.org/jira/browse/VALIDATOR-342
https://issues.apache.org/jira/browse/VALIDATOR/fixforversion/12320156
A simple upgrade to the latest version of the validator should fix things nicely.
check line 2
it should be
new UrlValidator(schemes);
if you want to allow 2 slashes and disallow fragments
new UrlValidator(schemes, ALLOW_2_SLASHES + NO_FRAGMENTS);
Here is the source code for isValid(String) method:
You can check the result at each step by manual call to understand where it fails.
You can use the following:
UrlValidator urlValidator = new UrlValidator(schemes, new RegexValidator("^((?!-)[A-Za-z0-9-]{1,63}(?<!-)\\.)+[A-Za-z]{2,6}$"), 0L);
The library method fails on this URL:
"http://en.wikipedia.org/wiki/3,2,1..._Frankie_Go_Boom"
Which is perfectly legal (and existing) URL.
I found by trial and error that the below code is more accurate:
public static boolean isValidURL(String url)
{
URL u = null;
try
{
u = new URL(url);
}
catch (MalformedURLException e)
{
return false;
}
try
{
u.toURI();
}
catch (URISyntaxException e)
{
return false;
}
return true;
}
Given an Android application's id/package name, how can I check programatically if the application is available on the Android Market?
For example:
com.rovio.angrybirds is available, where as com.random.app.ibuilt is not
I am planning on having this check be performed either from an Android application or from a Java Servlet.
Thank you,
PS: I took a look at http://code.google.com/p/android-market-api/ , but I was wondering if there was any simpler way to checking
You could try to open the details page for the app - https://market.android.com/details?id=com.rovio.angrybirds.
If the app doesn't exist, you get this:
It's perhaps not ideal, but you should be able to parse the returned HTML to determine that the app doesn't exist.
Given an Android application's id/package name, how can I check programatically if the application is available on the Android Market?
There is no documented and supported means to do this.
While the html parsing solution by #RivieeaKid works, I found that this might be a more durable and correct solution. Please make sure to use the 'https' prefix (not plain 'http') to avoid redirects.
/**
* Checks if an app with the specified package name is available on Google Play.
* Must be invoked from a separate thread in Android.
*
* #param packageName the name of package, e.g. "com.domain.random_app"
* #return {#code true} if available, {#code false} otherwise
* #throws IOException if a network exception occurs
*/
private boolean availableOnGooglePlay(final String packageName)
throws IOException
{
final URL url = new URL("https://play.google.com/store/apps/details?id=" + packageName);
HttpURLConnection httpURLConnection = (HttpURLConnection) url.openConnection();
httpURLConnection.setRequestMethod("GET");
httpURLConnection.connect();
final int responseCode = httpURLConnection.getResponseCode();
Log.d(TAG, "responseCode for " + packageName + ": " + responseCode);
if(responseCode == HttpURLConnection.HTTP_OK) // code 200
{
return true;
}
else // this will be HttpURLConnection.HTTP_NOT_FOUND or code 404 if the package is not found
{
return false;
}
}
My application processes URLs entered manually by users. I have discovered that some of malformed URLs (like 'http:/not-valid') result in NullPointerException thrown when connection is being opened. As I learned from this Java bug report, the issue is known and will not be fixed. The suggestion is to use java.net.URI, which is "more RFC 2396-conformant".
Question is: how to use URI to work around the problem? The only thing I can do with URI is to use it to parse string and generate URL. I have prepared following program:
import java.net.*;
public class Test
{
public static void main(String[] args)
{
try {
URI uri = URI.create(args[0]);
Object o = uri.toURL().getContent(); // try to get content
}
catch(Throwable e) {
e.printStackTrace();
}
}
}
Here are results of my tests (with java 1.6.0_20), not much different from what I get with java.net.URL:
sh-3.2$ java Test url-not-valid
java.lang.IllegalArgumentException: URI is not absolute
at java.net.URI.toURL(URI.java:1080)
at Test.main(Test.java:9)
sh-3.2$ java Test http:/url-not-valid
java.lang.NullPointerException
at sun.net.www.ParseUtil.toURI(ParseUtil.java:261)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:795)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:726)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1049)
at java.net.URLConnection.getContent(URLConnection.java:688)
at java.net.URL.getContent(URL.java:1024)
at Test.main(Test.java:9)
sh-3.2$ java Test http:///url-not-valid
java.lang.IllegalArgumentException: protocol = http host = null
at sun.net.spi.DefaultProxySelector.select(DefaultProxySelector.java:151)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:796)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:726)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1049)
at java.net.URLConnection.getContent(URLConnection.java:688)
at java.net.URL.getContent(URL.java:1024)
at Test.main(Test.java:9)
sh-3.2$ java Test http:////url-not-valid
java.lang.NullPointerException
at sun.net.www.ParseUtil.toURI(ParseUtil.java:261)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:795)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:726)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1049)
at java.net.URLConnection.getContent(URLConnection.java:688)
at java.net.URL.getContent(URL.java:1024)
at Test.main(Test.java:9)
You can use appache Validator Commons ..
UrlValidator urlValidator = new UrlValidator();
urlValidator.isValid("http://google.com");
http://commons.apache.org/validator/
http://commons.apache.org/validator/api-1.3.1/
If I run your code with the type of malformed URI in the bug report then it throws URISyntaxException. So the suggested fix fixes the reported error.
$ java -cp bin UriTest http:\\\\www.google.com\\
java.lang.IllegalArgumentException
at java.net.URI.create(URI.java:842)
at UriTest.main(UriTest.java:8)
Caused by: java.net.URISyntaxException: Illegal character in opaque part at index 5: http:\\www.google.com\
at java.net.URI$Parser.fail(URI.java:2809)
at java.net.URI$Parser.checkChars(URI.java:2982)
at java.net.URI$Parser.parse(URI.java:3019)
at java.net.URI.(URI.java:578)
at java.net.URI.create(URI.java:840)
Your type of malformed URI is different, and does not appear to be a syntax error.
Instead, catch the null pointer exception and recover with a suitable message.
You could try and be friendly and check whether the URI starts with a single slash "http:/" and suggest that to the user, or you can check whether the hostname of the URL is non-empty:
import java.net.*;
public class UriTest
{
public static void main ( String[] args )
{
try {
URI uri = URI.create ( args[0] );
// avoid null pointer exception
if ( uri.getHost() == null )
throw new MalformedURLException ( "no hostname" );
URL url = uri.toURL();
URLConnection s = url.openConnection();
s.getInputStream();
} catch ( Throwable e ) {
e.printStackTrace();
}
}
}
Note that even with the approaches proposed in the other answers, you wouldn't get validation right, since java.net.URI adheres to RFC 2396, which is notably outdated. By using java.net.URI, you'll get exceptions for URLs that today are valid for all web browsers.
In order to solve these issues, I wrote a library for URL parsing in Java: galimatias. It performs URL parsing the same way web browsers do (adhering to the WHATWG URL Specification).
In your case, you can write:
try {
URL url = io.mola.galimatias.URL.parse(url).toJavaURL();
} catch (GalimatiasParseException e) {
// If this exception is thrown, the given URL contains a unrecoverable error. That is, it's completely invalid.
}
As a nice side-effect, you get a lot of sanitization that you won't get with java.net.URI. For example, http:/example.com will be correctly parsed as http://example.com/.