CSS validation with AntiSamy - java

I have a String, and I want to validate whether it is a valid CSS value or not. In the documentation of AntiSamy, I found that I might be able to use CSSValidator.isValidProperty (http://javadox.com/org.owasp/antisamy/1.4/org/owasp/validator/css/CssValidator) to do so. However, the type of the second param requires LexicalUnit.
Is there another way to validate a String with AnitSamy?

I think what you want is the CssScanner.
/****** pull out style tag from html *****/
Pattern p = Pattern.compile("<style>([\\s\\S]+?)</style>");
Matcher m = p.matcher(validHTML);
// if we find a match, get the group
if (m.find()) {
// get the matching group
codeGroup = m.group(1);
}
/****** block for checking all css for validity *****/
InternalPolicy policy = null;
try {
policy = (InternalPolicy) InternalPolicy.getInstance("antisamy-ebay.xml");
} catch (PolicyException e) {
e.printStackTrace();
}
ResourceBundle messages = ResourceBundle.getBundle("AntiSamy", Locale.getDefault());
CssScanner scanner = new CssScanner(policy, messages);
CleanResults results = scanner.scanStyleSheet(codeGroup, Integer.MAX_VALUE);
validCSS = results.getCleanHTML().toString();
That is the part of the code that worked for me. Let me know if any of this does not work for you, I have variables declared at the top of the code because I am also handling html validation in here too. So some variables are not in this code. But it should point you in the right direction. Also, you need a policy in place, I chose the ebay policy, this guides the whitelist of what the css will allow for the resulting output. I have not used the CssValidator, so I am not sure how they compare, but CssScanner does a great job of giving back clean css.

Related

Obtaining census block groups from shapefile based on latlong inputs - Java

I am new to shapefile processing. Kindly guide me on how to achieve my below query.
I am using this shapefile tl_2018_us_aiannh.shp from census.gov : TIGER-LINE. I am to obtain the census block group entities like Block, Tract, County subdivision and County details from the shapefile based on the latitude and longitude provided by the user.
My requirement is to achieve this by shapefile alone and not through any API's.
Can someone help on which framework I can achieve this?
What I've tried/using so far:
I have used GeoTools to read the shapefile . Can I continue using the same? Will my requirement be achievable by this tool?
I have gone through a documentation from census.gov which states:
The Census Bureau assigns a code and these appear in fields such as
“TRACTCE”, where “CE” stands for Census. Finally, state-submitted
codes end in “ST”, such as “SLDLST”, and local education agency codes
end in “LEA”, as in “ELSDLEA”.
Which I tried in my code by:
File file = new File("D:\\tl_2018_us_aiannh.shp");
try {
Map<String, String> connect = new HashMap();
connect.put("url", file.toURI().toString());
DataStore dataStore = DataStoreFinder.getDataStore(connect);
String[] typeNames = dataStore.getTypeNames();
String typeName = typeNames[0];
System.out.println("Reading content " + typeName);
SimpleFeatureSource featureSource = dataStore
.getFeatureSource(typeName);
SimpleFeatureCollection collection = featureSource.getFeatures();
SimpleFeatureIterator iterator = collection.features();
try {
while (iterator.hasNext()) {
SimpleFeature feature = iterator.next();
GeometryAttribute sourceGeometry = feature
.getDefaultGeometryProperty();
String name = (String) (feature).getAttribute("TRACTCE");
Property property = feature.getProperty("TRACTCE");
System.out.println(property);
}
} finally {
iterator.close();
}
} catch (Throwable e) {
e.getMessage();
}
But I am receiving null as the value.
Any help would be much helpful.
I have found the solution to this. Hope this would be helpful to someone in need.
SimpleFeature is the type that has the attributes of shape files that you can check when you try to debug or print a line on runtime. You can use the SimpleFeature to get the property. The attributes can be achieved by:
try {
while (iterator.hasNext()) {
SimpleFeature feature = iterator.next();
Property intptlat = feature.getProperty("TRACTCE");
}
}
Make sure you are choosing the Block Groups as the layer type for download in Tiger-Line or which ever site is concerned, where you download the shape file.

Using Java Desktop mailto for two recipients

I am trying to add a feedback button to a Java program for work. I want this button to actually send an email to myself and one other person. All employees have the same default email application so using the Desktop mail method works fine.
I managed to get this working with 1 email addressee. It properly opens the email client, starts a new email and puts the addressee in the address line. The problem is when I try to add two email addresses.
int result = JOptionPane.showOptionDialog(null, panel, "Feedback", JOptionPane.YES_NO_OPTION,
JOptionPane.INFORMATION_MESSAGE, null, options1, null);
if(result == JOptionPane.NO_OPTION){
try {
Desktop.getDesktop().mail(new URI("mailto:Chuck.Norris#yahoo.com"));
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (URISyntaxException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
So doing it like that works perfectly.
I've tried simply separating the addresses with a comma like this:
Desktop.getDesktop().mail(new URI("mailto:Chuck.Norris#yahoo.com","Bill.Clinton#gmail.com"));
but this gives me an error and the only option is to actually remove the second argument.
Finally I've tried using a String[] like this:
String[] mailAddressTo = {"Chuck.Norris#yahoo.com","Bill.Clinton#gmail.com"};
and then inserting that into the mailto method like this:
Desktop.getDesktop().mail(new URI("mailto:"+mailAddressTo));
but the email address comes out being
[Ljava.lang.String; #5e9394f7
once the email client is opened.
I've tried searching online and while I did find some solutions in regards to sending mail using Java through other methods than Desktop.mail - I found nothing related to how to accomplish this with Desktop.
If anyone can let me know how to make this work I would greatly appreciate it!
It helps to look at the documentation instead of guessing.
The list of URI constructors shows that there is no URI constructor which takes two Strings. That is why your first approach failed.
In Java, all arrays extend Object and inherit the default toString method of Object. Concatenating objects with + automatically invokes each object’s toString method, which is why your second approach yielded the results it did.
The official definition of the format of mailto: URLs is RFC 2368, which states that multiple recipients can be specified by separating with commas. So, you were on the right track.
As of Java 8, you can simply join your addresses with String.join:
String[] mailAddressTo = {"Chuck.Norris#yahoo.com","Bill.Clinton#gmail.com"};
Desktop.getDesktop().mail(new URI("mailto:" + String.join(",", mailAddressTo)));
However, the documentation of the URI class states that the single-argument constructor assumes its String argument is already properly escaped. While it’s true that the example e-mail addresses you’ve provided don’t need to be escaped, it’s not safe to make that assumption with all possible addresses. To deal with this, you can use a multiple-argument URI constructor that will do the correct URI-escaping for you:
String[] mailAddressTo = { "Chuck.Norris#yahoo.com", "Bill.Clinton#gmail.com" };
Desktop.getDesktop().mail(new URI("mailto", String.join(",", mailAddressTo), null));
If you’re using a version of Java older than 8, you can build the string yourself:
String[] mailAddressTo = { "Chuck.Norris#yahoo.com", "Bill.Clinton#gmail.com" };
StringBuilder addressList = new StringBuilder();
String separator = "";
for (String address : mailAddressTo) {
addressList.append(separator).append(address);
separator = ",";
}
Desktop.getDesktop().mail(new URI("mailto", addressList.toString(), null));

Get all HTTP url from a webpage

I am creating a simple utility to retrieve all HTTP URL's from a webpage.
Initially I had planned to use a HTML parsing library to parse out the HREF tags but I got to know that I need to retrieve the URL contained inside the script too (Example script below) hence I started trying out regular expression to get all the HTTP url from the web page but for some reason my regular expression is not working properly.
The URL can be inside a javascript
<script>
if(jQuery.browser.msie)
{
var v= 'http://test.com/test/test';
}
</script>
My program:
try {
BufferedReader in=new BufferedReader(new FileReader("c:\\sample\\sample.html"));
while ((inputLine = in.readLine()) != null) {
System.out.println(inputLine);
String pattern = "http?://([-\\w\\.]+)+(:\\d+)?(/([\\w/_\\.]*(\\?\\S+)?)?)?";
// Create a Pattern object
Pattern r = Pattern.compile(pattern);
// Now create matcher object.
Matcher m = r.matcher(inputLine.replaceAll("http://", "\nhttp://"));
while (!m.hitEnd()) {
if (m.find()) {
System.out.println("Found value: " + m.group(0));
} else {
//System.out.println("NO MATCH");
}
}
}
in.close();
} catch (Exception e) {
e.printStackTrace();
}
Can someone help me fix this issue or let me know the best way to retrieve all URL's from a web page?
Description
Your expression has a typo. It should make the s optional.
https?://([-\\w\\.]+)+(:\\d+)?(/([\\w/_\\.]*(\\?\\S+)?)?)?
^
Also I recommend:
replacing the (...) capture groups with non capture groups like (?:...)
you don't need to escape a . inside a character group [.]
add a test to ensure you're not captureing the close quotes surrounding your url
rewrite your section looking for /folder/subfolder sections as a repeating non-capture group looking for the initial slash followed by the folder name
regex: https?:\/\/(?:[\w-]+.)+(?::\d+)?(?:\/[\w\/_.]*)*?(?:\?\S+)?(?=['"\s])
as a Java string: "https?:\\/\\/(?:[\\w-]+.)+(?::\\d+)?(?:\\/[\\w\\/_.]*)*?(?:\\?\\S+)?(?=['\"\\s])"
Example
Live Demo
Sample Text
<script>
if(jQuery.browser.msie)
{
var v= 'http://test.com/test/test';
}
</script>
<a class="test" href="http://blablablablabla.com">Third Link</a>
Matches
[0] => http://test.com/test/test
[1] => http://blablablablabla.com
try using this
\A'http:\/\/[\w\W]+'\z
this will check that your url must be starting from http:// and is an string in starting and ending and as in between the url nowadys anything can come so we will have to allow special character like ?:,-_/\ and also the numbers digits etc.
so this will get you all the urls present in the file.

How to fix the HTTP Response splitting vulnerability with ESAPI

after a recent findbugs (FB) run it complains about a: Security - HTTP Response splitting vulnerability The following code triggers it:
String referrer = req.getParameter("referrer");
if (referrer != null) {
launchURL += "&referrer="+(referrer);
}
resp.sendRedirect(launchURL);
Basically the 'referrer' http parameter contains an url, to which, when clicking on a back button in our application the browser returns to. It is appended to the url as a parameter. After a bit research i know that i need to sanitize the referrer url. After a bit more research i found the esapi project which seem to offer this kind of functionality:
//1st canonicalize
import org.owasp.esapi.Encoder;
import org.owasp.esapi.Validator;
import org.owasp.esapi.reference.DefaultEncoder;
import org.owasp.esapi.reference.DefaultValidator;
[...]
Encoder encoder = new DefaultEncoder(new ArrayList<String>());
String cReferrer = encoder.canonicalize(referrer);
However I didn't figure out how to detect e.g. jscript code or other stuff which doesn't belong to a referrer url. So how can I achieve that with esapi?
I tried:
Validator validator = new DefaultValidator(encoder);
validator.isValidInput("Redirect URL",referrer,"HTTPParameterValue",512,false);
however this doesn't work. What I need is a function which results in:
http://www.google.com (ok)
http://www.google.com/login?dest=http://google.com/%0D%0ALocation: javascript:%0D%0A%0D%0Aalert(document.cookie) (not ok)
Or is it enough to call the following statement?
encoder.encodeForHTMLAttribute(referrer);
Any help appreciated.
Here's my final solution if anyone is interested. First I canonicalize and then URL decode the string. If a CR or LF exists (\n \r) I just cut of the rest of that potential 'attack' string starting with \n or \r.
String sanitize(String url) throws EncodingException{
Encoder encoder = new DefaultEncoder(new ArrayList<String>());
//first canonicalize
String clean = encoder.canonicalize(url).trim();
//then url decode
clean = encoder.decodeFromURL(clean);
//detect and remove any existent \r\n == %0D%0A == CRLF to prevent HTTP Response Splitting
int idxR = clean.indexOf('\r');
int idxN = clean.indexOf('\n');
if(idxN >= 0 || idxR>=0){
if(idxN<idxR){
//just cut off the part after the LF
clean = clean.substring(0,idxN);
}
else{
//just cut off the part after the CR
clean = clean.substring(0,idxR);
}
}
//re-encode again
return encoder.encodeForURL(clean);
}
Theoretically i could have later verified the value against 'HTTPParameterValue' regex which is defined in the ESAPI.properties however it didn't like colon in the http:// and I didn't investigated further.
And one more remark after testing it: Most modern browser nowadays (Firefox > 3.6, Chrome, IE10 etc.) detect this kind of vulnerability and do not execute the code...
I think you have the right idea, but are using an inappropriate encoder. The Referer [sic] header value is really a URL, not an HTML attribute, so you really want to use:
encoder.encodeForURL(referrer);
-kevin
I would suggest white-listing approach wherein you check the referrer string only for permissible characters. Regex would be a good option.
EDIT:
The class org.owasp.esapi.reference.DefaultEncoder being used by you is not really encoding anything. Look at the source code of the method encodeForHTMLAttribute(referrer) here at grepcode. A typical URL encoding (encoding carriage return and line feed) too wont help.
So the way forward would be device some validation logic which checks for valid set of characters. Here is another insightful article.
The accepted answer will not work if in case there is "\n\r" in the string.
Example:
If I have string: "This is str\n\rstr", it returns "This is str\nstr"
Rectified version of above accepted answer is:
String sanitizeCarriageReturns(String value) {
int idxR = value.indexOf('\r');
int idxN = value.indexOf('\n');
if (idxN >= 0 || idxR >= 0) {
if ((idxN > idxR && idxR<0) || (idxR > idxN && idxR>=0)) {
value = value.substring(0, idxN);
} else if (idxN < idxR){
value = value.substring(0, idxR);
}
}
return value;
}

Java : replacing text URL with clickable HTML link

I am trying to do some stuff with replacing String containing some URL to a browser compatible linked URL.
My initial String looks like this :
"hello, i'm some text with an url like http://www.the-url.com/ and I need to have an hypertext link !"
What I want to get is a String looking like :
"hello, i'm some text with an url like http://www.the-url.com/ and I need to have an hypertext link !"
I can catch URL with this code line :
String withUrlString = myString.replaceAll(".*://[^<>[:space:]]+[[:alnum:]/]", "HereWasAnURL");
Maybe the regexp expression needs some correction, but it's working fine, need to test in further time.
So the question is how to keep the expression catched by the regexp and just add a what's needed to create the link : catched string
Thanks in advance for your interest and responses !
Try to use:
myString.replaceAll("(.*://[^<>[:space:]]+[[:alnum:]/])", "HereWasAnURL");
I didn't check your regex.
By using () you can create groups. The $1 indicates the group index.
$1 will replace the url.
I asked a simalir question: my question
Some exemples: Capturing Text in a Group in a regular expression
public static String textToHtmlConvertingURLsToLinks(String text) {
if (text == null) {
return text;
}
String escapedText = HtmlUtils.htmlEscape(text);
return escapedText.replaceAll("(\\A|\\s)((http|https|ftp|mailto):\\S+)(\\s|\\z)",
"$1$2$4");
}
There may be better REGEXs out there, but this does the trick as long as there is white space after the end of the URL or the URL is at the end of the text. This particular implementation also uses org.springframework.web.util.HtmlUtils to escape any other HTML that may have been entered.
For anybody who is searching a more robust solution I can suggest the Twitter Text Libraries.
Replacing the URLs with this library works like this:
new Autolink().autolink(plainText)
Belows code replaces links starting with "http" or "https", links starting just with "www." and finally replaces also email links.
Pattern httpLinkPattern = Pattern.compile("(http[s]?)://(www\\.)?([\\S&&[^.#]]+)(\\.[\\S&&[^#]]+)");
Pattern wwwLinkPattern = Pattern.compile("(?<!http[s]?://)(www\\.+)([\\S&&[^.#]]+)(\\.[\\S&&[^#]]+)");
Pattern mailAddressPattern = Pattern.compile("[\\S&&[^#]]+#([\\S&&[^.#]]+)(\\.[\\S&&[^#]]+)");
String textWithHttpLinksEnabled =
"ajdhkas www.dasda.pl/asdsad?asd=sd www.absda.pl maiandrze#asdsa.pl klajdld http://dsds.pl httpsda http://www.onet.pl https://www.onsdas.plad/dasda";
if (Objects.nonNull(textWithHttpLinksEnabled)) {
Matcher httpLinksMatcher = httpLinkPattern.matcher(textWithHttpLinksEnabled);
textWithHttpLinksEnabled = httpLinksMatcher.replaceAll("$0");
final Matcher wwwLinksMatcher = wwwLinkPattern.matcher(textWithHttpLinksEnabled);
textWithHttpLinksEnabled = wwwLinksMatcher.replaceAll("$0");
final Matcher mailLinksMatcher = mailAddressPattern.matcher(textWithHttpLinksEnabled);
textWithHttpLinksEnabled = mailLinksMatcher.replaceAll("$0");
System.out.println(textWithHttpLinksEnabled);
}
Prints:
ajdhkas www.dasda.pl/asdsad?asd=sd www.absda.pl maiandrze#asdsa.pl klajdld http://dsds.pl httpsda http://www.onet.pl https://www.onsdas.plad/dasda
Assuming your regex works to capture the correct info, you can use backreferences in your substitution. See the Java regexp tutorial.
In that case, you'd do
myString.replaceAll(....., "\1")
In case of multiline text you can use this:
text.replaceAll("(\\s|\\^|\\A)((http|https|ftp|mailto):\\S+)(\\s|\\$|\\z)",
"$1<a href='$2'>$2</a>$4");
And here is full example of my code where I need to show user's posts with urls in it:
private static final Pattern urlPattern = Pattern.compile(
"(\\s|\\^|\\A)((http|https|ftp|mailto):\\S+)(\\s|\\$|\\z)");
String userText = ""; // user content from db
String replacedValue = HtmlUtils.htmlEscape(userText);
replacedValue = urlPattern.matcher(replacedValue).replaceAll("$1$2$4");
replacedValue = StringUtils.replace(replacedValue, "\n", "<br>");
System.out.println(replacedValue);

Categories