Domain name interpretation utility for java - java

I find myself with a need for a java utility for taking a fully-qualified hostname, and producing the domain name from that.
In the simple case, that means turning host.company.com into company.com, but this gets rapidly more complicated with host.subdomain.company.com, for example, or host.company.co.uk, where the meaning of "domain name" gets a bit fuzzy. Throw in complications with the definition of SLD and ccSLD, and it gets messy.
So my question is whether there's a 3rd-party library out there that understands these things and can give me sensible interpreations.

Mozilla regularly maintains the rules that it uses in its browser for cookie security in a format that can be parsed and used by others:
http://publicsuffix.org/
Searching Google, there are probably Java libraries that can parse the list, but I don't know the quality of any of them.

I don't think such a thing exists, since it's an adminstrative rather than technical issue, and a very multi-lateral one, at that.
If you end up rolling your own, this page on the Mozilla wiki looks like a good starting point, with lots of references. Looks like a major headache though. Just look at the rules for Japan. Ouch.

Not sure if it's for the same purpose, I do something similar in my code. When I set cookies, I want to set the domain as close to top as possible so I need to find the domain one-level lower than a public suffix. For example, the highest domain you can set cookie for host.div.example.com is .example.com. For host.div.example.co.jp is .example.co.jp.
Unfortunately, the code is not in the public domain. It's very easy to do. I basically use following 2 classes from Apache HttpClient 4,
org.apache.http.impl.cookie.PublicSuffixFilter
org.apache.http.impl.cookie.PublicSuffixListParser
I forgot the exact reason but we had to make some very minor tweaks. You just walk the domain from top to bottom, first valid cookie domain is what you need.
You need to download the public suffix list from here and include it in your JAR,
http://mxr.mozilla.org/mozilla-central/source/netwerk/dns/src/effective_tld_names.dat?raw=1

Related

Minimum required properties in ESAPI.properties

My web application uses only the following ESAPI encode methods:
ESAPI.encoder().encodeForLDAP()
ESAPI.encoder().encodeForHTML()
In this case, what is the minimum required properties in ESAPI.properties?
Now I'm using ESAPI 2.1.0.1 and this properties.
If you are just using the encoder() function, the 3 lines in the encoder section is all you need. Lines 99-119 (between all the comments).
Edit
Plus you must specify a default encoder. Example:
ESAPI.Encoder=org.owasp.esapi.reference.DefaultEncoder
Encoder.AllowMultipleEncoding=false
Encoder.AllowMixedEncoding=false
Encoder.DefaultCodecList=HTMLEntityCodec,PercentCodec,JavaScriptCodec
I think I answered a previous question.
Again you're the victim of some bad design choices back at the beginning of the ESAPI project between 2009-2011. Namely, the Singleton-based monolith.
ESAPI's current design is monolithic, meaning it tries to be everything to everyone. As you're well aware, this isn't the best design strategy, but that's where we're at.
We have several proposals for breaking various functions out into separate libraries, but that's future work towards building ESAPI 3.0.
For your current dilemma, there's too much of the library that is dependent upon functionality that it sounds like you don't need and don't intend to use. Unfortunately, that is simply the current fact of life. No one has ever seemed to use our authentication interface--but its there for everybody, even if they don't need it. Most users use our encoding/decoding capability first, followed by the validation API and then crypto. The last couple are log injection and the WAF.
Most users of ESAPI take the non-prod test file, and leave it at that. (This is a really bad idea.)
The others take the one you reference and work through the exceptions, asking us questions on the mailing list.
This is not an ideal path to walk either, but it's the path we're in right now.
The danger from my perspective, is if you choose to implement happy-path configurations for the ones ESAPI is throwing exceptions towards, with the goal of JUST making it happy so you can get to your two narrow use-cases.
Then you get promoted and another developer on your app is faced with a problem that she thinks is solved because you handled all the integration with ESAPI.
PAY ATTENTION TO THE PARTS OF ESAPI THAT DON'T PERTAIN TO YOUR USE CASE. This isn't ideal, but its where we're at in 2017. Ask us questions on the user list.
Failure to do so--especially in the crypto portion, will leave your application vulnerable in the future.
RegEx used in ESAPI.validator().getValidInput(..) calls
Validator.COMPANY_ID_PTRN=[a-zA-Z0-9]+
Validator.USER_DN_PTRN=[a-zA-Z0-9=,]+
Validator.ROLE_DN_PTRN=[a-zA-Z0-9=,^\- ]+
Minimum default settings
ESAPI.Encoder=org.owasp.encoder.esapi.ESAPIEncoder
ESAPI.Logger=org.owasp.esapi.logging.slf4j.Slf4JLogFactory
Logger.ApplicationName=TrianzApp
Logger.LogEncodingRequired=false
Logger.LogApplicationName=false
Logger.LogServerIP=false
Logger.UserInfo=false
Logger.ClientInfo=false
IntrusionDetector.Disable=true
ESAPI.Validator=org.owasp.esapi.reference.DefaultValidator
Encoder.AllowMixedEncoding=false
Encoder.AllowMultipleEncoding=false
ESAPI.printProperties=false

How to split a Java library source into two blocks, keeping one package?

We are creating an android library for use with Android. That means an Eclipse-like IDE and an Ant-like build process.
The nature of the library is that it has two distinct parts, representing different levels of abstraction - let's say 'upper' and 'lower'.
Assume, for the purposes of this question, that we need to call methods in one part from the other, but would like to keep those methods hidden from the library user. I've scoured the usual references but they all stop at the point of explaining package name conventions and scope rules. I've failed to find anything that answers this on SO, though this was useful.
The immediate solution is to simply have everything in one package and for those methods to be package-private. However, for reasons of maintainability, clarity, and not-having-100-files-in-one-folder we'd prefer to split the parts into different folders.
The obvious splitting point is to split the (let's say 'wibble') package into com.me.wibble.upper and com.me.wibble.lower packages/folders, but that makes any interconnecting methods undesirably public. In mitigation they could be hidden from the javadoc with #hide.
Another thought is whether could we split the parts at the top level and instead of the classic /main and /test folders have /upper, /lower and /test and all parts share the same com.me.wibble namespace. I'm unsure if/how Eclipse would cope with that.
Is there a conventional way of doing this, or is it just not done? If there are ways, what are the pro's and con's?
hmmm......Instead of asking for the solution, sometimes it is better to give the question. WHY you want library users to have a restricted view may generate a better answer than the HOWTO. There are a few answers I thought of but didn't give because I don't know the motivation behind the question (I don't want to waste your time with an answer that is not applicable).
/upper,/lower/,/test doesn't make your situation any nicer. It just makes the project more organized. Whether they are all in the same folder or separate it doesn't affect much.
It sounds like you need public 'interfaces' for library users while having private 'interfaces' for your own use. This is possible with hacking but can be painful if this is large pre-existing collection of code.

JSP internationalization RTL/LTR

I want to create a web site which can be viewed with two languages, one LTR and one RTL. This means that all content should be shown in either of the two languages.
My framework is Spring, and I'm using Tiles2, but I think this question is not framework specific.
The obvious solution to supporting two languages is having everything doubled (all JSP's, fragments, etc.), and you get the part of the tree which fits the language you chose. But this causes problems when changing the web site (you might forget to update the other JSP's), and is not scalable (try doing this for 5 or 10 languages).
I know I can use properties files to host strings for the different languages, but then my web site would be a huge collection of spring:message tags and will be a lot harder to maintain (what happens if I have a paragraph of 100 lines, does this all go into a single properties line?)
Is there any kind of framework, plugin, other, which solves this problem? Has anyone come across a clever solution to this problem?
I've never realized a complete project, just some tests. I think this problem is not so big as it seems if you follow some simple rules. Here is what I would try to do:
Specify direction with <body dir='ltr/rtl'>. This is preferred versus CSS direction attribute.
Avoid different left/right margins or paddings in all CSS. If you must break this rule, probably you'll need to use two different files (ltr.css and rtl.css) containing all these different elements.
Sometimes you'll need to move some elements from left to right or vice versa. For example, in LTR you want a menu on the left, but in RTL you want it on the right. You can achieve this using CSS, but this sometimes is complicated if you are not an expert and you must test it in all browsers. Another option is to use some IF depending on the case. This last option will fit very well if you use a grid based CSS library, like Bootstrap.
Choose carefully which CSS and JS libraries you'll use. Obviously, pick the ones which offer RTL/LTR support.
Don't worry too much about the images. If you must change one image depending on the language is probably because it has some text in it. So, you must use different images anyway. This is a problem related to i18n, not a text direction issue.
Don't let your customer to be too much fussy about it. I think that with these rules (and maybe some more) you can get a good result. But if your customer starts complaining about one pixel here and another one there, you'll need to complicate all this and probably is not necessary.
About your language properties file. Yes, use them. Always. This is a good practice even when you are only using one language: HTML structure is separated from content, is very easy to correct or translate, some words or sentences are in only one file...
Usually, web frameworks are used to build web applications rather than web sites, and there are quite few long static paragraphs. Most of the content is dynamic and comes from a database. But yes, the usual way of doing is to externalize everything to resource bundles, usually in the form of properties files.
Putting a long paragraph in a properties file doesn't cause much problem, because you can break long paragraphs into multiple lines by ending each line by a backslash:
home.welcomeParagraph=This is a long \
paragraph splitted into several lines \
thanks to backslashes.
RTL and LTR is one of the upper and more difficult i18n problems.
Basically its a Problem of the view-scope of the MVC-Model. This may also includes pictures and emotional differences like the color of the skin of people. In this case you better abadon to the solution HTML+CSS gives you.
In example:
<style type="text/css">
*:lang(ar) { direction:rtl }
*:lang(de) { direction:ltr }
</style>
The best practice is to ask members of the audience-group about what effect the webpages have to them.
I agree to most of solutions provided here. Your problem is more design (architecturally) oriented rather than technical. You need to choose path whether you need to keep this logic of internationalization on server (java) side or in static files.
In case you want to go for java side (preferable solution), you need to keep two properties file and use jstl tags. This minimizes your work in case you want to add another language in future. This is maintainable solution. I have seen applications supporting more than 15 languages and time zones. In fact release process gets pretty easy.
In case you want to go for keeping multiple css and other static files, you will soon find things running out of your hands pretty soon. I dont think this is a maintainable solution.
Said all this, I will leave this choice to the architect of application. He will be able to judge which way to go based upon the nature of application and constraints given to him.
You don't want to use everywhere. That's a pity because it is just the way you should do it. It is a bad practice to keep hard-coded texts in a jsp if you need internationalization.
Furthermore, Most of the modern IDE allows you to go to the variable declaration by doing ctrl+left click (or hovering the key) so that having a lot of variables in your code should not be a problem for maintenance.
First, you must distinguish, for each text element, whether it is a user interface element (e.g. button label) or redactionnal content.
user interface element labels will be stored in properties file that will have to be translated for each supported language (and provide a default value as a fall back)
redactionnal content will be stored in a content management system that you will organize in order to find easily a localized version of your content

"rename" FileItem

From the business prespective, here's the problem
We have a number of shared folders that people use, let's call it //shared/the/drive. However, our server might know this shared drive as some other name, perhaps //ir83pn3br8mwhonamesthesethingsanyway/the/drive since the networking group insists on having incredibly messed up server names. For most of the servers, it works just fine to use the simple name, but on this one, it's just not working right. So the bandaid for our problem is, in our code, to just be like "Oh you're using shared - we'll replace that with stupid name from networking.
Okay - now on to the more technical side of things:
I have a FileItem (Apache commons FileUpload module) object that might have a name //shared/the/drive/stuff/plans.doc. I need to create a FileItem that references //stupidname/the/drive/stuff/plans.doc. What should I do?
Should I edit the request object in the JSP? That sounds like a bad idea.
Should I use reflection to edit the FileItem object? That sounds like an even worse idea.
I'm not a front end guy (note which tags I have votes in... haha), really... more of a server dude... this just got dropped onto my plate. Is it possible to intercept the text box before it gets to the request, moving the change to the client side?
I can't possibly have been the first person to come across this problem. I'm not looking for code necessarily (would I mind? No I wouldn't.) but a general approach of both what will work, and/or how this sort of thing (changing what a user inputs) is handled in a 'best practicey' kind of way is most welcome.
Its not uncommon when dealing with distributed file systems to have a "fake path" which the user sees and deals with and a backend path which represent the actual node that allows you to manipulate the file in context of the request you receive.
Every page you hit on the web is not represented by the physical URL you type into the browser. Files live on CDNs, in CMS systems, are dynamically created out of databases ....whatever.
Theres no need to hack on any objects. You just wrap them with another object that contains their transient properties such as where Im going to access that file THIS time.

Why do some APIs (like JCE, JSSE, etc) provide their configurable properties through singleton maps?

For example:
Security.setProperty("ocsp.enable", "true");
And this is used only when a CertPathValidator is used. I see two options for imporement:
again singleton, but with getter and setter for each property
an object containing the properties relevant to the current context:
CertPathValidator.setValidatorProperties(..) (it already has a setter for PKIXParameters, which is a good start, but it does not include everything)
Some reasons might be:
setting the properties from the command line - a simple transformer from command-line to default values in the classes suggested above would be trivial
allowing additional custom properties by different providers - they can have public Map getProviderProperties(), or even public Object .. with casting.
I'm curious, because these properties are not always in the most visible place, and instead of seeing them while using the API, you have to go though dozens of google results before (if lucky) getting them. Because - in the first place - you don't always know what exactly you are looking for.
Another fatal drawback I just observed is that this is not thread-safe. For example if two threads want to check a revocation via ocsp, they have to set the ocsp.responderURL property.. and perhaps override the settings of each other.
This is actually a great question that forces you to think about design decisions you may have made in the past. Thanks for asking a question that should have occurred to me years ago!
It sounds like the objection is not so much the singleton aspect of this (although an entirely different discussion could occur about that) - but the use of string keys.
I've worked on APIs that used this sort of scheme, and the reasons you outline above were definitely the driving factors - it makes it crazy simple to parse a command line or properties file, and it allows for 3rd party extensibility without impact to the official API.
In our library, we actually had a class with a bunch of static final String entries for each of the official parameters. This gave us the best of both worlds - the developer could still use code completion where it made sense to do so. It also becomes possible to construct hierarchies of related settings using inner classes.
All that said, I think that the first reason (easy parsing of command line) doesn't really cut it. Creating a reflection driven mechanism for pushing settings into a bunch of setters would be fairly easy, and it would prevent the cruft of String->object transformation from drifting into the main application classes.
Extensibility is a bit trickier, but I think it could still be handled using a reflection driven system. The idea would be to have the main configuration object (the one with all the setters in it) also have a registerExtensionConfiguration(xxx) method. A standard notation (probably dot separated) could be used to dive into the resultant acyclic graph of configuration objects to determine where the setter should be called.
The advantage of the above approach is that it puts all of the command line argument/properties file parsing exception handling in one place. There isn't a risk of a mis-formatted argument floating around for weeks before it gets hit.

Categories