I'm in charge of updating an existing java app for an embedded device (a copier).
One of the things I want to do is create a servlet which allows the download of all the files in our sandboxed directory on the device (which will include the application log files, local caches, etc). At the moment these files are all in a single directory with no subdirectories.
Basically what I'd like to do is as follows:
Log.log
Log.log.1
Log.log.2
SomeLocalCache.txt
AnotherLocalCache.txt
where each line is a clickable link allowing download of the file.
However, my HTML experience is basically nil, and my familiarity with the Java API is still fairly rudimentary, so I'm looking for some advice on the proper way to go about it.
I've read through all the samples provided, and here's what I'm thinking.
I can create a servlet at a specified URL on the device which will call into my code. Let's call this /MyApp.
I add another link below that, let's call it /MyApp/Download.
When this address it reached in a browser, it displays the list of files.
This list will have to be created on the fly. I can create an HTML template file and put it in the res folder (this seems to be the recommended method for the device in question), but the whole list of files/links will need to be substituted in at run time. Here's an example I found using <ol>+<li> tags for the list and <a> tags for the links. I can generate that on the fly pretty easily. Is that a reasonable way to go?
e.g.
<ol>
<li>
Log.log
</li>
<!--more <li> elements-->
</ol>
Clicking on an individual file will link to /MyApp/Download/File.ext which will then trigger the file download via my servlet (I've found this code which looks promising for the actual download).
The device will require users to log before they are allowed to access the /MyApp link or any sub-links, and I can additionally require that the logged in user be an admin before allowing file download, which together seems like sufficient security in this case (heavy security is not required for these files).
So am I missing anything big or is this a reasonable plan of engagement?
EDIT
Judging by this link when to use UL or OL in html? Many people are going to hammer the answer and comment below because they say it is important to put semantic information into the HTML.
My point is simply this -- the only difference is browsers will display one with bullet points (as OP seems to want) and one with numbers (as the OP does not want.) I suggest he change the HTML to the way he wants it to render, or leave it as is, and make some CSS changes.
Yes there is a semantic difference between the two... they will both still render in order as defined here http://www.w3.org/TR/html401/struct/lists.html Is the HTML the place to put semantic information? I think not, code that generated the HTML is the correct place. Your cohesion may vary.
I won't change my original comment for the sake of history.
END EDIT
Seems fine to me -- however <ol> is not really used any more, I'd go with <ul>. Don't worry, it is still ordered as you would expect.
The reason for this is the only difference between the two was browsers would automatically number (render with a number before) ordered lists. However, with CSS all the rendering control can be in the CSS (including numbering) and everyone is happy.
Hardly anyone uses the auto number anymore. In fact via CSS, lists can and are used for all sorts of crazy things, including CSS menuing systems.
Here's a summary you need to do:
You can use File#listFiles() to get a File[].
You can use JSTL c:forEach to iterate over an array.
You can use HTML <ol>, <ul> or <dl> elements to display a list.
You can use HTML <a> element to display a link.
You can use a Servlet to write an InputStream of a local file to OutputStream of the response. Remember to pass at least Content-Type, Content-Length and Content-Disposition along.
You can make use of request pathinfo to pass file identifier safely. E.g. map servlet on /files/* and let link point to http://example.com/files/path/to/file.ext and in the servlet you can get /path/to/file.ext by request.getPathInfo().
A basic and solid servlet example can be found here: FileServlet. If you want to add resume and compressing capabilities, then you may find the improved FileServlet more useful.
That said, most appservers also just supports directory listing by default. Tomcat for example supports it by default. You can just define another <Context> in Tomcat's server.xml with a docBase of C:/path/to/all/files and a context path of /files (so that it's accessible by http://example.com/files.
<Context docBase="/path/to/all/files" path="/files" />
That's basically all. No homegrown code/html/servlet needed.
Related
I want to make the js and css files which are modified are to be downloaded at the client end when a page is accessed. I have these approaches
Manually add the modified timestamp the URL in each page.
I was thinking of writing a scriptlet code in all the jsp pages which will read all the js and css files modified timestamp and append it to the url in the page.
Add the modified timestamp while building the war file using ANT.
I have following questions.
Can any one let me know which would be a better solution of the above approaches? I am open to any other solutions also.
I went through this answer on SO and using it I can get the modified date but how to change the jsp file?
Is there anything similar to this in java?
In this situation better or best solution is took shape according to your exact requirements. I might derive simple questions like; Will your static resources in same server or included in your app in same server etc.. May be some other better ways..
I don't have ant experience so I can't talk about it now , but you can go with java way already.I want to share just idea/s. A filter(looks the .css or .js requests , gets resources and look resource lastmodified date or checksum return as version on response) or custom jsp tag will provide your requirements. Write a custom jsp tag <resource:static path="app.js"/> like that example. So it may look specific file's last modified date, assumed under the same document root, and it can produce <script type="text/javascript" src="app.js?version=8637"> like this result, so this result will bust the cache.
I need to convert a web page [which has not public access] to PDF or Image [preferably to PNG].
Web page contains set of charts and image. Most of the charts are populated through Ajax calls so there is a delay between page load and chart load.
I am looking answer for any of these questions:
1- I found set of snapshot api's but none of them support accessing my internal page. Since the web page I am trying to export is not public I need to be authenticated. Biggest problem is I cannot send request headers [such as session-id, cookie or other variables] along with these API's. It seems they don't support this kind of functionality.
2- I am not sure if I can do following: Login to my web page with HTTP client, add http headers, send get call and get HTML string. Then use one of the converters to convert it to PDF. What I am not sure is if it's possible to get proper PDF from the HTML string I got from http client since resources [css, js and etc] will be missing. I want my pdf/image looks exactly as it on the web site.
I really appreciate if you can help.
Thanks in advance,
ED
You're probably best of using wkhtmltopdf, which is a server-side tool and is easily installed.
There are two parameters you can use to wait for your Ajax to finish, try:
javascript-delay to influence the time the program waits for the JavaScript to finish
window-status to wait for a certain return code for the window
See the extensive manual for this program here
wkhtmltopdf generates a PDF and wkhtmltoimg generates an image, which is PNG (as you requested) by default.
Authentication is difficult because it involves security. Because the operation you are describing is unusual it is likely to result in all kinds of alarm bells going off. It is entirely possible to do but it is fraught, easy to get wrong and fragile in the face of security updates and code changes.
As such I'm going to suggest an alternate method which is one we often recommend for ABCpdf (on which I work). Yes we support standard authentication methods but the beauty of this approach is that it is robust and is applicable to other solutions (eg Java based) and novel authentication methods.
Typically you just want a PDF of the current page. The easiest way to do this is snaffle the HTML. The way you do this rather depends on your environment. For example under ASP.NET you can obtain the HTML of the current page using the HttpResponse.Filter property or by overriding the Render method of the page. The way you do it will depend on what you're coding in.
Then you need to save this HTML to a file and present it to your solution via a 'file://' protocol URL. Now obviously at this point any relative links will be broken but this is easily fixed by dropping in a BASE tag that references the place they are located.
Generally the types of resources referenced by an server-side page are static. So if you can create a tag that references the actual files rather than a web site, you will bypass any authentication for access to these resources.
That still leaves the AJAX based problems which are another can of worms. The render delay method is something we have supported for many years (from before AJAX was around) however it is not terribly reliable because you just don't know how long to wait.
Much better is a tighter link into the JavaScript via a callback you can use to determine if the page is loaded. I don't think ABCpdf is going to be appropriate for you since it is .NET but I would certainly encourage you to look for a Java based solution that uses this type of more sophisticated approach.
I would like to have a tree/ folder structure for my content but would like all pages to be served as a flat URL. E.g.
the page located at /cat1/subcat2/tulips.html would be served at:
http://example.com/tulips.html
and the page located at /cat5/roses.html would be served at:
http://example.com/roses.html
I would need all links to be automatically calculated and ensure that there are no conflicts.
Is this possible with opencms?
Thanks,
Assaf
A rough outline how I'd to approach this:
You would first get the list of all the resources via <cms:contentload> (http://www.bng-galiza.org/opencms/opencms/alkacon-documentation/documentation_taglib/docu_tag_contentload.html), taglib or the respective java API in java code as you need some coding anyway, and then create new resources of type 'external link' in your OpenCms root folder, pointing to your targets; probably using something like
getCms().createResource(newFileName, templateFile.getTypeId());
or similar method (as external link isn't structured content) for it.
You could wrap this logic up into a java class and schedule it as a scheduled job, I guess it's sufficient, as long as you don't need it right away and some delay is acceptable. Otherwise you'd need to hook it into the publishing flow.
What I am trying to do is a take a list of URL's and download each URL's content (for indexing). The biggest problem is that if I encounter a link that is something like a facebook event that simply redirects to the login page I need to be able to detect and skip that URL. It seems as though the robots.txt file is there for this purpose. I looked into heritrix, but this seems like way more than I need. Is there a simpler tool that will provide information about robots.txt and scrape site accordingly?
(Also, I don't need to follow additional links and build up a deep index, I just need to index the individual pages in the list.)
you could just take the class you are interested in ie http://crawler.archive.org/xref/org/archive/crawler/datamodel/Robotstxt.html
For several reasons, a lot of "webmaster guides" (like Google and Yahoo!'s webmaster guides/guidelines) repeats several times that it is better to always put the width and height attribute of the img tag.
One of the most obvious reason is that the elements in the page won't seem to be "jumping around" to a new location after every picture is loaded (always setting the correct width/height sure gets rid of this behavior). And there are other reasons to follow these guidelines / best practices.
So:
if we consider that these are indeed good practices
if there are a lot of pictures and they are changing often
if pictures aren't changing between two .war re-deploy (that is: there's no user-contributed picture)
if we don't want to manually edit all these width/height attributes
How do we automatically/programmatically serve HTML pages where every img tag have their width/height attribute correctly set as the best practice recommend?
We have a more complicated build process, where all the .jsp, .css, .html, etc. are optimized.
.css files with lots of includes are collapsed in a single file (another major one if you're into website optimization, just use Chrome's developer tools or YSlow! and check the difference with and without .css collapsing)
.jsp and .html files have all their image width/height set at build time
images are given unique names and are set as foreover cacheable (not unlike what GWT is doing for the JavaScript it generates, where unique identifiers are used). On the next build, they'll get new unique names and will be once again forever cacheable.
etc.
There a lot of reason to have a build process more advanced than just "compile and zip all your files".
To answer your question: we do a lot of Unx shell scripting in our build process. We process the .jsp, .css, .html etc. using some Unx shell power. And I can tell you that people would have a very hard time replicating what we're doing without having the possibility to combine the power of all those shiny shell commands :)
When the webapp starts up, we're recursively crawling the entire exploded .war looking for every single picture file and determining their width/height.
We're saving these infos as a mapping in "file-to-width/height" map. Later on, every single time we're generating an img tag we're calling a method that gives up back that picture's width/height.
The only "drawback" is that the map lookup and width/height attributes generation are performed at runtime.