Passing obtained values to other templates

Passing obtained values to other templates - java

I have a main template that captures a string:
#(captured: String)
.... other templating stuff
I have a sub template that wants to utilize #captured:
.... somewhere in this templating stuff we have:
#subTemplate(#captured) <- wants to use #captured
I try this and I get nothing but errors. Im sure this MUST be possible, so what am I doing wrong? Im sorry if this question is simple, I just dont know how to succinctly phrase it for Google.

You need to remove the trailing # symbol on captured when it is being passed in as a variable.
e.g
#subTemplate(#captured) --> #subTemplate(captured)
The reason why this is the case is because # is a special symbol that tells Play that the template engine is about to do some computation, rather than just outputting HTML. In the case above, by calling the sub template, you have already started a computation (i.e used the # symbol), so you do not use it again inside the parenthisis, because the compiler is already in computation mode.
This was exactly the same in the Play 1.x template engine.

Remove the leading 'at' in #captured. For some odd reason, Play didnt wanna pick up on this and make it work until now. Seeing if i can reproduce the problem.

Related

Displaying a string, from an array within an array with JasperReports

I currently use iReport 5.1.0, Tomcat 7, Java, Spring MVC amongs other things.
I believe my current problem lies within the iReport itself, although it may be more complex.
I currently have an Object (Guide),
My Guide has a list of Objects, (itemsGuide)
My itemsGuide has a list of Objects, (itemsTeam)
My itemsTeam has a list of Objects, (professional)
My professional has a name.
My jasper file, receives as parameter the Guide, in the iReport software I must assign the fields and the current variables, although I must print, in a single line, parameters from the itemsTeam object (a couple of strings), and in this very specific case I will always have only one professional, although the variable lies within a list, therefore I must access the first object in this array to then display its attribute.
I keep getting the most unusual errors as I try to do this rather simple process.
The iReport either doesn't find the variables in itemsTeam or the ones in professional, when, for some unknown reason it "works" my preview of the PDF (it says "The document has no pages", to avoid the error msg I change the attribute in my report proprety, but now, it simple shows the iReport file without any field/variable, only the graphical elements).
I am utterly confused, would anyone be so kind as to lend a hand to this poor newcomer?
the JRXML file has over 37k characters and I can't post it as an edit, I uploaded as a txt file insted.
Thanks in advance.
No more need for the link to the xml ~
-- CATURDAY EDIT --
Allright, I seem to be getting closer to my solution.
At this point I've tried quite a few different approaches to my problem.
Although I couldn't completely solve it, I managed to get closer, or so I believe.
Now, all I have to find out (it seems) is:
How to access an array within an array.
I managed my way around the guide and the itemsGuide, I can't access the array within itemsGuide, I need to access a variable named "team" within that array (itemsGuide) and then print the value of a few strings in this array (which is of a specific class, itemsTeam) and a string in an object (professional).
All this, without subreport.
Anyone ?
Allright, what I spent about 4 days trying to find out, my co-worker with more experience than me found out in about 20 minutes.
The issue wasn't within iReports itself, it was necessary to create methods in the class in order to return the desired object/list, something I did not even consider given that I know very little of how iReports (and java honestly) works.
But according to what I understood from what has been done it has something to do with reflection and a couple of methods in the java class.
Thanks guys. :>

docx Template Docx4j replacing text in Java

Im new to Docx4j and my task is to replace some Text of a docx Template.
I read the getting Started Guide of docx4j but I don't think I fully understood the whole concept.
Well Anyway... I already tried [the unmashalling Template of Docx4j][1],
which worked fine with the given docx, but then I got the same Problem when I tried it on my own template
The Exceptions say, that the HashMap doesnt contain valid keys or values, and therefore it doesnt replace the placeholders.
I replaced the
<w:proofErr w:type="spellEnd"/>
by disabling the spellchecking, but it still didn't work... And it also takes quite some time to run the app.
In didn't understand the databound example in the Getting_Started.pdf, so I'm running out of options...
How can I simply replace some String-Texts from a docx?
EDIT:
I found out that if I add some Text to the unmarshallFromTemplate.docx and save it, that it wont replace the new lines of text.
the - Tags are somehow splitted into multiple Tags:
<w:p w:rsidR="002512F8" w:rsidRDefault="002512F8" w:rsidP="002512F8"><w:r><w:t>My</w:t></w:r><w:r w:rsidR="001A5174"><w:t xml:space="preserve"> favourite ice cream is ${DEGREE</w:t></w:r><w:r><w:t>}.</w:t></w:r><w:bookmarkStart w:id="0" w:name="_GoBack"/><w:bookmarkEnd w:id="0"/></w:p>
editing the Text in the document.xml, and adding the missing Information didnt help much.
well anyway here is the document.xml of the Template.docx that im using:
http://uploaded.net/file/vz4qr23o
EDIT 2:
Well guys. I found a quite suitable workaround for myself and dont know why it took so long to figure it out.
As I was saying: The runs where splited up, and the reason for this was the ${} in my opinion. Therefore I simply used a # before my Placeholders and rewrote every placeholder, so that it would all be in one run.
Had to switch couple of times to the document.xml and rewrite the passages but then it worked. Then I simply used a replace(placeholder, xml) and replaced the text of the marshalled document.xml, then I unmarshalled it again.
Worked. End of Story, fuck the nightly build or the mappings. THX

docx4j source code has been on GitHub for a while now; that svn repository is obsolete.
The equivalent sample is now called VariableReplace. That code is a bit more efficient, but you need to build it yourself, or use a current nightly build.
You'll probably find running VariablePrepare addresses your issue.

The placeholder search and replace code built in to docx4j works just fine, but if you're having issues with placeholders getting broken up by rsid entities, you need to ensure that you have grammar and spell-checking disabled when saving your "template" (i.e. source) document. This will help prevent your text runs becoming fragmented (note that you might want to disable proof-reading too, as that inserts bookmark tags here there and everywhere).
Once you've done the search and replace and have a new / updated document, you can re-enable spell-checking easily enough. This thread has more on RSIDs: turnoff rsid's spell check & grammar check in generated xml

Extracting webpage information based on a template in Java

Right now I use Jsoup to extract certain information (not all the text) from some third party webpages, I do it periodically. This works fine until the HTML of certain webpage changes, this change leads to a change in the existing Java code, this is a tedious task, because these webpage change very frequently. Also it requires a programmer to fix the Java code. Here is an example of HTML code of my interest on a webpage:
<div>
<p><strong>Score:</strong>2.5/5</p>
<p><strong>Director:</strong> Bryan Singer</p>
</div>
<div>some other info which I dont need</div>
Now here is what I want to do, I want to save this webpage (an HTML file) locally and create a template out of it, like:
<div>
<p><strong>Score:</strong>{MOVIE_RATING}</p>
<p><strong>Director:</strong>{MOVIE_DIRECTOR}</p>
</div>
<div>some other info which I dont need</div>
Along with the actual URLs of the webpages these HTML templates will be the input to the Java program which will find out the location of these predefined keywords (e.g. {MOVIE_RATING}, {MOVIE_DIRECTOR}) and extract the values from the actual webpages.
This way I wouldn't have to modify the Java program every time a webpage changes, I will just save the webpage's HTML and replace the data with these keywords and rest will be taken care by the program. For example in future the actual HTML code may look like this:
<div>
<div><b>Rating:</b>**1/2</div>
<div><i>Director:</i>Singer, Bryan</div>
</div>
and the corresponding template will look like this:
<div>
<div><b>Rating:</b>{MOVIE_RATING}</div>
<div><i>Director:</i>{MOVIE_DIRECTOR}</div>
</div>
Also creating these kind of templates can be done by a non-programmer, anyone who can edit a file.
Now the question is, how can I achieve this in Java and is there any existing and better approach to this problem?
Note: While googling I found some research papers, but most of them require some prior learning data and accuracy is also a matter of concern.

The approach you gave is pretty much similar to the Gilbert's except
the regex part. I don't want to step into the ugly regex world, I am
planning to use template approach for many other areas apart from
movie info e.g. prices, product specs extraction etc.
The template you describe is not actually a "template" in the normal sense of the word: a set static content that is dumped to the output with a bunch of dynamic content inserted within it. Instead, it is the "reverse" of a template - it is a parsing pattern that is slurped up & discarded, leaving the desired parameters to be found.
Because your web pages change regularly, you don't want to hard-code the content to be parsed too precisely, but want to "zoom in" on its' essential features, making the minimum of assumptions. i.e. you want to commit to literally matching key text such as "Rating:" and treat interleaving markup such as"<b/>" in a much more flexible manner - ignoring it and allowing it to change without breaking.
When you combine (1) and (2), you can give the result any name you like, but IT IS parsing using regular expressions. i.e. the template approach IS the parsing approach using a regular expression - they are one and the same. The question is: what form should the regular expression take?
3A. If you use java hand-coding to do the parsing then the obvious answer is that the regular expression format should just be the java.util.regex format. Anything else is a development burden and is "non-standard" and will be hard to maintain.
3B. If you use want to use an html-aware parser, then jsoup is a good solution. Problem is you need more text/regular expression handling and flexibility than jsoup seems to provide. It seems too locked into specific html tags and structures and so breaks when pages change.
3C. You can use a much more powerful grammar-controlled general text parser such as ANTLR - a form of backus-naur inspired grammar is used to control the parsing and generator code is inserted to process parsed data. Here, the parsing grammar expressions can be very powerful indeed with complex rules for how text is ordered on the page and how text fields and values relate to each other. The power is beyond your requirements because you are not processing a language. And there's no escaping the fact that you still need to describe the ugly bits to skip - such as markup tags etc. And wrestling with ANTLR for the first time involves educational investment before you get productivity payback.
3D. Is there a java tool that just uses a simple template type approach to give a simple answer? Well a google search doesn't give too much hope https://www.google.com/search?q=java+template+based+parser&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-GB:official&client=firefox-a. I believe that any attempt to create such a beast will degenerate into either basic regex parsing or more advanced grammar-controlled parsing because the basic requirements for matching/ignoring/replacing text drive the solution in those directions. Anything else would be too simple to actually work. Sorry for the negative view - it just reflects the problem space.
My vote is for (3A) as the simplest, most powerful and flexible solution to your needs.

Not really a template-based approach here, but jsoup can still be a workable solution if you just externalize your Selector queries to a configuration file.
Your non-programmer doesn't even have to see HTML, just update the selectors in the configuration file. Something like SelectorGadget will make it easier to pick out what selector to actually use.

How can I achieve this in Java and is there any existing and better approach to this problem?
The template approach is a good approach. You gave all of the reasons why in your question.
Your templates would consist of just the HTML you want to process, and nothing else. Here's my example based on your example.
<div>
<p><strong>Score:</strong>{MOVIE_RATING}</p>
<p><strong>Director:</strong>{MOVIE_DIRECTOR}</p>
</div>
Basically, you would use Jsoup to process your templates. Then, as you use Jsoup to process the web pages, you check all of your processed templates to see if there's a match.
On a template match, you find the keywords in the processed template, then you find the corresponding values in the processed web page.
Yes, this would be a lot of coding, and more difficult than my description indicates. Your Java programmer will have to break this description down into simpler and simpler tasks until she or he can code the tasks.

If the web page changes frequently, then you'll probably want to confine your search for the fields like MOVIE_RATING to the smallest possible part of the page, and ignore everything else. There are two possibilities: you could either use a regular expression for each field, or you could use some kind of CSS selector. I think either would work and either "template" can consist of a simple list of search expressions, regex or css, that you would apply. Just roll through the list and extract what you can, and fail if some particular field isn't found because the page changed.
For example, the regex could look like this:
"Score:"(.)*[0-9]\.[0-9]\/[0-9]
(I haven't tested this.)

Or you can try different approach, using what i would call 'rules' instead of templates: for each piece of information that you need from the page, you can define jQuery expression(s) that extracts the text. Often when page change is small, the same well written jQuery expressions would still give the same results.
Then you can use Jerry (jQuery in Java), with the almost the same expressions to fetch the text you are looking for. So its not only about selectors, but you also have other jQuery methods for walking/filtering the DOM tree.
For example, rule for some Director text would be (in sort of sudo-java-jerry-code):
$.find("div#movie").find("div:nth-child(2)")....text();
There could be more (and more complex) expressions in the rule, spread across several lines, that for example iterate some nodes etc.
If you are OO person, each rule may be defined in its own implementation. If you are groovy person, you can even rewrite rules when needed, without recompiling your project, and still being in java. Etc.
As you see, the core idea here is to define rules how to find your text; and not to match to patterns as that may be fragile to minor changes - imagine if just a space has been added between two divs:). In this example of mine, I've used jQuery-alike syntax (actually, it's Jerry-alike syntax, since we are in Java) to define rules. This is only because jQuery is popular and simple, and known by your web developer too; at the end you can define your own syntax (depending on parsing tool you are using): for example, you may parse HTML into DOM tree and then write rules using your helper methods how to traverse it to the place of interest. Jerry also gives you access to underlaying DOM tree, too.
Hope this helps.

I used the following approach to do something similar in a personal project of mine that generates a RSS feed out of here the leading real estate website in spain.
Using this tool I found the rented place I'm currently living in ;-)
Get the HTML code from the page
Transform the HTML into XHTML. I used this this library I guess there might be today better options available
Use XPath to navigate the XHTML to the information you're interesting in
Of course every time they change the original page you will have to change the XPath expression. The other approach I can think of -semantic analysis of the original HTML source- is far, far beyond my humble skills ;-)

Search for commented-out code across files in Eclipse

Is there a quick way to find all the commented-out code across Java files in Eclipse?
Any option in Search, perhaps, or any add-on that can do this?
It should be able to find only code which is commented out, but not ordinary comments.

In Eclipse, I just do a file search with the regular expression checkbox turned on:
(/\*.*;.*\*/)|(//.*;)
It will find semicolons in
// These;
and /* these; */
Works for me.

Sonar can do it: http://www.sonarsource.org/commented-out-code-eradication-with-sonar/

You can mark your own commented code with a task tag. You can create your own task tags in Eclipse.
From the menu, go to Window -> Preferences. In the Preferences dialog, go to General -> Editors -> Structured Text Editors -> Task Tags.
Add an appropriate task tag, like COMMENTED. Set the priority to Low.
Then, any code you comment out, you can mark with the COMMENTED task tag. A list of these task tags, along with their locations, appears in the Tasks view.

#Jorn said:
I think [the OP] wants to find code that is commented out, not code that has a comment.
If the intention is to find commented out code, then I don't think it is possible in general. The problem is that it is impossible to distinguish between comments that were written as code or pseudo-code, and code that is commented out. Making that distinction requires human intelligence.
Now IDE's typically have a "toggle comments" function that comments out code in a particular way. It would be feasible to write a tool / plugin that matches the style produced by a
particular IDE. But that's probably not good enough, especially since reformatting the code typically gets rid of the characteristics that made the commented out code recognizable.

If the problem is to find commented-out code, what is needed is a way to find comments, and way to decide if a comment might contain code.
A simple way to do this is to search for comment that contain code-like things. I'd be tempted to hunt for comments containing a ";" character (or some other rare indicator such as "="); it will be pretty hard to have any interesting commented code that doesn't contain this and in my experience with comments, I don't see many that people write that contain this. A regexp search for this should be pretty straightforward, even if it picked up a few addtional false positives (e.g. // in a string literal).
A more sophisticated way to accomplish this is to use a Java lexer or parser. If you have a lexer that returns comments at tokens (not all of them do, Java compilers aren't interested in comments), then you can simply scan the lexemes for a comment and do the semicolon check I described above. You won't get any false positives hits for comment like things in string literals with this approach.
If you have a re-engineering parser that captures comments as part of the AST ( such as our SD Java Front End),
you can mechanically scan the parse tree for comments, feed the comment context back to the parser
to see if the content is code like, and report any that passes that test modulo some size-depedent error rate
(10 errors in 15 characters implies "really is a comment"). Now the "code-like" test requires
the reengineering parser be willing to recognize any substring of the (Java) language.
Our DMS Software Reengineering Toolkit underlying the Java Front End can actually do that, using access to the grammar buried in the front end, as it is willing to start a parse for any language (non)terminal,
and this question is "can you find a sequuence of (non)terminals that consumes the string?".
The lexer and parser approaches are small and big sledgehammers respectively. If OP is going to do this just once, he can stick to the manual regex search. If the problem is to vet the code base repeatedly (needed in big organizations), he'd want a tool that can be run on regular basis.

You can do a search in Eclipse.
All you need to search for is /* and //
However, you will only find the files which contain that expression, and not the actual content which I believe you are after.
However, if you are using Linux you can easily get all the comments with a one liner.

Validating a Postscript without trying to print it?

Saving data to Postscript in my app results in a Postscript file which I can view without issues in GhostView, but when I try to print it, the printer isn't able to print it because it seems to be invalid.
Is there a way to validate / find errors in Postscript files without actually sending it to a printer? Preferred would be some kind of Java API/library, but a program which does the same would be fine as well.
Edit #1 : no I don't know why it's invalid, nor even necessarily if it's invalid, but would like to be able to validate it outside of ghostview, or figure out what's going on when it can't print.
Answer : Well using the ps2ps trick I was able to see the output that Postscript does and there check the difference. The difference was that I am not allowed to have a decimal number for the width or height of images in the Postscript, but rather only integers. So I still didn't find a way to validate, but this way was good enough for my problem. Thanks.

Whenever I need to validate a PostScript file using Ghostscript without having to actually look at its rendered page images I use the "nullpage" device:
gswin32c ^
-sDEVICE=nullpage ^
-dNOPAUSE ^
-dBATCH ^
c:/path/to/file/to/be/validated.pdf-or-ps ^
1>validated.stdout ^
2>validated.stderr
In case of a problem, there will be a non-zero %errorlevel% set, and the validated.stderr logfile will contain all the messages Ghostscript spit out during rendering.

Do you know why it's invalid?
My suggestion would have been to feed it to Ghostscript/Ghostvoiew, but given Ghostview can view it, it would seem that at least some interpreters think it is valid Postscript.
So it may be something specific to your printer - either it's picky about something in the PS that Ghostscript allows, or it's accessing something that doesn't exist on your printer (filesystem, perhaps) or exceeding some limit of memory, or...
The point being that it may not be an erroneous PS program and so a library/API to validate it might not help
Edit: Does any of it print? Have you tried a printer from a different manufacturer (or vendor of Postscript interpreter, anyway). Does Ghostview give/log any warnings or errors?
Where (what application) does the document originate from?
Can you generate other instances of the document? (e.g. a really simple/empty one to see if that also gives errors)
Unless there's an API providing access to the specific interpreter that's used in your printer, I think you are validating it against another PS interpreter (Ghostscript).
Since there aren't that many PS clones in the world, getting access to another non-GS based one probably isn't going to be easy
Edit2: This link (if quite old information) gives information about how to get more details from your printer on the error: http://www.quite.com/ps/errors.htm

If you can see it on ghostview, it means ghostscript can parse it.
So, one trick you could try using to print (but not to actually validate) your file would be to use ghostscript's postscript output mode (there is a wrapper called ps2ps for it, which mainly adds -sDEVICE=pswrite; there is also ps2ps2 which uses -sDEVICE=ps2write).

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.