Recently we had to upgrade from itext version 5 to version 5.2 since the latter has support for Chinese languages. However one major change in the newer version is that it no longer supports the TextProvidingRenderListener class in the contsructor of PdfTextExtractor class. There was a class we had customized to utilize this feature :
public class CustomLocationAwarePdfRenderListener
implements TextProvidingRenderListener
{
public CustomLocationAwarePdfRenderListener( int lineAlignErrorAllowed)
{
this.lineAlignErrorAllowed = lineAlignErrorAllowed;
reset();
}
However as an outcome there is now no way to use this class and the attribute called lineAlignErrorAllowed which was part of the constructor of the CustomLocationAwarePdfRenderListener. The lineAlignErrorAllowed basically acts as a parameter of scanning the minimum no. of lines before the source is considered as too complex.
Any help on this topic would really be appreciated.
Managed to fix this issue after realizing that the LocationTextExtractionStrategy class in the latest itext version is a worthy replacement for the TextProvidingRenderListener. Although this time we had to extend the class. The only other change that was needed is to alter to code to use the now static PdfTextExtractor class by passing an instance of the LocationTextExtractionStrategy to the getTextFromPage. Had a bit of a struggle searching for the latest itext API reference doc, but got them here finally. (Somehow they show up with some different formatting than regular java API docs but one can live with that).
Related
I have an old application I am upgrading from Saxon-HE 9.2 to 9.5 (and hopefully to 9.8). The application implements the Saxon Debugger interface. After upgrading to 9.5, the Debugger no long fires any events. I thought this might be due to the byte code optimization, and set GENERATE_BYTE_CODE to false. However, I still receive no debugging events. I believe I found the issue, but don't know of the fix. In 9.2, InstructionInfo had a number of subclasses, including StyleElement. My code expects some InstructionInfos to be StyleElements, which they are not.
public class Saxon2TraceListener implements TraceListener {
// implement interface
public void enter(InstructionInfo instruction, XPathContext context)
{
if (!(instruction instanceof StyleElement))
return;
// do logic with StyleElement
}
}
How can I get a StyleElement from InstructionInfo?
I'm afraid when you're working with low-level interfaces like this, there's no substitute for reading and understanding the source code. 9.2 and 9.5 are both unsupported releases, and in any case this is the kind of support that we only really offer to paying customers - we have to draw the line somewhere.
I think the actual Debugger interface has been obsolete for some time. Its original idea was to allow you to annotate a stack frame with the names of the variables occupying each slot, but that's now a standard product feature and doesn't require a custom debugger.
You seem to be talking instead about the TraceListener interface, which has certainly undergone changes over successive releases, inevitably since it gives you access to the internal representation of a compiled stylesheet which is something we are always tweaking.
I'm not sure what the situation was in 9.5 or 9.8, but in 10.x the argument to TraceListener.enter() and TraceListener.leave() has changed from an InstructionInfo to a Traceable, and every expression and instruction is a Traceable.
A StyleElement is a node in the tree representation of the source stylesheet, and the source stylesheet no longer exists at runtime, whether you're tracing/debugging or not.
I'm migrating from itext v5 to v7 and found the PdfSignatureAppearance class has changed its method setSignDate() from public to protected. I can't find the reason why is it necessary? (I know that protected method can only be used in inherited class or in the same package).
Am I missing some good design patterns of java?
should I make IpdfSignatureAppearance which inherit the PdfSignatureAppearance and call actual function
v5 https://api.itextpdf.com/iText5/5.5.13/
v7 https://api.itextpdf.com/iText7/java/7.0.4/
iText 5 to iText 7 has been a major overhaul, and even if a number of classes in iText 7 still have names known from iText 5, the functionality may have changed considerably or moved between classes.
For example in the case at hand, that method has become protected on 2015-10-29 09:05:58 in commit ba907ff8e40de9457ac08a2138a9a9732b6c7d68 with the comment
Refactored signatures module.
Moved the code related to the actual signing into separate class (PdfSigner). Removed unused methods.
Indeed, if you need to set the signing time in iText 7, you now do so in the associated PdfSigner instance using its public setSignDate method; that method in turn calls PdfSignatureAppearance.setSignDate among other things.
In a (now) relatively old book, "Java Puzzlers", the authors talk about the inlining problems that can occur in static final fields (Puzzle 93: Class Warfare discussed here).
Essentially Java used to have a problem where due to how classes load, you could run into the issue that if a library class (class A) is recompiled with a changed static final field, then a class (class B) which uses that field might not function properly. This could happen because the class A field might have been inlined into class B during compilation, so that when the library is updated, class B does not incorporate the change in the class A field.
My simple question is... Is this still a problem? Do the newer versions of Java redefine the class loading procedure so that we do not have to worry about such issues?
I can only find relatively old posts touching on this issue (pre-2014), which makes me think that this issue has somehow been addressed, but I can find no definitive source.
Also, if it makes any difference, I am particularly interested if this will be a problem in Android.
in lucene 4.3.1 there was an interface StandardTokenizerInterface and a number of classes implement this class, such as StandardTokenizerImpl and ..... this interface doesn't exist in solr 5.3.1... what is the replacement of this class in solr 5.3.1?
The interface was not replaced, it was removed entirely, as it was deemed to no longer serve a useful purpose, due to the changes in how backwards compatibility is handled (instead of passing in a version arg, you would just use StandardTokenizer40, for instance). Ticket here: LUCENE-6000
The calls specified in the interface are still used in pretty much the same way by the current StandardAnalyzerImpl though, as far as I can tell.
I am doing an OCR project. getInstance() in tess4j is deprecated. I can't use Tesseract.Tesseract() even which gives an error. How can I solve this?
Code with Tesseract.getInstance()
Code with Tesseract.Tesseract()
[![Code with Tesseract.Tesseract()][2]][2]
This is what is displayed when I compiled the program after I inserted
Tesseract tess = new Tesseract() ;
enter image description here
Deprecated methods can still be used. The #Deprecated annotation just means that the library developer plans to stop supporting this method (or remove it from the library) in a future release.
More precisely, from the #Deprecated documentation,
A program element annotated #Deprecated is one that programmers are discouraged from using, typically because it is dangerous, or because a better alternative exists.
You may want to check these out:
What does it mean for a method to be deprecated?
The constructor Date(...) is deprecated. What does it mean? (Java)
Is it wrong to use Deprecated methods or classes in Java?
How and When To Deprecate APIs (Oracle official documentation)
What does the deprecated API warning mean?
It is not a good practice, however, to use deprecated methods and classes, as they may lead to future bugs and compilation problems in your system if the methods or classes are removed and you update the library versions.
However, in your case, Tesseract() is a class constructor. You are making the wrong call, as the correct one would be
Tesseract instance = new Tesseract();
Have a look at the Tess4j documentation to learn more about the Tesseract class.
Tesseract() is a constructor, so you need to use new Tesseract() to get one.