Referring to Build text callout with PDF Clown - Is there a possibility to change the font color of the text within the callout note?
I haven't found a suitable method yet, can someone please give me a hint?
There is no explicit PDF Clown method to set the text color. This might be related to the fact that there is no explicit entry in the PDF annotation dictionary for it either.
There are two options, though:
There is a default appearance (DA) entry for variable text in annotations in general. As PDF Clown does not hide generic object methods, you can extend the original callout sample like this:
// Callout.
composer.showText("Callout note annotation:", new Point(35, 85));
new StaticNote(
page,
new Rectangle(250, 90, 150, 70),
"Text of the Callout note annotation"
).withLine(
new StaticNote.CalloutLine(
page,
new Point(250,125),
new Point(150,125),
new Point(100,100)
)
)
.withLineEndStyle(LineEndStyleEnum.OpenArrow)
.withBorder(new Border(1))
.withColor(DeviceRGBColor.get(Color.YELLOW))
.getBaseDataObject().put(PdfName.DA, new PdfString("1 0 1 rg /Ti 12 Tf"));
You have to use plain PDF instructions there, though, rg sets a RGB color defined by the three preceding values, and Tf sets font and size according to the preceding two values. The result of the above is:
As you see, the text now is purple (red 100%, green 0%, blue 100%). A side effect is, though, that the callout line and the frame around the callout box also are purple.
Alternatively a PDF can bring along an own appearance stream defining the whole appearance of the annotation in question. This means, though, that you really have to draw everything yourself including lines, frames, backgrounds, and text.
PDF Clown allows you to set the appearance of an annotation using the setAppearance and withAppearance methods.
Related
I am using Apache PDFBox for configuration of PDTextField's on a PDF document where I load Lato onto the document using:
font = PDType0Font.load(
#j_pd_document,
java.io.FileInputStream.new('/path/to/Lato-Regular.ttf')
) # => Lato-Regular
font_name = pd_default_resources.add(font).get_name # => F4
I then pass the font_name into a default_appearance_string for the PDTextField like so:
j_text_field.set_default_appearance("/#{font_name} 0 Tf 0 g") # where font_name is
# passed in from above
The issue now occurs when I proceed to invoke setValue on the PDTextField. Because I set the font_size in the defaultAppearanceString to 0, according to the library's example, the text should scale itself to fit in the text box's given area. However, the behaviour of this 'scale-to-fit' is inconsistent for certain fields: it does not always choose the largest font size to fit in the PDTextField. Might there be any further configuration that might allow for this to happen? Below are the PDFs where I've noticed this problem occurring.
Unfilled, with fonts loaded:
http://www.filedropper.com/0postfontload
Filled, with inconsisteny textbox text sizing:
http://www.filedropper.com/file_327
Side Note: I am using PDFBox through jruby which is just a integration layer that allows Ruby to invoke Java libraries. All java methods for the library available; a java method like thisExampleMethod would have a one-to-one translation into ruby this_example_method.
Updates
In response to comments, the appearances that are incorrect in the second uploaded file example are:
1st page Resident Name field (two text fields that have text that is too small for the given input field size)
2nd page Phone fields (four text fields that have text that overflows the given input field size)
Especially the appearances of the Resident Name fields, the Phone fields, and the Care Providers Address fields appear conspicuous. Only the former two are mentioned by the OP.
Let's inspect these fields; all screen shots are made using Adobe Reader DC on MS Windows:
The Resident Name fields
The filled in Resident Name fields look like this
While the height is appropriate, the glyphs are narrower than they should be. Actually this effect can already be seen in the original PDF:
This horizontal compression is caused by the field widget rectangles having a different aspect ratio than the respectively matching normal appearance stream bounding box:
The widget rectangles: [ 45.72 601.44 118.924 615.24 ] and [ 119.282 601.127 192.486 614.927 ], i.e. 73.204*13.8 in both cases.
The appearance bounding box: [ 0 0 147.24 13.8 ], i.e. 147.24*13.8.
So they have the same height but the appearance bounding box is approximately twice as wide as the widget rectangle. Thus, the text drawn normally in the appearance stream gets compressed to half its width when the appearance is displayed in the widget rectangle.
When setting the value of a field PDFBox unfortunately re-uses the appearance stream as is and only updates details from the default appearance, i.e. font name, font size, and color, and the actual text value, apparently assuming the other properties of the appearance are as they are for a reason. Thus, the PDFBox output also shows this horizontal compression
To make PDFBox create a proper appearance, it is necessary to remove the old appearances before setting the new value.
The Phone fields
The filled in Phone fields look like this
and again there is a similar display in the original file
That only the first two letters are shown even though there is enough space for the whole word, is due to the configuration of these fields: They are configured as comb fields with a maximum length of 2 characters.
To have a value here set with PDFBox displayed completely and not so spaced out, you have to remove the maximum length (or at least have to make it no less than the length of your value) and unset the comb flag.
The Care Providers Address fields
Filled in they look like this:
Originally they look similar:
This vertical compression is again caused by the field widget rectangles having a different aspect ratio than the respectively matching normal appearance stream bounding box:
A widget rectangle: [ 278.6 642.928 458.36 657.96 ], i.e. 179.76*15.032.
The appearance bounding box: [ 0 0 179.76 58.56 ], i.e. 179.76*58.56.
Just like in the case of the Resident Name fields above it is necessary to remove the old appearances before setting the new value to make PDFBox create a proper appearance.
A complication
Actually there is an additional issue when filling in the Care Providers Address fields, after removing the old appearances they look like this:
This is due to a shortcoming of PDFBox: These fields are configured as multi line text fields. While PDFBox for single line text fields properly calculates the font size based on the content and later finely makes sure that the text vertically fits quite well, it proceeds very crudely for multi line fields, it selects a hard coded font size of 12 and does not fine tune the vertical position, see the code of the AppearanceGeneratorHelper methods calculateFontSize(PDFont, PDRectangle) and insertGeneratedAppearance(PDAnnotationWidget, PDAppearanceStream, OutputStream).
As in your form these address fields anyways are only one line high, an obvious solution would be to make these fields single line fields, i.e. clear the Multiline flag.
Example code
Using Java one can implement the solutions explained above like this:
final int FLAG_MULTILINE = 1 << 12;
final int FLAG_COMB = 1 << 24;
PDDocument doc = PDDocument.load(originalStream);
PDAcroForm acroForm = doc.getDocumentCatalog().getAcroForm();
PDType0Font font = PDType0Font.load(doc, fontStream, false);
String font_name = acroForm.getDefaultResources().add(font).getName();
for (PDField field : acroForm.getFieldTree()) {
if (field instanceof PDTextField) {
PDTextField textField = (PDTextField) field;
textField.getCOSObject().removeItem(COSName.MAX_LEN);
textField.getCOSObject().setFlag(COSName.FF, FLAG_COMB | FLAG_MULTILINE, false);;
textField.setDefaultAppearance(String.format("/%s 0 Tf 0 g", font_name));
textField.getWidgets().forEach(w -> w.getAppearance().setNormalAppearance((PDAppearanceEntry)null));
textField.setValue("Test");
}
}
(FillInForm test testFill0DropOldAppearanceNoCombNoMaxNoMultiLine)
Screen shots of the output of the example code
The Resident Name field value now is not vertically compressed anymore:
The Phone and Care Providers Address fields also look appropriate now:
File example: file.
Problem - when extracting text using PdfTextStripper, there is token "9/1/2017" and "387986" after "ASSETS" in the page start which should be removed, and some others hidden tokens.
I have already applied this solution (so I do not copy-paste it here, because actually problem is exactly the same) and still that hidden text is appearing on page. Could it be hidden by something else except clip path?
thanks!
Could it be hidden by something else except clip path?
Yes. In case of your new document the text is written in white on white, e.g. the 387986 after ASSETS is drawn like this:
1 1 1 rg
/TT0 16 Tf
-1011.938 115.993 Td
(#A,BAC)Tj
The initial 1 1 1 rg sets the fill color to RGB WHITE. (Additionally that text is quite tiny but would still be visible if drawn in e.g. BLACK.)
The solution you refer to was implemented for documents like the sample document presented in that issue in which the invisible text is made invisible by defining clip paths (outside the bounds of which the text is) and by filling paths (hiding the text underneath). Thus, your white text won't be recognized by it as hidden.
Unfortunately recognizing invisibility of WHITE on WHITE text is more difficult to determine than that of clipped or covered text because one not only needs to know the a property of the current graphics state (like the clip path) or remove all text inside a given path, one also needs to know the color of the part of the page right before the text is drawn (to check the on WHITE detail).
If, on the other hand, you assume the page background to be essentially WHITE, it is fairly simple to ignore all white text: Simply also detect the current fill color in processTextPosition:
PDColor fillColor = gs.getNonStrokingColor();
and compare it to the flavors of WHITE you want to consider invisible. (Usually it should suffice to compare with RGB, CMYK, and Grayscale WHITE; in seldom cases you'll also have to correctly interpret more complex color spaces. Additionally you might also consider nearly WHITE colors invisible, (.99, .99, .99) RGB can hardly be distinguished from WHITE.)
If you find the current color to be WHITE, ignore the current TextPosition.
Be aware, though, just like the solution you referenced this is not yet the final solution recognizing all WHITE text: For that you'll also have to check the text rendering mode: If it is just filling (the default), the above holds, but if it is (also) stroking, you'll (also) have to consider the stroking color; if it is rendered invisible, there is no color to consider; and if the text rendering mode includes adding to path for clipping, you'll have to wait and determine what will be later drawn in this part of the page as long as the clip path holds, definitely not trivial!
I am new to PDFBox API. I would like to apply text annotation(AirPassengers) style like below marked in red box.
I am using PDF box API. I am creating text annotation as shown below.
PDAnnotationTextMarkup txtMark = new PDAnnotationTextMarkup(PDAnnotationTextMarkup.SUB_TYPE_FREETEXT);
This will result in Simple Text Annotation without any Style or background color. I would like to achieve the style as shown in screenshot. Anybody has any idea to achieve this.
Do this:
txtMark.setColor(new PDColor(new float[] { 0, 1, 1 }, PDDeviceRGB.INSTANCE));
this sets the color you mentioned (#00FFFF). In Adobe Acrobat, colors are between 0 and 1 and not between 0 and 255. Be aware that the annotation will be visible in Adobe Reader, but at this time not in PDFBox rendering or PDF.js rendering because the Appearance Stream is missing (see my comment in your previous question).
I am developing a simple devotional app, which has a Kannada (a language in India) sentence to be displayed. I am successful in using typeface and displaying the content.
In few places I have word which has a line on top/bottom of the word as shown below. I tried with a spannable image but I am still not able to achieve it properly.
This is a sample of the code which I am referring to. Here I am using a small icon to display it in between the string.
Spannable span1 = new SpannableString("The imageplace");
Drawable android = TestImageActivity.this.getResources().getDrawable(R.drawable.end);
android.setBounds(5, 0, 20, 5);
ImageSpan image = new ImageSpan(android, ImageSpan.ALIGN_BASELINE);
span1.setSpan(image, 3, 4, Spannable.SPAN_INCLUSIVE_EXCLUSIVE);
tvTextImage3.setText(span1);
ImageSpan extends ReplacementSpan so any characters you are spanning won't get rendered, as the TextLayout is expecting that the span itself will be doing all the rendering.
What I would recommend is implementing your own ReplacementSpan subclass. Since it looks like your graphics are associated with one character, you would wrap the single character.
In the getSize override, you would use start and end to index into text and get the character(s) you are spanning, then use paint.getTextBounds() to measure the width of the text and return that value. You want the width calculation to work in a way that the width of the span doesn't affect the default spacing of the text.
Another thing this method might need to do is change the FontMetrics by increasing the ascent and descent in order to give you some space to draw the lines.
In the draw override, you use the paint to render the text that isn't being rendered within the span. The paint and font metrics should already have the proper values so that your text render looks like the surrounding text. Of course, you'll also render the line graphics you want.
For some sample code, take a look at my answer to a similar question. This has all the pieces I just discussed.
If you want me to write some code for this, you'll need to provide some code that gives me a starting point with some actual Kannada text along with what the lines are and where they go. I don't even know if Kannada text is LTR or RTL; that might affect how the span subclass is coded. Preferably the text would correspond to the image you posted so I can see how it should look when it's working.
I am using itext and want to make my acrofields curved. Say textfield with rounded corners,
and apply the same to buttons and imageField(pustButtonField).
Is it possible in itext or by using some other api.
Thanks in advance for everybody valuable reply...
Depending on what you're actually asking, there are three possible answers to this question:
ISO-32000 only allows you to define a rectangle as the clickable area of AcroForm fields. This is the area that is highlighted when you select highlight fields. You can define a border for this rectangle consisting of an array containing at least 3 values: the horizontal corner radius, the vertical corner radius and the border width. An optional fourth value allows you to define the dash pattern.
Apart from this you can create any appearance for a widget annotation that corresponds with an AcroForm field. The appearance is stored in the /AP entry of the annotation dictionary. This is quite common for button fields (see for instance the createAppearance() method in the Calculator example). This is not done for text fields, as the appearance will disappear the moment somebody changes the value of the text field.
Maybe you are asking to create a rectangular border that is part of the content stream of the page as opposed to a shape that is defined at the AcroField level (see for instance how Open Office adds border to form fields: these borders don't disappear when you remove the field dictionary).
If I had to guess, I'd say you're looking for answer 2 regarding buttons and for answer 3 regarding text fields.
Update: thank you for accepting the answer even though I misunderstood the question. You were asking for a field where you define a path (not necessarily a straight line) that will be used to position the value of the field (for instance: a word written in a way that the characters form a circle). That's not possible with AcroForm fields.