I am developing a simple devotional app, which has a Kannada (a language in India) sentence to be displayed. I am successful in using typeface and displaying the content.
In few places I have word which has a line on top/bottom of the word as shown below. I tried with a spannable image but I am still not able to achieve it properly.
This is a sample of the code which I am referring to. Here I am using a small icon to display it in between the string.
Spannable span1 = new SpannableString("The imageplace");
Drawable android = TestImageActivity.this.getResources().getDrawable(R.drawable.end);
android.setBounds(5, 0, 20, 5);
ImageSpan image = new ImageSpan(android, ImageSpan.ALIGN_BASELINE);
span1.setSpan(image, 3, 4, Spannable.SPAN_INCLUSIVE_EXCLUSIVE);
tvTextImage3.setText(span1);
ImageSpan extends ReplacementSpan so any characters you are spanning won't get rendered, as the TextLayout is expecting that the span itself will be doing all the rendering.
What I would recommend is implementing your own ReplacementSpan subclass. Since it looks like your graphics are associated with one character, you would wrap the single character.
In the getSize override, you would use start and end to index into text and get the character(s) you are spanning, then use paint.getTextBounds() to measure the width of the text and return that value. You want the width calculation to work in a way that the width of the span doesn't affect the default spacing of the text.
Another thing this method might need to do is change the FontMetrics by increasing the ascent and descent in order to give you some space to draw the lines.
In the draw override, you use the paint to render the text that isn't being rendered within the span. The paint and font metrics should already have the proper values so that your text render looks like the surrounding text. Of course, you'll also render the line graphics you want.
For some sample code, take a look at my answer to a similar question. This has all the pieces I just discussed.
If you want me to write some code for this, you'll need to provide some code that gives me a starting point with some actual Kannada text along with what the lines are and where they go. I don't even know if Kannada text is LTR or RTL; that might affect how the span subclass is coded. Preferably the text would correspond to the image you posted so I can see how it should look when it's working.
Related
The first page at this PDF displays the following white decorated text on top of an image.
When using the PDFBox utility PrintImageLocations, this graphics is not extracted as an image, only the background image is extracted, without the white decorated text. When converting to Word doc, the decorated text is extracted as a shape with properties which can be modified, such as fill color, border color, and much more.
Is it possible to extract that shape from the PDF, using PDFBox? How?
The simplest way to extract such graphics is to reverse engineer those that can be into ScaledVectorGraphics as here I had to change colour from white to magenta otherwise it would look like a snowscape.
I dont use PDFbox so cant say how easy that may be possible .I simply exported page 1 as SVG using
MuPDF\mutool.exe convert -o page1.svg -O no-reuse-images Xcel_Energy-AR2018.pdf 1
However you will get all SVG output such as the lower text and note the extra header text in the top left corner and lower left corner page number that were not visible behind the pixel grapics.
Note: that everything (thus any conventional text and image pixels are converted to SVG objects) there is no easier way to extract all the PostScript Printer style moves and lineto's. So yes it is overkill as it needs parsing to get just the object of interest (more easily done in a GUI such as inkscape or InDesign where it was constructed). It is not a good methodology for shape recognition since the y x values are described as rectangles, and will have positions and scalars that most likely vary from page to page, thus there are no constants other than filled appearance. The filled object would best be "seen" by regeneration as pixels for visual symbol recognition (much like OCR).
File example: file.
Problem - when extracting text using PdfTextStripper, there is token "9/1/2017" and "387986" after "ASSETS" in the page start which should be removed, and some others hidden tokens.
I have already applied this solution (so I do not copy-paste it here, because actually problem is exactly the same) and still that hidden text is appearing on page. Could it be hidden by something else except clip path?
thanks!
Could it be hidden by something else except clip path?
Yes. In case of your new document the text is written in white on white, e.g. the 387986 after ASSETS is drawn like this:
1 1 1 rg
/TT0 16 Tf
-1011.938 115.993 Td
(#A,BAC)Tj
The initial 1 1 1 rg sets the fill color to RGB WHITE. (Additionally that text is quite tiny but would still be visible if drawn in e.g. BLACK.)
The solution you refer to was implemented for documents like the sample document presented in that issue in which the invisible text is made invisible by defining clip paths (outside the bounds of which the text is) and by filling paths (hiding the text underneath). Thus, your white text won't be recognized by it as hidden.
Unfortunately recognizing invisibility of WHITE on WHITE text is more difficult to determine than that of clipped or covered text because one not only needs to know the a property of the current graphics state (like the clip path) or remove all text inside a given path, one also needs to know the color of the part of the page right before the text is drawn (to check the on WHITE detail).
If, on the other hand, you assume the page background to be essentially WHITE, it is fairly simple to ignore all white text: Simply also detect the current fill color in processTextPosition:
PDColor fillColor = gs.getNonStrokingColor();
and compare it to the flavors of WHITE you want to consider invisible. (Usually it should suffice to compare with RGB, CMYK, and Grayscale WHITE; in seldom cases you'll also have to correctly interpret more complex color spaces. Additionally you might also consider nearly WHITE colors invisible, (.99, .99, .99) RGB can hardly be distinguished from WHITE.)
If you find the current color to be WHITE, ignore the current TextPosition.
Be aware, though, just like the solution you referenced this is not yet the final solution recognizing all WHITE text: For that you'll also have to check the text rendering mode: If it is just filling (the default), the above holds, but if it is (also) stroking, you'll (also) have to consider the stroking color; if it is rendered invisible, there is no color to consider; and if the text rendering mode includes adding to path for clipping, you'll have to wait and determine what will be later drawn in this part of the page as long as the clip path holds, definitely not trivial!
Referring to Build text callout with PDF Clown - Is there a possibility to change the font color of the text within the callout note?
I haven't found a suitable method yet, can someone please give me a hint?
There is no explicit PDF Clown method to set the text color. This might be related to the fact that there is no explicit entry in the PDF annotation dictionary for it either.
There are two options, though:
There is a default appearance (DA) entry for variable text in annotations in general. As PDF Clown does not hide generic object methods, you can extend the original callout sample like this:
// Callout.
composer.showText("Callout note annotation:", new Point(35, 85));
new StaticNote(
page,
new Rectangle(250, 90, 150, 70),
"Text of the Callout note annotation"
).withLine(
new StaticNote.CalloutLine(
page,
new Point(250,125),
new Point(150,125),
new Point(100,100)
)
)
.withLineEndStyle(LineEndStyleEnum.OpenArrow)
.withBorder(new Border(1))
.withColor(DeviceRGBColor.get(Color.YELLOW))
.getBaseDataObject().put(PdfName.DA, new PdfString("1 0 1 rg /Ti 12 Tf"));
You have to use plain PDF instructions there, though, rg sets a RGB color defined by the three preceding values, and Tf sets font and size according to the preceding two values. The result of the above is:
As you see, the text now is purple (red 100%, green 0%, blue 100%). A side effect is, though, that the callout line and the frame around the callout box also are purple.
Alternatively a PDF can bring along an own appearance stream defining the whole appearance of the annotation in question. This means, though, that you really have to draw everything yourself including lines, frames, backgrounds, and text.
PDF Clown allows you to set the appearance of an annotation using the setAppearance and withAppearance methods.
I found out there is a new component in LibGDX in nightly builds - TextArea which is part of the scene2d.ui package. It's nice to have a component like this, very easy to use, but what I'm missing is some support for a multi-colored text.
I want to highlight some keywords in a text with a different color but I don't know how to do it with current api. There is one method in BitmapFontCache class:
public void setColors (Color tint, int start, int end)
Javadoc for this method says following:
Sets the color of the specified characters. This may only be called after setText(CharSequence, float, float) and is reset every time setText is called.
But I don't know how to use it through TextArea object or if it's even possible to do it that way. Someone who tried to figure it out? Every hint will be appreciated.
Libgdx offers color markup, which must first be enabled on the BitmapFont with
font.getData().markupEnabled = true;
Text rendered with that font will look for color markup, where colors are surrounded in brackets. Each used color is pushed onto a stack.
Named colors (case sensitive): [RED]red [ORANGE]orange
Hex colors with optional alpha: [#FF0000]red [#FF000033]transparent
A set of empty brackets pops a color off the stack: [BLUE]Blue text[RED]Red text[]Blue text
A double bracket [[ represents an escaped bracket character, however it will not work as expected when followed by a closing bracket.
Named colors are defined in the class com.badlogic.gdx.graphics.Colors, and can be added with Colors.put("NAME", color);.
Hopefully this isn't super late.
I haven't tried it your way, but I bet you would have to overwrite the setText method and then set the colors for the specific points you want. start and end are indices for the pieces of text you want in that particular color.
I have implemented a MulticolorTextArea here: https://github.com/AnEmortalKid/MulticolorTextArea/tree/mta-release
Hopefully this helps out.
I am writing content to a PdfContentByte object directly using PdfContentByte.showTextAligned, I'd like to know how I can stop the text overflowing a given region when writing.
If possible it would be great if iText could also place an ellipsis character where the text does not fit.
I can't find any method on ColumnText that will help either. I do not wish the content to wrap when writing.
Use this:
int status = ColumnText.START_COLUMN;
ColumnText ct = new ColumnText(cb);
ct.setSimpleColumn(rectangle);
status = ct.go();
Make sure that you define rectangle in a way so that only one line fits, use ColumnText.hasMoreText(status) to find out if you need to add an ellipsis character.