java - OCR with Asprise library - java

I make an Android app that captures a photo and saves the text from it using OCR. This is my code with Asprise library, but something is wrong with the "recognize" method:
Ocr.setUp();
Ocr ocr = new Ocr();
ocr.startEngine("eng", Ocr.SPEED_FASTEST);
String s = ocr.recognize(theImage, Ocr.RECOGNIZE_TYPE_ALL, Ocr.OUTPUT_FORMAT_PLAINTEXT);
ocr.stopEngine();
"theImage" is Bitmap, but they want "RenderedImage" type there (thought Bitmap is rendered too), and the fourth parameter of the "recognize" method is "Object... propSpec", but there in the sample of asprise official site there are only 3 parameters. And now parameters in the "recognize" line are underlined with red. So, what should I do with my code that it work properly?
P.S. Of course, I've heard about tess-two library, but it's a bit complicated for me to add it in Android Studio (I don't know why they couldn't just make it the way that it be added with only one line in build.gradle)

I've implemented same , what you want to do , by following code , and this is working as i wanted it to , other issues may be like file reader in your PC i.e, if you want PDF file to be OCR , .pdf reader should be installed .
Ocr.setUp();
Ocr ocr = new Ocr();
ocr.startEngine("eng", Ocr.SPEED_FASTEST);
String s = ocr.recognize(new File[] {new File(path)},
Ocr.RECOGNIZE_TYPE_ALL, Ocr.OUTPUT_FORMAT_PLAINTEXT);
System.out.println("Result: \n" + s);
ocr.stopEngine();
System.out.println("---END---");

Related

Ocr could not convert some characters properly in java

I am trying to get text from element which are inside canvas, I am trying to capture canvas portion as image and then extracting text from image using OCR library. But I am facing issue as some of characters are not getting converted in exact text.
I am using Selenium webdriver, Java and Maven.
Code to extract text from image:
Ocr.setUp();
Ocr ocr = new Ocr(); // create a new OCR engine
ocr.startEngine("eng", Ocr.SPEED_FASTEST); // English
textFromImage = ocr.recognize(new File[]{new File("E:\\Device.png")},
Ocr.RECOGNIZE_TYPE_TEXT, Ocr.OUTPUT_FORMAT_PLAINTEXT);
System.out.println(textFromImage);
ocr.stopEngine();
Currently I am trying to get text which is in google search text box from below image :
I am expecting output as get text : What is swift in iOS
But OCR returns text : Whatls swlflln IOS as per following screenshot:
Some how OCR could not covert some characters same as what in image. Is there any other solution for this?

How to write XMP metadata to a PSD? (JAVA)

I have a photoshop file that I want to be able to change 2 text values with a java program. Opening the PSD with a text editor I can find the text that I want to change. LayerText Eighty, LayerText Nine
I hid some content with blue for privacy reasons. If I use exiftool gui i see [this][2]. So I assumed it was under TextLayerText. In photoshop they are [text layers.][3] I did some research and heard about Sanselan in apache commons. I can find the same code that I found in my [text editor][4].
File imageFile = new File(fileField.getText());
File outputFile = new File(fileField.getText().split("\\.")[0] + ".png");
BufferedImage image = Sanselan.getBufferedImage(imageFile);
logArea.append("--- XMP Metadata ----\n");
logArea.append(Sanselan.getXmpXml(imageFile));
Map params = new HashMap();
params.put("TextLayerText", "");
Sanselan.writeImage(image, outputFile, ImageFormat.IMAGE_FORMAT_PNG, params);
This is the code I currently have. It declares 2 files first is input and 2nd is output. It gets the XMP and prints it out. I create a params Map but my error is.
org.apache.sanselan.ImageWriteException: Unknown parameter: TextLayerText
The goal of this program is to modify the 2 text layers and render the png from this. It renders the png file if i leave the params blank, and i can read the params with Sanselan.getXmpXml. I am struggling to find a way to change them though. I put all pictures in one because of my reputation I can't post more than 2 links.

How do I generate a PDF417 Barcode with Java4Less?

For our standard PDF & Barcode generation, we have the Java4Less library (java4less-1.0rel.jar) so that our customers can print tickets sold to/by them. We use this library to create CODE128(C), Aztec, QR barcodes and so on.
Right now we're looking into PDF417 Barcodes; and while this library supports this generation, something isn't going right. Have a look at the following code from a small Netbeans project:
BarCode bc= new BarCode();
bc.setSize(400 , 200);
bc.barType = BarCode.PDF417;
bc.resolution=1;
bc.leftMarginCM= 50;
bc.topMarginCM= 50;
bc.checkCharacter =true;
bc.code = "THISISJUSTATESTTEXT";
bc.barColor = Color.black;
bc.backColor= Color.red;
bc.fontColor = Color.blue;
bc.textFont = new Font("Arial",Font.BOLD,14);
bc.X = 1;
bc.N = 3;
bc.paint(region);
ImageIO.write(img, "PNG", new File("barcode.png"));
This piece of code generates a .png image with the requested barcode-type. All barcodes are generated, except for the PDF417.
Here's an image that shows a CODE128 and a PDF417 generation:
As you can see, the CODE128 generates its barcode, but the PDF417 doesn't. The only thing changed in the code is the following:
bc.barType = BarCode.CODE128; --> bc.barType = BarCode.PDF417;
I've looked up the documentation, examples; I even downloaded the demo from the official Java4Less website, and in a war/Java project, it generates a PDF417 normally.
So what is going wrong here? Is it a bug in the library that anyone knows of, or am I missing a step?
It would seem that our current library, despite claiming to support PDF417 creation, was outdated. When using the demo's library I managed to succesfully create a PDF417 barcode with the previousley mentioned code.

QR code decoding without using camera in java

** I am developing a Java Application for reading(decoding) QR Codes with out using camera in the laptop. I am using the ZXING JAR for the generation of QR Code.**
I am doing some manipulation for that QR Code. Now, I wanted to check whether the QR Code is fine or not with out using camera.
Is there any way it can be done?
ZXing has a JavaSE module which provides the crucial BufferedImageLuminanceSource for decoding a regular Java BufferedImage.
The bare minimum, extracted from ZXing's JavaSE DecodeThread:
BufferedImage image = ...
LuminanceSource source = new BufferedImageLuminanceSource(image);
BinaryBitmap bitmap = new BinaryBitmap(new HybridBinarizer(source));
Result result = new MultiFormatReader().decode(bitmap);
If decode() doesn't throw an exception, ZXing was able to decode the barcode (and you can check the contents of the bar code).
http://zxing.org/w/docs/javadoc/com/google/zxing/Reader.html#decode(com.google.zxing.BinaryBitmap, java.util.Map)
You can configure the MultiFormatReader, e.g. to only parse QR codes, by using the decode(BinaryBitmap, Map<DecodeHintType,?> hints) overload, allowing you to specify any number of decoding hints. Alternatively, if you really only want QR codes, use a QRCodeReader instead of MultiFormatReader.

How to get favicon.ico from a website using Java?

So I'm making an application to store shortcuts to all the user's favorite applications, acting kind of like a hub. I can have support for actual files and I have a .lnk parser for shortcuts. I thought it would be pretty good for the application to support Internet shortcuts, too. This is what I'm doing:
Suppose I'm trying to get Google's icon (http://www.google.com/favicon.ico).
I start out by getting rid of the extra pages (e.g. www.google.com/anotherpage would become www.google.com.
Then, I use ImageIO.read(java.net.URL) to get the Image.
The problem is that ImageIO never returns an Image when I call this method:
String trimmed = getBaseURL(page); //This removes the extra pages
Image icon = null;
try {
String fullURLString = trimmed + "/favicon.ico";
URL faviconURL = new URL(fullURLString);
icon = ImageIO.read(faviconURL);
} catch (IOException e) {
e.printStackTrace();
}
return icon;
Now I have two questions:
Does Java support the ICO format even though it is from Microsoft?
Why does ImageIO fail to read from the URL?
Thank you in advance!
Try Image4J.
As this quick Scala REPL session shows (paste-able as Java code):
> net.sf.image4j.codec.ico.ICODecoder.read(new java.net.URL("http://www.google.com/favicon.ico").openStream())
res1: java.util.List[java.awt.image.BufferedImage] = [BufferedImage#65712a80: type = 2 DirectColorModel: rmask=ff0000 gmask=ff00 bmask=ff amask=ff000000 IntegerInterleavedRaster: width = 16 height = 16 #Bands = 4 xOff = 0 yOff = 0 dataOffset[0] 0]
UPDATE
To answer your questions: Does Java support ICO? Doesn't seem like it:
> javax.imageio.ImageIO.read(new java.net.URL("http://www.google.com/favicon.ico"))
java.lang.IllegalArgumentException: Empty region!
Why does ImageIO fail to read from the URL? Well, the URL itself seems to work for me, so you may have a proxy/firewall issue, or it could be the problem above.
Old post, but for future reference:
I've written a plugin for ImageIO that adds support for .ICO (MS Windows Icon) and .CUR (MS Windows Cursor) formats.
You can get it from GitHub here: https://github.com/haraldk/TwelveMonkeys/
After you have installed the plugin, you should be able to read the icon, using the code in the original post without any modifications.
You don't need ImageIO for this. Just copy the bytes, same as for any other static resource.
There is Apache Commons Imaging for reading ico files and others: https://commons.apache.org/proper/commons-imaging/index.html
Reading an ico file works like this:
List<BufferedImage> images = org.apache.commons.imaging.Imaging.getAllBufferedImages(yourIcoFile);
In your case you have to download it first, I guess.

Categories