Google Vision API JSON Response in English only - java

Its been so much of time exploring the Google vision API, I am trying to get the Vision API Response in English Language only , below is my request object to API which has language hints :
{
"requests": [
{
"features": [
{
"type": "IMAGE_PROPERTIES"
},
{
"type": "LANDMARK_DETECTION"
},
{
"type": "LABEL_DETECTION"
},
{
"type": "WEB_DETECTION"
},
{
"type": "FACE_DETECTION"
},
{
"type": "SAFE_SEARCH_DETECTION"
},
{
"type": "TEXT_DETECTION"
},
{
"type": "LOGO_DETECTION"
}
],
"image": {
"source": {
"imageUri": "https://images.dreamstream.com/prodds/prddsimg/OM_pasteIt22_12_2017_2_34_7806303.jpeg"
}
},
"imageContext": {
"languageHints": [
"en"
]
}
}
]
}
Even this request object not getting correct response(multiple languages) from Vision API ..
if there is any steps is there to get response in English only please let me know, as of now response contains multiple languages like below :
{
"url": "https://www.tummyummi.com/food/menu-aryaas-restaurant",
"pageTitle": "Aryaas India Restaurant - مطعم ارياس لبهند - TummYummi Restaurants",
"fullMatchingImages": [
{
"url": "https://www.tummyummi.com/food/upload/1509868727-Curd-Vada.jpg"
}
]
},

If I'm understanding correctly, the Vision Api is looking at your image, and determined that it has seen a similar image at https://www.tummyummi.com/food/menu-aryaas-restaurant.
The title of this website is Aryaas India Restaurant - مطعم ارياس لبهند - TummYummi Restaurants.
It is not a bug that this non-english text is being sent to you, because you asked the Api to use WEB_DETECTION.
It found a website that has that image, and gave you a link to it and its title.
From the docs, the ImageContext parameter languageHints allows you to set the expected language for text in the image, and will return an error if any other language is detected:
Text detection returns an error if one or more of the specified languages is not one of the supported languages.
It's important to note that this language setting is only affecting text detection.
If you want the text detection to only return english elements, but not error out if it detects anything else, then that document recommends the following:
For languages based on the Latin alphabet, setting languageHints is not needed. In rare cases, when the language of the text in the image is known, setting a hint will help get better results (although it will be a significant hindrance if the hint is wrong)
Instead, to filter out any text that is not english, you would instead look at the TextAnnotation's locale field, and filter out anything that isn't en on the client side.
As far as detecting the language of the title of the website during WEB_DETECTION is concerned, I think that is out of scope of the Google vision api, but you could try using the detecting lanuages feature of the cloud translation api.

Thanks for the useful answer #dustinroepsch, rather than relying on cloud translation api , we can go for regex because the only feature which is having non-english texts is WEB_DETECTION , sometimes it may vary.
In WEB_DETECTION , few objects like pagesWithMatchingImages and webEntities may have non-english texts . After Parsing JSON , we can use following regex pattern to remove non-english texts.
String regex = "[a-z,A-Z,0-9,($&+,:;=?##|'<>.^*()%!-)\\s]";

Related

Storing XACML file in JSON using MongoDB for Authzforce

I would like to implement a PDP engine using the authzforce-ce-core-pdp-engine jar file like you mentioned in the README, but with exception of the policy files in XML should be dynamic. The main idea is similar to file sharing system as one user could share multiple files to other user with each file may have different policy. I was thinking to store the policy files in some sort of DB like MySQL or MongoDB and PDP will refer to it and make a decision to grant or deny the access based on the request.
I found that the pdp core engine supports MongoDB as mentioned here.
Here is my pdp configuration file:
<?xml version="1.0" encoding="UTF-8"?>
<!-- Testing parameter 'maxPolicySetRefDepth' -->
<pdp xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://authzforce.github.io/core/xmlns/pdp/6.0" xmlns:ext="http://authzforce.github.io/core/xmlns/test/3" version="6.0.0">
<refPolicyProvider id="refPolicyProvider" xsi:type="ext:MongoDBBasedPolicyProvider" serverHost="localhost" serverPort="27017" dbName="testXACML" collectionName="policies" />
<rootPolicyProvider id="rootPolicyProvider" xsi:type="StaticRefBasedRootPolicyProvider">
<policyRef>root-rbac-policyset</policyRef>
</rootPolicyProvider>
</pdp>
So now the question is that how can I store the policy XML files as it needs to be stored in JSON with MongoDB? I tried to convert XML to JSON using JSON maven dependency, but I have a problem of converting back to XML. For example with the policy XML file like this it will create the JSON file something like this:
{"Policy": {
"xmlns": "urn:oasis:names:tc:xacml:3.0:core:schema:wd-17",
"Target": "",
"Description": "Policy for Conformance Test IIA001.",
"Version": 1,
"xmlns:xsi": "http://www.w3.org/2001/XMLSchema-instance",
"RuleCombiningAlgId": "urn:oasis:names:tc:xacml:3.0:rule-combining-algorithm:deny-overrides",
"Rule": {
"Target": {"AnyOf": [
{"AllOf": {"Match": {
"AttributeValue": {
"DataType": "http://www.w3.org/2001/XMLSchema#string",
"content": "Julius Hibbert"
},
"AttributeDesignator": {
"Category": "urn:oasis:names:tc:xacml:1.0:subject-category:access-subject",
"AttributeId": "urn:oasis:names:tc:xacml:1.0:subject:subject-id",
"MustBePresent": false,
"DataType": "http://www.w3.org/2001/XMLSchema#string"
},
"MatchId": "urn:oasis:names:tc:xacml:1.0:function:string-equal"
}}},
{"AllOf": {"Match": {
"AttributeValue": {
"DataType": "http://www.w3.org/2001/XMLSchema#anyURI",
"content": "http://medico.com/record/patient/BartSimpson"
},
"AttributeDesignator": {
"Category": "urn:oasis:names:tc:xacml:3.0:attribute-category:resource",
"AttributeId": "urn:oasis:names:tc:xacml:1.0:resource:resource-id",
"MustBePresent": false,
"DataType": "http://www.w3.org/2001/XMLSchema#anyURI"
},
"MatchId": "urn:oasis:names:tc:xacml:1.0:function:anyURI-equal"
}}},
{"AllOf": [
{"Match": {
"AttributeValue": {
"DataType": "http://www.w3.org/2001/XMLSchema#string",
"content": "read"
},
"AttributeDesignator": {
"Category": "urn:oasis:names:tc:xacml:3.0:attribute-category:action",
"AttributeId": "urn:oasis:names:tc:xacml:1.0:action:action-id",
"MustBePresent": false,
"DataType": "http://www.w3.org/2001/XMLSchema#string"
},
"MatchId": "urn:oasis:names:tc:xacml:1.0:function:string-equal"
}},
{"Match": {
"AttributeValue": {
"DataType": "http://www.w3.org/2001/XMLSchema#string",
"content": "write"
},
"AttributeDesignator": {
"Category": "urn:oasis:names:tc:xacml:3.0:attribute-category:action",
"AttributeId": "urn:oasis:names:tc:xacml:1.0:action:action-id",
"MustBePresent": false,
"DataType": "http://www.w3.org/2001/XMLSchema#string"
},
"MatchId": "urn:oasis:names:tc:xacml:1.0:function:string-equal"
}}
]}
]},
"Description": "Julius Hibbert can read or write Bart Simpson's medical record.",
"RuleId": "urn:oasis:names:tc:xacml:2.0:conformance-test:IIA1:rule",
"Effect": "Permit"
},
"PolicyId": "urn:oasis:names:tc:xacml:2.0:conformance-test:IIA1:policy"
}}
but when I try to convert it back to XML it becomes entirely different XML file. So now how can I store the XML file in MongoDB? Also how to ensure that pdp engine core could find the correct policy to be compared? I saw there is a mentioned about the json adapter in README like this but I am not sure how to implement it normally.
I answered this question on AuthzForce's github. In a nutshell, David is mostly right about the format (xml content stored as JSON string). More precisely, for AuthzForce MongoDB policy Provider, you have to store policies as shown by the part of the unit test class's setupBeforeClass method that populates the database with test policies. You'll see that we use the Jongo library (using Jackson object mapping behind the curtains) to map PolicyPOJO Java objects to JSON in the Mongodb collection. So from the PolicyPOJO class, you can pretty much guess the storage format of policies in JSON: it is a JSON object with the following fields (key-value pairs):
"id" (string): the Policy(Set) ID
"version" (string): the Policy(Set) version
"type" (string): the Policy(Set) type, i.e. '{urn:oasis:names:tc:xacml:3.0:core:schema:wd-17}Policy' (resp. '{urn:oasis:names:tc:xacml:3.0:core:schema:wd-17}PolicySet') for XACML 3.0 Policy (resp. PolicySet)
"content" (string): the actual Policy(Set)'s XML document as string (plain text)
The xml content is automatically escaped properly by the Java library (Jongo/Jackson) to fit in a JSON string. But if you use another library/language, make sure it is the case as well.
There currently isn't a JSON format for XACML policies. That's currently under consideration by the OASIS XACML Technical Committee. Bernard Butler at Waterford Institute of Technology did do some initial translation which might be of value to you.
The only other option I could think of for the time being is to create a JSON wrapper around the policies e.g.
{
"policy":"the xml policy contents escaped as valid json value or in base64"
}

Google Sheets: Column width

I am creating Spreadsheets via Java API and there seem to be no method for setting column width? According to this document: https://developers.google.com/sheets/samples/rowcolumn - there seem to be a way via JSON:
{
"requests": [
{
"updateDimensionProperties": {
"range": {
"sheetId": sheetId,
"dimension": "COLUMNS",
"startIndex": 0,
"endIndex": 1
},
"properties": {
"pixelSize": 160
},
"fields": "pixelSize"
}
}
]
}
Is there a way to set these via SheetProperties or GridProperties?
I think there is no way to set it using those properties. So I think the one that was specified in the docs is the only available option as of now.
I checked SpreadsheetProperties reference and
GridProperties as well and it does not mention what you're asking for.
If you plan to use
POST https://sheets.googleapis.com/v4/spreadsheets/spreadsheetId:batchUpdate
using Java, you can always resort to XHR.

Decipher JSON response googles topic api

I am using goggle's search api to get topics id which is used to get JSON response from topic api.The returned response looks like this
{
"id":"/m/01d5g",
"property":{
"/amusement_parks/ride_theme/rides":{...},
"/award/ranked_item/appears_in_ranked_lists":{...},
"/book/book_character/appears_in_book":{
"valuetype":"object",
"values":[
{
"text":"Inferno",
"lang":"en",
"id":"/m/0g5qs3",
"creator":"/user/duck1123",
"timestamp":"2010-02-11T04:00:59.000Z"
},
{
"text":"Batman: Year One",
"lang":"en",
"id":"/m/0hzz_1h",
"creator":"/user/anasay",
"timestamp":"2012-01-25T11:05:03.000Z"
},
{
"text":"Batman: The Dark Knight Returns",
"lang":"en",
"id":"/m/0hzz_sb",
"creator":"/user/anasay",
"timestamp":"2012-01-25T11:22:17.001Z"
},
{
"text":"Batman: Son of the Demon",
"lang":"en",
"id":"/m/071l77",
"creator":"/user/wikimapper",
"timestamp":"2013-07-11T15:20:32.000Z"
},
{
"text":"Joker",
"lang":"en",
"id":"/m/04zxvhs",
"creator":"/user/wikimapper",
"timestamp":"2013-07-11T16:58:37.000Z"
},
{
"text":"Arkham Asylum: A Serious House on Serious Earth",
"lang":"en",
"id":"/m/0b7hyw",
"creator":"/user/wikimapper",
"timestamp":"2013-07-11T19:26:54.000Z"
}
],
"count":6.0
},
"/book/book_subject/works":{...},
"/comic_books/comic_book_character/cover_appearances":{...},
...
}
}
I want to decipher this so that i can get relevant information such as, "/book/book_character/appears_in_book" itself is a property for response and only required value that i want from it is "text" and "id" e.g. "text":"Batman: Year One" and "id":"/m/0hzz_1h".
Since the response does not have fixed properties, and which may varying according to response id. how can i covert this JSON response in java Class where i can store "/book/book_character/appears_in_book" as one serialized class and containing Collection of values such has id and text and appears_in_book as name variables for class.
I considered GSON to do this. since name of property is not constant i can not use it to covert JSON to Java Object. currently i am iterating over each property by hard coding and filling them in java variables.
If some one can provide efficient way to do so i will appreciate help.
You could do this dynamically using reflection in Java but this is an advanced feature of Java and it may make your code more complicated than it needs to be.
See: Dynamically create an object in java from a class name and set class fields by using a List with data
A simpler alternative would be to just parse the JSON into a bunch of nested Maps and Lists exactly as they're given in the JSON data.
See: How to parse JSON in Java

RestKit in iOS - How to send in iOS & receive in java?

I m new to RESTful Webservices in iOS and in Java. I read really a lot but i dont get definitiv answer how to SEND & RECEIVE a POST request in iOS & Handle that in Java.
My Situation is this:
I want to create a user on serverside. On Clientside i got for that 3 Objects that saves information (User, Usersdevice & Usersmembership). I read a lot about Objectmapping but I cant relate it with a practical example.
Seconde one is how to handle that POST on serverside with Java(Jersey) as RE.
Iknow that are two qeustions but I really need to know that.
I strongly recommend you to post a snippet of your code and clarify which restkit version uses. seems a bit outdated.
Also read the newest Object Mapping guide in order to leverage RestKit taking care if you want the KVC mapping or not. It could be tricky!!!
https://github.com/RestKit/RestKit/wiki/Object-mapping
First of all you have to determine your base URL:
RKObjectManager* manager = [RKObjectManager managerWithBaseURL:[NSURL URLWithString:#"http://www.yourURL.com]];
A method to download articles using RK object mapping:
- (void)loadArticles{
RKObjectMapping* articleMapping = [RKObjectMapping mappingForClass:[Article class]];
[articleMapping addAttributeMappingsFromDictionary:#{
#"title": #"title",
#"body": #"body",
#"author": #"author",
#"publication_date": #"publicationDate"
}];
RKResponseDescriptor *responseDescriptor = [RKResponseDescriptor responseDescriptorWithMapping:articleMapping pathPattern:nil keyPath:#"articles" statusCodes:RKStatusCodeIndexSetForClass(RKStatusCodeClassSuccessful)];
NSString * stringURL = #"/articles/";
[RKObjectManager.sharedManager getObjectsAtPath:stringURL parameters:nil success:^(RKObjectRequestOperation *operation, RKMappingResult *mappingResult){
RKLogInfo(#"Load collection of Articles: %#", mappingResult.array);
} failure:^(RKObjectRequestOperation *operation, NSError *error) {
RKLogError(#"Operation failed with error: %#", error);
}];
[objectRequestOperation start];
}
In order to map this JSON
{ "articles": [
{ "title": "RestKit Object Mapping Intro",
"body": "This article details how to use RestKit object mapping...",
"author": "Blake Watters",
"publication_date": "7/4/2011"
},
{ "title": "RestKit 1.0 Released",
"body": "RestKit 1.0 has been released to much fanfare across the galaxy...",
"author": "Blake Watters",
"publication_date": "9/4/2011"
}]
}

Create MongoDB river in Elasticsearch using Java API

I am trying to create using the Java API a new river between MongoDB and ElasticSearch. Using the REST API is pretty easy making a PUT request with the following JSON
{
"type": "mongodb",
"mongodb": {
"servers": [
{ "host": "127.0.0.1", "port": 27017 }
],
"options": { "secondary_read_preference": true },
"db": "test",
"collection": "collectionTest"
},
"index": {
"name": "testIndex",
"type": "default"
}
}
But I am having several problems with the Java API. I am trying to use the CreateIndexRequestBuilder class but I don't know how to specify the params.
Are they custom params? What about source? I'm pretty lost...
Thank you in advance!
You need to add a document with id _meta to the _river index. The type is the name that you want to give to your index. The document to send is a json containing the configuration needed for your river. Beyond the custom configuration that depends on the river, the json document needs to contain the property type, which contains the name used within the river itself to register the RiverModule. For the mongodb river it's mongodb. The json that you posted is exactly the source that you have to send.
Here is the code that you need:
client.index(Requests.indexRequest("_river").type("my_river").id("_meta").source(source)).actionGet();

Categories