ExecuteStreamCommand won't read foreign characters

ExecuteStreamCommand won't read foreign characters - java

We have Apache Nifi set to write files to local drive, then run program that processes these files and outputs response to "response" attribute. This is a JSON string that we then deliver to API to update records.
However, issue is that while we can successfully write, read and process the files, Nifi fails to understand non-English characters in the response text. This leads to names being corrupted when we send back the response. This only applies to the JSON string we receive from the program.
Nifi is running Windows 10 environment When we run the program manually using files outputted by Nifi, we get correct output. Issue only happens in Nifi.
To provide example, input json is:
{
"player" : "mörkö",
"target" : "goal",
"didhitin" : ""
}
This is stored in our programs work folder and we call program using ExecuteStreamCommand , giving our input JSON file as the parameter. JSON is processed and our program outputs following JSON, which is then stored into response attribute of the flowfile:
{
"player" : "mörkö",
"target" : "goal",
"didhitin" : "true"
}
However, issue is that when this is read by Nifi into the response attribute of the flowfile, it becomes
{
"player" : "m¤rk¤",
"target" : "goal",
"didhitin" : "true"
}
(Not the actual process, but close enough to demonstrate the issue)
Which, when we feed it into API, would either fail or corrupt the name of the original (In this case, value of player). Neither which is desirable output.
So far, we have figured out that this is most likely issue with encoding, but we have not found a way to change the encoding of Nifi to possibly fix incorrectly read characters.

Managed to fix this issue by adding following line to the start of the program:
Console.OutputEncoding = Encoding.UTF8;
This effectively enforces the program to output UTF-8 characters, which would be in-line with rest of the flow.

Related

YAML Problem with single quotes unexpected scalar at node end

I am reading an application.yml file in my java Spring application, and getting this property called body to send in a request (it is a very very long json), sometimes it contains names or values like you will see in the example ahead, and it messes up the yml, any way to solve this so that it takes the json properly? Here is a little example of the kind of data that messes my application.yml (that comes inside that very big json):
data:
body: '{"name":"O'brien"}'
The problem is the ' in the persons name
I tried using: <%='putting the very big json here'%> but then I get "Nested mappings are not allowed in compact mappings", also tried
<%=very big json%> but get the same error

Cloud Dataflow:Reading entire json array file from Cloud Storage and create a PCollection of json object

I have a json array file with content as below
[ {
"MemberId" : "1234",
"Date" : "2017-07-03",
"Interactions" : [ {
"Number" : "1327",
"DwellTime" : "00:03:05"
} ]
}, {
"MemberId" : "5678",
"Date" : "2017-07-03",
"Interactions" : [ {
"Number" : "1172",
"DwellTime" : "00:01:26"
} ]
} ]
I wanted to create a PCollection of Java Object mapped to each Json present in Json array

JSON formatted like this (records spread over multiple lines instead of one per line) is hard for a data processing tool like beam/dataflow to process in parallel - from a random point in the file, you cannot be sure where the next record begins. You can do it by reading from the beginning of the file, but then you're not really reading in parallel.
If it's possible, reformatting it so that it's one record per line would let you use something like TextIO to read in the file.
If not, you'll need to read the file in one go.
I would suggest a couple possible approaches:
Write a ParDo that reads from the file using the gcs API
This is pretty straight forward. You'll do all the reading in one ParDo and you'll need to implement the connection code inside of that pardo. Inside the pardo you would write the same code you would as if you're reading the file in a normal java program. The pardo will emit each java object as a record.
Implement a filebasedsource
File based sources will work - when the fileOrPatternSpec is "gs://..." it knows how to read from GCS. You'll need to make sure to set fileMetadata.isReadSeekEfficient to false so that it won't try to split the file. I haven't tried it, but I believe the correct way to do that is to set it inside of the single file constructor of FBS (ie, your class's override of FileBaseSource(MetaData, long, long)
TextSource/XmlSource (and their accompanying wrappers TextIO/XmlIO) are examples of this, except that they try to implement splitting - yours will be much simpler since it won't.

autodesk forge "Failed to trigger translation for this file"

I am trying to use the autodesk forge viewer tutorial
https://developer.autodesk.com/en/docs/model-derivative/v2/tutorials/prepare-file-for-viewer/
I have successfully uploaded and downloaded a dwg file
on the step where i convert it to svf it never seems to process and fails with
{"input":{"urn":"Safe Base64 encoded value of the output of the upload result"},"output":{"formats":[{"type":"svf","views":["2d","3d"]}]}}
HTTP/1.1 400 Bad Request
Result{"diagnostic":"Failed to trigger translation for this file."}
First question do i need to remove the urn: before Base64 encoding.
Second is there any more verbose error result that I can see.
Note I have also tried with a rvt file and tried with "type":"thumbnail" nothing seems to work.
I feel my Encoded URN is incorrect but I am not sure why it would be.
On the tutorial page they seem to have a much longer and raw urn not sure if I should be appending something else to it before encoding. they have a version and some other number
from tutorial
raw
"urn:adsk.a360betadev:fs.file:business.lmvtest.DS5a730QTbf1122d07 51814909a776d191611?version=12"
mine
raw
"urn:adsk.objects:os.object:gregbimbucket/XXX"
EDIT:
This is what i get back from the upload of a dwg file
HTTP/1.1 200 OK
Result{
"bucketKey" : "gregbimbucket",
"objectId" : "urn:adsk.objects:os.object:gregbimbucket/XXX",
"objectKey" : "XXX",
"sha1" : "xxxx",
"size" : 57544,
"contentType" : "application/octet-stream",
"location" : "https://developer.api.autodesk.com/oss/v2/buckets/gregbimbucket/objects/XXX"
}
This is what i send to convert the file
{"input":{"urn":"dXJuOmFkc2sub2JqZWN0czpvcy5vYmplY3Q6Z3JlZ2JpbWJ1Y2tldC9YWFg"},"output":{"formats":[{"type":"svf","views":["2d","3d"]}]}}
This is the error I get back
HTTP/1.1 400 Bad Request
Result{"diagnostic":"Failed to trigger translation for this file."}
EDIT 2: SOLUTION
It looks like the object_id when uploading a file has to have the file extension and not end in a GUI or random set of characters for it to know what file type it is. So that it can be converted.
"objectId" : "urn:adsk.objects:os.object:gregbimbucket/Floor_sm.dwg",

SOLUTION It looks like the object_id when uploading a file has to have the file extension and not end in a GUI or random set of characters for it to know what file type it is.

How to request a file and get it or get a message in case it cannot be created from the server

Using JQuery and Spring's #ModelAndView annotation for the controller.
I'm trying to code a process in which the user clicks an icon and if a certain criteria on the DB is met, a zip file will be produced on the server containing a bunch of files, then this zip file should be sent to the browser for saving.
If the criteria isn't met, then an error message should be sent to the browser telling there isn't any file to be created and produced.
However if I use JQuery' .post method, I can receive the error message (if that is the case) but never the zip binary file.
If I use a regular Href Link I can receive the file (if that is the case) but don't know how to receive the message when the file cannot be produced.
Is there an alternative or a standard way to do this?
Thanks for your support!
-Gabriel.

You should probably split your server-side method in two:
the first one validates the criteria. If unsuccessful, it notifies of an exception, otherwise it returns a URL to the method in next point
the second one actually returns the zip file
In your frontend, the code will look something like this:
$.post(urlToPoint1, data, function(response) {
if (response.success) {
// download the file using the url provided
// (pointing to method described in point 2)
window.location.href = response.url;
}
else {
alert('whatever');
}
});

Capture generated output file path and name using CSSDK

We are in the process of converting over to using the XSLT compiler for page generation. I have a Xalan Java extention to exploit the CSSDK and capture some meta data we have stored in the Extended Attributes for output to the page. No problems in getting the EA's rendered to the output file.
The problem is that I don't know how to dynamically capture the file path and name of the output file.
So just as POC, I have the CSVPath hard coded to the output file in my Java extension. Here's a code sample:
CSSimpleFile sourceFile = (CSSimpleFile)client.getFile(new CSVPath("/some-path-to-the-output.jsp"));
Can someone point me in the CSSDK to where I could capture the output file?

I found the answer.
First, get or create your CSClient. You can use the examples provided in the cssdk/samples. I tweaked one so that I captured the CSClient in the method getClientForCurrentUser(). Watch out for SOAP vs Java connections. In development, I was using a SOAP connection and for the make_toolkit build, the Java connection was required for our purposes.
Check the following snippet. The request CSClient is captured in the static variable client.
CSSimpleFile sourceFile = (CSSimpleFile)client.getFile(new CSVPath(XSLTExtensionContext.getContext().getOutputDirectory().toString() + "/" + XSLTExtensionContext.getContext().getOutputFileName()));

We Keep Coding

Java is a programming language and computing platform first released by Sun Microsystems in 1995.