BaseX: best practice to hold xquery function - java

Is the following possible for the BaseX database?
insert one or more XQuery functions into the database
call BaseX from Java, specifying which function to call, and receive a response
Or perhaps in a worse case:
Have a single file with all the XQuery functions I wish to define (this is certainly possible)
Somehow select a single function from that file and execute the query
At this moment I have a number of files, each of which contains a single xquery function. It is a bad solution; I would like to meet a more elegant one.

Related

Parse SQL and Evaluate an Expression

I have a SQL query that I would like to parse evaluate. I have parsed the SQL using JSQL Parser. Now, I need to evaluate the where clause in the SQL. I would like to do it in Flink as part of the filter function. Basically, stream.filter(Predicate<Row>). The Predicate<Row> is what I need to get from the evaluation of the SQL's where clause and apply it on the streaming record.
Ex: SELECT COLUMN FROM TABLE WHERE (ac IS NOT NULL AND ac = 'On')
I would like to parse the above query and given a streaming record with say ac = on, I would like to run the above expression evaluation on that record.
Any thoughts on how I can do it?
I would like to try using expression evaluation with DFS but kinda confused how to run by it. Any help is appreciated!
If the SQL query is known at compile time, it's more straightforward to do this by integrating Flink SQL (via the Table API) into your DataStream application. See the docs for more info and examples.
The overall approach would be to convert your DataStream into a dynamic Table (which can be done automatically if the stream is a convenient type, such as a POJO), apply the SQL query to it, and then (if necessary) convert the resulting Table back to a DataStream.
Or maybe just implement the entire application with the Table API if you don't need any of the features that are unique to DataStreams.
On the other hand, if the query is dynamic and isn't provided until runtime, you'll need to pursue something like what you've proposed. FWIW, others with similar requirements have used dynamic languages with JVM-based runtimes, such as Javascript via Rhino, or Groovy. The overall approach is to use a BroadcastProcessFunction, with the dynamic code being broadcast into the operator.

Cross referencing in knime

I'm using Knime and I need to be able to cross reference a value within a csv file against a value I get from an Oracle DB.
Specifically, I need to match a ZIP code I get from the DB to a CSV file I have that contains zip codes and their corresponding counties.
I'm not really sure how to approach it. I've tried Joins and Cross joins but the data ends up looking garbled and I'm unable to make any sense of it. Worst case scenario I end up manually looking things up.
If you are adding only one column (county in this example) I prefer Cell Replacer node with option to Append column. Easier to configure and faster than Joiner node.

Appropriate way to pass a dataset to Java from Oracle PL/SQL

I need to pass datasets from Oracle to Java through JDBC.
How is it better to organize it so that everything works well and it would be convenient both for Java developers and PL/SQL developers to maintain the code in case of changing, for example, table column types?
I see such variants:
Pass the sys_refcursor via stored procedure, and in Java expect that there will be certain fields with a certain type of data.
Pass a strong ref cursor and in Java do the same, that in item 1, but in the PL/SQL package there is a type description.
Pass SQL "table of" type, described at the schema level. If I understand correctly, in Java apparently it can somehow be applied to the object. The problem is that in these types it is impossible to do fields with the column type - Column_Name%TYPE.
Conduct in the PL/SQL package "table of object / record" type, and using JPublisher to work with it - JPublisher apparently converts it into a SQL type. It is not entirely clear for me how this is implemented, and what needs to be done for the same case when the data type of the column changes.
Using the pipelined function instead of the cursor (does this even make sense for such a task?).
What to choose? Or maybe something else, not from these points?
P.S. Sorry for bad English.
I'm not sure that i've understood your queston right, but i think you confused.
The variants, which you discribed is way to execute Java package on server side (for example, when you have database with application servers and want execute java package on it with data of database).
But if you thinking about JDBC then i guess that you want to make some java-app which could work with database. So that you don't have to user some sys_refcursor of subtupes like table of object / record. The JDBC provides capabilities to work with datasets using simple SQL. You should just connect to database as user (via JDBC) and execute sql query. After that you can get any data from result set.
Examples:
Connection example via JDBC
Execute select after connection
So the answer for your question depends on yours goals.

Is it okay to validate JSON at PostgreSQL side?

Writing APIs I used to validate all input parameters on the Java (or PHP, whatever) side, but now we moved our DBs to PostgreSQL which gives us great JSON features, like building JSON from table rows and a lot more (I didn't find anything we can't to without PGSQL JSON-functions so far). So I thought what if I do all parameters validation to Postgres (also considering that I can return JSON straight from database)?
In Java I made it like this:
if (!params.has("signature"))
//params comes from #RequestBody casted to JSONObject
return errGenerator.genErrorResponse("e01"); //this also need database access to get error description
On a Postgres I will to that like this (tested, works as expected):
CREATE OR REPLACE FUNCTION test.testFunc(_object JSON)
RETURNS TABLE(result JSON) AS
$$
BEGIN
IF (_object -> 'signature') IS NULL --so needed param is empty
THEN
RETURN QUERY (SELECT row_to_json(errors)
FROM errors
WHERE errcode = 'e01');
ELSE --everything is okay
RETURN QUERY (SELECT row_to_json(other_table)
FROM other_table);
END IF;
END;
$$
LANGUAGE 'plpgsql';
And so on...
The one problem I see so far is that if we move to MS SQL or Sybase it will need to rewrite all procedures. But as NoSQL comes more and more now, it seems to be unlikely and If we move to NoSQL DB we will also have to recode all APIs
You have to consider basically two items:
The closer you put your checks to the data storage, the safer it is. If you have the database perform all the checks, they'll be performed no matter how you interface with it, whether through your application, or through some third party tool you might be using (if even only for maintenance). In that sense, checking at the database side improves security (as in "data consistency"). In that respect, it does make all the sense to have the database perform the checks.
The closer you put your checks to the user, the fastest you can respond to his/her input. If you have a web application that needs fast response times, you probably want to have the checks on the client side.
And take into consideration an important one:
You might also have to consider your team knowledge: what the developers are more comfortable with. If you know your Java library much better than you know your database functions... it might make sense to perform all the checks Java-side.
You can have a third way: do both checks in series, first application (client) side, then database (server) side. Unless you have some sophisticated automation, this involves extra work to make sure that all checks performed are consistent. That is, there shouldn't be any data blocked at the client-side that woud be allowed to pass when checked by the database. At least, the most basic checks are performed in the first stages, and all of them (even if they're redundant) are performed in the database.
If you can afford the time to move the data through several application layers, I'd go with safety. However, the choice to be made is case-specific.
So I found some keys... The main is that I can have my error messages been cached in my application that will allow to avoid making database request if input parameters doesn't pass it and only go to database to get the result data

Is there a clean way to read embedded SQL resource files?

To avoid creating SQL statements as strings in a class I've placed them as .sql files in the same package and read the contents to a string in the static constructor. The reason for this is the SQL is very complex due to an ERP system that the SQL is querying.
There's no problem with this method, though since the SQL reading mechanism quite simply just reads the whole file any comments within that file may cause the read to fail if they are at the end of the line, as when reading it first removes excess whitespace and removes new-lines. Full commented lines (i.e. lines beginning with -- are removed).
I could enhance the simple reading to read the file and remove commented lines etc, though I have to wonder if there is something already available that could read an SQL file and clean it up.
I've seen this same problem solved in a project I've worked on by storing queries in XML, and loading the XML into a custom StoredQueriesCache object at runtime. To get a query, we would call a method on the StoredQueriesCache object and just pass the query name (which is defined in the XML), and it would return the query.
Writing something like this is fairly simple. The XML would look something like this below...
<Query>
<Name>SomeUniqueQueryName</Name>
<SQL>
SELECT someColumn FROM someTable WHERE somePredicate
</SQL>
</Query>
You would have one element for every stored query. The XML would be loaded into memory at application startup from file, or depending on your needs it could be lazy loaded from file. Then your StoredQueriesCache object that holds the XML would have methods to return individual queries by name. In my experience, having comments in the query has never caused any issue since linebreaks are part of the XML node's innertext, but if you want your StoredQueriesCache methods that retrieve the queries could parse comments out.
I've found this to be the most organized way of storing queries without embedding them in code, and without using stored procedures. There should honestly be a library that does this for you; maybe I'll write one!

Categories