using bigquery in spring boot microservice - java

I am referring to this documentation for running a query in my spring boot app: https://docs.spring.io/spring-cloud-gcp/docs/current/reference/html/bigquery.html
I am thinking of going on with something similar to this example they mentioned:
// BigQuery client object provided by our autoconfiguration.
#Autowired
BigQuery bigquery;
public void runQuery() throws InterruptedException {
String query = "SELECT column FROM table;";
QueryJobConfiguration queryConfig =
QueryJobConfiguration.newBuilder(query).build();
// Run the query using the BigQuery object
for (FieldValueList row : bigquery.query(queryConfig).iterateAll()) {
for (FieldValue val : row) {
System.out.println(val);
}
}
}
However, there are some questions I have, they mentioned that The GcpBigQueryAutoConfiguration class configures an instance of BigQuery for you by inferring your credentials and Project ID from the machine’s environment, and also in the configuration section they mentioned spring.cloud.gcp.bigquery.datasetName is The BigQuery dataset that the BigQueryTemplate and BigQueryFileMessageHandler is scoped to and that it's required.
but what if I am having different projects and different datasets I want to use based on different conditions in my code, will it be enough to define the credential in the machine's environment, and use and in the part in which the query is defined, I can concentenate the project id and dataset name to the table to be something like this:
String query = "SELECT column FROM <project_id>+<dataset>+<table>;";
is this correct or should I do with better way?
EDIT:
and the second question is, what if I need to apply where condition in the query should it will be easily done by string concatenation enough also? for example like this:
because I found some resource is dealing with it like this:
String query = "SELECT column FROM <project_id>+<dataset>+<table> where id =<ID>;";
final var queryJobConfiguration = QueryJobConfiguration
.newBuilder("SELECT * FROM " + tableId.getTable() + " WHERE id=#id")
.addNamedParameter("id", QueryParameterValue.numeric(BigDecimal.valueOf(userId)))
.setDefaultDataset(dataset)
.build();
I don't know why they used ```.addNamedParameter````

Related

How do you return only x number of records back

I have an n API using Spring Boot to return the data back from my MySQL db.
I would like to send in a parameter (to keep it simple as part of the URI) to only return an x amount of records back.
My question is
Is it easier to just return all the records back in the Spring Boot app and then only loop through al the records and return the x amount of records back via an Arraylist or
Is there an actual method I can call with either JPA or the standard super class CRUD from Java to get the correct result?
You can use native query in your repository.
For example you have controller named fetch_data_controller and a repository name fetch_data_repository and a table name fetch_data_table from where you have to fetch only specific data.
In fetch_data_repository write the query as follows:
#Query(value = "SELECT col_1,col_2 FROM fetch_data_table WHERE validation = 1", nativeQuery = true)
List<Map<String,String>> fetch_data_func();
In fetch_data_controller write code as follows:
List<Map<String,String>> fetched_data = fetch_data_repository.fetch_data_func();

Creating a table within a dataset in BigQuery programmatically

Is it possible to create a table within a dataset in BigQuery using the API in Java? I know it's possible with
bq mk --schema <fileName> -t <project>:<dataset>.<table>
but I can't find a way to do it programmatically.
I haven't used the Java BigQuery library personally1, but it looks like you should call BigQuery.create(TableInfo, TableOptions[]. That documentation has this example code - assuming you already have an instance of a BigQuery interface implementation of course:
String datasetName = "my_dataset_name";
String tableName = "my_table_name";
String fieldName = "string_field";
TableId tableId = TableId.of(datasetName, tableName);
// Table field definition
Field field = Field.of(fieldName, Field.Type.string());
// Table schema definition
Schema schema = Schema.of(field);
TableDefinition tableDefinition = StandardTableDefinition.of(schema);
TableInfo tableInfo = TableInfo.newBuilder(tableId, tableDefinition).build();
Table table = bigquery.create(tableInfo);
Obviously your schema construction is likely to be a bit more involved for a real table, but that should get you started. I can't see any way of loading a schema from a file, but if your schema file is machine-readable in a simple way (e.g. JSON) you could probably write your own parser fairly easily. (And contribute it to the project, should you wish...)
1 I'm the main author of the C# BigQuery library though, so I know what to look for.

Create Cassandra #Query for one or more #Params

I am trying to perform a SELECT on a cassandra database, using the datastax driver on a Java App.
I already developed simple queries as:
#Repository
public class CustomTaskRepository
extends AbstractCassandraRepository<CustomTask> {
#Accessor
interface ProfileAccessor {
#Query("SELECT * FROM tasks where status = :status")
Result<CustomTask> getByStatus(#Param("status") String status);
}
public List<CustomTask> getByStatus(String status) {
ProfileAccessor accessor = this.mappingManager.createAccessor(ProfileAccessor.class);
Result<CustomTask> tasks = accessor.getByStatus(status);
return tasks.all();
}
}
Thats works great.
The problem I have now is that I want to execute a SELECT statement for more than one status. For example I would like to execute the query for one, two ... or more status codes (Pending, Working, Finalized,etc).
How could I create a #Query Statement with the flexibility of accepting one or more Status codes?
Thanks in advance!!!
EDIT: The table create statement is:
CREATE TABLE tasks(
"reservation_id" varchar,
"task_id" UUID,
"status" varchar,
"asignee" varchar,
PRIMARY KEY((reservation_id),task_id)
)
WITH compaction = {'class': 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'} ;
CREATE INDEX taskid_index ON tasks( task_id );
CREATE INDEX asignee_index ON tasks( asignee );
Try using IN instead of = . If this is partitioning key you will get the rows that you need out. Also note that it might cause performance degradation if there are a lot of statuses in in.

writing a basic n1ql query in java

I have just started learning Couchbase. I am trying to write a basic query using java sdk but I am not able to understand how to write it. Below is the query:
SELECT *
FROM users_with_orders usr
JOIN orders_with_users orders
ON KEYS ARRAY s.order_id FOR s IN usr.shipped_order_history END
This is for joining without array:
LetPath path = select("*,META(usr).id as _ID,META(usr).cas as _CAS).from(bucketName +" usr").join(bucketname +" orders").onKeys("usr.order_id)
How should I proceed with the above query for on keys array?
Thanks!!!!
As described in the docs on Querying from the SDK, you can use either a simple string with the Java SDK or use the DSL. For example:
// query with a simple string
System.out.println("Simple string query:");
N1qlQuery airlineQuery = N1qlQuery.simple("SELECT `travel-sample`.* FROM `travel-sample` WHERE name=\"United Airlines\" AND type=\"airline\"");
N1qlQueryResult queryResult = bucket.query(airlineQuery);
for (N1qlQueryRow result: queryResult) {
System.out.println(result.value());
}
//query with a parameter using the DSL
System.out.println("Parameterized query using the DSL:");
Statement statement = select(path(i("travel-sample"), "*")).from(i("travel-sample")).where(x("name").eq(x("$airline_param")).and(x("type").eq(s("airline"))));
JsonObject placeholderValues = JsonObject.create().put("airline_param", "United Airlines");
N1qlQuery airlineQueryParameterized = N1qlQuery.parameterized(statement, placeholderValues);
N1qlQueryResult queryResultParameterized = bucket.query(airlineQueryParameterized);
for (N1qlQueryRow row : queryResultParameterized) {
System.out.println(row);
}
(I posted a full gist of this example for the imports, etc.)
See the docs for more info, but you may want to use the DSL to allow IDE code completion and Java compile time checking. When developing an interactive web application, you'll probably also want to use parameterized statements (for security) and may even want prepared statements (for performance).

UpdateString not implemented by SQLite JDBC driver

I have a table PERSON with more than 5 millions rows and I need to update field NICKNAME on each one of them based on the field NAME inside the same table.
ResultSet rs = statement.executeQuery("select NAME from PERSON");
while(rs.next())
{
// some parsing function like:
// Nickname = myparsingfunction(rs.getString("NAME"));
rs.updateString( "NICKNAME", Nickname );
rs.updateRow();
}
But I got this error:
not implemented by SQLite JDBC driver
I'm using sqlite-jdbc-3.8.11.2.jar downloaded at https://bitbucket.org/xerial/sqlite-jdbc/downloads.
I know I could use the following SQL query:
statement.executeUpdate("update PERSONS set NICKNAME = Nickname where ID = Id");
But that would take forever and I understand updating ResultSet would be faster. So what options do I have to update the table on the fastest way? Any other driver available? Should I move out of Java?
UPDATE
I was able to find a fast solution using below syntax. The block between CASE and END was a concatenated string that I built before executing the SQL query, so I could send all updates at once.
update PERSON
set NICKNAME= case ID
when 173567 then 'blabla'
when 173568 then 'bleble'
...
when 173569 then 'blublu'
end
where ID in (173567, 173568, 173569)
As you have encountered, the SQLite JDBC driver does not currently support the updateString operation. This can be seen in the source code for this driver.
I can think of three options:
As you stated in your question, you can select the name and ID of the person and then update the person by its ID. Those updates could be done in a batch (using PreparedStatement.addBatch()) to improve performance (tutorial).
Implement the method myparsingfunction in pure SQL so that the query could become UPDATE PERSONS SET NICKNAME = some_function(NAME).
Create an user-defined function (using org.sqlite.Function), implemented in Java, and call it inside the SQL. Example, taken from this answer:
Function.create(db.getConnection(), "getNickName", new Function() {
protected void xFunc() throws SQLException {
String name = value_text(0);
String nickName = ...; // implement myparsingfunction here
result(nickName);
}
});
and use it like this: UPDATE PERSONS SET NICKNAME = getNickName(NAME);
SQLite does not support stored procedures so that option is out of the table.
I'm not sure which of these options would provide the best performance (certainly using pure SQL would be faster but that may not be a viable solution). You should benchmark each solution to find the one that fits you.

Categories