I have a general question about how best to build an API that can modify records in a database.
Suppose we have a table with 10 columns and we can query these 10 columns using REST (GET). The JSON response will contain all 10 fields. This is easy and works without problems.
The next step is that someone wants to create a new record via POST. In this case the person sends only 8 of the 10 fields in the JSON Request. We would then only fill the 8 fields in the database (the rest would be NULL). This also works without problems.
But what happens if someone wants to update a record? We see here different possibilities with advantages and disadvantages.
Only what should be updated is sent.
Problem: How can you explicitly empty / delete a field? If a "NULL" is passed in the JSON, we get NULL in the object, but any other field that is not passed is NULL as well. Therefore we cannot distinguish which field can be deleted and which field cannot be touched.
The complete object is sent.
Problem: Here the object could be fetched via a GET before, changed accordingly and returned via PUT. Now we get all information back and could write the information directly back into the database. Because empty fields were either already empty before or were cleared by the user.
What happens if the objects are extended by an update of the API. Suppose we extend the database by five more fields. The user of the API makes a GET, gets the 15 fields, but can only read the 10 fields he knows on his page (because he hasn't updated his side yet). Then he changes some of the 10 fields and sends them back via PUT. We would then update only the 10 fields on our site and the 5 new fields would be emptied from the database.
Or do you have to create a separate endpoint for each field? We have also thought about creating a map with key / value, what exactly should be changed.
About the technique: We use the Wildfly 15 with Resteasy and Jackson.
For example:
Database at the beginning
+----+----------+---------------+-----+--------+-------+
| ID | Name | Country | Age | Weight | Phone |
+----+----------+---------------+-----+--------+-------+
| 1 | Person 1 | Germany | 22 | 60 | 12345 |
| 2 | Person 2 | United States | 32 | 78 | 56789 |
| 3 | Person 3 | Canada | 52 | 102 | 99999 |
+----+----------+---------------+-----+--------+-------+
GET .../person/2
{
"id" : 2,
"name" : "Person 2",
"country" : "United States",
"age" : 22,
"weight" :62,
"phone": "56789"
}
Now I want to update his weight and remove the phone number
PUT .../person/2
{
"id" : 2,
"name" : "Person 2",
"country" : "United States",
"age" : 22,
"weight" :78
}
or
{
"id" : 2,
"name" : "Person 2",
"country" : "United States",
"age" : 22,
"weight" :78,
"phone" : null
}
Now the database should look like this:
+----+----------+---------------+-----+--------+-------+
| ID | Name | Country | Age | Weight | Phone |
+----+----------+---------------+-----+--------+-------+
| 1 | Person 1 | Germany | 22 | 60 | 12345 |
| 2 | Person 2 | United States | 32 | 78 | NULL |
| 3 | Person 3 | Canada | 52 | 102 | 99999 |
+----+----------+---------------+-----+--------+-------+
The problem is
We extend the table like this (salery)
+----+----------+---------------+-----+--------+--------+-------+
| ID | Name | Country | Age | Weight | Salery | Phone |
+----+----------+---------------+-----+--------+--------+-------+
| 1 | Person 1 | Germany | 22 | 60 | 1929 | 12345 |
| 2 | Person 2 | United States | 32 | 78 | 2831 | NULL |
| 3 | Person 3 | Canada | 52 | 102 | 3921 | 99999 |
+----+----------+---------------+-----+--------+--------+-------+
The person using the API does not know that there is a new field in JSON for the salary. And this person now wants to change the phone number of someone again, but does not send the salary. This would also empty the salary:
{
"id" : 3,
"name" : "Person 3",
"country" : "Cananda",
"age" : 52,
"weight" :102,
"phone" : null
}
+----+----------+---------------+-----+--------+--------+-------+
| ID | Name | Country | Age | Weight | Salery | Phone |
+----+----------+---------------+-----+--------+--------+-------+
| 1 | Person 1 | Germany | 22 | 60 | 1929 | 12345 |
| 2 | Person 2 | United States | 32 | 78 | 2831 | NULL |
| 3 | Person 3 | Canada | 52 | 102 | NULL | NULL |
+----+----------+---------------+-----+--------+--------+-------+
And salary should not be null, because it was not set inside the JSON request
You could deserialize your JSON to a Map.
This way, if a property has not been sent, the property is not present in the Map. If its null, its inside the map will a null value.
ObjectMapper mapper = new ObjectMapper();
TypeReference<HashMap<String, Object>> typeReference = new TypeReference<>() {};
HashMap<String, Object> jsonMap = mapper.readValue(json, typeReference);
jsonMap.entrySet().stream().map(Map.Entry::getKey).forEach(System.out::println);
Not a very convenient solution, but it might work for you.
A common technique is to track changes on the entity POJO.
Load Dog with color = black, size = null and age = null
Set size to null (the setter will mark this field as changed)
Run update SQL
The POJO will have an internal state knowning that size was changed, and thus include that field in the UPDATE. age, on the other hand, was never set, and is thus left unchanged. jOOQ works like that, I'm sure there's others.
Only what should be updated is sent. Problem: How can you explicitly empty / delete a field? If a "NULL" is passed in the JSON, we get NULL in the object, but any other field that is not passed is NULL as well. Therefore we cannot distinguish which field can be deleted and which field cannot be touched.
The problem you have identified is genuine; I have faced this too. I think it is reasonable to not provide a technical solution for this, but rather document the API usage to let the caller know the impact of leaving out a field or sending it as null. Of course, assuming that the validations on the server side are tight and ensure sanity.
The complete object is sent. Problem: Here the object could be fetched via a GET before, changed accordingly and returned via PUT. Now we get all information back and could write the information directly back into the database. Because empty fields were either already empty before or were cleared by the user.
This is "straighter-forward" and should be documented in the API.
What happens if the objects are extended by an update of the API.
With the onus put on the caller through the documentation, this too is handled implicitly.
Or do you have to create a separate endpoint for each field?
This, again, is a design issue, the solution to which varies from person-to-person. I would rather retain the API at a record level than at the level of individual value. However, there may be cases where they are needed to be that way. Eg, status updates.
Suppose we extend the database by five more fields. The user of the API makes a GET, gets the 15 fields, but can only read the 10 fields he knows on his page (because he hasn't updated his side yet). Then he changes some of the 10 fields and sends them back via PUT. We would then update only the 10 fields on our site and the 5 new fields would be emptied from the database.
So let's start with an example - what would happen on the web, where clients are interacting with your API via HTML rendered in browsers. The client would GET a form, and that form would have input controls for each of the fields. Client updates the fields in the form, submits it, and you apply those changes to your database.
When you want to extend the API to include more fields, you add those fields to the form. The client doesn't know about those fields. So what happens?
One way to manage this is that you make sure that you include in the form the correct default values for the new fields; then, if the client ignores the new fields, the correct value will be returned when the form is submitted.
More generally, the representations we exchange in our HTTP payloads are messages; if we want to support old clients, then we need the discipline of evolving the message schema in a backwards compatible way, and our clients have to be written with the understanding that the message schema may be extended with additional fields.
The person using the API does not know that there is a new field in JSON for the salary.
The same idea holds here - the new representation includes a field "salary" that the client doesn't know about, so it is the responsibility of the client to forward that data back to you unchanged, rather than just dropping it on the floor assuming it is unimportant.
There's a bunch of prior art on this from 15-20 years ago, because people writing messages in XML were facing exactly the same sort of problems. They have left some of their knowledge behind. The easiest way to find it is to search for some key phases; for instance must ignore or must forward.
See:
Versioning XML Vocabularies
Extensibility, XML Vocabularies, and XML Schema
Events in an event store have the same kinds of problems. Greg Young's book Versioning in an Event Sourced System covers a lot of the same ground (representations of events are also messages).
The accepted answer works well but it has a huge caveat which is that it's completely untyped. If the object's fields change then you'll have no compile time warning that you're looking for the wrong fields.
Therefore I would argue that it's better to force all fields to be present in the request body. Therefore a null means the user explicitly set it to null while if the user misses a field they'll receive a 400 Bad Request with the request body describing the error in detail.
Here's a great post on how to achieve this: Configure Jackson to throw an exception when a field is missing
Here's my example in Kotlin:
data class PlacementRequestDto(
val contentId: Int,
#param:JsonProperty(required = true)
val tlxPlacementId: Int?,
val keywords: List<Int>,
val placementAdFormats: List<Int>
)
Notice that the nullable field is marked as required. This way the user has to explicitly include it in the request body.
You can control empty or null values as below
public class Person{
#JsonInclude(JsonInclude.Include.NON_NULL)
private BigDecimal salary; // this will make sure salary can't be null or empty//
private String phone; //allow phone Number to be empty
// same logic for other fields
}
i) As you're updating weight and removing the phone number,Ask client to send fields which needs to updated along with record identifier i.e id in this case
{
"id" : 2,
"weight" :78,
"phone" : null
}
ii) As you're adding salary as one more column which is mandatory field & client should be aware of it..may be you have to redesign contract
Suppose I have the following tables, in an Oracle DB
Foo:
+--------+---------+---------+
| id_foo | string1 | string2 |
+--------+---------+---------+
| 1 | foo | bar |
| 2 | baz | bat |
+--------+---------+---------+
Bar:
+--------+-----------+--------+
| id_bar | id_foo_fk | string |
+--------+-----------+--------+
| 1 | 1 | boo |
| 2 | 1 | bum |
+--------+-----------+--------+
When I insert into Foo, by using a Dataset and JDBC, such as
Dataset<Row> fooDataset = //Dataset is initialized
fooDataset.write().mode(SaveMode.Append).jdbc(url, table, properties)
an ID is auto-generated by the database. Now when I need to save Bar, using the same strategy, I want to be able to link it to Foo, via id_foo_fk.
I looked into some possibilities, such as using monotonically_increasing_id() as suggested in this question, but it won't solve the issue, as I need the ID generated by the database. I tried what was suggested in this question, but it leads to the same issue, of unique non-database IDs
It's also not possible to select from the JDBC again, as string1 and string2 may not be unique. Nor is it possible to change the database. For instance, I can't change it to be UUID, and I can't add a trigger for it. It's a legacy database that we can only use
How can I achieve this? Is this possible with Apache Spark?
I'm not a Java specialist so you will have to look into the database layer on how to proceed exactly but there are 3 ways you can do this:
You can create a store procedure if the database server you are using is capable of (most do) and call it from your code.
Create a trigger that returns the id number on the first insertion and use it in your next DB insertion.
Use UUID and use this as the key instead of the database auto generated key.
Please bear with me. I am trying to unravel a mystery I've seen in last few days.
I have a following scenario. I have a table named "Bar" with three columns as "A", "B", and "C" in oracle DB. Please refer to figure 1.
I deployed new DDL changing its data structure so that it has columns as "A" and "D" (dropping "B" and "C"). Please refer to figure 2.
Now I have a web application (JSF + Spring Data JPA + Hibernate) that has java entity domain object called "Bar.java" which maps to the table "Bar". However this entity model's fields are not updated yet. It still has "A","B" and "C" fields (columns). If I deployed the application to an app server (e.g. weblogic), would hibernate framework alter the actual table by adding back "A","B" and "C" columns back? Please refer to figure 3.
Figure 1.
+-----------+
| A | B | C |
+-----------+
Figure 2.
+-------+
| A | D |
+-------+
Figure 3.
+---------------+
| A | B | C | D |
+---------------+
In a shop where I am, there are multiple developers are working together and some times their local working branch does not have others' latest commits. I am in the process of investigating why the table "Bar" is keep getting altered. Let's say I verified the table is in figure 2. state yesterday but today morning it is switched to figure 3.
My guess is some of the developers working branch still has old entity model and he/she not aware of it, works on their part of code, deploys the app and the table is altered and exception is thrown in the app server.
Can someone validate this assumption?
[update]
John Bollinger ans sbjavateam were right. I found a xml (not persistence.xml) that contains hibernate.hbm2ddl.auto = update which in turn updates a table.
hibernate has config option hibernate.hbm2ddl.auto with list of possible options :
validate: validate the schema, makes no changes to the database.
update: update the schema.
create: creates the schema, destroying previous data.
create-drop: drop the schema when the SessionFactory is
closed explicitly, typically when the application is stopped.
somebody can run appilcation with hibernate.hbm2ddl.auto=update and all columnts(tables) will be added
I have created simple entity with Hibernate with #Lob String field. Everything works fine in Java, however I am not able to check the values directly in DB with psql or pgAdmin.
Here is the definition from DB:
=> \d+ user_feedback
Table "public.user_feedback"
Column | Type | Modifiers | Storage | Stats target | Description
--------+--------+-----------+----------+--------------+-------------
id | bigint | not null | plain | |
body | text | | extended | |
Indexes:
"user_feedback_pkey" PRIMARY KEY, btree (id)
Has OIDs: no
And here is that I get from select:
=> select * from user_feedback;
id | body
----+-------
34 | 16512
35 | 16513
36 | 16514
(3 rows)
The actual "body" content is for all rows "normal" text, definitely not these numbers.
How to retrieve actual value of body column from psql?
This will store the content of LOB 16512 in file out.txt :
\lo_export 16512 out.txt
Although using #Lob is usually not recommended here (database backup issues ...). See store-strings-of-arbitrary-length-in-postgresql for alternatives.
Hibernate is storing the values as out-of-line objects in the pg_largeobject table, and storing the Object ID for the pg_largeobject entry in your table. See PostgreSQL manual - large objects.
It sounds like you expected inline byte array (bytea) storage instead. If so, you may want to map a byte[] field without a #Lob annotation, rather than a #Lob String. Note that this change will not be backward compatible - you'll have to export your data from the database then drop the table and re-create it with Hibernate's new definition.
The selection of how to map your data is made by Hibernate, not PostgreSQL.
See related:
proper hibernate annotation for byte[]
How to store image into postgres database using hibernate
I have parent-child mapping in hibernate where the entities are connected through table.
The problem is that column automatically created by hibernate in this table is called like "_actions_id". But I use Oracle and it says that column name "_actions_id" is invalid.
It works fine when I wrap the name with "" and execute the script manually, but is there a way to make hibernate to wrap all columns with "" ?
In your example, you specified a join table, which is for scenarios like this
People table:
PID | Name
1 | Albert
2 | Bob
TelephoneNumbers table:
TID | Tel
1 | 123-456
2 | 456-789
3 | 789-012
Join table:
PID | TID
1 | 1
1 | 2
2 | 3
I.e. the column that connects the current entity to the entity in the collection is in neither the current table nor the table for the collection entity. This is more useful for the many-to-many mapping, but you can also use it for OneToMany if you don't have control over the TelephoneNumbers table for example. Otherwise you should just use plain #JoinColumn.
The usage of #JoinTable has been explained many times by many websites. See the JavaDoc and this question.
I think you want a custom NamingStrategy. I got the idea here. In your case, it would be something like:
public class MyNamingStrategy extends DefaultNamingStrategy {
public String logicalCollectionColumnName(String columnName, String propertyName, String referencedColumn) {
return "`" + super.logicalCollectionColumnName(columnName, propertyName, referencedColumn + "`";
}
}