Auto-generating a unique varchar field - mySQL, Java, Hibernate - java

Background : I have a database table called Contact. All users of my system have details of their contacts in this table including a firstname, a lastname, and also a varchar field called 'UniqueId'. Users of the system may put anything in the UniqueId field, as long as it is unique from that user's other contact's unique ids.
Aim : I now need to change my code so a unique id is automatically generated if the user does not provide one. This should be short and visually pleasing. Ideally it could just be an auto-incrementing number. However, AUTO_INCREMENT works for an integer field, not a varchar field.
Also note that each contact UniqueId needs to be unique from the other contacts of that user, but not neccesarily unique to the entire system. Therefore, the following UniqueIds are valid :
Contact
UserId Firstname Lastname UniqueId
1 Bob Jones 1
1 Harold Smith 2
2 Joe Bloggs 1
Question : So, how can I achieve this? Is there a reliable and clean way to get the database to generate a unique id for each contact in the existing UniqueId varchar field (Which is my preference if possible)? Or am I forced to make Java go and get the next available unique id, and if so, what is the most reliable way of doing this? Or any alternative solution?
Edit - 11th April AM: We use hibernate to map our fields. I'm just beginning to research if that may provide an alternative solution? Any opinions?
Edit - 11th April PM: 2 options are currently standing out, but neither seem as ideal as I would like.
1. As #eis suggests, I could have an auto-incrementing field in addition to my current varchar field. Then, either when a contact is saved the int can also be saved in the varchar field, or when a contact is retrieved the int can be used if the varchar is empty. But it feels messy and wrong to use two fields rather than one
2. I am looking into using a hibernate generator, as discussed here. But this involves holding a count elsewhere of the next id, and Java code, and seems to massively overcomplicate the process.
If my existing uniqueId field had been an int field, AUTO_INCREMENT would simply work, and work nicely. Is there no way to make the database generate this but save it as a String?

I think what you really should do is ditch your current 'uniqueid' and generate new ones that are really unique across the system, being always autogenerated and never provided by the user. You would need to do separate work to migrate to the new system. That's the only way I see to keep it sane. User could provide something like an alias to be more visually pleasing, if needs be.
On the upside, you could use autoincrement then.
Ok, one additional option, if you really really want what you're asking. You could have a prefix like §§§§ that is never allowed for a user, and always autogenerate ids based on that, like §§§§1, §§§§2 etc. If you disallow anything starting with that prefix from the end user, you would know that there would be no collisions, and you could just generate them one-by-one whenever needed.
Sequences would be ideal to generate numbers to it. You don't have sequences in MySQL, but you could emulate them for example like this.

I apologize, I really don't know MySQL syntax, but here's how I'd do it in SQL Server. Hopefully that will still have some value to you. Basically, I'm just counting the number of existing contacts and returning it as a varchar.
CREATE FUNCTION GetNewUniqueId
(#UserId int)
RETURNS varchar(3)
AS
BEGIN
DECLARE #Count int;
SELECT #Count = COUNT(*)
FROM Contacts
WHERE UserId = #UserId;
SET #Count = #Count + 1;
RETURN CAST(#Count AS varchar(3));
END
But if you really want something "visually pleasing," why not try returning something more like Firstname + Lastname?
CREATE FUNCTION GetNewUniqueId
(#UserId int, #FirstName varchar(255), #LastName varchar(255))
RETURNS varchar(515)
AS
BEGIN
DECLARE #UniqueId varchar(515), #Count int;
SET #UniqueId = #FirstName + #LastName;
SELECT #Count = COUNT(*)
FROM Contacts
WHERE UserId = #UserId AND LEFT(UniqueId, LEN(#UniqueId)) = #UniqueId;
IF #Count > 0
SET #UniqueId = #UniqueId + '_' + CAST(#Count + 1 AS varchar(3));
RETURN #UniqueId;
END

Related

Any drawbacks of using ID instead of DATE column?

I will be storing the archival users' passwords in the ArchivalPassword table:
CREATE TABLE public.ArchivalPassword (
id SERIAL,
userid INTEGER NOT NULL,
content VARCHAR(100) NOT NULL,
CONSTRAINT archivalpassword_pkey PRIMARY KEY(id),
CONSTRAINT archivalpassword_user FOREIGN KEY (userid)
REFERENCES public.user(id)
ON DELETE CASCADE
ON UPDATE CASCADE
NOT DEFERRABLE
)
WITH (oids = false);
CREATE INDEX fki_archivalpassword_user ON public.archivalpassword
USING btree (userid);
For each user I store the limited number of the passwords (based on the archived.passwords.limit property). If the user changes the password I am fetching the archived passwords number from the ArchivalPassword table and if it is greater than limit I calculate how many have to be deleted and delete them.
The requirement is that I delete the oldest passwords. And the question is if I can make and assumption that the password with the lower ID is older than the one with greater ID? Or do I need to add the EXPIREDAT column (date), which will be used to determine which password is needed to be deleted (the one which has the oldest date in the EXPIREDAT column)?
Here is the hypothetical EXPIREDAT column definition:
expiredat TIMESTAMP(0) WITH TIME ZONE DEFAULT '2017-03-20 00:00:00+01' NOT NULL;
And the ID sequence definition:
CREATE SEQUENCE public.archivalpassword_id_seq
INCREMENT 1 MINVALUE 1
MAXVALUE 9223372036854775807 START 1
CACHE 1;
Can you see any drawbacks of using the ID column in the described case?
Assuming your id column is something like a BIGSERIAL then it has a sequence definition which is where it auto allocates the next id from. Under normal circumstances the id's will reliably be allocated in order based on user's changing their password. The sequence definition can however be manually changed so that it starts at a different number and if anyone did this then id numbers would no longer represent chronological order.
I would personally opt to use the EXPIREDAT column though as that will always be accurate and the intention is clear. Not sure why you say "but then i would have to sort the dates instead of the integers" - assuming you are letting Postgres do the sorting I'm not sure why you think there is much difference?
If you have many users then integer (serial data type in Postgres) is faster then a date and time (timestamp data type in Postgres) column to access the record. Not sure a date column would be good if password changes multiple times on the same day.

How to make a primary key start with a specific letter?

Here I am using MySQL and I want my primary key to start with a letter, like D000. Then everytime I enter a new record the primary key auto increments like so:
D001
D002
D003.
How can I do this?
You can't AUTO_INCREMENT a column whose type is VARCHAR.
What you could do is make it BIGINT and AUTO_INCREMENT, and whenever you need it as String, you can prepend it with your letter 'D' like:
Long dbKey = ...;
String key = "D" + dbKey;
You could create a stored procedure for this to set an "auto-incremented" string as the default value for this column, but it just doesn't worth the hassle. Plus working with numbers is always faster and more efficient than working with strings.
I'm not sure whether I get your question right, but shouldn't the following work?
CREATE TRIGGER myTrigger
BEFORE INSERT
ON myTable
FOR EACH ROW
BEGIN
SET NEW.myCustomId = COALESCE('D', RPAD('0',3,NEW.id));
END
for this case you NEED a "normal" primary key column..
Two ideas.
(Useless IMHO) I think Maria DB has virtual columns, though MySQL I think not. But you have views. So you could make a normal INT, AUTOINCREMENT and in the view have a calculated column concatting your key.
One can use different number ranges for different tables.
ALTER TABLE debtors AUTO_INCREMENT=10000;
ALTER TABLE creditors AUTO_INCREMENT=30000;
ALTER TABLE guests AUTO_INCREMENT=50000;
This admittedly is a lame solution, but might do. I think such a distinction might be what you are aiming at.
Not sure why you need it but you can add the D AFTER you fetched the data (String id = "D" + autoIncId;).
You can't insert a string or anything in an autoincrement field and I can't see anyway this can be useful (all the recorde will have a D, so no one has).
If you want to declare a row default, you can add a boolean column named DEFAULT.
while(rs.next()){
String id = rs.getBoolean("DEFAULT")?"D":"ND";
id+=rs.getLong(1);
}
EDIT
As per your comment I understand that you want to select the max ID and add 1 to it. Then it's ok to use an autoincrement field in your DB and it must be a number type (INTEGER, BIGINT...).
Please FORGET to add the "D" to your primary key, it will simply not going to work as you want. The autoincrement takes the last inserted ID and adds 1 to it. If your last id is "D3" adding 1 has the same meaning as adding 4 to "apple". You are using different types.
There is no way for SQL or any other programming language to understand that if you add 1 to "D3" it should become "D4". What you need to do is get rid of that D (which purpose I still don't understand).
Yo may try to do this aberration at your own risk:
INSERT INTO table (id, a, b, c)
VALUES ( fn_get_key( LAST_INSERT_ID("table_name ") +1), "a", "b", "c");
Where fn_get_key is a function that will convert the number into your desired string AND will execute:
ALTER TABLE table_name AUTO_INCREMENT = start_value;
Anyway I do not recommend your approach. Numeric strings are faster and easier to sort. You could always create a view that transforms the ID or use logic o change from the "D001" key to "1". Foreing key and uniqness of ids enforcement will be harder and more expensive

How to determine if a table contains a value in SQL?

I feel like I'm missing something very obvious here, but it seems that the only way to go about doing this is to get the value, and then see if it returns a null (empty) value, which I would rather not do.
Is there an equivalent to List.contains(Object o) in SQL? Or perhaps the JDBC has something of that nature? If so, what is it?
I am using Microsoft Access 2013.
Unfortunately I don't have any useful code to show, but here is the gist of what I am trying to do. It isn't anything unique at all. I want to have a method (Java) that returns the values of a user that are stored in the database. If the user has not previously been added to the database, the user should be added, and the default values of the user should be set. Then those newly created values will be returned. If a player has already been added to the database (with the username as the primary key), I don't want to overwrite the data that is already there.
I would also advise against using MS Access for this purpose, but if you are familiar with MS Office applications, the familiar UI/UX structure might help you get your footing and require less time to learn other database environments. However, MS Access tends to be quite limited, and I would advise considering alternative options if available.
The only way to see if an SQL table contains a row with some condition on a column is to actually make an SQL query. I don't see why you wouldn't do that. Just make sure that you have an index on the column that you will be constraining the results on. Also for better speed use count to prevent from retrieving all the data from the rows.
SELECT count(*) FROM foos WHERE bar = 'baz'
Assuming you have an index on the bar column this query should be pretty fast and all you have to do is check whether it returns > 0. If it does then you have rows matching your criteria.
You can use "IF EXISTS" which returns a boolean value of 1 or 0.
select
if(
exists( select * from date1 where current_date()>now() ),
'today > now',
'today is not > now'
) as 'today > now ?' ;
+--------------------+
| today > now? |
+--------------------+
| today is not > now |
+--------------------+
1 row in set (0.00 sec)
Another Example:
SELECT IF(
EXISTS( SELECT col from tbl where id='n' ),
colX, colY
) AS 'result'
FROM TBL;
I'm also new to sql and I'm using Oracle.
In Oracle, suppose we have: TYPE: value.
We can use:
where value not in (select TYPE from table)
to make sure value not exist in the column TYPE of the table.
Don't know if it helps.
You can simply use Query with condition.
For example if you have to check records with particular coloumn, you can use where condition
select * from table where column1 = 'checkvalue'
You can use count property to check the no. of records existing with your specified conditon
select count(*) from table where column1 = 'checkvalue'
I have created the following method, which to my knowledge works perfectly. (Using the java.sql package)
public static containsUser(String username)
{
//connection is the Connection object used to connect to my Access database.
Statement statement = this.connection.createStatement();
//"Users" is the name of the table, "Username" is the primary key.
String sql = "SELECT * FROM Users WHERE Username = '" + username + "'";
Result result = statement.executeQuery(sql);
//There is no need for a loop because the primary key is unique.
return result.next();
}
It's an extremely simple and extremely basic method, but hopefully it might help someone in the future.
If there is anything wrong with it, please let me know. I don't want anyone learning from or using poorly written code.
IMPORTANT EDIT: It is now over half a decade after I wrote the above content (both question and answer), and I now advise against the solution I illustrated above.
While it does work, it prioritizes a "Java-mindset-friendly" approach to SQL. In short, it is typically a bad idea to migrate paradigms and mindsets of one language to another, as it is inevitable that you will eventually find yourself trying to fit a square peg into a round hole. The only way to make that work is to shave the corners off the square. The peg will then of course fit, but as you can imagine, starting with a circle peg in the first place would have been the better, cleaner, and less messy solution.
Instead, refer to the above upvoted answers for a more realistic, enterprise-friendly solution to this problem, especially as I imagine the people reading this are likely in a similar situation as I was when I originally wrote this.

How to retrieve only the information that got changed from Cassandra?

I am working on designing the Cassandra Column Family schema for my below use case.. I am not sure what is the best way to design the cassandra column family for my below use case? I will be using CQL Datastax Java driver for this..
Below is my use case and the sample schema that I have designed for now -
SCHEMA_ID RECORD_NAME SCHEMA_VALUE TIMESTAMP
1 ABC some value t1
2 ABC some_other_value t2
3 DEF some value again t3
4 DEF some other value t4
5 GHI some new value t5
6 IOP some values again t6
Now what I will be looking from the above table is something like this -
For the first time whenever my application is running, I will ask for everything from the above table.. Meaning give me everything from the above table..
Then every 5 or 10 minutes, my background thread will be checking this table and will ask for give me everything that has changed only (full row if anything got changed for that row).. so that is the reason I am using timestamp as one of the column here..
But I am not sure how to design the query pattern in such a way such that both of my use cases gets satisfied easily and what will be the proper way of designing the table for this? Here SCHEMA_ID will be primary key I am thinking to use...
I will be using CQL and Datastax Java driver for this..
Update:-
If I am using something like this, then is there any problem with this approach?
CREATE TABLE TEST (SCHEMA_ID TEXT, RECORD_NAME TEXT, SCHEMA_VALUE TEXT, LAST_MODIFIED_DATE TIMESTAMP, PRIMARY KEY (ID));
INSERT INTO TEST (SCHEMA_ID, RECORD_NAME, SCHEMA_VALUE, LAST_MODIFIED_DATE) VALUES ('1', 't26', 'SOME_VALUE', 1382655211694);
Because, in my this use case, I don't want anybody to insert same SCHEMA_ID everytime.. SCHEMA_ID should be unique whenever we are inserting any new row into this table.. So with your example (#omnibear), it might be possible, somebody can insert same SCHEMA_ID twice? Am I correct?
And also regarding type you have taken as an extra column, that type column can be record_name in my example..
Regarding 1)
Cassandra is used for heavy writing, lots of data on multiple nodes. To retrieve ALL data from this kind of set-up is daring since this might involve huge amounts that have to be handled by one client. A better approach would be to use pagination. This is natively supported in 2.0.
Regarding 2)
The point is that partition keys only support EQ or IN queries. For LT or GT (< / >) you use column keys. So if it makes sense to group your entries by some ID like "type", you can use this for your partition key, and a timeuuid as a column key. This allows to query for all entries newer than X like so
create table test
(type int, SCHEMA_ID int, RECORD_NAME text,
SCHEMA_VALUE text, TIMESTAMP timeuuid,
primary key (type, timestamp));
select * from test where type IN (0,1,2,3) and timestamp < 58e0a7d7-eebc-11d8-9669-0800200c9a66;
Update:
You asked:
somebody can insert same SCHEMA_ID twice? Am I correct?
Yes, you can always make an insert with an existing primary key. The values at that primary key will be updated. Therefore, to preserve uniqueness, a UUID is often used in the primary key, for instance, timeuuid. It is a unique value containing a timestamp and the MAC address of the client. There is excellent documentation on this topic.
General advice:
Write down your queries first, then design your model. (Use case!)
Your queries define your data model which in turn is primarily defined by your primary keys.
So, in your case, I'd just adapt my schema above, like so:
CREATE TABLE TEST (SCHEMA_ID TEXT, RECORD_NAME TEXT, SCHEMA_VALUE TEXT,
LAST_MODIFIED_DATE TIMEUUID, PRIMARY KEY (RECORD_NAME, LAST_MODIFIED_DATE));
Which allows this query:
select * from test where RECORD_NAME IN ("componentA","componentB")
and LAST_MODIFIED_DATE < 1688f180-4141-11e3-aa6e-0800200c9a66;
the uuid corresponds to -> Wednesday, October 30, 2013 8:55:55 AM GMT
so you would fetch everything after that

Hibernate and padding on CHAR primary key column in Oracle

I'm having a little trouble using Hibernate with a char(6) column in Oracle. Here's the structure of the table:
CREATE TABLE ACCEPTANCE
(
USER_ID char(6) PRIMARY KEY NOT NULL,
ACCEPT_DATE date
);
For records whose user id has less than 6 characters, I can select them without padding the user id when running queries using SQuirreL. I.E. the following returns a record if there's a record with a user id of "abc".
select * from acceptance where user_id = "abc"
Unfortunately, when doing the select via Hibernate (JPA), the following returns null:
em.find(Acceptance.class, "abc");
If I pad the value though, it returns the correct record:
em.find(Acceptance.class, "abc ");
The module that I'm working on gets the user id unpadded from other parts of the system. Is there a better way to get Hibernate working other than putting in code to adapt the user id to a certain length before giving it to Hibernate? (which could present maintenance issues down the road if the length ever changes)
That's God's way of telling you to never use CHAR() for primary key :-)
Seriously, however, since your user_id is mapped as String in your entity Hibernate's Oracle dialect translates that into varchar. Since Hibernate uses prepared statements for all its queries, that semantics carries over (unlike SQuirreL, where the value is specified as literal and thus is converted differently).
Based on Oracle type conversion rules column value is then promoted to varchar2 and compared as such; thus you get back no records.
If you can't change the underlying column type, your best option is probably to use HQL query and rtrim() function which is supported by Oracle dialect.
How come that your module gets an unpadded value from other parts of the system?
According to my understanding, if the other part of the system don't alter the PK, they should read 6 chars from the db and pass 6 chars all along the way -- that would be ok. The only exception would be when a PK is generated, in which case it may need to be padded.
You can circumvent the problem (by trimming or padding the value each time it's necessary), but it won't solve the problem upfront that your PK is not handled consistently. To solve the problem upfront you must eiher
always receive 6 chars from the other parts of the module
use varchar2 to deal with dynamic size correctly
If you can't solve the problem upfront, then you will indeed need to either
add trimming/padding all around the place when necessary
add trimming/padding in the DAO if you have one
add trimming/padding in the user type if this works (suggestion from N. Hughes)

Categories