I am trying to create a new database and new table using Mybatis and SQLite. I found from previous answers (1, 2, 3) that Mybatis does support using CREATE and ALTER statements, by marking them as "UPDATE" within Mybatis mapper syntax. However, those questions/answers were using Mapper XML whereas I'm using annotations, and also none were using SQLite.
SQLite creates a new database as soon as you open a new connection to it, so it doesn't matter if the DB exists before or not. A new database is created with a size of zero bytes, which is fine (SQLite treats a 0 byte file as an empty database). But after the table creation I would expect the database size to be non-zero as it stores the table structure for that table. After running my code which I think should create the table (I'm checking my syntax against this answer), the database size still reads as 0 bytes, which says to me that the table has not actually been created. What am I doing wrong?
My Java code to test this scenario:
public class Example {
public static void main(String[] args) {
String userHomePath = System.getProperty("user.home");
File exampleDb = new File(userHomePath, "example.sqlite3");
String jdbcConnectionString = "jdbc:sqlite:" + exampleDb.getAbsolutePath();
DataSource dataSource = new PooledDataSource("org.sqlite.JDBC", jdbcConnectionString, null, null);
Environment environment = new Environment("Main", new JdbcTransactionFactory(), dataSource);
Configuration configuration = new Configuration(environment);
configuration.addMapper(GenericMapper.class);
SqlSessionFactoryBuilder builder = new SqlSessionFactoryBuilder();
SqlSessionFactory sessionFactory = builder.build(configuration);
try (SqlSession session = sessionFactory.openSession()) {
GenericMapper genericMapper = session.getMapper(GenericMapper.class);
genericMapper.createExampleTableIfMissing();
}
}
}
My mapper:
public interface GenericMapper {
#Update("CREATE TABLE IF NOT EXISTS extbl (id INTEGER PRIMARY KEY AUTOINCREMENT)")
void createExampleTableIfMissing();
}
Checking the file after this code has run:
C:\Users\me>dir example.sqlite3
Volume in drive C is Windows
Volume Serial Number is D4DE-B46A
Directory of C:\Users\me
12/04/2021 18:14 0 example.sqlite3
1 File(s) 0 bytes
0 Dir(s) 27,326,779,392 bytes free
C:\Users\me>
I am running data.bat file with the following lines:
Rem Tis batch file will populate tables
cd\program files\Microsoft SQL Server\MSSQL
osql -U sa -P Password -d MyBusiness -i c:\data.sql
The contents of the data.sql file is:
insert Customers
(CustomerID, CompanyName, Phone)
Values('101','Southwinds','19126602729')
There are 8 more similar lines for adding records.
When I run this with start > run > cmd > c:\data.bat, I get this error message:
1>2>3>4>5>....<1 row affected>
Msg 8152, Level 16, State 4, Server SP1001, Line 1
string or binary data would be truncated.
<1 row affected>
<1 row affected>
<1 row affected>
<1 row affected>
<1 row affected>
<1 row affected>
Also, I am a newbie obviously, but what do Level #, and state # mean, and how do I look up error messages such as the one above: 8152?
From #gmmastros's answer
Whenever you see the message....
string or binary data would be truncated
Think to yourself... The field is NOT big enough to hold my data.
Check the table structure for the customers table. I think you'll find that the length of one or more fields is NOT big enough to hold the data you are trying to insert. For example, if the Phone field is a varchar(8) field, and you try to put 11 characters in to it, you will get this error.
I had this issue although data length was shorter than the field length.
It turned out that the problem was having another log table (for audit trail), filled by a trigger on the main table, where the column size also had to be changed.
In one of the INSERT statements you are attempting to insert a too long string into a string (varchar or nvarchar) column.
If it's not obvious which INSERT is the offender by a mere look at the script, you could count the <1 row affected> lines that occur before the error message. The obtained number plus one gives you the statement number. In your case it seems to be the second INSERT that produces the error.
Just want to contribute with additional information: I had the same issue and it was because of the field wasn't big enough for the incoming data and this thread helped me to solve it (the top answer clarifies it all).
BUT it is very important to know what are the possible reasons that may cause it.
In my case i was creating the table with a field like this:
Select '' as Period, * From Transactions Into #NewTable
Therefore the field "Period" had a length of Zero and causing the Insert operations to fail. I changed it to "XXXXXX" that is the length of the incoming data and it now worked properly (because field now had a lentgh of 6).
I hope this help anyone with same issue :)
Some of your data cannot fit into your database column (small). It is not easy to find what is wrong. If you use C# and Linq2Sql, you can list the field which would be truncated:
First create helper class:
public class SqlTruncationExceptionWithDetails : ArgumentOutOfRangeException
{
public SqlTruncationExceptionWithDetails(System.Data.SqlClient.SqlException inner, DataContext context)
: base(inner.Message + " " + GetSqlTruncationExceptionWithDetailsString(context))
{
}
/// <summary>
/// PArt of code from following link
/// http://stackoverflow.com/questions/3666954/string-or-binary-data-would-be-truncated-linq-exception-cant-find-which-fiel
/// </summary>
/// <param name="context"></param>
/// <returns></returns>
static string GetSqlTruncationExceptionWithDetailsString(DataContext context)
{
StringBuilder sb = new StringBuilder();
foreach (object update in context.GetChangeSet().Updates)
{
FindLongStrings(update, sb);
}
foreach (object insert in context.GetChangeSet().Inserts)
{
FindLongStrings(insert, sb);
}
return sb.ToString();
}
public static void FindLongStrings(object testObject, StringBuilder sb)
{
foreach (var propInfo in testObject.GetType().GetProperties())
{
foreach (System.Data.Linq.Mapping.ColumnAttribute attribute in propInfo.GetCustomAttributes(typeof(System.Data.Linq.Mapping.ColumnAttribute), true))
{
if (attribute.DbType.ToLower().Contains("varchar"))
{
string dbType = attribute.DbType.ToLower();
int numberStartIndex = dbType.IndexOf("varchar(") + 8;
int numberEndIndex = dbType.IndexOf(")", numberStartIndex);
string lengthString = dbType.Substring(numberStartIndex, (numberEndIndex - numberStartIndex));
int maxLength = 0;
int.TryParse(lengthString, out maxLength);
string currentValue = (string)propInfo.GetValue(testObject, null);
if (!string.IsNullOrEmpty(currentValue) && maxLength != 0 && currentValue.Length > maxLength)
{
//string is too long
sb.AppendLine(testObject.GetType().Name + "." + propInfo.Name + " " + currentValue + " Max: " + maxLength);
}
}
}
}
}
}
Then prepare the wrapper for SubmitChanges:
public static class DataContextExtensions
{
public static void SubmitChangesWithDetailException(this DataContext dataContext)
{
//http://stackoverflow.com/questions/3666954/string-or-binary-data-would-be-truncated-linq-exception-cant-find-which-fiel
try
{
//this can failed on data truncation
dataContext.SubmitChanges();
}
catch (SqlException sqlException) //when (sqlException.Message == "String or binary data would be truncated.")
{
if (sqlException.Message == "String or binary data would be truncated.") //only for EN windows - if you are running different window language, invoke the sqlException.getMessage on thread with EN culture
throw new SqlTruncationExceptionWithDetails(sqlException, dataContext);
else
throw;
}
}
}
Prepare global exception handler and log truncation details:
protected void Application_Error(object sender, EventArgs e)
{
Exception ex = Server.GetLastError();
string message = ex.Message;
//TODO - log to file
}
Finally use the code:
Datamodel.SubmitChangesWithDetailException();
Another situation in which you can get this error is the following:
I had the same error and the reason was that in an INSERT statement that received data from an UNION, the order of the columns was different from the original table. If you change the order in #table3 to a, b, c, you will fix the error.
select a, b, c into #table1
from #table0
insert into #table1
select a, b, c from #table2
union
select a, c, b from #table3
on sql server you can use SET ANSI_WARNINGS OFF like this:
using (SqlConnection conn = new SqlConnection("Data Source=XRAYGOAT\\SQLEXPRESS;Initial Catalog='Healthy Care';Integrated Security=True"))
{
conn.Open();
using (var trans = conn.BeginTransaction())
{
try
{
using cmd = new SqlCommand("", conn, trans))
{
cmd.CommandText = "SET ANSI_WARNINGS OFF";
cmd.ExecuteNonQuery();
cmd.CommandText = "YOUR INSERT HERE";
cmd.ExecuteNonQuery();
cmd.Parameters.Clear();
cmd.CommandText = "SET ANSI_WARNINGS ON";
cmd.ExecuteNonQuery();
trans.Commit();
}
}
catch (Exception)
{
trans.Rollback();
}
}
conn.Close();
}
I had the same issue. The length of my column was too short.
What you can do is either increase the length or shorten the text you want to put in the database.
Also had this problem occurring on the web application surface.
Eventually found out that the same error message comes from the SQL update statement in the specific table.
Finally then figured out that the column definition in the relating history table(s) did not map the original table column length of nvarchar types in some specific cases.
I had the same problem, even after increasing the size of the problematic columns in the table.
tl;dr: The length of the matching columns in corresponding Table Types may also need to be increased.
In my case, the error was coming from the Data Export service in Microsoft Dynamics CRM, which allows CRM data to be synced to an SQL Server DB or Azure SQL DB.
After a lengthy investigation, I concluded that the Data Export service must be using Table-Valued Parameters:
You can use table-valued parameters to send multiple rows of data to a Transact-SQL statement or a routine, such as a stored procedure or function, without creating a temporary table or many parameters.
As you can see in the documentation above, Table Types are used to create the data ingestion procedure:
CREATE TYPE LocationTableType AS TABLE (...);
CREATE PROCEDURE dbo.usp_InsertProductionLocation
#TVP LocationTableType READONLY
Unfortunately, there is no way to alter a Table Type, so it has to be dropped & recreated entirely. Since my table has over 300 fields (😱), I created a query to facilitate the creation of the corresponding Table Type based on the table's columns definition (just replace [table_name] with your table's name):
SELECT 'CREATE TYPE [table_name]Type AS TABLE (' + STRING_AGG(CAST(field AS VARCHAR(max)), ',' + CHAR(10)) + ');' AS create_type
FROM (
SELECT TOP 5000 COLUMN_NAME + ' ' + DATA_TYPE
+ IIF(CHARACTER_MAXIMUM_LENGTH IS NULL, '', CONCAT('(', IIF(CHARACTER_MAXIMUM_LENGTH = -1, 'max', CONCAT(CHARACTER_MAXIMUM_LENGTH,'')), ')'))
+ IIF(DATA_TYPE = 'decimal', CONCAT('(', NUMERIC_PRECISION, ',', NUMERIC_SCALE, ')'), '')
AS field
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = '[table_name]'
ORDER BY ORDINAL_POSITION) AS T;
After updating the Table Type, the Data Export service started functioning properly once again! :)
When I tried to execute my stored procedure I had the same problem because the size of the column that I need to add some data is shorter than the data I want to add.
You can increase the size of the column data type or reduce the length of your data.
A 2016/2017 update will show you the bad value and column.
A new trace flag will swap the old error for a new 2628 error and will print out the column and offending value. Traceflag 460 is available in the latest cumulative update for 2016 and 2017:
https://support.microsoft.com/en-sg/help/4468101/optional-replacement-for-string-or-binary-data-would-be-truncated
Just make sure that after you've installed the CU that you enable the trace flag, either globally/permanently on the server:
...or with DBCC TRACEON:
https://learn.microsoft.com/en-us/sql/t-sql/database-console-commands/dbcc-traceon-trace-flags-transact-sql?view=sql-server-ver15
Another situation, in which this error may occur is in
SQL Server Management Studio. If you have "text" or "ntext" fields in your table,
no matter what kind of field you are updating (for example bit or integer).
Seems that the Studio does not load entire "ntext" fields and also updates ALL fields instead of the modified one.
To solve the problem, exclude "text" or "ntext" fields from the query in Management Studio
This Error Comes only When any of your field length is greater than the field length specified in sql server database table structure.
To overcome this issue you have to reduce the length of the field Value .
Or to increase the length of database table field .
If someone is encountering this error in a C# application, I have created a simple way of finding offending fields by:
Getting the column width of all the columns of a table where we're trying to make this insert/ update. (I'm getting this info directly from the database.)
Comparing the column widths to the width of the values we're trying to insert/ update.
Assumptions/ Limitations:
The column names of the table in the database match with the C# entity fields. For eg: If you have a column like this in database:
You need to have your Entity with the same column name:
public class SomeTable
{
// Other fields
public string SourceData { get; set; }
}
You're inserting/ updating 1 entity at a time. It'll be clearer in the demo code below. (If you're doing bulk inserts/ updates, you might want to either modify it or use some other solution.)
Step 1:
Get the column width of all the columns directly from the database:
// For this, I took help from Microsoft docs website:
// https://learn.microsoft.com/en-us/dotnet/api/system.data.sqlclient.sqlconnection.getschema?view=netframework-4.7.2#System_Data_SqlClient_SqlConnection_GetSchema_System_String_System_String___
private static Dictionary<string, int> GetColumnSizesOfTableFromDatabase(string tableName, string connectionString)
{
var columnSizes = new Dictionary<string, int>();
using (var connection = new SqlConnection(connectionString))
{
// Connect to the database then retrieve the schema information.
connection.Open();
// You can specify the Catalog, Schema, Table Name, Column Name to get the specified column(s).
// You can use four restrictions for Column, so you should create a 4 members array.
String[] columnRestrictions = new String[4];
// For the array, 0-member represents Catalog; 1-member represents Schema;
// 2-member represents Table Name; 3-member represents Column Name.
// Now we specify the Table_Name and Column_Name of the columns what we want to get schema information.
columnRestrictions[2] = tableName;
DataTable allColumnsSchemaTable = connection.GetSchema("Columns", columnRestrictions);
foreach (DataRow row in allColumnsSchemaTable.Rows)
{
var columnName = row.Field<string>("COLUMN_NAME");
//var dataType = row.Field<string>("DATA_TYPE");
var characterMaxLength = row.Field<int?>("CHARACTER_MAXIMUM_LENGTH");
// I'm only capturing columns whose Datatype is "varchar" or "char", i.e. their CHARACTER_MAXIMUM_LENGTH won't be null.
if(characterMaxLength != null)
{
columnSizes.Add(columnName, characterMaxLength.Value);
}
}
connection.Close();
}
return columnSizes;
}
Step 2:
Compare the column widths with the width of the values we're trying to insert/ update:
public static Dictionary<string, string> FindLongBinaryOrStringFields<T>(T entity, string connectionString)
{
var tableName = typeof(T).Name;
Dictionary<string, string> longFields = new Dictionary<string, string>();
var objectProperties = GetProperties(entity);
//var fieldNames = objectProperties.Select(p => p.Name).ToList();
var actualDatabaseColumnSizes = GetColumnSizesOfTableFromDatabase(tableName, connectionString);
foreach (var dbColumn in actualDatabaseColumnSizes)
{
var maxLengthOfThisColumn = dbColumn.Value;
var currentValueOfThisField = objectProperties.Where(f => f.Name == dbColumn.Key).First()?.GetValue(entity, null)?.ToString();
if (!string.IsNullOrEmpty(currentValueOfThisField) && currentValueOfThisField.Length > maxLengthOfThisColumn)
{
longFields.Add(dbColumn.Key, $"'{dbColumn.Key}' column cannot take the value of '{currentValueOfThisField}' because the max length it can take is {maxLengthOfThisColumn}.");
}
}
return longFields;
}
public static List<PropertyInfo> GetProperties<T>(T entity)
{
//The DeclaredOnly flag makes sure you only get properties of the object, not from the classes it derives from.
var properties = entity.GetType()
.GetProperties(System.Reflection.BindingFlags.Public
| System.Reflection.BindingFlags.Instance
| System.Reflection.BindingFlags.DeclaredOnly)
.ToList();
return properties;
}
Demo:
Let's say we're trying to insert someTableEntity of SomeTable class that is modeled in our app like so:
public class SomeTable
{
[Key]
public long TicketID { get; set; }
public string SourceData { get; set; }
}
And it's inside our SomeDbContext like so:
public class SomeDbContext : DbContext
{
public DbSet<SomeTable> SomeTables { get; set; }
}
This table in Db has SourceData field as varchar(16) like so:
Now we'll try to insert value that is longer than 16 characters into this field and capture this information:
public void SaveSomeTableEntity()
{
var connectionString = "server=SERVER_NAME;database=DB_NAME;User ID=SOME_ID;Password=SOME_PASSWORD;Connection Timeout=200";
using (var context = new SomeDbContext(connectionString))
{
var someTableEntity = new SomeTable()
{
SourceData = "Blah-Blah-Blah-Blah-Blah-Blah"
};
context.SomeTables.Add(someTableEntity);
try
{
context.SaveChanges();
}
catch (Exception ex)
{
if (ex.GetBaseException().Message == "String or binary data would be truncated.\r\nThe statement has been terminated.")
{
var badFieldsReport = "";
List<string> badFields = new List<string>();
// YOU GOT YOUR FIELDS RIGHT HERE:
var longFields = FindLongBinaryOrStringFields(someTableEntity, connectionString);
foreach (var longField in longFields)
{
badFields.Add(longField.Key);
badFieldsReport += longField.Value + "\n";
}
}
else
throw;
}
}
}
The badFieldsReport will have this value:
'SourceData' column cannot take the value of
'Blah-Blah-Blah-Blah-Blah-Blah' because the max length it can take is
16.
Kevin Pope's comment under the accepted answer was what I needed.
The problem, in my case, was that I had triggers defined on my table that would insert update/insert transactions into an audit table, but the audit table had a data type mismatch where a column with VARCHAR(MAX) in the original table was stored as VARCHAR(1) in the audit table, so my triggers were failing when I would insert anything greater than VARCHAR(1) in the original table column and I would get this error message.
I used a different tactic, fields that are allocated 8K in some places. Here only about 50/100 are used.
declare #NVPN_list as table
nvpn varchar(50)
,nvpn_revision varchar(5)
,nvpn_iteration INT
,mpn_lifecycle varchar(30)
,mfr varchar(100)
,mpn varchar(50)
,mpn_revision varchar(5)
,mpn_iteration INT
-- ...
) INSERT INTO #NVPN_LIST
SELECT left(nvpn ,50) as nvpn
,left(nvpn_revision ,10) as nvpn_revision
,nvpn_iteration
,left(mpn_lifecycle ,30)
,left(mfr ,100)
,left(mpn ,50)
,left(mpn_revision ,5)
,mpn_iteration
,left(mfr_order_num ,50)
FROM [DASHBOARD].[dbo].[mpnAttributes] (NOLOCK) mpna
I wanted speed, since I have 1M total records, and load 28K of them.
This error may be due to less field size than your entered data.
For e.g. if you have data type nvarchar(7) and if your value is 'aaaaddddf' then error is shown as:
string or binary data would be truncated
You simply can't beat SQL Server on this.
You can insert into a new table like this:
select foo, bar
into tmp_new_table_to_dispose_later
from my_table
and compare the table definition with the real table you want to insert the data into.
Sometime it's helpful sometimes it's not.
If you try inserting in the final/real table from that temporary table it may just work (due to data conversion working differently than SSMS for example).
Another alternative is to insert the data in chunks, instead of inserting everything immediately you insert with top 1000 and you repeat the process, till you find a chunk with an error. At least you have better visibility on what's not fitting into the table.
I understand this has been asked for multiple times, but I am really stuck here and if it is fairly easy, please help me.
I have a sample java program and a jar file.
Here is what is inside of the java program (WriterSample.java).
// (c) Copyright 2014. TIBCO Software Inc. All rights reserved.
package com.spotfire.samples;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.util.Date;
import java.util.Random;
import com.spotfire.sbdf.BinaryWriter;
import com.spotfire.sbdf.ColumnMetadata;
import com.spotfire.sbdf.FileHeader;
import com.spotfire.sbdf.TableMetadata;
import com.spotfire.sbdf.TableMetadataBuilder;
import com.spotfire.sbdf.TableWriter;
import com.spotfire.sbdf.ValueType;
/**
* This example is a simple command line tool that writes a simple SBDF file
* with random data.
*/
public class WriterSample {
public static void main(String[] args) throws IOException {
// The command line application requires one argument which is supposed to be
// the name of the SBDF file to write.
if (args.length != 1)
{
System.out.println("Syntax: WriterSample output.sbdf");
return;
}
String outputFile = args[0];
// First we just open the file as usual and then we need to wrap the stream
// in a binary writer.
OutputStream outputStream = new FileOutputStream(outputFile);
BinaryWriter writer = new BinaryWriter(outputStream);
// When writing an SBDF file you first need to write the file header.
FileHeader.writeCurrentVersion(writer);
// The second part of the SBDF file is the metadata, in order to create
// the table metadata we need to use the builder class.
TableMetadataBuilder tableMetadataBuilder = new TableMetadataBuilder();
// The table can have metadata properties defined. Here we add a custom
// property indicating the producer of the file. This will be imported as
// a table property in Spotfire.
tableMetadataBuilder.addProperty("GeneratedBy", "WriterSample.exe");
// All columns in the table needs to be defined and added to the metadata builder,
// the required information is the name of the column and the data type.
ColumnMetadata col1 = new ColumnMetadata("Category", ValueType.STRING);
tableMetadataBuilder.addColumn(col1);
// Similar to tables, columns can also have metadata properties defined. Here
// we add another custom property. This will be imported as a column property
// in Spotfire.
col1.addProperty("SampleProperty", "col1");
ColumnMetadata col2 = new ColumnMetadata("Value", ValueType.DOUBLE);
tableMetadataBuilder.addColumn(col2);
col2.addProperty("SampleProperty", "col2");
ColumnMetadata col3 = new ColumnMetadata("TimeStamp", ValueType.DATETIME);
tableMetadataBuilder.addColumn(col3);
col3.addProperty("SampleProperty", "col3");
// We need to call the build function in order to get an object that we can
// write to the file.
TableMetadata tableMetadata = tableMetadataBuilder.build();
tableMetadata.write(writer);
int rowCount = 10000;
Random random = new Random();
// Now that we have written all the metadata we can start writing the actual data.
// Here we use a TableWriter to write the data, remember to close the table writer
// otherwise you will not generate a correct SBDF file.
TableWriter tableWriter = new TableWriter(writer, tableMetadata);
for (int i = 0; i < rowCount; ++i) {
// You need to perform one addValue call for each column, for each row in the
// same order as you added the columns to the table metadata object.
// In this example we just generate some random values of the appropriate types.
// Here we write the first string column.
String[] col1Values = new String[] {"A", "B", "C", "D", "E"};
tableWriter.addValue(col1Values[random.nextInt(5)]);
// Next we write the second double column.
double doubleValue = random.nextDouble();
if (doubleValue < 0.5) {
// Note that if you want to write a null value you shouldn't send null to
// addValue, instead you should use theInvalidValue property of the columns
// ValueType.
tableWriter.addValue(ValueType.DOUBLE.getInvalidValue());
} else {
tableWriter.addValue(random.nextDouble());
}
// And finally the third date time column.
tableWriter.addValue(new Date());
}
// Finally we need to close the file and write the end of table marker.
tableWriter.writeEndOfTable();
writer.close();
outputStream.close();
System.out.print("Wrote file: ");
System.out.println(outputFile);
}
}
The jar file is sbdf.jar, which is in the same directory as the java file.
I can now compile with:
javac -cp "sbdf.jar" WriterSample.java
This will generate a WriterSample.class file.
The problem is that when I try to execute the program by
java -cp .:./sbdf.jar WriterSample
I got an error message:
Error: Could not find or load main class WriterSample
What should I do? Thanks!
You should use the fully qualified name of the WriterSample, which is com.spotfire.samples.WriterSample and the correct java command is:
java -cp .:././sbdf.jar com.spotfire.samples.WriterSample
I am looking for a SQL Library that will parse an SQL statement and return some sort of Object representation of the SQL statement. My main objective is actually to be able to parse the SQL statement and retrieve the list of table names present in the SQL statement (including subqueries, joins and unions).
I am looking for a free library with a license business friendly (e.g. Apache license). I am looking for a library and not for an SQL Grammar. I do not want to build my own parser.
The best I could find so far was JSQLParser, and the example they give is actually pretty close to what I am looking for. However it fails parsing too many good queries (DB2 Database) and I'm hoping to find a more reliable library.
I doubt you'll find anything prewritten that you can just use. The problem is that ISO/ANSI SQL is a very complicated grammar — something like more than 600 production rules IIRC.
Terence Parr's ANTLR parser generator (Java, but can generate parsers in any one of a number of target languages) has several SQL grammars available, including a couple for PL/SQL, one for a SQL Server SELECT statement, one for mySQL, and one for ISO SQL.
No idea how complete/correct/up-to-date they are.
http://www.antlr.org/grammar/list
You needn't reinvent the wheel, there is already such a reliable SQL parser library there, (it's commerical, not free), and this article shows how to retrieve the list of table names present in the SQL statement (including subqueries, joins and unions) that is exactly what you are looking for.
http://www.dpriver.com/blog/list-of-demos-illustrate-how-to-use-general-sql-parser/get-columns-and-tables-in-sql-script/
This SQL parser library supports Oracle, SQL Server, DB2, MySQL, Teradata and ACCESS.
You need the ultra light, ultra fast library to extract table names from SQL (Disclaimer: I am the owner)
Just add the following in your pom
<dependency>
<groupId>com.github.mnadeem</groupId>
<artifactId>sql-table-name-parser</artifactId>
<version>0.0.1</version>
And do the following
new TableNameParser(sql).tables()
For more details, refer the project
Old question, but I think this project contains what you need:
Data Tools Project - SQL Development Tools
Here's the documentation for the SQL Query Parser.
Also, here's a small sample program. I'm no Java programmer so use with care.
package org.lala;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.nio.MappedByteBuffer;
import java.nio.channels.FileChannel;
import java.nio.charset.Charset;
import java.util.Iterator;
import java.util.List;
import org.eclipse.datatools.modelbase.sql.query.QuerySelectStatement;
import org.eclipse.datatools.modelbase.sql.query.QueryStatement;
import org.eclipse.datatools.modelbase.sql.query.TableReference;
import org.eclipse.datatools.modelbase.sql.query.ValueExpressionColumn;
import org.eclipse.datatools.modelbase.sql.query.helper.StatementHelper;
import org.eclipse.datatools.sqltools.parsers.sql.SQLParseErrorInfo;
import org.eclipse.datatools.sqltools.parsers.sql.SQLParserException;
import org.eclipse.datatools.sqltools.parsers.sql.SQLParserInternalException;
import org.eclipse.datatools.sqltools.parsers.sql.query.SQLQueryParseResult;
import org.eclipse.datatools.sqltools.parsers.sql.query.SQLQueryParserManager;
import org.eclipse.datatools.sqltools.parsers.sql.query.SQLQueryParserManagerProvider;
public class SQLTest {
private static String readFile(String path) throws IOException {
FileInputStream stream = new FileInputStream(new File(path));
try {
FileChannel fc = stream.getChannel();
MappedByteBuffer bb = fc.map(FileChannel.MapMode.READ_ONLY, 0,
fc.size());
/* Instead of using default, pass in a decoder. */
return Charset.defaultCharset().decode(bb).toString();
} finally {
stream.close();
}
}
/**
* #param args
* #throws IOException
*/
public static void main(String[] args) throws IOException {
try {
// Create an instance the Parser Manager
// SQLQueryParserManagerProvider.getInstance().getParserManager
// returns the best compliant SQLQueryParserManager
// supporting the SQL dialect of the database described by the given
// database product information. In the code below null is passed
// for both the database and version
// in which case a generic parser is returned
SQLQueryParserManager parserManager = SQLQueryParserManagerProvider
.getInstance().getParserManager("DB2 UDB", "v9.1");
// Sample query
String sql = readFile("c:\\test.sql");
// Parse
SQLQueryParseResult parseResult = parserManager.parseQuery(sql);
// Get the Query Model object from the result
QueryStatement resultObject = parseResult.getQueryStatement();
// Get the SQL text
String parsedSQL = resultObject.getSQL();
System.out.println(parsedSQL);
// Here we have the SQL code parsed!
QuerySelectStatement querySelect = (QuerySelectStatement) parseResult
.getSQLStatement();
List columnExprList = StatementHelper
.getEffectiveResultColumns(querySelect);
Iterator columnIt = columnExprList.iterator();
while (columnIt.hasNext()) {
ValueExpressionColumn colExpr = (ValueExpressionColumn) columnIt
.next();
// DataType dataType = colExpr.getDataType();
System.out.println("effective result column: "
+ colExpr.getName());// + " with data type: " +
// dataType.getName());
}
List tableList = StatementHelper.getTablesForStatement(resultObject);
// List tableList = StatementHelper.getTablesForStatement(querySelect);
for (Object obj : tableList) {
TableReference t = (TableReference) obj;
System.out.println(t.getName());
}
} catch (SQLParserException spe) {
// handle the syntax error
System.out.println(spe.getMessage());
#SuppressWarnings("unchecked")
List<SQLParseErrorInfo> syntacticErrors = spe.getErrorInfoList();
Iterator<SQLParseErrorInfo> itr = syntacticErrors.iterator();
while (itr.hasNext()) {
SQLParseErrorInfo errorInfo = (SQLParseErrorInfo) itr.next();
// Example usage of the SQLParseErrorInfo object
// the error message
String errorMessage = errorInfo.getParserErrorMessage();
String expectedText = errorInfo.getExpectedText();
String errorSourceText = errorInfo.getErrorSourceText();
// the line numbers of error
int errorLine = errorInfo.getLineNumberStart();
int errorColumn = errorInfo.getColumnNumberStart();
System.err.println("Error in line " + errorLine + ", column "
+ errorColumn + ": " + expectedText + " "
+ errorMessage + " " + errorSourceText);
}
} catch (SQLParserInternalException spie) {
// handle the exception
System.out.println(spie.getMessage());
}
System.exit(0);
}
}