Requirement
It is required to copy data stored in one data source to another data source having the schema structure using JDBC. Whatever data is present in selected tables of origin data source should be copied to destination data source.
Challenges
Different JDBC drivers require somewhat different URLs and drivers. Even though Java.sql.DriverManager simplifies need to specify correct JDBC driver by looking at the URL, it is still necessary to make sure required driver is in classpath. Furthermore Driver Manager will not be able to load sun’s JdbcOdbcDriver when it is required to access Odbc Data Sources like MS Access DB files, thus it should explicitly be loaded at application startup. Also there might be situations as it is in MS Access, which requires matching DSN configuration, additional configurations might be necessary.
As to JDBC URLs, since they are not completely standard among different drivers, user needs to know correct construction of required JDBC URLs.
Different SQL Data Sources require different syntaxes. For example MySQL uses back-quote to escape column names with spaces while SQLServer uses square brackets. One solution to this is to leave selection of right quotes to user by putting them into application properties.
While it is good practice to specify fetch size of data to both reduce network bandwidth usage and memory footprint of application, JdbcOdbcDriver will throw exception instead of simply ignoring this parameter.
Data Sources may use and require different charsets. Which should be known and specified by the user which is sometimes hard to query? For example, for MS Access there is no easy way to learn internal charset of *.mdb file. Specifying wrong charsets may yield weird column names and incorrectly constructed SQL statements will break the correct execution of application.
Implementation
Database Specific parameters are read from properties file to avoid source code change in case of data source changes.
Name of tables to copy are read from properties file as a comma separated list. Then table schema is determined using ResultSetMetadata. This makes it very easy to copy from any number of database tables without any effort. However, this approach requires that source and destination databases shares exactly the same table schemas.
For each table name, a query is sent to origin database to retrieve table metadata. Then using this metadata, respective select and insert queries are constructed as in code listing below.
ResultSet rs = statement.executeQuery("SELECT * FROM " + tableName + " WHERE 1<>1");
ResultSetMetaData rsm = rs.getMetaData();
List<String> attributeList = new ArrayList<String>();
StringBuilder insertQueryBuilder = new StringBuilder("INSERT INTO " + tableName + " (");
StringBuilder selectQueryBuilder = new StringBuilder("SELECT ");
for (int i = 1; i <= rsm.getColumnCount(); i++) {
String columnName = rsm.getColumnName(i);
if (rsm.isAutoIncrement(i))
continue;
insertQueryBuilder.append(DST_COLUMN_QUOTE_START).append(columnName).append(DST_COLUMN_QUOTE_END).append(LIST_SEPARATOR);
selectQueryBuilder.append(SRC_COLUMN_QUOTE_START).append(columnName).append(SRC_COLUMN_QUOTE_END).append(LIST_SEPARATOR);
attributeList.add(columnName);
}
// insert
insertQueryBuilder.replace(insertQueryBuilder.length() - 1, insertQueryBuilder.length(), ")");//replace last comma
insertQueryBuilder.append(" VALUES(");
for (int i = 1; i <= attributeList.size(); i++)
insertQueryBuilder.append("?" + LIST_SEPARATOR);
insertQueryBuilder.replace(insertQueryBuilder.length() - 1, insertQueryBuilder.length(), ")"); //replace last comma
//select
selectQueryBuilder.replace(selectQueryBuilder.length() - 1, selectQueryBuilder.length(), " "); //replace last comma
selectQueryBuilder.append(" FROM ").append(tableName);
Data for each table is copied in the order which is specified in properties file. Batch size number of rows are fetched and inserted at a time. After all rows are inserted transaction is committed and connection is set back to auto-commit mode.
destinationConnection.setAutoCommit(false);
PreparedStatement preparedStatementSelect = sourceConnection.prepareStatement(selectStatements.get(tableName));
PreparedStatement preparedStatementInsert = destinationConnection.prepareStatement(insertStatements.get(tableName));
List<String> attributeList = attributeListMap.get(tableName);
// for JdbcOdbcDriver, setting fetch size yields exception
preparedStatementSelect.setFetchSize(batchSize);
ResultSet rs = preparedStatementSelect.executeQuery();
preparedStatementInsert.clearParameters();
preparedStatementInsert.clearBatch();
preparedStatementInsert.clearWarnings();
int insertCount = 0;
while (rs.next()) {
for (int i = 1; i <= attributeList.size(); i++) {
Object obj = rs.getObject(i);
preparedStatementInsert.setObject(i, obj);
}
insertCount++;
preparedStatementInsert.addBatch();
if (insertCount % batchSize == 0) {
preparedStatementInsert.executeBatch();
preparedStatementInsert.clearBatch();
}
}
// ensure that all batches are executed
preparedStatementInsert.executeBatch();
preparedStatementInsert.clearBatch();
destinationConnection.commit();
Conclusion
This application realizes a simple condition where data copy across different data sources needed. This application shows both powers and weaknesses of JDBC. Metadata information, Batch Processing, Prepared Statements are powers of JDBC, while incompatibility of different driver implementations is weakness of JDBC.
Whole application can be downloaded from this link.
Comments
Post a Comment