Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Overview

Engineers with systems operating on and Using this connector application developers caching flat Java domain objects want to can easily persist those objects to a relational database table.  Ex: put Employee object into a Region, read/write Employee object to/from Employee table

Goals

Goals

Using only gfsh commands you can configure simple in-line caching between Apache Geode and a relational database (e.g. MySQL) communicating through the database’s JDBC driver.

  1. Users User can easily map a region to a JDBC database table with minimal configuration.
  2. User Users can persist entity objects put into the region for reading and writingto an external database table, synchronously or asynchronously.
  3. Users can read database table rows into a region.

Approach

  1. Expose new jdbccreate data-mapping source and create jdbc-connection mapping commands.

  2. Implement JdbcReader, JdbcWriter, and JdbcAsyncWriter to be added to a region on region createand automatically add them to the region specified in create jdbc-mapping.

Out of Scope

This

won’t

JDBC connector is not intended to solve complex use cases involving complex entity objects, nested structures

, etc.  This won’t solve

nor every inline caching use case.

 It

It is designed to be a general fit and

so won’t solve unique use cases.  Users are still free to code their own   Normalized

will not support special database features (such as auto-incremented id fields). Users should implement normalized flat tables and corresponding domain objects covering only one topic

are encouraged

.  


 
 GFSH ChangesCommands


Use help <command> to show the help 

create

jdbc

data-

mapping

source

alter jdbc

list data-

mapping

source

describe

jdbc

data-

mapping

source

destroy

jdbc-mappinglist jdbc-mappings

data-source

create jdbc-

connectionalter jdbc-connection

mapping

describe jdbc-

connection

mapping

destroy jdbc-

connection

mapping

list jdbc-

connections

Notable New Classes

JdbcReader

JdbcWriter

JdbcAsyncWriter

 

mapping


Out-Of-Box implementation

 

JDBC Connector is designed to be usable out of the box.   It  It is designed to plug into a pattern where you have a specific table name that (probably) wont won’t change.  

1) Just create Create a region with the same name , add on a JdbcWriter (for write through) or JdbcAsyncWriter (for write behind) or JdbcReader (for read through)as the table

2) Create a jdbc-connection to your backend DB, create a jdbc-mapping to match the region to the connection.

3) Begin putting entity objects into the region the default behavior will attempt to write to a table matching the name of the region, into columns matching the field names on your object.

If your names don't match you can also configure specific region name to table names and field names to columns names in the jdbc-mapping command.

create data-source defines the URL for Geode to communicate with the external database through its JDBC driver.

3) create jdbc-mapping will map fields in your domain object to columns in the table of the external database. This command automatically configures the JdbcReader and JdbcAsyncWriter (default) or JdbcWriter (synchronous) that use the JDBC protocol to support inline caching between the Geode region and the external table specified.

4) Once the data source and region-table mapping (JDBC Mapping) are set up, the user can perform Region.put and Region.get operations to persist data or load data from the external data source. Region.put adds or updates an entry in the region as well as updating the table in the external data source. If Region.get results in a cache miss, the JDBC cache loader will automatically load the entry into the region from the database table, and return the entry to the caller of Region.get. 

Getting Started

JDBC connector works best when starting from an empty region and empty table (thus synced).  And using only one application--one region--one table tenancy.


  Setup
  1. You’ll need an existing jdbc datasource JDBC data source with a table you want to inline (map) a region with and an entity object that implements PdxSerializable.

    ex: a MySQL RDS instance running in AWS

2. In gfsh you’ll need to add the jdbc jar to the server classpath when you start the server.

gfsh>start server --name=s1 --classpath="

/Users/fkrone/workspace

/path/to/mysql-connector-java-5.1.45-bin.jar"

  OR use deploy --jar gfsh command to deploy the JDBC driver and add it to the Geode classpath.

gfsh>deploy --jar="/path/to/mysql-connector-java-5.1.45-bin.jar"

3. Create a region with the same name as the table you want to persist to (optionally you can use a table/region name mismatch and configure the mapping with the jdbc-mapping command)

 

gfsh>create region --name="employee" --type=REPLICATE

--cache-loader="org.apache.geode.connectors.jdbc.JdbcLoader"

4. Create a

jdbc-connection with DB url and credentialsgfsh>create jdbc-connection

data source with JDBC connection URL and credentials. Please refer to your database documentation for details about JDBC connection URL. Here is an example:

gfsh>create data-source --name=

aws

mysql_data_

mysql

source --url="jdbc:mysql://localhost/

rds-mysql-test.123.us-west-2.rds.amazonaws.com:3306/test?user=emma&password=iamthegreatest"

geode_db" --username=mysql_user

5. Map a table in external data source

5. map the new connection

to a region using the

jdbc

create  jdbc-mapping command.

 

gfsh>create jdbc-mapping --data-

connection

source="

aws

mysql_data_

mysql

source" --region="employee" --catalog=geode_db --table=employee --id=id --pdx-name=com.example.Employee

NOTE: For some database vendors, for example MySQL, you may need to specify --catalog for your database name. Other vendors, for example PostgreSQL, you may need to specify --schema for your database name. The --schema and --catalog options are used by the JDBC driver supplied with the vendor specific database. Please refer to your database documents for more details about catalog and schema.

 

Risks and Mitigations

 

Mitigations 

It is strongly recommended that you do your own use-case specific testing when using JDBC Connector.   Test an accurate  Test the actual read/write/load you would expect to requires the Geode region to be connected to the DB and make sure JDBC Connector performs according to your SLAs.   Make  Make sure you understand the number of connections made to your DB and review against your licensing agreement where applicable. It is also recommended that you not read or write directly to a DB which is your system of record.   If  If possible, use a DB that is designed to be a backup of your cache and DB for ETL(Extract-Transform-Load) from there.

Dates

 

For your database date columns it is encouraged to use integer for a timestamp as it’s much simpler and consistent all around.  Translate later. Barring that use a date type for the column and java.util.Date in your domain object date field.

 

Ex: Date date = new java.sql.Date(Calendar.getInstance().getTimeInMillis());

 

PdxInstance.writeDate("hire_date", date).create();

Entity object structure

Nested structures are not supported.  Flat table structures with unique key columns must be used.

This is for persisting flat domain objects to a table.  Pull from those flat tables for more complex transformations, etc.

Your entity object needs to implement PdxSerializable.

public class Employee implements PdxSerializable {

Implement the toData, fromData methods from the interface.  Jdbc returns objects in results which the JdbcLoader will try to convert.  Implement toData and fromData to cast object returns to specific fields.

example:

@Override

public void toData(PdxWriter writer) {

 writer.writeObject("first_name", this.first_name);

writer.writeObject("last_name", this.last_name);

writer.writeObject("hire_date", this.hire_date);

writer.writeObject("id", this.id);

}

@Override

public void fromData(PdxReader reader) {

 this.first_name = (String)reader.readObject("first_name");

 this.last_name = (String)reader.readObject("last_name");

 this.hire_date = (java.util.Date)reader.readObject("hire_date");

 this.id = (int) reader.readObject("id");

Region to Database table mapping

By default the connector uses region name as table name and Java class field names as column names.

The create jdbc-mapping gfsh command requires that you specify --pdx-name. The Java class specified by --pdx-name should pre-exist in classpath or you may add the --pdx-class-file option to register a PDX type for your domain class.  For example, ---pdx-class-file=/path/to/com/example/Employee.class.

Default Mapping and Case Sensitivity

Domain object field names must match RDBMS table column names. The default mapping of fields to columns first looks for an exact, case-sensitive name match. If not found it then looks for a case insensitive match. If the jdbc-mapping command cannot find case insensitive matches for all domain object fields and DB columns, an exception is thrown.

Field name Mapping

Every Java class field name (The fields in the Java class specified by --pdx-name) must map to an existing column name. If a column does not exist, an exception will be thrown, when creating the jdbc-mapping.

Field type mapping

The connector tries to map the Java class field type to an appropriate SQL type. The user needs to make sure that the Java class field type is compatible with the database column type. The describe jdbc-mapping command will display the field and column type mapping.

The JDBC Connector relies on the JDBC driver’s ability to map a Java type to an SQL type. The JDBC driver will throw an exception if the types are not compatible. If the Java field is null, then the JDBC PreparedStatement.setNull method is called. For non-null Java field, the JDBC PreparedStatement.setObject method is called with the Java field data.

Writing a PDX field that is a Character or char

For characters whose value is zero, JDBC PreparedStatement.setNull will be called. All non-zero characters are changed to a String.  We advise against using Character, as when it uses PreparedStatement.setObject, JDBC will convert it for the table column on insert, and on a read will use ResultSet.getObject and won’t convert back to Character. Note that if an object field originally contained an instance of java.lang.Character, then it is written to database as an instance of String using PreparedStatement.setObject.  So when it is read using ResultSet.getObject, it will not be an instance of java.lang.Character.

Writing a PDX field that is a java.util.Date

java.util.Date instances are changed to a sql date type (java.sql.Date, java.sql.Time, or java.sql.Timestamp) based on the column type.

Type mapping while reading

When data is read from the database, the data is converted to the Java field type by calling getXXX on the ResultSet. For example, if the Java field type is short, then ResultSet.getShort is called.

The JDBC getXXX method will throw an exception if it is unable to convert the column data to a Java type.

The following special cases exist:

Reading a char field

ResultSet.getString is called and the first char of the String is put in the PDX field. If the String is null or empty then zero is put in the PDX field.

Reading a date field

The column type is used to call one of the following: ResultSet.getDate, ResultSet.getTime, or ResultSet.getTimestamp. The resulting object is put in the Java field as a java.util.Date.

Reading a byte array field

If the column type is BLOB, then ResultSet.getBlob is called and the entire blob is read into a byte array. Otherwise ResultSet.getBytes is called.

Reading an array field other than byte array

ResultSet.getObject() is called and the output is cast to the field’s array type.

Changes to the Java domain class

If the developer changes the Java domain class (adding or removing fields or changing field data type) after the JDBC mapping has been created, the database table should be updated and the JDBC mapping should be destroyed and re-created. Otherwise data from Geode will not be persisted to the database table and exceptions may be thrown from the JDBC driver.

Entity Object Structure

Nested object structures in the Geode domain class are not supported.  Flat table structures with unique key columns must be used. This JDBC connector is designed for persisting flat domain objects to a table and/or reading data from external RDBMS into a Geode region for further processing.

The JDBC connector supports both single and composite keys (when more than one table column is needed to create a unique key).

If you don’t want to use reflection-based PDX serialization on your client application, your domain class has to implement one of the following interfaces for serialization: java.io.Serializable, DataSerializable or PdxSerializable.}