Real-time Data Warehousing

Batches for data warehouse loads used to be scheduled daily to weekly; today’s businesses demand
information that is as fresh as possible. The value of this realtime business data decreases as
it gets older, latency of data integration is essential for the business value of the data
warehouse. At the same time the concept of “business hours” is vanishing for a global enterprise,
as data warehouses are in use 24 hours a day, 365 days a year. This means that the traditional
nightly batch windows are becoming harder to accommodate, and interrupting or slowing down
sources is not acceptable at any time during the day. Finally, integration projects have to be
completed in shorter release timeframes, while fully meeting functional, performance, and
quality specifications on time and within budget. These processes must be maintainable over
time, and the completed work should be reusable for further, more cohesive, integration initiatives.

Conventional “Extract, Transform, Load” (ETL) tools closely intermix data transformation rules with
integration process procedures, requiring the development of both data transformations and data
flow. Oracle Data Integrator (ODI) takes a different approach to integration by clearly separating
the declarative rules (the “what”) from the actual implementation (the “how”). With ODI, declarative
rules describing mappings and transformations are defined graphically, through a drag-and-drop
interface, and stored independently from the implementation. ODI automatically generates the data
flow, which can be fine-tuned if required.

This innovative approach for declarative design has also been applied to ODI’s framework for Changed
Data Capture. ODI’s CDC moves only changed data to the target systems and can be integrated with
Oracle GoldenGate, thereby enabling the kind of real time integration that businesses require. This
technical brief describes several techniques available in ODI to adjust data latency from scheduled
batches to continuous real-time integration. Basic solutions, such as filtering records according to
a timestamp column or “changed” flag, are possible, but they might require modifications in the
applications. In addition, they usually do not sufficiently ensure that all changes are taken into
account. ODI’s Changed Data Capture identifies and captures data as it is being inserted, updated,
or deleted from datastores, and it makes the changed data available for integration processes.

Advertisements

, , , , , , , , , , ,

  1. Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: