Distributed Data Management Middleware for Data-Driven Application Systems

Data driven scientific applications is the storage and management of input and output data in a distributed environment.
A distributed storage middleware, based on a data and metadata management framework, to address this problem.
In this middleware system, applications define the structure of their input and output data using XML schemas.
The system provides support for 1) registration, versioning, management of schemas, and 2) management of storage,
querying, and retrieval of instance data corresponding to the schemas in distributed databases.an experimental
evaluation of the system on a set of PC clusters connected over wide- and local-area networks.

The system is implemented as a services-based middleware framework, called Mobius. The architecture of
Mobius is motivated by the activities of the Data Access and Integration Services group at Global Grid Forum and
by earlier work. Mobius consists of several services and underlying protocols that support distributed creation,
versioning, management of data models defined by XML schemas, on-demand creation of distributed databases, federation
of existing databases, and querying of data in a distributed environment.

The Mobius Framework

Mobius consists of three core services: Global Model Exchange (GME), Metadata and Data Instance Management
(Mako), and Data Translation Service (DTS). Mobius services employ XML schemas to represent metadata definitions
or data models and XML documents to represent and exchange metadata instances or data instances. 
The provide a description of the GME and Mako services that are relevant to our target application environment.

Global Model Exchange

The Global Model Exchange (GME) is responsible for storing and linking data models as defined inside namespaces
in the distributed environment. The GME enables other services to publish, retrieve, discover, deprecate, and
version metadata definitions. GME services are composed together in a domain name server-like architecture representing
a parent-child namespace hierarchy wherein parents act as authorities for children and provide them with
a sub-namespace. When a schema is registered in GME, it is stored in under the name and namespace specified by
the application and is given a version number. It is refer to the tuple consisting of the schema’s name, its namespace,
and its version number as the global name id (GNI) of the schema.

The GME provides model version and model-to-model dependency management. For instance, if a user service
publishes a model to the GME and later the model is modified and republished, the model will automatically be versioned.
The GME protocol provides a mechanism for stating the exact version of the model that is requested. A
model can also contain types defined by other models or references to types contained in other models, and can be
assured that the referenced entities exist. This reference integrity might be considered the largest requirement for
a GME that the current use of a URL does not provide. The role of the GME in the greater picture is to ensure distributed
model evolution and integrity while providing the ability for storage, retrieval, versioning, and discovery of
models of all shape, complexity, and interconnectedness in a distributed environment. A future extension of the GME
service architecture would be to support semantic model storage, versioning, and querying.

, , ,

  1. Leave a comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: