Declustering a relation involves distributing the tuples of a relation among two or more disk
drives according to some distribution criteria such as applying a hash function to the key attribute
of each tuple. Declustering has its origins in the concept of horizontal partitioning initially
developed as a distribution mechanism for distributed DBMS [RIES78] (see Figure 1). One of the
key reasons for using declustering in a parallel database systems is to enable the DBMS software to
exploit the I/O bandwidth reading and writing multiple disks in parallel. By declustering the tuples
of a relation the task of parallelizing a scan operator becomes trivial. All that is required is to start a
copy of the operator on each processor or disk containing relevant tuples, and to merge their
outputs at the destination. For operations requiring a sequential scan of an entire relation, this
approach can provide the same I/O bandwidth as a RAID-style system [SALE84], [PATI88]
without needing any specialized hardware.
While tuples can simply be declustered in a round-robin fashion, more interesting
alternatives exist. One is to apply a hashing function to the key attribute of each tuple to distribute
the tuples among the disks. This distribution mechanism allows exact match selection operations
on the partitioning attribute to be directed to a single disk, avoiding the overhead of starting such
queries on multiple disks. On the other hand, range queries on the partitioning attribute, must be
sent to all disks over which a relation has been declustered. A hash declustering mechanism is
provided by Arbre, Bubba, Gamma, and Teradata.
Figure 1: The three basic declustering schemes: range declustering maps contiguous fragments of a
table to various disks. Round-Robin declustering maps the i’th record to disk i mod n. Hashed
declustering, maps each record to a disk location based on some hash function. Each of these schemes
spreads data among a collection of disks, allowing parallel disk access and parallel processing.
An alternative declustering strategy is to associate a distinct range of partitioning attribute
values with each disk by, for example, dividing the range of possible values into N units, one for
each of the N processors in the system range partitioning. The advantage of range declustering is
that it can isolate the execution of both range and exact match-selection operations to the minimal
number of processors. Another advantage of range partitioning is that it can be used to deal with
non-uniformly distributed partitioning attribute values. A range-partitioning mechanism is provided
by Arbre, Bubba, Gamma, and Tandem.
While declustering is a simple concept that is easy to implement, it raises a number of new
physical database design issues. In addition to selecting a declustering strategy for each relation,
the number of disks over which a relation should be declustered must also be decided. While
Gamma declusters all relations across all disk drives (primarily to simplify the implementation), the
Bubba, Tandem, and Teradata systems allow a subset of the disks to be used. In general,
increasing the degree of declustering reduces the response time for an individual query and
(generally) increases the overall throughput of the system. For sequential scan queries, the
response time decreases because more processors and disks are used to execute the query. For
indexed selections on the partitioning attribute, the response time improves because fewer tuples
are stored at each node and hence the size of the index that must be searched decreases. However,
there is a point beyond which further declustering actually increases the response time of a query.
This point occurs when the cost of starting a query on a node becomes a significant fraction of the
actual execution time [COPE88, DEWI88, GHAN90a]. In general, full declustering is not always
a good idea, especially in a very large configurations. Bubba [COPE88] refines the concept of
range-partitioning by considering the heat of its tuples when declustering a relation; the goal
being to balance the frequency with which each disk is accessed rather than the actual number of
tuples on each disk. In [COPE88] the effect of the degree of the declustering on the multiuser
throughput of Bubba is studied. In [GHAN90a] the impact of the alternative partitioning strategies
on the multiuser throughput of Gamma is evaluated.