Posts Tagged Throughput
Before reading about the solution, a fair question the reader may ask is: “What
is the problem? Is that problem important? and to whom?” Answering
these questions requires looking at a global picture of our computerized society.
Today, in a competitive world, enterprises of all kinds use and depend on timely
available, up-to-date information. Information volumes are growing 25-35% per
year and the traditional transaction rate has been forecast to grow by a factor
of 10 over the next five years-twice the current trend in mainframe growth.
In addition, there is already an increasing number of transactions arising
from computer systems in business-to-business interworking and by intelligent
terminals in the home, office or factory.
The profile of the transaction load is also changing as decision-support queries,
typically complex, are added to the existing simpler, largely clerical workloads.
Thus, complex queries such as those macro-generated by decision support systems
or system-generated as in production control will increase to demand significant
throughput with acceptable response times. In addition, very complex queries on
very large databases, generated by skilled staff workers or expert systems, may
hurt throughput while demanding good response times.
From a database point of view, the problem is to come up with database
servers that support all these types of queries efficiently on possibly very large
on-line databases. However, the impressive silicon technology improvements
alone cannot keep pace with these increasing requirements. Microprocessor
performance is now increasing 50% per year, and memory chips are increasing
in capacity by a factor of 16 every six years. RISC processors today can deliver
between 50 and 100 MIPS (the new 64 bit DEC Alpha processor is predicted to
deliver 200 M!PS at cruise speed!) at a much lower price/MIPS than mainframe
processors. This is in contrast to much slower progress in disk technology which
has been improving by a factor of 2 in response time and throughput over the
last 10 years. With such progress, the I/O bottleneck worsens with time.
The solution is therefore to use large-scale parallelism to magnify the raw power
of individual components by integrating these in a complete system along with the
appropriate parallel database software. Using standard hardware components is
essential to exploit the continuing technology improvements with minimal delay.
Then, the database software can exploit the three forms of parallelism inherent
in data-intensive application workloads. Interquery parallelism enables the parallel
execution of multiple queries generated by concurrent transactions. Intraquery
parallelism makes the parallel execution of multiple, independent operations (e.g.,
select operations) possible within the same query. Both interquery and intraquery
parallelism can be obtained by using data partitioning. Finally, with intraoperation
parallelism, the same operation can be executed as many suboperations using
function partitioning in addition to data partitioning. The set-oriented mode of
database languages (e.g., SQL) provides many opportunities for intraoperation
parallelism. For example, the performance of the join operation can be increased
significantly by parallelism.
Declustering a relation involves distributing the tuples of a relation among two or more disk
drives according to some distribution criteria such as applying a hash function to the key attribute
of each tuple. Declustering has its origins in the concept of horizontal partitioning initially
developed as a distribution mechanism for distributed DBMS [RIES78] (see Figure 1). One of the
key reasons for using declustering in a parallel database systems is to enable the DBMS software to
exploit the I/O bandwidth reading and writing multiple disks in parallel. By declustering the tuples
of a relation the task of parallelizing a scan operator becomes trivial. All that is required is to start a
copy of the operator on each processor or disk containing relevant tuples, and to merge their
outputs at the destination. For operations requiring a sequential scan of an entire relation, this
approach can provide the same I/O bandwidth as a RAID-style system [SALE84], [PATI88]
without needing any specialized hardware.
While tuples can simply be declustered in a round-robin fashion, more interesting
alternatives exist. One is to apply a hashing function to the key attribute of each tuple to distribute
the tuples among the disks. This distribution mechanism allows exact match selection operations
on the partitioning attribute to be directed to a single disk, avoiding the overhead of starting such
queries on multiple disks. On the other hand, range queries on the partitioning attribute, must be
sent to all disks over which a relation has been declustered. A hash declustering mechanism is
provided by Arbre, Bubba, Gamma, and Teradata.
Figure 1: The three basic declustering schemes: range declustering maps contiguous fragments of a
table to various disks. Round-Robin declustering maps the i’th record to disk i mod n. Hashed
declustering, maps each record to a disk location based on some hash function. Each of these schemes
spreads data among a collection of disks, allowing parallel disk access and parallel processing.
An alternative declustering strategy is to associate a distinct range of partitioning attribute
values with each disk by, for example, dividing the range of possible values into N units, one for
each of the N processors in the system range partitioning. The advantage of range declustering is
that it can isolate the execution of both range and exact match-selection operations to the minimal
number of processors. Another advantage of range partitioning is that it can be used to deal with
non-uniformly distributed partitioning attribute values. A range-partitioning mechanism is provided
by Arbre, Bubba, Gamma, and Tandem.
While declustering is a simple concept that is easy to implement, it raises a number of new
physical database design issues. In addition to selecting a declustering strategy for each relation,
the number of disks over which a relation should be declustered must also be decided. While
Gamma declusters all relations across all disk drives (primarily to simplify the implementation), the
Bubba, Tandem, and Teradata systems allow a subset of the disks to be used. In general,
increasing the degree of declustering reduces the response time for an individual query and
(generally) increases the overall throughput of the system. For sequential scan queries, the
response time decreases because more processors and disks are used to execute the query. For
indexed selections on the partitioning attribute, the response time improves because fewer tuples
are stored at each node and hence the size of the index that must be searched decreases. However,
there is a point beyond which further declustering actually increases the response time of a query.
This point occurs when the cost of starting a query on a node becomes a significant fraction of the
actual execution time [COPE88, DEWI88, GHAN90a]. In general, full declustering is not always
a good idea, especially in a very large configurations. Bubba [COPE88] refines the concept of
range-partitioning by considering the heat of its tuples when declustering a relation; the goal
being to balance the frequency with which each disk is accessed rather than the actual number of
tuples on each disk. In [COPE88] the effect of the degree of the declustering on the multiuser
throughput of Bubba is studied. In [GHAN90a] the impact of the alternative partitioning strategies
on the multiuser throughput of Gamma is evaluated.
A fundamental limit of indirect statistics-based feedback is that it classifies link conditions as
either “good” or “bad.” This binary information provides some notion about the direction in which
to adapt the rate setting, but does not suffice to select the appropriate rate at once. This leads
to a slow step-by-step accommodation to large changes in conditions, and introduces the risk of
oscillation in stable conditions. A better approach is to use direct measurements of the link
conditions.The SNR is directly related to the bit error rate (BER) in the link, and hence to the
FER. Consequently, the SNR is linked to the packet delay and jitter, and throughput, and has the
potential to provide rich feedback for automatic rate control. Knowing the current SNR and the
throughput vs SNR curves for each rate setting (e.g., Fig. 2) would solve the rate selection
Figure 2. Throughput vs. SNR for some 802.11 modulation schemes
Despite the advantages, SNR-based rate control has not been applied in practice yet because of the
following three problems:
1) In reality, for certain link conditions the relation between the optimal rate and SNR is highly
variable. This is due to the imperfectness of the models describing the radio channel.
2) It is not trivial to obtain a reliable estimate of the SNR of a link. Many radio interfaces only
provide an uncalibrated signal strength indication (SSI).
3) The rate controller, which is on the sending side, in fact needs the SNR observed at the
Most work on using SNR information for automatic rate control is based on simulation and does not
consider the practical difficulties of obtaining good SNR estimates. It concentrates on the way in
which the noisy and drifting SNR can be used to determine the correct rate setting to address the
issue of how to communicate back SNR values, but their rate selection algorithm still relies on a
straight SNR threshold technique. Another approach is discussed in, where the assumption is made
that the channel is symmetric, meaning that the SNR observed at either station is very similar for
any given point in time. This assumption allows to use the SNR of the last ACK frame as an indicator
of the SNR at the other side, and to use it for selecting the rate of the next data frame to be sent.