Posts Tagged Parallel Database
Before reading about the solution, a fair question the reader may ask is: “What
is the problem? Is that problem important? and to whom?” Answering
these questions requires looking at a global picture of our computerized society.
Today, in a competitive world, enterprises of all kinds use and depend on timely
available, up-to-date information. Information volumes are growing 25-35% per
year and the traditional transaction rate has been forecast to grow by a factor
of 10 over the next five years-twice the current trend in mainframe growth.
In addition, there is already an increasing number of transactions arising
from computer systems in business-to-business interworking and by intelligent
terminals in the home, office or factory.
The profile of the transaction load is also changing as decision-support queries,
typically complex, are added to the existing simpler, largely clerical workloads.
Thus, complex queries such as those macro-generated by decision support systems
or system-generated as in production control will increase to demand significant
throughput with acceptable response times. In addition, very complex queries on
very large databases, generated by skilled staff workers or expert systems, may
hurt throughput while demanding good response times.
From a database point of view, the problem is to come up with database
servers that support all these types of queries efficiently on possibly very large
on-line databases. However, the impressive silicon technology improvements
alone cannot keep pace with these increasing requirements. Microprocessor
performance is now increasing 50% per year, and memory chips are increasing
in capacity by a factor of 16 every six years. RISC processors today can deliver
between 50 and 100 MIPS (the new 64 bit DEC Alpha processor is predicted to
deliver 200 M!PS at cruise speed!) at a much lower price/MIPS than mainframe
processors. This is in contrast to much slower progress in disk technology which
has been improving by a factor of 2 in response time and throughput over the
last 10 years. With such progress, the I/O bottleneck worsens with time.
The solution is therefore to use large-scale parallelism to magnify the raw power
of individual components by integrating these in a complete system along with the
appropriate parallel database software. Using standard hardware components is
essential to exploit the continuing technology improvements with minimal delay.
Then, the database software can exploit the three forms of parallelism inherent
in data-intensive application workloads. Interquery parallelism enables the parallel
execution of multiple queries generated by concurrent transactions. Intraquery
parallelism makes the parallel execution of multiple, independent operations (e.g.,
select operations) possible within the same query. Both interquery and intraquery
parallelism can be obtained by using data partitioning. Finally, with intraoperation
parallelism, the same operation can be executed as many suboperations using
function partitioning in addition to data partitioning. The set-oriented mode of
database languages (e.g., SQL) provides many opportunities for intraoperation
parallelism. For example, the performance of the join operation can be increased
significantly by parallelism.
Technology is driving parallel database machines. The ideal machine would be a single infinitely fast processor
with an infinite memory with infinite bandwidth – and it would be infinitely cheap (free). Given such a
machine, the need for parallelism would be much reduced. Unfortunately, technology is not delivering such
machines – but it is coming close. Technology is promising to deliver fast one-chip processors, fast and large
disks, and large ram memories. And each of these devices will be very cheap by today’s standards, costing
only hundreds of dollars each. So the challenge is to build an infinitely fast processor out of infinitely many
processors offinite speed, and to build an infinitely large memory with infinite memory bandwidth from
infinitely many storage units offinite speed. This sounds trivial mathematically; but in practice, when a new
processor is added to most computer designs, it slows every other computer down just a litde bit. If this slow
down is 1%, then when adding a processor to a 100 processor complex, the added power is cancelled by the
slow down of the other processors. Such a system has a maximum speedup of 25 (at 50 processors) and the
100 processor complex has the effective power of a single processor system.
The ideal parallel system demonstrates two key properties: (1) linear speed up: Twice as much hardware can
perform the task in half the elapsed time, and (2) linear scaleup: Twice as much hardware can perform
twice as large a task in the same elapsed time (see Figure l.a).
Figure 18. The difference between a speedup design in which a one-minute job is done in 15-seconds,
and a scaleup design in which a ten-times bigger job is done in the same time by a ten-times bigger
The barriers to these goals are the triple threats of: (1) startup: The time needed to start a parallel
operation. If thousands of processes must be started, this can easily dominate the actual computation
time. (2) interference: The slowdown each new process imposes on all others. (3) skew: The service
time of a job is often the service time of the slowest step of the job. As the number of parallel
steps increase the average sized of each step decreases, but the variance can well exceed the mean.
When the variance dominates the mean, further parallelism will not help.
This section describes several basic techniques widely used in the design of shared-nothing parallel
database machines to achieve linear speedup and scaleup on relational operators. There are a number of
reasons for the success of shared-nothing systems. The main advantage of shared nothing multiprocessors
is that they can be scaled up to hundreds and probably thousands of processors which do not interfere
with one another. Teradata and Tandem, for example, have shipped systems with more than 200
processors, and Intel is implementing a 2000 node Hypercube. On the other hand, the largest shared
memory multiprocessors currently available are limited to about 32 processors.
Figure 1b. The standard speedup curves. The left curve is the ideal. The middle graph shows no
speedup as harclware is added. The right curve shows the three threats to parallelism. Initial startup costs
may dominate at first. As the number of processes increase, interference can increase. Ultimately, the job
is divided so finely, that the variance in service times (skew) causes a slowdown.
Interference is a major problem for shared-memory multiprocessors for database applications. As
processors become faster, each processor is given a large private cache in order to avoid interference
on shared memory resources. Recent results measuring a multiprocessor running an online transaction
workload show that the cache loading/flushing overhead of transaction processing applications
considerably degrades processor performance. We do not believe high-performance shared-memory
machines will scale beyond a tens of processors when running database applications. When high degrees
of parallelism are introduced, shared resources become a bottleneck. Consequently the software
architecture must use fine granularity concurrency control, and must even partition resources to
avoid process and processor interference. This observation is true for both shared-nothing and shared
memory designs. This partitioning on a shared-memory system creates many of the problems faced by a
shared nothing machine.