values of the processor speeds based on the benchmark values obtained from the site http://www.netlib.org/benchmark/linpackj ava/timings_list.html and the reported and normalized mean CPU times on the benchmark sets for the algorithms are summarized in Table 7.
In both cases the performance is less than 1% of the 1100 Mflops
which we can expect for Krylov subspace methods without preconditioning on the Fujitsu VPP300.
Since the Cray-2 typically runs at tens of MFLOPS
, and our Sun-3 with a 68881 coprocessor typically runs at tens of KFLOPS, we naively hoped to see our run time decrease by a factor of 1,000 after porting our program to a Cray-2.
Intel has continued the pure multicomputer path by introducing its third generation, Paragon with up to 4K Intel i860 microprocessor based nodes, each with a peak of 4 x 75 Mflops
for delivery in early 1993.
Figures 10 and 11 show the Mflop
performance (averaged over the 18 matrix types) versus block size on the IBM and SGI platforms.
Machine Characteristics Clock Speed Peak Cache Line Size Machine (MHz) MFLOPS
Size (KB) Associativity (Bytes) IBM POWER2 66.5 264 256 4 256 HP 712 80 80 256 1 32 DEC Alpha 250 500 8 1 32 SGI Indigo2 200 100 16 1 32
5, that being the Greatly Reduced Array of Processor Elements with Data Reduction (GRAPE-DR) system at the National Astronomical Observatory of Japan, which produced 429 Mflops
The results are given (in MFlops
) for execution of 2.3....
TMS320F2833x floating point Digital signal controllers (DSCs) from Texas Instruments provide 300 million MFLOPS
performance at 150 MHz.
The newly developed micro controller comes with a CPU core of SH2A-FPU, enabling a processing of 480 million instructions per second (MIPS) on a wavelength of 200 MHz and a floating operation of 400 MFLOPS
of powered processing.
Results from Table 1 show that the energy calculated by the SASLOBE program (-1.174444 au) differs from the exact result in the 6th digit (a 3 x [10.sup.-5] error) with a 20 hours process time in a 320 MFLOPS
Silicon Graphics computer.
The particular machine that we used has 64 processors of 1000 MFLOPS
floating speed and has 64 GB of shared memory.