Skip to content

3.1 Early Pioneers and Vector Supercomputers

The first era of commercially successful parallel computing was dominated by a single, towering figure and his singular vision: Seymour Cray. His name became synonymous with the very idea of a supercomputer, and his design philosophy shaped the high-performance landscape for nearly two decades.

Seymour Cray’s career began at Control Data Corporation (CDC), where he established his reputation as a master architect of high-speed computers.1 His designs, like the CDC 1604 and the revolutionary CDC 6600, were the fastest of their time.2 The CDC 6600, released in 1964, is widely considered the first supercomputer.1

The CDC 1604 computer system with a large central console and tape drives.
The CDC 1604. Credit: Marcel Brown
Two men operating the dual-console of the CDC 6600, with tape drives in the background.
The CDC 6600. Credit: Computer History Museum

Cray’s defining characteristic was a relentless focus on performance, achieved through elegant, minimalist designs that pushed technology to its absolute limits.

Frustrated by corporate bureaucracy at CDC, he left in 1972 to found Cray Research, a company with one goal: to build the fastest computers in the world, without compromise.2


Before Cray could redefine the market, other companies had already attempted to harness vector processing. The idea was to perform a single operation on an entire array (“vector”) of numbers at once.2 However, the first attempts were a lesson in unbalanced design.

Architecture TypeMemory-to-Memory (e.g., CDC STAR-100)
ConceptVector instructions stream data directly from main memory, through the arithmetic units, and back to memory.3
StrengthPotentially very high peak performance on perfectly vectorized problems.
Fatal Flaws1. High Startup Overhead: Enormous setup time for vector operations.4
2. Poor Scalar Performance: Exceptionally slow on non-vector code.3
OutcomeOften performed worse than contemporary scalar machines on real-world scientific programs, which are a mix of vector and scalar work.3
Diagram of the CDC STAR-100 CPU and memory layout.
Architectural diagram of the CDC STAR-100 CPU and memory. Credit: Purcell, 20105
Block diagram of the CDC STAR-100 system architecture.
Architectural diagram of the CDC STAR-100 system. Credit: Purcell, 20105

These early machines taught Seymour Cray a critical lesson: a supercomputer must be fast at everything, not just one thing.4


Architectural Deep Dive: The Cray-1 (1976)

Section titled “Architectural Deep Dive: The Cray-1 (1976)”

In 1976, Cray delivered his masterpiece: the Cray-1.2 It was a holistic solution to the problems that plagued earlier vector machines, rooted in a philosophy of balance.

Cray understood, perhaps better than anyone, the principle that would later be formalized as Amdahl’s Law: the speedup of a program is ultimately limited by its sequential, non-parallelizable fraction.6

The Cray-1’s design was a masterclass in balanced architecture, ensuring that the machine would not bog down on the inevitable scalar portions of a program.

The iconic C-shaped Cray-1 supercomputer with its distinctive cylindrical design and padded bench seating around the base.
The Cray-1 supercomputer with its distinctive C-shaped chassis. Image Credit: Computer History Museum
  • Blazing-Fast Scalar Unit: The Cray-1 was built around what was arguably the world’s fastest scalar processor of its time. This ensured top performance on the non-vectorizable parts of any code.3
  • Register-to-Register Architecture: Instead of streaming from slow main memory, the Cray-1 used eight 64-element vector registers. Data was loaded into these high-speed registers, operated on multiple times, and only then written back to memory. This dramatically cut down on memory traffic.47
  • Vector Chaining: This groundbreaking feature allowed the result of one vector operation to be “chained” directly into the next functional unit as an operand, without waiting to be written back to a register. This “pipeline of pipelines” allowed multiple floating-point operations per clock cycle, pushing the machine to its peak performance.27
FeatureCDC STAR-100Cray-1
ArchitectureMemory-to-MemoryRegister-to-Register
Vector DataStreamed from Main MemoryHeld in 8 Vector Registers
Scalar SpeedPoorExcellent
PipeliningSingle vector pipeline”Chained” vector pipelines
Peak Performance100 MFLOPS (theoretical)160 MFLOPS (sustained) 4
Block diagram showing the Cray-1 architecture with vector registers, functional units, and memory organization.
Architectural diagram of the Cray-1 showing its register-to-register vector processing design. Image Credit: Chris Fenton via Homebrew Cray-1A

The Cray-1’s internal elegance was matched by its iconic external design. Every aspect of its physical form was a solution to a fundamental engineering challenge.

  • C-Shaped Chassis: This unique shape, earning it the nickname “the world’s most expensive love-seat,” was a direct solution to the speed of light.8 To keep the clock cycle at a blistering 12.5 nanoseconds, all wires had to be extremely short. The cylindrical design minimized the maximum wire length to just a few feet.910
  • Dense, High-Speed Logic: The system was built with power-hungry Emitter-Coupled Logic (ECL) chips, the fastest available.11
  • Revolutionary Cooling: To dissipate the 115 kW of heat generated by the ECL logic (enough to power a dozen homes), Cray’s team invented a novel cooling system. Copper plates drew heat away from the circuit boards to aluminum bars cooled by liquid Freon refrigerant circulating through embedded stainless steel tubes.8
Cray-1 SpecificationValue
Clock Speed80 MHz (12.5 ns cycle time)
Peak Performance160 MFLOPS
MemoryUp to 1 million 64-bit words
Power Consumption115 kW
Cost5to5 to 8 million USD 4

The Cray-1 was an immediate and resounding success, with over 80 systems sold.4 It became an indispensable tool for national laboratories and research universities, enabling breakthroughs in:

  • Nuclear weapons simulation
  • Cryptography
  • Weather forecasting
  • Computational fluid dynamics1

Its successors, the Cray X-MP and Y-MP, introduced shared-memory multiprocessing, allowing multiple vector processors to work in parallel on a single problem and pushing performance into the gigaflops range.1 For more than a decade, Seymour Cray’s vision defined the pinnacle of computing, creating a legacy where bespoke design, heroic engineering, and a holistic approach to performance reigned supreme.

  1. Cray-1 | computer - Britannica, accessed October 2, 2025, https://www.britannica.com/topic/Cray-1 2 3 4

  2. History of supercomputing - Wikipedia, accessed October 2, 2025, https://en.wikipedia.org/wiki/History_of_supercomputing 2 3 4 5

  3. Vector Architectures: Past, Present and Future, accessed October 2, 2025, https://www.cs.cmu.edu/afs/cs/academic/class/15740-f03/public/doc/discussions/uniprocessors/vector/vector-past-present-future-supercomputing98.pdf 2 3 4

  4. Cray-1 - Wikipedia, accessed October 2, 2025, https://en.wikipedia.org/wiki/Cray-1 2 3 4 5 6

  5. The control data STAR-IOO-Performance measurements, accessed October 16, 2025, https://api.semanticscholar.org/CorpusID:43509695 2

  6. History of computer clusters - Wikipedia, accessed October 2, 2025, https://en.wikipedia.org/wiki/History_of_computer_clusters

  7. I INM MoRY J - Cray simulator, accessed October 2, 2025, https://cray.modularcircuits.com/cray_docs/articles/an_analysis_of_the_cray1_computer.PDF 2

  8. The CRAY-1 Computer System^, accessed October 2, 2025, https://tcm.computerhistory.org/ComputerTimeline/Chap44_cray1_CS2.pdf 2

  9. The Cray-1 Computer System, 1977, accessed October 2, 2025, https://s3data.computerhistory.org/brochures/cray.cray1.1977.102638650.pdf

  10. The Cray-1 Supercomputer - CHM Revolution - Computer History Museum, accessed October 2, 2025, https://www.computerhistory.org/revolution/supercomputers/10/7

  11. The CRAY- 1 Computer System, accessed October 2, 2025, https://www.cs.auckland.ac.nz/courses/compsci703s1c/archive/2008/resources/Russell.pdf