Earth Simulator(ES1) System Overview


Operating System

The operating system running on ES is an enhanced version of NEC's UNIX-based OS called "SUPER-UX" that is developed for NEC's SX Series supercomputers. To support ultra-scale scientific computations, SUPER-UX was enhanced mainly in the following two points:

  • extending scalability; and
  • providing special features for ES.


Extending scalability up to the whole system (640 PNs) is the major requirement for the OS of ES. All functions of the OS, such as process management, memory management, file management, etc., are fully optimized to fulfill the requirement. For example, any of the OS tasks costing order n, such as scattering data in sequence over all PNs, is replaced with the equivalent one costing order log n, such as binary-tree copy, if possible, where n is the number of PNs.
On the other hand, the OS provides some special features which aim for efficient use or administration of such a large system. The features includes inter-node high-speed communication via IN, global address space among PNs, super cluster system (Figure 1), batch job environment, etc.

Figure 1: Super Cluster System of ES: A hierarchical management system is introduced to control the ES. Every 16 nodes are collected as a cluster system and therefore there are 40 sets of cluster in total. A set of cluster is called an "S-cluster" which is dedictated for interactive processing and small-scale batch jobs. A job within one node can be processed on the S-cluster. The other sets of cluster is called "L-cluster" which are for medium-scale and large-scale batch jobs. Parallel processing jobs on several nodes are executed on some sets of cluster. Each cluster has a cluster control station (CCS) which monitors the state of the nodes and controls electricity of the nodes belonged to the cluster. A super cluster control station (SCCS) plays an important role of integration and coordination of all the CCS operations.




Back