TOKYO – The top of a dormant volcano in Hawaii might seem like an unlikely place to work on improving the reliability of computer chips, but that’s just the spot engineers from Fujitsu chose over their well-equipped laboratory in Tokyo.
The engineers were trying to better understand how cosmic rays cause computers to make errors. Cosmic rays are high-energy particles from outer space, many of which are absorbed as they pass through the earth’s atmosphere. The top of the Mauna Kea volcano was perfect for the research because at 4,200 meters above sea level the intensity of the cosmic rays is about 16 times that of places like Tokyo, said Yoshiharu Tosaka, manager of Fujitsu’s reliability engineering department.
If a cosmic ray collides with an oxygen, carbon or nitrogen atom it can release a neutron from the atom’s nucleus. If that neutron then goes on to collide with a silicon atom, it can split the atom’s nucleus into smaller, charged particles. And if the silicon atom is in a memory chip, the charged particles can be sufficient to change the contents of a memory cell in the chip, producing what is called a “soft error.” In a trivial case this might mean a single pixel in a digital photo ends up being changed slightly, but in a more serious case it can cause a computer program to crash.
Soft errors in individual cells are extremely rare events, but as semiconductor technology gets more advanced the number of cells per chip is increasing and so a given computer system is more likely to experience such an error, said Tosaka.
To combat soft errors, engineers can add error-correction codes or choose materials that are less susceptible to the radiation. However, doing so for every chip will increase prices so engineers work on simulations of the number of soft errors a given chip might experience to see where best to invest their efforts. Getting accurate data for the simulations is difficult because the errors are so rare and are also influenced by the specific conditions of the physical location where the chips will be used. This is where Fujitsu’s research on Mauna Kea comes in.
Along with engineers from the National Astronomical Observatory of Japan, Fujitsu took measurements over a three-month period inside and outside of Japan’s Subaru telescope, which is one of 13 that sits at the peak of Mauna Kea.
Both the soft-error rate and the intensity of neutrons were measured so that the relationship between the two can be assessed. By taking measurements inside and outside the telescope it was also possible to deduce the effect of the building on shielding chips from cosmic rays. Fujitsu also took measurements in Tokyo to provide a comparison with the level of cosmic rays seen near sea level.
The result is a more accurate simulation of the effects of cosmic rays not just on chips in general, but on chips at a given location, and that should allow the development of more reliable computers and devices, said Tosaka.
By doing the work on Mauna Kea, Fujitsu was able to gather the data in one-eighth the time it would have taken in Tokyo. The problem of soft errors was highlighted in 2000 when a batch of IBM static RAM chips that were unusually susceptible to soft errors was used in servers built by Sun Microsystems. The result was repeated system failures, many of which could only be solved with reboots.
The results from the experiments, which were done in late 2006 and early 2007 but are only now being announced, are already being used in the manufacture of some Fujitsu chips, the company said.
Fujitsu plans to disclose more details of the research at the International Reliability Physics Symposium this week in Phoenix.