Credit: IBM

Supercomputers churn through vast amounts of data using multiple processing cores working in parallel. But most supercomputers dip into a database to find information, process the data, and then produce a result--a process that can take minutes or days, depending on the task. In recent years, however, researchers have started to explore the potential of stream computing, a type of computing approach that lets them crunch a real-time stream of data in microseconds. Data from traffic cameras, accident reports, and weather could be used to predict traffic, and streaming audio could be transcribed or translated quicker.
Now IBM has shown that stream computing can be used to analyze market data faster than ever before. The result is a machine that helps automated trading systems determine the price of securities using financial events that have just occurred. To build the system, the computing company partnered with TD Securities, an investment-banking firm, to tweak IBM software called InfoSphere Streams for financial data. The firm ran the software on one of the latest IBM supercomputers, known as Blue Gene/P.
IBM's system improves upon the current type of financial trading systems, which collect data from numerous different sources around the world, including constantly fluctuating prices of stocks and trading volumes. This information is broken into chunks, called messages, which are sent through trading systems. The more messages a system can examine, the more security prices it can determine, the more options can be sold using automated trading machines that match buyers with sellers.
The significant advance, says Nagui Halim, chief scientist of the stream-computing project at IBM, is that the engineers optimized the software to run on Blue Gene/P so that the data streams were analyzed faster than possible on other financial-analysis systems. The information arrived at a rate of five million messages per second, says Halim. The system could process a message within 200 microseconds. The result: a supercomputer that produces security prices 21 times faster than any other financial-trading system.
In some instances, says Halim, it's critical to process the data as it comes  in. A system that IBM has built monitors the vital signs of patients, such as  their blood gas levels, and keeps track of patient statistics, such as their  weight and medication regime. Data from these feeds, which can number in the  hundreds, are analyzed and correlated, producing a picture of the patient's  health that would be impossible to draw from doctors' or nurses' observations  alone. 
IBM's financial stream-computing system consists of three concepts, explains  Halim. The first is the use of streams, data flows that move in one direction  through the systems. The second is the fact that data is processed in chunks, or  windows, within that stream. And the third is the use of a collection of  algorithms that record the rate that the data comes in, that understand the  capabilities of the hardware, and that direct the streams in the most efficient  ways. These algorithms can take a stream and "spread it around in different  ways," Halim says, and "partition it on different kinds of hardware that are  specialized to do certain tasks."  For instance, some cores of a supercomputer might be optimized to process and  summarize the text in news reports, such as the failing health of a company's  popular CEO, while others are better at performing simple mathematical  operations on numbers that flow into the system. IBM has developed its own  stream-computing language called Spade that can assess the capabilities on  supercomputers and spread the data flows around appropriately, without needing  much input from a programmer. Spade makes it possible, says Halim, for stream  computing to run on other multiple-processing systems, not just Blue Gene/P. Stream computing is not a new idea. In fact, concepts for processing data as  it enters a computer were around in the 1960s, says Saman Amarasinghe,  a professor of electrical engineering and computer science at MIT. But in recent  years, it has become more practical to use, thanks to the growing  popularity of multicore chips, which have multiple processing centers that  crunch numbers independently. Streams of data can be broken up and partitioned  to individual cores relatively easily, says Amarasinghe.  Amarasinghe adds that IBM has made improvements in the more academic,  theoretical stream-computing work and has applied it to real-world problems.  "IBM has brought stream computing to high performance," he says. "They can make  it run very fast." Amarasinghe suspects that the popularity of stream computing will grow due to  a confluence of factors. First, the chip-making industry plans to  keep increasing the number of cores that it builds on its chips. Second,  stream computing is a relatively straightforward programming approach to making  use of these multiple cores. Third, "there's an explosion of data," he says,  "and it's the type of data that streams in, like video and audio." It could even  lead to more advanced user interfaces for computers that can process real-time  video and audio interactions from people, he says. 
No comments:
Post a Comment