About Bulk Synchronous Parallel(BSP) model

As an alternative to mapreduce paradigm, there is another parallel computing model  called Bulk Synchronous Parallel(BSP). A BSP computer is defined as a set of processors with local memory, interconnected by a communication mechanism (e. g., a network or shared memory) capable of point-to-point communication, and a barrier synchronization mechanism. It differentiates/decouples the use of local memory from that of remote memory. A BSP program consists of a set of BSP processes and a sequence of super-steps—time intervals bounded by the barrier synchronization. Each processor has its own local memory module, and all other memories are non-local where they are accessed by networking. The communication between processors are non-blocking.The essence of the BSP model is super-step. At the start of super step computations are done locally. Then, using the messaging system in the network, the other processes can handle requests for further computation.The communication and synchronization are decoupled. There exists a barrier synchronization in which the processors wait and sync when all communications are completed. When all processes have invoked the sync method and all messages are delivered, the next super-step begins. Then the messages sent during the previous super-step can be accessed by its recipients.The data locality is an inherent part of this model in which the communication is made only when the peer data in necessary. This is different from mapreduce frameworks in which they do not preserve data locality in consecutive operations. During mapreduce processing, it generally passes input data through either many passes of mapreduce or mapreduce iteration in order to derive final results which makes communication cost added on to the processing cost. So BSP is useful with many programs requiring iterations and recursions.



Apache Hama  is one such project enabling hadoop to leverage BSP. Google Pregel uses BSP for large scale mining of graphs.

reference:

http://en.wikipedia.org/wiki/Bulk_synchronous_parallel
http://incubator.apache.org/hama/

No comments:

Post a Comment