Skip to main content

Getting back into parallel computing with Apache Spark

Getting back into parallel computing with Apache Spark has been great, and it has been interesting to see the McColl and Valiant BSP (Bulk Synchronous Parallel) model finally start becoming mainstream beyond GPUs.

While Spark can be some effort to setup on actual clusters and does have an overhead, thinking that these will be optimized over time and Spark will become more and more efficient. 

I have started a GitHub repo for Spark snippets if any are of interest as Apache Spark moves forward 'in parallel' to the HDFS (Hadoop Distributed File System).


Popular posts from this blog

Stream PRAM: Research: Darrell Ulm @ Microsoft Research

Stream Pram is a paper co-written by Darrell Ulm, cat be accessed at Darrell Ulm Stream Pram Research Paper This is a paper about a multiple instruction stream style model of Parallel Random Access Memory (PRAM) parallel computation. The paper deals mostly with theoretical parallel computation as compared to applied parallel computing. Other links about the Stream Pram. Profile . Wordpress , Tumblr

ORCId for Research Publications

Found out that  ORCid.org is an interesting researcher, research publication database (here for me, Darrell Ulm)   and this tool looks like it has much potential, and allows fine control over how research elements are entered and other details. This is a research listing site that has many useful options and interesting that did not find it until recently.