Twitter and Scala - Scalding 0.8.0 and Algebird

Earlier this year we open sourced Scalding, a Scala API for Cascading that makes it easy to write big data jobs in a syntax that’s simple and concise. We use Scalding heavily — for everything from custom ad targeting algorithms to PageRank on the Twitter graph. Since open sourcing Scalding, we’ve been improving our documentation by adding a Getting Started guide and a Rosetta Code page that contains several MapReduce tasks translated from other frameworks (e.g., Pig and Hadoop Streaming) into Scalding.

Today we are excited to tell you about the 0.8.0 release of Scalding.

There are a lot of new features, for example, Scalding now includes a type-safe Matrix API. The Matrix API makes expressing matrix sums, products, and simple algorithms like cosine similarity trivial. The type-safe Pipe API has some new functions and a few bug fixes.