Mathematica is probably one of the most expressive functional languages available. But since Mathematica holds all data in-core, it has been challenging to use it on extremely large datasets. The HadoopLink package is one part of a new effort at Wolfram to circumvent this in-core limitation, by linking Mathematica with your Hadoop cluster. After a brief introduction to functional programming in Mathematica, I'll walk through a simple example showing how to write Map/Reduce jobs using Mathematica and HadoopLink. I'll then describe a novel genome search algorithm written specifically for Hadoop, taking advantage of Mathematica's expressive functional constructs.
is a full-stack data scientist at Wolfram, working on technology applications ranging from genomics to web analytics. He has been on the faculty of the Wolfram Science Summer School since 2005, where he teaches practical functional programming in Mathematica to students who want to explore new frontiers in the computational universe. Paul-Jean is currently writing in-house tools for working with Hadoop and HBase using Mathematica.