Sep 08, 2017: 2:00 - 5:30pm

C7: Owl - Data Science in OCaml

Liang Wang

Abstract

Owl is an OCaml library for scientific computing. It supports N-dimensional arrays, various matrix operations, linear algebra, regressions, fast Fourier transforms, algorithmic differentiation, and many advanced mathematical and statistical functions. Owl is evolving into a power toolkit for data scientists to test their new ideas, evaluate novel designs quickly and safely by providing a consistent and integrated numerical infrastructure. It not only allows us to fast prototype machine learning based algorithms without sacrificing performance, but also guarantees the reliability and robust of the application thanks to OCaml’s powerful type system.

In this tutorial, tutor will first introduce the overall architecture of Owl by going through its major components: Matrix, Ndarray, Linear algebra, Algodiff, Dataset, Ext module, and etc. Then tutor will show how to use Owl to finish a series of common tasks that every data scientist and industrial practitioner has to deal with in their daily job. The tasks will be carefully designed and arranged (based on their difficulty levels) to cover the following topics: basic data types, matrix operations, indexing, plotting, algorithmic differentiation, machine learning & neural network, and topic modelling.

Tutorial objectives

Audience will gain an overall understanding of Owl and have basic familiarity with Owl's APIs and major components. Audience should be able to use Owl to finish simple numerical computing tasks such as numerical optimization.

Target audience

Anyone with basic functional programming knowledge

Infrastructure required

Owl library needs to be installed beforehand. Alternatively, our docker image can be used for convenience.

Liang Wang

Liang Wang

Liang Wang is a research associate in the Computer Laboratory at University of Cambridge, United Kingdom. He received both his MSc and PhD degrees in Computer Science from University of Helsinki, Finland in 2011 and 2015 respectively. Liang’s research interests include data analytics, distributed data processing, system and network optimization, modeling and analysis of complex networks, and so on.