Currently active projects.
Clusterless is a tool for deploying scalable and secure data-processing workloads for continuously arriving data, across clouds.
A command line tool for reading and writing data to/from multiple locations and across multiple formats.
Cascading is a collection of applications, languages, and APIs for developing data-intensive applications.
Mini-Parsers is a Java API for parsing short discrete text strings into native types where a single type may have multiple textual representations.
Pointer-Path is a Java API for building and transforming nested data types like JSON. The API was designed to allow the declaration of bulk transformations. That is, declare the transform to be performed, then execute them over a set of input data.