c3po

Clever, Crafty Content Profiling of Objects

View on GitHub

Build Status

C3PO is a software tool, which uses meta data extracted from files of a digital collection as input to generate a profile of the content set. It is designed in a way so that different meta data formats originating from different tools can be easily integrated. Currently it supports FITS meta data and Apache TIKA meta data.

The tool follows a three part profiling process and provides facilities for data export and further analysis of the content, such as helpful visualisations of the meta data characteristics, partitioning of the collection into homogeneous sets based on any known characteristic. For each chosen partition of the content, a special machine-readable profile can be generated that contains aggregations and distributions for many of the properties. The profile optionally contains the set of chosen sample objects that are representative.

Visual Material

Collection Overview

Additional Information

You can find more information in the following posts:

I want it!

Please download the current version from BinTray.

How can I use it?

Please read this Usage Guide.

I want to contribute!

Please read this Dev Guide. You can find the JavaDocs here.