Apache OODT
Apache Software Foundation | |
Stable release | 1.9.1
/ October 3, 2021[1] |
---|---|
Apache License 2.0 | |
Website | oodt |
The Apache Object Oriented Data Technology (OODT) is an open source
History
The project started out as an internal
After deploying OODT to the
Influenced by the emerging efforts in
Features
OODT focuses on two canonical use cases:
File Manager
A File Manager is responsible for tracking file locations, their metadata, and for transferring files from a staging area to controlled access storage.
Workflow Manager
A Workflow Manager captures control flow and data flow for complex processes, and allows for reproducibility and the construction of scientific pipelines.
Resource Manager
A Resource Manager handles allocation of Workflow Tasks and other jobs to underlying resources, e.g., Python jobs go to nodes with Python installed on them; jobs that require a large disk or CPU are properly sent to those nodes that fulfill those requirements.
In addition to the three core services, OODT provides three client-oriented frameworks that build on these services.
File Crawler
A file Crawler automatically extracts metadata and uses Apache Tika to identify file types and ingest the associated information into the File Manager.
Catalog and Archive Crawling Framework
A Push/Pull framework acquires remote files and makes them available to the system.
Catalog and Archive Service Production Generation Executive (CAS-PGE)
A scientific algorithm wrapper (called CAS-PGE, for Catalog and Archive Service Production Generation Executive) encapsulates scientific codes and allows for their execution independent of environment, and while doing so capturing provenance, and making the algorithms easily integrated into a production system.
CAS RESTful Services
A Set of RESTful APIs which exposes the capabilities of File Manager, Workflow Manager and Resource manager components.
OPSUI Monitor Dashboard
A web application for exposing services form the underlying OODT product / workflow / resource managing Control Systems via the
The overall motivation for OODT's re-architecting was described in a paper in Nature (journal) in 2013 by Mattmann called A Vision for Data Science.[5]
OODT is written in the
Notable uses
OODT has been recently highlighted as contributing to NASA missions including Soil Moisture Active Passive[7] and New Horizons.[8] OODT also helps to power the Square Kilometre Array telescope[9] increasing the scope of its use from Earth science, Planetary science, radio astronomy, and to other sectors. OODT is also used within bioinformatics and is a part of the Knowledgent Big Data Platform.[10]
References
- ^ "[ANNOUNCE] Apache OODT 1.9.1 released". Retrieved 27 September 2022.
- ^ Crichton, Daniel; Hughes, John; Hyon, Jason; Kelly, Sean (2000). "Science Search and Retrieval using XML". The Second National Conference on Scientific and Technical Data, US National Committee for CODATA, National Research Council.
- S2CID 7699385.
- S2CID 705732.
- PMID 23344342.
- ^ "Apache OODT APIs - OODT - Apache Software Foundation". cwiki.apache.org. Retrieved 2016-06-27.
- ^ "Apache - The ASF on Twitter". Retrieved 2016-06-27.
- ^ "Apache - The ASF on Twitter". Retrieved 2016-06-27.
- ^ "Apache - The ASF on Twitter". Retrieved 2016-06-27.
- ^ "Q&A on the Advantages of OODT - Object Oriented Data Technology - Knowledgent Perspectives". 2014-07-30. Archived from the original on 2015-04-14. Retrieved 2016-06-27.