The Observations Workshop was an NSF-sponsored event held at NCEAS in July 2007, bringing together a number of researchers in the earth and life sciences who were involved with projects identified as having some convergence in terms of modeling their primary data as observations and measurements. Detailed discussion and breakout groups led to optimism that a coordinated approach to modeling scientific observations might provide a key unifying construct for achieving interoperability of data across disciplinary boundaries.
The goal of AKN is to educate the public on the dynamics of bird populations, provide interactive decision-making tools for land managers, make available a data resource for scientific research, and advance new exploratory analysis techniques to study bird populations. To do so, the AKN is bringing together observation data on birds. This includes data from bird-monitoring, bird-banding, and broad-scale citizen-based bird-surveillance programs. All of the observation data are unified within a distributed information architecture that is constantly expanding. All data can be made accessible, and are archived in Cornell University's data management infrastructure. Additionally, the AKN has gathered over 1100 environmental, climate, and human demographic variables that are linked to all AKN bird observation locations.
The SWEET project provides a common semantic framework for various Earth science initiatives. The semantic web is a transformation of the existing web that will enable software programs, applications, and agents to find meaning and understanding on web pages. SWEET developed these capabilities in the context of finding and using Earth science data and information.
The VSTO project addresses key science areas such as solar-terrestrial physics datasets, the highly interdisciplinary Center for Integrated Space-Weather Modeling (CISM) model intercomparison, providing a framework for collaboration and a basis for building and distributing advanced data assimilation tools for the solar-terrestrial physics community. This project will directly addresses key needs in Cyberinfrastructure (CI) such as software tools and services, interdisciplinary data integration, representation, metadata, documentation, quality control and user community building.
SPIRE will develop a framework to facilitate science research and education on the semantic web, and will implement and evaluate prototype tools and applications for use in the biocomplexity and biodiversity domains. These capabilities include the ability to collaborate and convey meaning through the automatic and semi-automatic semantic annotation of web documents; to improve information retrieval using background knowledge and inference; and to extract and fuse information from multiple, heterogeneous sources in response to a query. A testbed for prototyping these capabilities will be the web portal of the National Biological Information Infrastructure. The framework will include specifications for ontologies, protocols, agents, and tools for authoring, automated ingest, and annotation. These tools will leverage collaboratively constructed ontologies to bring diverse communities together and enable community construction of scientific knowledge. Additional domain-independent, general purpose ontologies will be developed to enable metadata about the contents and structure of databases and other knowledge repositories to be expressed in emerging knowledge markup languages such as RDF and OWL. This will enable agents to both access and index the hidden web, and will also support the data mining of diverse and distributed databases.
The CUAHSI Hydrologic Information System project is developing information technology infrastructure to support hydrologic science. One aspect of this is a data model for the storage and retrieval of hydrologic observations in a relational database. The purpose for such a database is to store hydrologic observations data in a system designed to optimize data retrieval for integrated analysis of information collected by multiple investigators. It is intended to provide a standard format to aid in the effective sharing of information between investigators and to allow analysis of information from disparate sources both within a single study area or hydrologic observatory and across hydrologic observatories and regions. The observations data model (ODM) is designed to store hydrologic observations and sufficient ancillary information (metadata) about the data values to provide traceable heritage from raw measurements to usable information allowing them to be unambiguously interpreted and used. A relational database format is used to provide querying capability to allow data retrieval supporting diverse analyses.
- ODM specification
The Ecological Metadata Language (EML ) is a metadata specification developed by the ecology discipline and for the ecology discipline. It is based on prior work done by the Ecological Society of America and associated efforts (Michener et al., 1997, Ecological Applications). EML is implemented as a series of XML document types that can by used in a modular and extensible manner to document ecological data. Each EML module is designed to describe one logical part of the total metadata that should be included with any ecological dataset.
The ODS is intended to benefit the research and conservation communities by facilitating data aggregation and sharing within and between organizations, such as data discovery through global search portals, and by fostering interoperability and collaboration. To achieve these broader goals, NatureServe is a co-sponsor of the Taxonomic Databases Working Group (TDWG) Observational Data Subgroup. NatureServe will be offering this provisional standard as an input for that group’s work towards an international observation data standard.
- ODS specification
The OSR Interest Group is to explore concepts and methods of biodiversity data description, integration and transfer that fully integrate specimen and observational data into existing data exchange schemas.
The Canopy Database Project's mission is to address issues of data acquisition, management, analysis and exchange relating to canopy studies at all stages of the research process. We develop informatics tools for canopy scientists, document and publish datasets that demonstrate use of these tools, characterize (and formalize in informatics terms) fundamental structures of the forest canopy, and relate those structures to functional characterizations for retrospective, comparative, and integrative studies. We also aim to generalize the tools we develop (or show that they are generalizable) to the larger discipline of ecology and to articulate where current information technology is not adequate for implementing tools for these scientists. In the latter case, we communicate these needs to other researchers (specifically the database and information technology communities).
OBOE is a formal ontology for capturing the semantics of generic scientific observation and measurement. The ontology provides a convenient basis for adding detailed semantic annotations to scientific data, which crystallize the inherent “meaning” of observational data. The ontology can be used to characterize the context of an observation (e.g., space and time), and clarify inter-observational relationships such as dependency hierarchies (e.g., nested experimental observations) and meaningful dimensions within the data (e.g., axes for cross-classified categorical summarization). It also enables the robust description of measurement units (e.g., grams of carbon per liter of seawater), and can facilitate automatic unit conversions (e.g., pounds to kilograms). The ontology can be easily extended with specialized domain vocabularies, making it both broadly applicable and highly customizable. In particular, explicit “extension points” allow new types of observable entities (e.g., tree, rock, population), characteristics (e.g., height, color, diversity), and unit definitions to be easily added. Finally, we describe the utility of the ontology for enriching the capabilities of data discovery and integration processes.
O&M is a conceptual model and encoding for observations and measurements. This is formalized as an Application Schema, but is applicable across a wide variety of application domains.
- O&M specification
The goal of MMI is to promote collaborative research in the marine science domain, by simplifying the incredibly complex world of metadata into specific, straightforward guidance. MMI hopes to encourage scientists and data managers at all levels to apply good metadata practices from the start of a project, by providing the best advice and resources for data management.