Use Case 3 - Urban Ecology
by
obrien
—
last modified
Dec 28, 2010 01:06 PM
Semantic representation of Socio-economic information
Participants: Corinna*, Huiping, Rob, Simon, Nick, Peter
- Goal:
- Data fusion: combine data on demographic, temperature, socio-economic, etc. to do correlation analysis.
- Data integration: integrate similar data (but maybe different syntactic and semantic representation, e.g., on temperature).
- NOT: data discovery to find data with some specific values.
- Summary:
- We need to have the semantic representation of places (e.g., use
census blocks/WRF grids, but not specific spatial point in longitude
and latitude).
- Capture other detailed information in the metadata (e.g., column information in the csv file)
- Background Newspaper article Graphs
- Queries:
- Data integration/fusion based on spatial feature.
- Feature types: Census area (strange shapes), WRF grid
(Data granularity problem) - Vocabularies for the related domains (e.g., medical morbidity)
- Operations/Tasks:
- Service:
(1) Relationship between census blocks and WRF grids
(2) Getting data out of high dimensional grids (File process?)
(3) Integration interface (census block is the integrating feature? or can we use some other integration unit?)
(4) Logical definition of continuous region (contiguity) - Service components:
- Data sets and associated metadata
- Temperature data (with spatial and temporal information) --another set
- Public records of aggregated health data (not private) because of too high temperature/cold/flooding
- Heat related dispatches (some research papers)
- Survey data (e.g., detailed land cover WRF model)
- Weather service data (forecast)
- Census & temperature &vegetation data (concrete)
- The meta data information can be found from metadata
- Data sets: three csv file on survey, site, temperature, e.g.
- survey data (41_pass2001_1.csv)
- site information (41_sites_1.csv)
- climate model data (41_wrf_simulated_temp_1.csv)
- Challenges
- Terminology
Does O&M model works well enough for this use case?
Different models use different representations
(e.g., feature, entity, observation)
(1) Map existing representations or
(2) Add a virtual high level representation? - Variable independence
- Metrics of completion/success
- Be able to map the data to the core model (vegetation data, temperature data)
- Be able to repeat it (different place, different time period)
- DO NOT:
- The details in analyzing the data
- Extend this study:
- Satellite data map?
- Different time periods (weekdays, weekends)?
- Repeatability