A Visual analytics approach to examining the moving patterns of asthma incidences from social media
ISBN 978-85-88783-11-9
Authors
1X. Angela Yao; 2Grundstein, A.
1University of Georgia
2UNIVERSITY OF GEORGIA
Abstract
Social media has been an increasingly significant source of spatially and temporally-explicit data. Many social media data are made available at the finest spatial and temporal scales. The availability of such fine-grained data enables researchers to answer more geographical and health-related questions. This study is motivated by two research objectives. The first is to evaluate the feasibility of using social media data to track asthma occurrences in a city. In prior studies, Google Trends was used to identify influenza peaks. However, these studies were usually performed at large spatial scales and coarser spatio-temporal granularities which showed general changes across large geographical regions over days, weeks, or even months of time. This study intends to examine the changing spatio-temporal patterns of claimed asthma symptoms at an urban scale with a fine temporal granularity, i.e. minutes. The second objective is to develop a space-time representation of such data in a GIS environment and to perform visual analytics methods to examine the spatio-temporal patterns. This is an ongoing study and we only have some preliminary results. A case study is conducted for Atlanta, Georgia in the United States. Geotagged tweets in this area were collected via Twitter's REST application programming interface (API) in a few months in late 2014 and early 2015. Ten keywords including Asthma, related symptoms, and common medicine are used to collect potentially relevant Asthma tweets. The tweets are then organized in a relational database. A coded automatic filtering process is combined with some manual process to remove duplicate and irrelevant tweets according to predefined rules. The final set of tweets were then georeferenced and time-stamped in a GIS environment. It is recognized that tweet distribution may severely affected by population composition. The literature has shown that age, gender, and other demographic factors may have significant impact on the use of twitter. Thus we normalize the twitter counts by age group at a fine spatial scale. A number of visual methods, including the space-time cube and animation, were adopted for visual exploration of the changing spatial patterns over time. The visual analytic studies are applied to each day’s dataset independently. Significant differences in spatial and temporal patterns are found among different days. Further analysis will be performed to understand factors that can explain the differences. The preliminary study also found interesting geographic distribution of Asthma tweets. The visual analysis suggest that asthma attacks are more prevalent in the central area of Atlanta (Fulton and Dekalb counties), even after adjustment by age and population. This finding provides some justification and starting points for further research to study the relationships between asthma attacks and other factors. The limitation of the current stage of the research is that it is based on a small set of data. Data scarcity was found to be a big problem for several reasons. We expect to apply the developed methods in the next months on a much richer data set when more data become available.