Proceedings Article | 1 May 2017
KEYWORDS: Machine learning, Web 2.0 technologies, Performance modeling, Analytical research, Internet, Data integration, Computing systems
Increasing worldwide internet connectivity and access to sources of print and open social media has increased near realtime
availability of textual information. Capabilities to structure and integrate textual data streams can contribute to more
meaningful representations of operational environment factors (i.e., Political, Military, Economic, Social, Infrastructure,
Information, Physical Environment, and Time [PMESII-PT]) and tactical civil considerations (i.e., Areas, Structures,
Capabilities, Organizations, People and Events [ASCOPE]). However, relying upon human analysts to encode this
information as it arrives quickly proves intractable. While human analysts possess an ability to comprehend context in
unstructured text far beyond that of computers, automated geoparsing (the extraction of locations from unstructured text)
can empower analysts to automate sifting through datasets for areas of interest. This research evaluates existing
approaches to geoprocessing as well as initiating the research and development of locally-improved methods of tagging
parts of text as possible locations, resolving possible locations into coordinates, and interfacing such results with human
analysts. The objective of this ongoing research is to develop a more contextually-complete picture of an area of interest
(AOI) including human-geographic context for events. In particular, our research is working to make improvements to
geoparsing (i.e., the extraction of spatial context from documents), which requires development, integration, and
validation of named-entity recognition (NER) tools, gazetteers, and entity-attribution. This paper provides an overview
of NER models and methodologies as applied to geoparsing, explores several challenges encountered, presents
preliminary results from the creation of a flexible geoparsing research pipeline, and introduces ongoing and future work
with the intention of contributing to the efficient geocoding of information containing valuable insights into human
activities in space.