eRulemaking: Text Mining Techniques for Large Public Comment Databases
Citizens and government administrators need a variety of navigation
aids and text analysis tools to help them cope with the hundreds of
thousands of emails received as a result of organized letter-writing
campaigns about proposed government regulations. These aids and tools
include duplicate- and near-duplicate detection, automatic
construction of browsing hierarchies, and tools that identify
stakeholder communities mentioned in comment texts. The underlying
technologies are primarily Information Retrieval, Text Datamining, and
simple forms of Natural Language Processing. These tools are being
used by the US Fish and Wildlife Service to analyze comments being
submitted about the proposed rule to list the polar bear as a
threatened species, and the proposed rule to remove the gray wolf as an endangered species.