My favorites | Sign in
Project Logo
                
Code license: MIT License
Labels: python, govtrack, sunlight, pygtk, nltk
People details
Project owners:
  tdfine

Words Vote.

Apps for America

"Words Vote." employs Sunlight Labs' "capitolwords" python library to obtain lists of words used by individual congresspeople during an interval of time (usually before a major vote). Also grabbing data for the selected and associated single roll call vote from Govtrack's XML records, it uses Bayesian statistics to determine which words are most informative in predicting a congressperson's vote.

Sample Screenshot

Purpose

We live in an age closely attuned to rhetoric. Political agents, when speaking on the floors of Congress, carefully deploy language to have the maximum impact in a media age. Although we may feel that we know which words and types of language are associated with certain positions, statistical analysis can discover many unconsidered things which reveal much about our political discourse.

Although linguistics researchers already have access to established corpora, the rise of syndicated web technologies is permitting machine-friendly corpora to emerge in real-time and to be used by non-professionals for political or casual research. In association with other syndicated categorization systems (here vote records maintained by govtrack), machines can help us identify many interesting and hidden associations in political speech and elsewhere.

In this project, we deploy two such technologies. First, Sunlight Labs has developed an API and associated Python library that identifies the most commonly-used words in the Congressional Record by speaker. Second, Govtrack maintains machine-readable records of Congressional voting that can be scraped and processed.

Dependencies

The program is merely a single python script (wordsvote.py) with a pickled file that includs a dictionary. Run via "python wordsvote.py"

To-Do

Interesting Votes to Check









Hosted by Google Code