Locating Websites

Website location methodology is designed to mirror the standard methods employed by interested citizens in searching for candidate information. Research assistants conducted Google searches to identify candidate campaign websites. In order to maintain consistency across data collection efforts, research assistants logged out of google accounts, disable history based search parameters, disabled popular searches, and turned search customization off. The search employed the phrase “[Name] [State] [Chamber] [Year]” and searched the research assistants looked through first 20 returns for individual websites. Ballotpedia.org and any official Facebook.com pages that appeared in the first 20 Google returns were also consulted. Official state websites (.gov), broad state-party websites, and fundraising sites like ActBlue and WinRed were all excluded from the list.

Processing Website Text

Archived page text is processed on the page level, with unique identifiers for candidate and web page. Statements are separated using latent HTML tags including paragraph tags, headings, and list items. Statements can be separated by tags or assembled into a full candidate-year corpus file.

Identifying Issue Statements

Throughout the project, undergraduate research assistants have worked to identify candidate issue statements by issue arena according to the Policy Agendas Project under the Comparative Agendas Project (https://www.comparativeagendas.net/) codebook. We are actively working to develop accurate machine learning algorithms to detect issue statements according to policy arena and hope to have this data available soon.