California, particularly in coastal cities, has a significant housing crisis.* Recently, an ambitious bill was put forward in the California state senate that would have overrided local zoning laws in order to permit the construction of dense housing near frequent public transit stops. As someone who has spent the last five years trying to find places to rent in Santa Barbara on a grad student salary, I watched the bill with great interest. Unfortunately the it died in committee before any real polling could be done to see whether the broader community wanted it. I thought I'd try to do some sentiment analysis on tweets on #SB827 in order to gauge support for the bill. Overall, 73% of the tweets in the set seemed to be supportive of the bill, with 72% of tweets from the West Coast and 75% of tweets from elsewhere expressing support of the bill. It seems likely that there is enough broad support that a similar bill could succeed in the future.
In the process I used tweepy to extract data via the Twitter API, pandas and numpy to organize dataframes and arrays, textblob to handle some text processing and text classification, and a Naive Bayes classifier. I chose to work with the Naive Bayes classifier primarily because of its speed and simplicity, as well as because it is known for working well with categorical variables and with text classification problems in particular. This technique only worked with 73% accuracy and produced a fairly large number of false positives, so it likely overstated the support for the bill (because the classifier seemed to predict too many positives by a factor of ~1.2x, true support may fall closer to around 59%). More complex techniques associated with embeddings with more linguistic context (such as Word2Vec) followed by simple classification (perhaps just logistic classification) may have yielded more accuracy within this dataset, but ultimately analysis of political support by tweet sentiment analysis is of limited accuracy anyway.
The overall result suggest that there was enough support for SB 827 to justify more comprehensive study into the popularity of variations on these measures. It would be interesting to see the results of more rigorous polling on this topic.
*http://www.latimes.com/business/hiltzik/la-fi-hiltzik-housing-crisis-20180330-story.html