Part 0. Executive Summary
How do you know which voters to target for a close election? In an election that has been close in many recent elections, how do you maximize your chance of winning? What voting strategies and precincts do you target to efficiently sway an election?
A previous article outlined below described a mathematical methodology on how to win an election. That methodology made an assumption about the cost of obtaining votes in each precinct but did not describe how to arrive at those precinct-by-precinct costs.
This effort uses the framework previously developed but takes the step at arriving at the precinct-by-precinct costs needed to maximize the chance of winning an election. Inherent in the methodology is the use of demographics, the relevance of getting out the vote, and the swaying of fence-sitters or middle-of-vote voters.
Specific precincts are targeted to maximize the chance of winning an election with a cost associated with gaining votes in each precinct. The campaign strategies to win these precincts is guided by the algorithms developed herein.
Part 1. Background
Strategies to target and attract voters come in all shapes and size. Many of these strategies are based upon what worked in the past. One work which is very generic is the NDI Campaign Skills Handbook (Ref. 1) which highlights “…an electoral campaign must target its limited resources only to those voters who are most likely to support it”. In that work there are good qualitative methods to employ but no quantitative methods. On the other hand, there has been a sizeable effort at the term “voting power” related to the possibilities that a given set of votes can affect the outcome of an election (Ref. 2). That work does not yield information related to where to find voters with voting power or how to target them. Finally a big data approach is described in Reference 3 where databases of detailed information on voters is used to decide electoral strategies. To some extent, many big data approaches border on the invasion of privacy and the Reference 3 authors admit “…the most valuable information campaigns acquire comes from the behaviors and direct responses provided by citizens themselves” as opposed to big data methods. Finally, Reference 4 developed a constrained minimization method where the least change in previous voting patterns was developed that could win an election. That effort utilized assumed costs for obtaining additional votes in each precinct.
This approach uses publicly-available data and a constrained function minimization algorithm to determine least cost methods of trying to swing voters from one party to another. The approach provides guidance as to where to target spending (either resources or time) to win an election.
The costs associated with gaining votes in each precinct are obtained by developing a model of the precincts which relies on demographic data. Statistical models of the demographic distributions were used to derive a neural network model for each precinct that yielded the relative cost of obtaining new votes in each precinct. The neural networks can be thought of as finding “inefficiencies” in voting patterns for each precinct similar to an economic inefficiency that can be exploited.
The 36th Assembly District of the state of California provides a fun and competitive example of voter-flipping strategies. The district itself is located northeast of the city of Los Angeles and is a primary exurb of Los Angeles located in the Antelope Valley. Every two years since 2012, the same two Democrat and Republican candidates have been competing against each other in the general elections. Each having beat back challengers within their own primaries during various years. Other pertinent data related to this race are given in Reference 4.
For the 36th Assembly District, the demographics associated with it and past voting patterns will be used here to derive a strategy to win an election that was lost in the previous voting cycle.
Part 2. Methodology
California Assembly District voting results from the March 2020 primary were obtained from Reference 5. These results contained the number of votes cast for the Democratic party candidates and the Republican party candidates by precinct. The raw votes were converted to percent Democrat and percent Republican vote for each precinct. The possibility of outlier precincts due to small vote totals were removed by eliminating any precincts with less than twenty total votes cast.
Demographic data for the various precincts were obtained from Reference 6 at the census tract level. These results were parsed down to the voting precinct level using polygon overlap area methods in Python.
A typical demographic distribution across California Assembly District 36 precincts is shown in Figure 1. As is the case for many of the demographics, a half-normal distribution (as shown in green) is more representative than a typical normal distribution (shown in blue). Other distributions, among the dozen or so available, could be used if they fit the data better.
Added to the pure demographics of a precinct was the single pseudo-demographic of the number of nonvoters in the previous election.
In total, eighteen demographic indicators and one pseudo-demographic indicator was used to derive a voting model for each precinct in the 36th Assembly District.
Initially a linear multivariable curve fit was applied to the demographic data and matched with the percent of voters voting Democrat in the last election. This linear model did not provide results that could predict voting trends adequately.
Thus a model using neural networks was applied to the data. Inputs to the model consisted of the demographic data for each precinct and the outputs were the percent of Democrat vote for each precinct in the last election. The data was split into 60% training data and 40% testing data and the resulting neural network predictions of Democrat vote were compared with the actual Democrat vote. Figure 2 shows the resulting match of the data where training data is shown in red and test data is shown in blue. A perfect neural predictor would lie on the green line and the desired error bounds are shown in magenta. There is no visible difference or ways to discriminate between the training and testing data. This indicates the neural network is a suitable predictor of voting behavior. There are certainly some outliers that could be examined and improved upon on a case-by-case basis, but these results are good enough to move forward. The results used here were obtained with four hidden layers and 8, 24, 24, and 16 hidden neurons in the hidden layers after experimenting with other architectures and activation functions.
As the demographics within precincts change, the weights associated with each vote can be updated and the voting strategy updated. The neural network model of precinct demographics and voting tendencies supports updates to demographic factors.
The update of precinct demographic data and voting tendencies are all tasks that need to be performed and data that need to be collected in order to run a well-informed campaign regarding knowing one’s district. The neural network model makes this much easier.
The calculation of which precincts to target in a campaign strategy was posed as the constrained minimization problem (Ref 4)
subject to the inequality constraints
where Vi is the derived percent Democrat vote in precinct i, Voi is the baseline percent Democrat vote for each precinct from the primary election, TDEM and TREP are the total calculated Democrat and Republican vote, and bi is a “cost” factor associated with each precinct. The cost factors, bi, are the subject of this specific effort as described above utilizing the neural networks.
During the optimization process, as Democrat vote percentages increase, the calculated number of votes is added to the Democrat total and subtracted from the Republican total. Thus maintaining the same total number of votes in each precinct.
The constrained minimization algorithm to determine optimal vote acquisition strategies was implemented in Python and run for various scenarios as described below. Note that the cost factors bi were fixed for the various runs as derived a priori from the neural network process.
Part 3. Results
Two voting strategy scenarios were run to demonstrate the value of the neural network model of demographics. The first run was a baseline run that used a uniform cost penalty across all precincts. That is, no demographic modeling or individual precinct costs were used. The second run was made using the method described above with the neural network model of demographics and a distinct cost of votes for each and every precinct.
Note that in the 2020 primary election, there is a difference of 8% between the Republican votes at 54% and the Democrat votes at 46%. Thus vote percent differences in the 8% region are going to be required to win the general election.
The first election strategy was derived using the cost minimization procedure with all bi set to unity indicating equal difficulty in changing the vote in any precinct. This might be a good starting point if one has no a priori knowledge of the costs of campaigning in each precinct. Figure 3 shows the results of the cost minimization process for a uniform precinct penalty. As you can see a large number of the precincts showed an 8% increase in Democrat voting to win the election along with other precincts with greater or lesser percent increases. This semi-uniform 8% increase is to be expected using a uniform penalty function across all precincts.
A second election strategy was derived using the constrained minimization procedure and the neural network derived bi penalties for each precinct. The results of optimization runs made with the neural network model of demographics penalty are shown in Figure 4. In this case the semi-uniform 8% increase in precinct Democrat vote is not seen since the penalties are not uniform. Instead the optimization process picks out least cost voting precincts to gain votes and win the election.
The voting strategy using the neural network model for demographics yielded an overall campaign cost of half the uniform penalty method. The optimizer is concentrating its efforts to find additional votes where they would cost the least to campaign for and obtain. This also indicates the importance of knowing the precincts in your race and taking advantage of “inefficiencies” in the process.
Figures 5 and 6 show where the actual votes were gained by precinct for the uniform penalty and neural network penalty runs, respectively.
Figure 7 shows the geographic distribution of precincts with vote percent increases color coded from the neural network penalty run. The blue precincts are where the vote was increased 7% or more during the optimization process, the gold between 3% and 7%, and the green less than 3%.
The distribution of votes gained shows that there is a geographic similarity between precincts to be targeted as well as the neural network model of them.
Part 4. Commentary
Elections are won by picking up many votes where you typically run strong, swinging fence-sitting voters, and trying to do “less bad” where you run weak. Up to now, decisions on how to obtain votes in these three categories was based upon experience from previous elections. This effort derived a mathematical method on how to target precincts and voters to obtain votes and minimize costs. The cost of obtaining votes for each precinct were derived from a neural network model of precinct demographics. Results from the method can guide campaign strategies to help win elections.
It is important to examine the demographics of each precinct to develop a campaign strategy that fits the precinct as well as the mathematical results described above. The gathering of current demographic data, past demographic data and voting trends, and the mathematics to derive a resulting target and campaign strategy are the way that elections are won.
The knowledge of your district, the data to make decisions, and strategies to exploit inefficiencies in voting trends can save a lot of “thrown away” money with bad campaign strategies.
1. “Campaign Skills Handbook”, O’Connell, S., Smoot, S., and Khalel, S.A., National Democratic Institute, https://www.ndi.org/sites/default/files/Campaign%20Skills%20Handbook_EN.pdf.
2. “The Mathematics and Statistics of Voting Power”, Gelman, A., Katz, J.N., and Tuerlinckx, F., Statistical Science, Vol 17, No 4, 2002, pp. 420–435.
3. “Political Campaigns and Big Data”, Nickerson, D.W. and Rogers, T., Harvard Kennedy Research Working Paper Series, RWP13–045, 2013.
4. “Algorithmically Targeting Voting Precincts to Win an Election, Pt 1”, Manning, R. A., Medium, Jan 2021, https://ray-90807.medium.com/algorithmically-targeting-voting-precincts-to-win-an-election-part-1-d9adc708b9e3.
5. “The Redistricting Database for the State of California”, University of California, Berkeley, https://statewidedatabase.org/.