Saturday, November 7, 2020

Evidence Suggesting Voter Fraud in Milwaukee

 I posted a version of this on twitter, but a) the writing format there is so ugly, and b) who knows how long that thread might last. So here it is for the record. 

I’ve been looking at the vote counts within Milwaukee, and there’s suspicious patterns in the data that need explaining. Proving fraud is difficult, but there’s a lot of irregularities here that point in that direction. First, the tl;dr, then the main analysis.

1. Democrat votes started increasing massively relative to Republicans after Tuesday night counts. This can’t be accounted for by explanations like heavily Democratic wards reporting later. When we look at the changes *within wards*, 96.6% of them favored the Democrats.

2. Democrats also improved massively against third party candidates, whereas Republicans and third party candidates showed similar changes to each other. Since there’s little incentive to manipulate third party counts, this implies that the big change after Tuesday night is in Democrat votes, not in Republican ones.  

3. When we compare different down ballot races, we find that Democrat increases within each ward were larger in races where the Democrat candidate was initially behind in the overall race on Tuesday night – that is, relatively more Democrat votes appeared in races where they were more likely to alter the outcome.

4. This result is easy to explain by fraud, but is much more complicated to explain by other explanations like Democrats mostly voting by mail. Most such theories predict all Democrat candidates should benefit in equal proportions within a ward, not that more votes come in exactly where they’re most needed.  

Ward-level vote counts are from the Milwaukee County Clerk at 7pm last night  and the archived version from the count as it stood on election night . 

This idea came from Spotted Toad, who’s been doing great work on this too. I’m looking at Presidential, Congress, State Senate and Assembly races. One way to look at what happened is to compare the percentage increase in votes for Republican Candidates versus Democrat candidates within each ward after election night.

For instance, suppose the Democrat candidate vote total went up 200% from initial counting to Thursday night. How much did the Republican vote total go up? If the distribution of votes before and after is the same, the percentage gains for each group should be similar, regardless of who was ahead.

This is very different from the normal reason where candidate totals in the entire state might change as counting goes on, as different reports come in from other parts of the city. That just shows that wards differ from each other. Rather, we’re testing whether the *same ward * should continue to find the same distribution of votes before and after Tuesday night. 

In other words, if the before and after distributions were the same, as votes come from the same pool, you’d expect that half the time, the Republicans got a slightly unlucky draw in the early votes, and end up improving their position (regardless of whether they ultimately win or lose). And roughly half the time, the Democrats should increase their votes by more. 

What actually happens? The Democrat candidate vote increases relative to the Republican candidate a crazy fraction of the time. The variable in question is percentage increase in Democrat vote totals for that ward (that is, the percentage change from Tuesday night to Thursday night), minus percentage increase in Republican vote totals. 

So a value above zero means that Democrat totals went up more than Republicans in that ward/race. A value of 500 means that the Democrats went up 500% in excess of the republicans (e.g. D votes grew 600%, R votes grew 100%). Here’s a graph of the histogram. 

You see an enormously right skewed distribution –tons of large gains for Democrats, very few gains for Republicans. Not only do Democrats very often increase more than Republicans, but when they do, it’s often by a colossal amount. 

Out of the 1217 ward/race combinations with non-missing early votes for both parties, 1037 saw relative increases for the Democrats, 37 saw relative increases for Republicans, and 143 were ties. Excluding the ties, the D “win” fraction here is 96.6%.  A remarkable feat!

Depending on how you assign ties, if this were a 50/50 coin (i.e. D and R were equally likely to gain relative to the other), the probability or p-value for this is between 10^-147 and a number Excel just lists as “0”.

So, this proves incontrovertibly that something about the count skews crazily towards the Democrats after 2am Wednesday. But it doesn’t prove what it is. Maybe they counted different types of ballots or something, but only starting at 4am. 

However, there’s one thing we can test – from which party’s votes is the weirdness coming from? We can answer things by looking at vote changes for other candidates – third party races, write-in candidates etc. 

We can be virtually certain that nobody is bothering to manipulate the vote totals for fringe, no-hope write-in candidates. These form a great placebo group – what might you expect the changes to look like for a group where nobody is manipulating the totals?

So let’s do the same thing as the earlier graph, but compare each part with “Miscellaneous”, which because the count is small, I aggregate together. I also limit the sample here to cases where there’s at least 5 votes for “Misc” in that ward by 2am Wednesday, to make sure that this isn’t coming from rounding (e.g. if you have only 1 vote, the minimum increase is 100%). 

What are we predicting to find? Well, if it’s the Democrat total that’s being wildly inflated, Democrats should also be increasing relative to Miscellaneous. Meanwhile, if Republicans are just being counted as normal, then their changes should look similar to the Miscellaneous Group.

And that’s basically what we find. First, Democrats vs Miscellaneous. Visually, the picture looks even more crazily skewed than the previous one. In terms of counts, Democrats improve relative to Miscellaneous in 520 ward/race observations. They tie 89 times, and Miscellaneous improves in relative terms just 3 times. That’s not a typo.


This corresponds to p-values between 10^-73 and 10^-177. The fraction of Democratic “wins” here (520/523), excluding ties, is a ludicrous 99.4%. 

So how do Republicans compare with Miscellaneous? It turns out that while they’re not exactly the same, they’re far, far more similar to each other than either is to the Democrats . Other than a few outliers (because “Miscellaneous” has very few votes in total, remember), the distribution is fairly symmetric around zero. 


In terms of counts, Republicans improve relative to Miscellaneous 179 times, Miscellaneous improves 251 times, and there are 74 ties. As a result, which p-value you get here depends enormously on how you allocate the ties. Give them to M, and it’s 10^-11. Give them to R, and it’s 0.55, or almost exactly chance (253 vs 251). 

Excluding ties, the R “win” percentage is 41.6%. So under some measures, they look slightly worse, but this ends up being affected by questions of rounding and the small vote totals for M. What’s incontrovertible is that D looks wildly, wildly different from either of them.

This is exactly what the null would predict, if votes before look like votes after. So this *does* roughly hold, but only when comparing Republicans vs Miscellaneous. This story is also inconsistent with the driver being something Trump did, like telling all his supporters to vote in-person. If so, why do changes in Miscellaneous votes look about the same? The important difference after Tuesday night, whatever you think it is, is coming on the Democrat side.

So maybe you’re wondering – are there reasons other than fraud that the ballots might be different before and after? If the ordering is random and they’re drawn from the same pool, no. But if each ward counts different types in a different order (those at 9am versus 4pm, or in-person versus mail-in), then this could happen. 

Whatever is making the vote distributions different before and after, it’s a factor that’s overwhelmingly just impacting Democrats, not Republicans. If you think it’s about in-person versus postal voting, you have to hypothesize that Republicans look kind of similar to Miscellaneous in this respect. This is possible, but not nearly as obvious. 

But there’s another more important aspect we can test here. In particular, if some of these Democrat increases are due to fraud, we would expect that the increases should be larger *when the fraud is more likely to impact the race. And since these include lots of down-ballot races like State Assembly Representatives, we have quite a lot of variation here. 

Sometimes the Democrat is way up after early counting, at which point it doesn’t matter much if they post big relative gains after that. But if the Democratic candidate is down early on, jacking up the total becomes much more important. I’m assuming that if the Party wants to rig votes, they’d also like to win as many races as possible for the least amount of rigging.

In other words, the comparison is now between two different races at the same ward. A Democrat voter comes to the ballot box or mailbox, and sees a number of races. For some, like President, it’s going to be a close call. For others, it might be a heavy favorite for the Democrat. 

The voter is a Democrat, so presumably he’s inclined to vote Democrat for both. We can compare within a given ward which of the two races showed bigger improvement for the Democrats in that particular ward after Tuesday night. 

Sure enough, the increase in Democrats relative to Republicans (the variable in our first histogram) is significantly higher when the Democratic race-wide vote share is lower during the early counting. In other words, within each ward, late vote counts break more heavily to Democrat in exactly those races where the change in votes is likely to affect the result.


How big is this effect? Well, one way to measure it is to see how many races it impacted. There were 8 races where Republicans were ahead on a two-party basis on Wednesday morning. By Thursday night, half of them had flipped to Democratic. By contrast, there were 19 races where the Democrat was ahead, and not a single one flipped to the Republicans. 

And again, let’s recall what we’re observing here. It’s not that the races flipped because suddenly wards that were known to be heavy Democrat strongholds started reporting in. Rather, more votes started coming in for Democrats relative to the ratio that was coming in for that exact same ward the previous night. Moreover, within each ward, the votes also skewed more for races that the Democrats looked like they might lose. 

Importantly, this finding is surprisingly hard to explain with the commonly cited reasons for Democrats pulling ahead overall. For instance, one of the claims is that mail-in ballots are counted late, and these are more heavily Democrat. In general, this doesn’t explain why within the same ward, some races later skew Democrat more than others.

The key part is that for each voter, the decision to take a mail-in ballot is common to all races. In other words, a single voter can’t vote for some races by mail, and others in person. So if your claim is that the overall skew to Democrats is a mail ballot effect, most versions of this explanation predict that all races should be equally affected.

To simplify the logic, consider a stylized example where all Democrats and Republicans vote straight ticket. More Democrats vote by mail, and these are counted late.  This would predict that Democrats overall would improve, but the expected improvement is the same for all races, regardless of whether the Democrat is ahead or behind. 

More ballots come in Democratic, they each vote for every Democrat, so all Democrats increase in the same percentage terms. This isn’t what we find. In the data, within a ward, the important races go up more than the unimportant races.

And this prediction, that all races should be equally affected, holds for a lot of other variations too. Does the answer change if every Democrat voter has a 90% chance of voting for each Democratic candidate, if this attitude is the same between Democrats who vote in-person versus those who vote by mail? No. The increase should be the same in all races.

The answer doesn’t even change if Democrat voters in general can’t be bothered as much voting for shoo-in candidates, and only cast their votes for tight races. As long as this instinct is the same in Democrats who vote by mail and those who vote in person, there should be no difference across races in how much they break late towards Democrats.

What you need is something complicated. Democrat voters can’t be bothered voting for candidates they like but who they know are going to win anyway, AND this instinct is somehow larger in Democrat voters who vote by mail than those who vote in person, AND there has to be a larger share of mail voting by Democrats overall. 

This may sound like a confusing and complicated explanation. And it is! That’s kind of the point. We’re now a long way from the simple explanation that Democrats vote more by mail. It’s not impossible, of course, and we can’t rule it out. There are other variants on this story, but if you think this is all about mail-in ballots, there has to be some difference *within Democrat voters* who vote by mail versus in person.

In other words, the bare fact is that races swung much more towards Democrats exactly for those races where the Democrats were down on Wednesday early morning. To explain this with mail-in ballots needs a very complicated story. To explain it with fraud needs a very simple story – you commit fraud more where the fraud matters more. 

This is why the evidence suggests fraud to me, but your mileage may vary here. I’ve tried very much to stick to the facts, because I don’t have any special ability to interpret the numbers above. Whatever is going in is crying out for explanation, and the simple alternatives don’t do it. To me, it looks pretty suspicious. 

A final question worth pondering. What should our null hypothesis be here? When we say “there’s no evidence of it”, we’re claiming “no fraud” as the null hypothesis. But as I’ve argued (by metaphor), the system of vote counting is so rickety and broken that this is an incredibly difficult null to justify. 

A metaphor for the likelihood of voter fraud, for people who insist that it's a conspiracy theory, or there's no evidence of it.

Suppose Amazon wanted to know how many packages it had. Packages were kept in warehouses all over the country. The system was different in every warehouse.

Some people need to move packages around, and there's a list of who is allowed to do that in each warehouse. But if you go in and say you're that person, nobody checks. If someone else has already done that for you when you arrive, you just get another package.

Some packages get driven around by people in their own cars, some get moved around by the post office, some by volunteers or low paid government employees, and in each case they're largely unmonitored - there's no clear record of which ones left or arrived.

Packages are, by common consent, valuable for people to take. But nobody investigates closely what happens in each place, and very rarely are package thieves caught.

For what package system other than "votes" would this be considered a reliable and acceptable system?

For what important corporate outcome, if you proposed this setup as a manager, would you not be fired?

If someone told you there was no evidence of package fraud, how plausible would that claim be?

I find the possibility of voter fraud entirely plausible, and that belief has nothing to do which party you think is doing it. At a minimum, I feel strongly that this possibility needs to be investigated more seriously than it is, given the evidence above.

17 comments:

  1. Great work. Suggest that you post your raw data so that people can more easily expand on your analysis.

    ReplyDelete
  2. Very nice write up. I am a young student of comp sci/info sci and I would love to work with your dataset. I'll check back here!

    ReplyDelete
  3. Sound logic. The problem: is it really fraud when it's completely legal?

    State officials put a moratorium on all the rules that would have prevented this, under the guise that those rules would suppress voting-by-mail, and that would be unfair during a pandemic.

    As a result, Dem operatives were allowed to go ballot harvesting as early as September, and in Georgia they continued even AFTER election day. All completely legal.

    It's probably too late to do anything to change the results now. But the rules for your elections should definitely be standardized, because this is some banana republic bullshit...

    ReplyDelete
  4. Hey, can you showcase the formulas you did to bring about these calculations and such?

    ReplyDelete
  5. "For instance, suppose the Democrat candidate vote total went up 200% from initial counting to Thursday night. How much did the Republican vote total go up? If the distribution of votes before and after is the same, the percentage gains for each group should be similar, regardless of who was ahead."

    Why would we expect distribution to be the same if we know that Dems vote disproportionately by mail, and mail in votes are counted last?

    i.e 70% of Republicans vote during the day, 30% by mail. 30% Dems vote during day, 70% by mail. The 'election night' vote therefore contains a greater % of the total R vote, whereas Dem vote contains a lower % of their total vote. Then mail-in ballots are counted and Dems receive a disproportionate amount of that vote. Not only is a higher % of their total electorate voting by mail relative to on election day, but the inverse is true for Republicans.

    Am I missing something here?

    ReplyDelete
    Replies
    1. "Sure enough, the increase in Democrats relative to Republicans (the variable in our first histogram) is significantly higher when the Democratic race-wide vote share is lower during the early counting. In other words, within each ward, late vote counts break more heavily to Democrat in exactly those races where the change in votes is likely to affect the result."

      This makes the assumption that all sub-demographics of the Dem vote, vote at the same election day/mail in proportions. Let's say this isn't true, this can be explained simply by the demographics of a particular ward.

      For example say white Dems vote 50% election day 50% mail in, whereas black dems vote 20% election day and 80% mail in. Therefore if we assume that mail ins are counted after election day votes, the wards that are disproportionately black are going to have a higher % of the total vote come in by mail in - and if this is the case, then it WOULD be the case that these would be the wards that the Dems were lagging in most, as the majority demographic of that ward are huge mail in voters.

      So this gives the illusion of "dem vote increases disproportionately in wards in which they were losing" - because they only had the appearance of losing due to that ward being populated by a demographic that votes mail-in at a higher % than other demographics

      So would this not be consistent with a sleazy, yet ultimately legal, "get out the vote" mail in ballot harvesting scheme in areas run by people like Stacey Abrams and other Dem grifters. i.e A huge team of volunteers going to ghetto apartment blocks and signing up loads of blacks to get a mail in vote, then pro-actively following up with them and making sure they vote?

      Delete
    2. Yes, you're missing two things:
      1. There are more than two candidates/parties. Why ONLY Democrats vote more by mail? If the main argument is "Trump told his supporters to not vote through mail", why did other voters with other affiliations listen to him too?
      2. More Democrat vote in mail overnight only for those counties where Trump and Biden were in close race. Naturally, you would expect around the same ratio of Democrats voting through mail in all counties...

      Delete
    3. "Why ONLY Democrats vote more by mail? If the main argument is "Trump told his supporters to not vote through mail", why did other voters with other affiliations listen to him too?"

      The idea that independents listened to Trump is not realistic, I think it is probably more likely that Dem voters are the most COVID-aware demographic and therefore they were the ones who decided to vote by mail more, relative to other groups who are less afraid of COVID. They were also encouraged to vote by mail by Democrats.

      "Naturally, you would expect around the same ratio of Democrats voting through mail in all counties..."

      Why would we expect this? "Democrats" is one large amalgam that is composed of many other groups- whites, blacks, women, etc. And wards have varying demographics also. If certain Dem groups are more/less likely to vote by mail relative to other Dem groups, and if those groups have a disproportionate population % in specific wards, then this could create that effect

      Delete
  6. Where do absentee Military ballots come into play? Some were not counted or counted last?

    ReplyDelete
  7. you're a fucking idiot to compare votes to a coin flip

    ReplyDelete
  8. I’ve just spent some time researching the logistics of the 2020 vote in Milwaukee. There’s a lot of info about it online.

    It seems that a Milwaukee resident could vote in one of two ways:

    1. In person, on election day, at their local ward.
    2. Via absentee ballot.

    If they chose Option 2 and voted via absentee ballot, they could:

    A) Mail the ballot back in the return envelope. All of the mailed-in absentee ballots were addressed to one central location in the city.
    B) Drop off the ballot at one of about fourteen drop-off locations during in-person early voting or on election day itself.

    After voting was finished on election day, the votes were counted and then reported to Milwaukee County, which added them to the totals displayed on the county website as they were reported in.

    The Option 1 votes, i.e. the in-person, local ward votes, were counted locally and the totals were sent directly to the county. Since there were many small wards, this caused a continuous, steady increase in the vote totals for some time after the polls closed. After a while, most of the wards had reported in and the vote totals were no longer increasing much (i.e. “the counting stopped”).

    All of Milwaukee’s Option 2 votes were gathered at a single Central Count Facility. The total included all of the mailed ballots as well as those left at the drop-off facilities during early voting and on election day.

    Under Wisconsin law, the counting of absentee ballots couldn’t begin until 7AM on election day. This posed a logistical issue for this year’s election, which featured a huge increase in absentee voting. Milwaukee received about 170,000 absentee ballots overall. Based on the speed at which the machines could operate, the Elections Commission Director predicted (in an article on November 1st) that the count would take until at least 3AM or 4AM on election night. They then planned to take the results from the machines to the county on hard drives under a police escort, and all of the city’s absentee votes would be added to the county website total in one single update.

    The Central Count Facility was a pretty big operation with hundreds of people working in shifts, and the whole scene was livestreamed and watched by reporters. It did take them from 7AM on election day until 3AM the next morning to count everything, and then the counting machine hard drives were brought under police escort, recorded by news cameras, to the county, where the votes were added to the in-person votes in one big chunk at 4AM or so.

    The county website reports that about 247,700 votes were cast in Milwaukee. If the newspaper’s report of 170,000 absentee votes is accurate, then this single chunk made up nearly 69% of all of the city’s votes. Based on the way the numbers add up, it appears that each absentee voter’s vote was recorded in that person’s ward’s total, even though the vote was actually counted at the Central Count Facility.

    The more I read about this, the more it seems like a fairly transparent, efficient operation. Any late-night tampering that occurred would have to have been done in a pretty slick way. Your Amazon package analogy doesn’t seem to fit at all, unless there’s more to the story that I’m missing.

    ReplyDelete
  9. 2/

    Following up on my last comment. Here are some local news stories about the voting process, written before, during, and after election day:

    https://www.jsonline.com/story/news/local/milwaukee/2020/11/01/absentee-ballot-counting-milwaukees-central-count-live-streamed/6118214002/

    https://www.tmj4.com/news/election-2020/city-of-milwaukee-to-live-stream-ballot-counting-tuesday-night

    https://www.cbs58.com/news/milwaukee-election-commission-aims-to-provide-full-transparency-on-election-day

    https://www.usatoday.com/story/news/factcheck/2020/11/04/fact-check-wisconsin-never-took-break-counting-ballots/6168105002/

    https://www.jsonline.com/story/news/local/milwaukee/2020/11/07/absurd-and-insulting-milwaukee-officials-reject-vos-call-vote-count-probe/6203414002/

    ReplyDelete
  10. I'd recommend the edX Harvard course on Data Science: Visualization. Your graphs make no sense at all and are hard to read. Frequency .001 .002 .003 WTF? Wheres the other 99 percent of occurences??

    ReplyDelete
    Replies
    1. Actually, take the entire series on Data Science. Man, you really wanted this to be fraud going to these lengths. The data isn't there, sorry.

      Delete
    2. I read research papers for fda subs and I agree

      Delete
  11. 3/

    The hard shift to the Democrats in the wee hours of the morning seems pretty straightforward: all of those votes were absentee votes, and mass absentee voting was a partisan Democrat phenomenon. It appears that the small number of Miscellaneous voters behaved like Republicans rather than Democrats. That’s intuitively plausible to me; someone independent-minded enough to vote for a third party in a swing state is probably not cowering at home in fear of COVID.

    Regarding the swings in the down-ballot races, it’s hard for me to analyze this based on what you’ve presented. It would be easier if you posted your full dataset, showing which down-ballot races you’re looking at and what geographic territories they cover. That said, my initial instinct is that Scolar’s comment above is right: some subsets of the Democratic electorate were more inclined to vote via absentee ballot, and so Democrat candidates temporarily “underperformed” in the down-ballot races where those subsets were concentrated. Then this “underperformance” corrected itself when the absentee votes came in.

    ReplyDelete
  12. Have you studied how ballots are processed. How is something like this feasible? Please elaborate how cheating could occur and not be detected from a ballot security perspective.

    ReplyDelete