Friday, June 12, 2015

Judicial Coefficients

Note: if you read this entire post, please note in the comments section. We have statistical comparison (betting pool) on this factor. -- Melody

Prologue: So we are working on a statistical instrument to measure different plea bargaining strategies. The following is an email thread discussing our approach. Dan Hansmeier, Carl Folsom, Melody Evans, and myself (Kirk Redmond) are quoted. The names of Judges are redacted to protect the innocent.

Spoiler (from Kirk) - Dan is wrong, but funnier and more profane than anyone else.

We are getting close to having enough data to analyze. The question is what to do with it. One of the ideas was that we would normalize between judges by adding a coefficient. The idea was to compare the SD (sentence differential: sentence expected minus sentence imposed) for each judge against the low end, and add a multiplier. If Judge XXXX sentences illegal re-entry defendants at 1.21 times the low end of the guideline range, we should multiply the sentence expected by 1.21 to account for how s/he actually sentences defendants. This is a recognition that a low end sentence is not the expectation in front of Judge XXXX in a 1326 case.
But it occurred to me just now, because I'm dense, that this approach should not extend across all plea options. Binding pleas don't affect what the judge can do to the client, unless it is rejected, which we keep track of separately. 5K motions don't affect what the judge does, because they are [almost] always followed. Charge bargains are a more difficult question. They usually leave the ultimate sentence up to the judge.  I think we should only apply the multiplier in open plea and no plea agreement cases. The other plea (or acquittal) options don't measure judicial performance. We need to get this straight before we set up the database.
Doesn't this coefficient obliterate the known comparator (the low end of the Guidelines range)?
I don't follow the coefficient. I don't understand why we need to fudge the numbers to recognize that some judges vary downward in certain cases. The actual numbers, compared to the Guidelines range, will tell us that, right?
And how does the coefficient affect our ability to compare judges? I'm lost when I try to imagine that scenario.
And I'm not following. How is the low end of the guideline range the baseline if a particular judge doesn't sentence at the low end in a particular case type? Assuming that the low end is a stable result seems dumb if the data doesn't agree. It's not fudging the data, it's adjusting for the results.
You lost me. The low end is the low end because, by definition, it is the low end. It is the one constant that allows a cross-comparison. It is why that brilliantly odd KU professor [edit: Not who you might be thinking of if you know what we are doing] got a boner when we talked with him that day. Maybe I'm wrong, but it seems to me that as soon as the low end is something other than the low end, it is impossible to compare between judges. Well, at least impossible to make a meaningful comparison between judges. And it makes it more difficult to advise clients.
Wouldn't this be like adjusting ERA based on the strength of the opposing team? Or maybe awarding less than or more than one steal based on the arm strength of the opposing catcher? Is that similar to what the coefficient does? Does Billy Beane do this shit?
We don't throw out the low end. We see whether judges actually adhere to it.
This is like baseball guys adjusting for the park in which the team plays. Say a guy hits 38 home runs. It matters whether he plays 81 games at Coors Field or 81 games at Petco (before they moved the fences in this year). If you are considering signing that guy as a free agent, you have to know whether those home runs were a product of his skill or a product of the park he played in.
Or with your ERA example, you have to consider the defense behind the pitcher. A pitcher’s ERA may be low because he is awesome, or it may be low because he puts the ball in play a lot and the defense goes and catches it.
For the same reason, it matters what court you are in when determining what approach to take to a given case. Returning to the example of Judge XXXX sentencing a 1326 case, it will be useful for the client to know whether s/he generally sentences defendants to the low end of the guideline range. I suspect that is not true. But we will be able to find out soon enough. If s/he does not, we can see how far away from the low end s/he generally imposes sentence. That is a critical piece of advice for clients- is a low-end recommendation worth bargaining for? Because if it’s not, we should try to lock in a binding plea, even if it's just to the low end.
The inverse is Judge XXXX on a drug case. If s/he is statistically likely to vary downward, then why would you ever enter a low-end plea agreement? More specifically, if Judge XXXX sentences drug cases at .89 of the low end, shouldn't you always plead open and argue for a variance? You can't find that out if you use low end as an unaltered constant.
Carl is enjoying this from afar.
Redmond's Coefficient. My children will study this one day. Well, the younger one. The older one might not make it out of preschool.
Suspiciously, your hyperlinks did not work.
I'm still missing the value in it. We are not scouting judges. We are comparing them. Right? And, for a valid comparison, one that is statistically significant, the key is a stable comparator. We have that. The guidelines range is the equivalent of the universal ball park, or the universal defense. Or the entire universe. It allows us to compare judge behavior in all types of cases. And our numbers will answer your proposed questions without the application of Redmond's Coefficient. If XXXX comes in below the range in drug cases, our numbers tell us that, and we know not to enter into a low-end plea agreement.
What if we created two sets of numbers: one based on Redmond's Coefficient, and the other based on a straight statistical comparison with the Guidelines range? Or are we doing that? Is Redmond's Coefficient like a bonus? Is it like a Lorenzo Cain bobblehead? Because if so, I'm not going to argue against it. I love bobbleheads.
So, should Clayton Kershaw's ERA depend upon the team he faces? Should he be allowed 2 earned runs per inning when he faces the Cards because the Cards own him? Should he not be allowed any earned runs when he faces the Cubs because the Cubs are, well, the Cubs?
Have you written Rob Manfred a letter about this yet? Can you set up a fantasy baseball league where the stats are based on Redmond's Coefficient?
Carl (at least the response he was writing before I got in first)
Hilariously enough, this is the unfinished response I started for Dan:
I think it's more like adjusting the ERA based on the park the team is playing in (park factor) or the defense they pitch in front of. The same fly ball out in Kauffman might be a home run in Yankee Stadium. Just like a low end sentence might be a poor result with a really favorable judge, but it might be a good result with a tough judge.
The league ERA in Kauffman is lower than it is in Yankee Stadium. This is true even when other factors are accounted for. So a pitcher with a 4.50 ERA who plays for the Royals is a worse pitcher than one with a 4.50 ERA who plays for the Yankees.
So if the league average ERA is 4.50, this is like the SE. Getting the average ERA or the low end is an average result. But if a certain judge always gives high end of the range on child porn cases, or another judge always gives downward variances - because he/she thinks the use of computer enhancement is bullshit, that should be accounted for.
Just like the SD will be worse for certain judges than other judges. Getting a 20-month variance (below the Low end (SE)) from XXXX will probably be a hell of a lot harder than in front of XXXX.
What Carl actually wrote
I think we're scouting judges against the guidelines. And we're scouting our own performance against what the judges usually do (Redmond Coefficient?).
Like a Rockies pitcher with a 4.50 ERA is probably better than a Royals pitcher with a 4.50 ERA (adjusting for home park and defense).
I cannot believe that Carl just conceded that a Rockies pitcher is "probably better" than a Royals pitcher.
What the fuck is going on around here?
I'd also note the phrase "against the guidelines" Carl used in the first sentence. I think that is exactly right. The Guidelines do not adjust based on the courtroom or the judge (like ERAs and ballparks).
I feel like a member of the Sentencing Commission.
What the fuck is going on around here?
Let's continue this email until the end of time.
The guidelines comparator stays in the equation, always. But our current assumption of a low-end default is just a starting point. And it will probably be the ending point with some Judges, like XXXX. But that's not universally true. Judge XXXX says that he starts in the middle of the guideline range, and after looking at all of his drug sentences, I believe him. Evaluating whether a government recommendation influences his sentencing decision seems important. Anecdotally, I don't think it influences XXXX.
Dan, I will set up the database so that if you prefer to ignore the judge deciding the sentence, you can.
I want a XXXX bobblehead.
I can probably make this happen. Karen and I had personalized bobble heads on our wedding cake. Ordered from England.
I have my own bobblehead as well. I thought everyone did. I think some kid in Asia made mine. It is like the bobblehead equivalent to a blood diamond.
I don't want to ignore the judge. I think I want the option to ignore your coefficient. Can we have it both ways? Will the numbers tell me where the judge falls with respect to the guidelines range? Or will the numbers tell me where the judge falls with respect to some point created by Redmond's Coefficient? Do I get to know both of those things?
And yes, Hansmeier's Coefficient is in its development stage. I'm researching Ricky Henderson's stolen-base record as we speak . . . .

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.