PDA

View Full Version : Wonkiness of small samples



zitothebrave
04-21-2015, 09:07 AM
Just figured I'd take a look at stats so far this year. Most everyone is 12 or 13 games in.

I'll look at 3 different players, one for offense, one for defense, and one for pitching and analyse the flaws in small sample statistics.

First up is one of the best hitters in baseball right now, Stephen Vogt. You may be asking yourself, who is Stephen Vogt, and you'd be right to ask that question. He's a 30 year old second year player. he's posting a .415 iso, way way way higher than he's ever posted at any level in the majors or minors. He also is sporting a .367 BABIP, much higher than you'd expect from a slow lefty. And maybe my favorite part of his stat line. In his 2 previous seasons he had about an 8% HR/FB, so far this year he's at 30.8.

Second is Shane Greene. Looking through his numbers, there's not a thing that isn't lucky about him. Through his 3 starts he's compiled 23 innings with a 0.39 ERA. You may be loving those numbers til you look a little deeper. .188 BABIP is too low for a pitcher who's ground ball heavy. He has walked a mere 1.96 per nine which is great. But he would counter that with a 4.30 K/9. Making his FIP 2.72, and he has not allowed a homerun yet, which makes his xFIP 4.01. So still very good performance, but his ERA is the product of luck and sampling.

Lastly we look at DRS. UZR hasn't put any stats up because they don't even bother with this small of a sample. Anyway my example is the former Brave he who should not be named. From 2010-2014 he accumulated the highest total DRS of any player in baseball. He's sitting at 0 so far this year. Which as we all know whether you're a fan of Him, or not. He is an allworld defender, certainly not worse than guys like Ryan Braun and Jose Bautista. Not that those guys are bad players, they're just not Him.

thethe
04-21-2015, 09:10 AM
But how about Markakis OBP!?!?!?!?

nsacpi
04-21-2015, 09:24 AM
Markakis BABIP .455 verus career .316

Walk rate 16.7 versus career 9.3

Strikeout rate 14.6 versus career 13.0

ISO .025 versus career .144

Line drive rate 27.3% versus career 20.4%

Hawk
04-21-2015, 09:28 AM
Markakis BABIP .455 verus career .316

Walk rate 16.7 versus career 9.3

Strikeout rate 14.6 versus career 13.0

ISO .025 versus career .144

Line drive rate 27.3% versus career 20.4%

https://33.media.tumblr.com/168159b985930395a03f3875a33931fc/tumblr_mv95y1BMTt1rwxg8ro1_500.gif

thethe
04-21-2015, 09:29 AM
Strange how he is hitting hte ball harder but just not getting any loft on the ball. Wonder if that is intentional or just degradation of skills.

zitothebrave
04-21-2015, 09:48 AM
Strange how he is hitting hte ball harder but just not getting any loft on the ball. Wonder if that is intentional or just degradation of skills.

Small sample size. And hitting more line drives doesn't mean he's hitting it harder, He's hitting way more ground balls than his career norms as well which means he's on top of the ball more, whether that's intentional or just how the cookie crumbles, who knows. But he went from being a 1.3-1.5 range GB/FB ratio to a 3 this year. That's not indicating that he's hitting it harder.

smootness
04-21-2015, 05:10 PM
I agree that the advanced metrics that don't utilize a simple formula are often 'wonky' this early. But I don't consider someone hitting way better than normal in a small sample to be 'wonky'. In reality, all players are better than normal during short periods of time and worse than normal during others.

cajunrevenge
04-21-2015, 06:08 PM
Maybe they told him to focus on hitting line drives instead of homers like they did with Chris Johnson. A fundamental change like that can have immediate results because it takes some time for scouting reports to change.

zitothebrave
04-21-2015, 09:46 PM
I agree that the advanced metrics that don't utilize a simple formula are often 'wonky' this early. But I don't consider someone hitting way better than normal in a small sample to be 'wonky'. In reality, all players are better than normal during short periods of time and worse than normal during others.

Wonky means unsteady/shaky. I know about the shenanigans that happens from bad stats. It's great to use the start of the season to dictate to people about sample sizes and how you have to take things in a large sample.