One of the things about art

One of the things about art is, you give people an excuse to take some time, to be quiet, and pay attention to something, and maybe under the guise of enjoyment, think about important things in life.

—Tim O’Brien

“They’ve invented a new machine” – a folk song’s mixed emotions on automation

Bruce Molsky sings “a little sad song about a shoemaker”:

They’ve invented a new machine / I peg one shoe, it pegs fifteen / I’m gonna lay me down my awl, my peg & awl.

It’s a beautiful performance, an all-time favorite of mine, and of Linda Ronstadt’s, too. She said, “I cried when I heard that song the first time — and I’m not a crybaby.”

Ronstadt puts Molsky’s singing on Peg & Awl alongside Edith Piaf and Ella Fitzgerald. “Bruce has that ability to track deep emotion in his voice, without any unnecessary adornment. It’s pared back to only the essential architecture of emotion.”

But what is the emotion of Peg & Awl? Molsky explains the song “has really gone through the changes over the years, all the people who’ve sung it and recorded it. In the original lyrics to the song, the guy was actually pretty happy about being replaced. But I just rewrote the lyrics to make it sad.”

Indeed, in an older version from the Anthology of American Folk Music, the shoemaker was tired of tedious labor with primitive tools, but is now overjoyed by new time-saving technology:

They’ve invented a new machine / prettiest little thing you’ve ever seen / Throw away my pegs, my pegs, my pegs, my awl.

(You can find more discussion & variations at “The Old Weird America” blog.)

There’s nothing wrong with rewriting lyrics or changing the mood – it’s the folk process at work. The late great Pete Seeger advised artists, “Don’t be so all-fired concerned about being original. You hear an old song you like but you’d like to change a little, there’s no crime in changing a little. It’s a process by which ordinary people take over old songs and make them their own.”

Molsky took this old song and made it his own, with melancholy, nostalgia, and bittersweet beauty.

“Progress can leave some people behind, even as it benefits society as a whole,” write Erik Brynjolfsson & Andrew McAfee, authors of Race Against the Machine. That’s the tension in the folk history of Peg & Awl, and the tension in the contemporary debate about how technology impacts the future of work.

Anxiety or optimism – which side is the song on? It depends who sings it.

The issue formerly known as privacy

I’m not really looking for privacy. I want an assurance that my data is not going to be abused.

Maybe privacy isn’t the right word. This is the issue formerly known as privacy.

Let’s call it justice, and human rights, and due process. What can we do that would allow people to have due process around their data?

Julia Angwin, investigative journalist & author of a recent book on privacy, from her keynote at the Strata 2014 NYC conference.

Angwin’s insight here, and her call to action, particularly resonated with me.

The full video, and more highlights, below:

I don’t know if you remember this, when Google was placing some code in ads that would actually allow them to circumvent the Apple Safari privacy settings and install their cookies. So they ended up paying a $22.5 million fine for that. And that is the largest fine I think anyone’s ever paid for privacy. And it’s worth reminding us, that was five hours worth of revenue for Google.

Does it matter that they have all your data? Staples gives different prices for their office supplies based on your zip code. They estimate how far you are from a competitor store. So in classical economical terms it makes sense for them to offer you a slightly cheaper price if you have the option of going to an OfficeMax. But in reality when you looked at the data across the nation, you also found that the people getting better prices were whiter and richer. And so we have this emerging world where you might make a choice about your data for a very legitimate reason, but find out you’re coming close to something that we used to call redlining.

I decided that, I’m not a Luddite, I’m not going to live in a shed, but living in the modern world, could I protect my privacy? So I took a lot of different steps. … In the end I wasn’t that successful. And I spent almost $2,500, for one year. It really was much more expensive than I thought.

Is privacy becoming a luxury good? Is that what we want for this world? It’s not even that. It’s like a crappy luxury good, like a fake BMW. Because I didn’t get out of the data brokers. I can’t protect my cell phone unless I don’t use it. I couldn’t get my friends to do this encryption. And I really didn’t have any assurances that my tools worked.

I want to feel the way I feel when I get in the car. I know it’s dangerous, but I also know it has to meet some minimum safety standards. And that if they break those, they’re going to be crying in front of Congress and paying me a lot of money in a lawsuit. That’s what I want with my data. I want to know that if it’s abused, I have due process, I have rights.

Statistics with computational methods, not agonizing pain

This is a mash-up of two talks that go together wonderfully:

  • “Statistics Without the Agonizing Pain,” John Rauser, Strata + Hadoop World NYC 2014
  • “Inferential Statistics With Computational Methods,” Allen Downey, PyCon 2015

Rauser’s keynote was my favorite at the Strata conference:

Some excerpts:

When I decided to learn statistics, I read several books, which I shall politely not identify. I understood none of them. Most of them simply described statistical procedures and then applied them, without any intuitive explanation. This talk was born of that frustration, and my wish that future students of statistics will learn the deep and elegant ideas at the heart of statistics, rather than a confusing grab-bag of statistical procedures. …

I am a software engineer, who was self-taught in statistics over a period of about a decade. And I remember struggling with what seemed like the most basic questions. But it doesn’t have to be this way. …

If you can program a computer, you have superpowers when it comes to learning statistics. Because being able to program allows you to tinker with the most fundamental ideas in statistics, the way you might have tinkered with electronics, or with mechanical things, or with music, or with sports. And so I want you to go out, and to attack statistical problems with a feeling of joy, in the spirit of play, and not from a position of fear and self-doubt.

To convince you of this idea, we’re going to use statistics to figure out whether drinking beer makes you more attractive to mosquitoes.

Inspiring! I wasn’t the only one impressed:

Now how to put this into action? Rauser recommends Allen Downey:

Recently at PyCon, Downey taught a hands-on tutorial, “Statistical inference with computational methods.” I didn’t attend the conference, so I’m grateful the video, slides, and code have all been shared openly.

The video is unedited and over 3 hours long, but much of that time is silent during hands-on work and breaks. So for convenience, I’ve provided an outline, with timestamp links, so you can proceed at your own pace. Enjoy!

1. Effect Size

2. Quantifying Precision

3. Hypothesis Testing

Downey also taught a Bayesian tutorial at PyCon, which I haven’t done yet, but I’m looking forward to it. Resources like these give me hope & confidence to learn more statistics.

Banjo Data Science

This data scientist definition is intended as a self-deprecating joke, but still seems well beyond my reach:

So I’ll claim another title instead:

Banjo Data Scientist (n.): Person who is better at playing banjo than any statistician or software engineer, and better at statistics & software engineering than any banjo player.

Stay tuned for my side project: statistical machine learning to generate musical arrangements in the style of Earl Scruggs, father of bluegrass banjo…