Complexity

I have the beginnings of some thoughts about teaching statistical modelling

One of my fabulous colleagues has started a book club on campus where a group of us work through Advanced R by Hadley Wickham. After the day I learned about the tidyverse, this Advanced R book club has been the biggest set of leaps I’ve been making in my R skills, and I’m probably only understanding about a fifth of it.

This week we began the chapter on functional programming – and Ian’s code and examples are on github. I went home and spent the evening doing this:

There was one example that Ian drew up that I can’t stop thinking about from a teaching perspective. Teaching stats is really, really intimidating, because the more you know about it, the more you recognise how subjective it can be. I often see people take refuge in complexity where they refuse to answer a learner’s question in favour of reiterating the memorised textbook response. I’ve done this myself! At the same time, I’ve had a really intriguing stats challenge with a colleague where I’ve gone around the houses trying to make sure I can justify our choices.

This comes down to model selection, which is one of the most Fun(™) conversations you can ever have about statistics. The more I learn about statistics the more I feel that model selection is the personification of this tweet from my colleague:

You see, there really are no ‘right’ answers in model selection, just ‘less wrong’ ones. This is the subject of a lot of interesting blogs. One of them is David Robinson’s excellent ‘Variance Explained’.

Another of @drob’s posts that I’ve linked to before I’m sure is this one: Teach tidyverse to beginners. This idea fascinates me. David (and I feel I can call him David because I once asked him a question at a demo and he said it was a good question and it was honestly one of the highlights of my life) suggests that students should have goals, and they should be doing those goals as soon as possible.

I don’t know how much educational training the Data Camp/RStudio folks have but I’m always really impressed with the way they teach.

(It’s important here to take a moment to acknowledge the problems Data Camp is having at the moment regarding how they addressed a sexual harassment complaint. I have the utmost sympathy for all involved, and at the moment I don’t feel that boycotting Data Camp is the answer, but it’s worth pointing towards blog posts like this one to give a different opinion.)

‘Doing’ as soon as possible is something we struggle with in higher education. I’ve just had to rewrite a portion of a paper to defend why I think authentic assessment is so vital for science. We put ‘doing’ at the top of our assessment pyramids, and talk about how it takes us a long time to get there.

During this week’s bookclub, my colleague Ian had a great example of using the broom and purrr packages in R to fit multiple models to a dataset quickly and easily. And I had to derail the conversation in the room for a bit. Why don’t we teach this to our students straight away? At present, the way I teach model selection is a laborious process of fitting each model one by one, examining the results individually, and then trying to get those results into some kind of comparable format. After some brief discussion, with all the usual sciencey caveats, our Advanced R bookclub was all keen to use this as a way of introducing model selection to students.

I feel as though this is tickling at the edge of something quite important for higher education, especially for the sciences. Something about empowering students, and getting them to ask me about things I don’t know the answer to more quickly. I also feel just a little irate about the fact I can’t formalise this as nicely as I know David Robinson and the RStudio lot can. I kind of feel like some of the most useful stuff I’m doing lately is in the Open Educational Resources range, such as my Media Hopper channels and on my GitHub. There’s a freedom in OERs to push the boat, and to start teaching the complex things first.

And ultimately, my disjointed ramblings might just help someone else connect a few dots. Happy spring, people!

If We Should Dress for Sun or Snow

Despite feeling pretty good about my work-life balance last year, I’ve been a little humbled by 2019 so far. My personal life has needed more attention than my work life, and I’ve been feeling guilty about shifting the focus.

Before Christmas I got very into the Groundhog Day musical soundtrack, particularly If I Had My Time Again, which is my new favourite shower sing-along. I was also thinking a lot about academic workload last year, and how the varying pressures of the academic role can be challenging.

Despite feeling pretty good about my work-life balance last year, I’ve been a little humbled by 2019 so far. My personal life has needed more attention than my work life, and I’ve been feeling guilty about shifting the focus. It’s been difficult to keep on top of things, and I hadn’t quite appreciated how much I’d let things creep into the evenings.

There were two articles recently that my mind kept returning to. One is Dr Anderson’s widow speaking out about academic workload, and this article about email’s influence on workload. Particularly on Monday when I was attending an Echo 360 community meet-up about learning analytics.

I had good reasons for wanting to go to this community meet-up. I’m interested in analytics, and I’m the PI on our university’s evaluation project so a little networking is always valuable. I’m also in the rare academic position of having some spare money floating around so it all seemed worthwhile. Except there was a very west-of-Scotland sounding voice in the back of my head wondering if I’m worth spending that money on. Who am I to go to That London to talk to people? Shouldn’t I be slaving over a hot laptop?

On the other side of this, I’ve also spent a little bit of my evenings this week working on a Shiny app. Now I want to emphasise that ‘a little bit’ in this context literally means five or ten minutes here and there when an idea comes to me, but it’s still very much useful time. And yet I’ve been frustrated that I haven’t been able to spend more time on it.

A couple of months ago I had a devil’s advocate style debate with my good colleague Ian about how much these kind of extracurricular activities should contribute to our CVs. We kept circling back to how much the open science and open data analysis movements favour those people with the spare time to dedicate to this kind of work. If all your work is on proprietary data, you maybe can only contribute to things like a github repository in your spare time. And if when you get home you start doing the childcare, or can’t get away with not cleaning the house because you prefer to spend that time tweaking a package. What if all your hours out of work are spent on other tasks, and when you have that lightning moment of “ah – I should use enquo()!” you can’t immediately go to your laptop to check it out?

There are many people much busier than me who manage to contribute way more than me. Those people should be applauded. And we should definitely still value the amazing resources people put online. I think it is our responsibility as academics to support ourselves (and our managers too).

All this is a round-about way of saying that having a little bit less time to make-up for my business has highlighted to me how very important it is to protect time for the things that are important in your work. During one of our protected analysis times today I started a new package which I hope will be able to be incorporated into a shiny app I’m planning for our students. Tomorrow’s my first Writing Friday since before Christmas.  This is the way to do it. And yes, my emails have been slipping in the mean-time. Let ’em.

We should believe we are worth the time.

(And also I managed to go to work today wearing two different earrings and no one pointed it out. That’s not relevant but it amused me greatly.)