I’m solving a really interesting problem at work. At least, interesting to me. To detect a gas that’s really important for us, methane, we need to have a mathematical model for how light travels through our remote sensing instrument. It’s a problem that’s been solved in a general sense, and we have something that works ok, but now we need to know the specifics1. I’ve been working pretty full-time on this for about a month now, and it’s felt like a dive back into my previous academic life - but better.

company goals & support

What’s better? The support. My academic experience included a lot of hands-off advisors that gave extremely useful big-picture and scientific problem-solving advice, but didn’t have the time or expertise to help with anything too specific or technical. I had trouble forming the sorts of relationships with my peers (and/or co-authors) where I could do more than bounce ideas from time to time and show relatively finished results. I worked ok like that, but when I got stuck I was very stuck.

While I’ve been working on this problem, one of my co-workers, also a data scientist, has been an amazing support. He started a year before me, so has plenty of hands-on experience with the spectra and can kinda tell if something “looks off” where I can’t. We chat daily about results and progress, always with enthusiasm and new ideas (on both sides). I cracked a part of the problem he was stuck on last year, and he’s solved a few things I would have missed - including spotting the interfering signature of the thin plastic cover we use to protect our sensors from the weather2.

Solving this problem is an essential component of our current company goal, and the support is company-wide. Our CEO instantly approved sending our the plastic cover to a lab that will provide us with its detailed spectrum. Part of our hardware team and set aside a morning to get an extra dataset with my coworker. I’ve asked the scientific programmer on our team about some of my coding challenges and gotten pointed in the right direction. And - very important for me - I’m explicitly reassured that asking for help is the right thing to do.

Maybe I could have asked for a lot more support in academia than I did, but it’s always been difficult for me to ask for help. I had a lot of independence and ownership of my projects, but that also meant that no one was as invested in my success. When I’m goals-aligned with everyone around me, going to them for help is helping us all3.

physical intuition & research

The upside of the dive back into science is the chance to apply physical intuition to solve problems. When I’m trying to answer a question, I can make a model in my head and think about pressure, temperature, and gas content. Or refractive and reflective indices. I totally understand these aren’t intuitive to everyone; this is years of training and a decade of application. My intuition doesn’t answer every question, but it presents ideas for the next thing to check or try or visualize.

In my previous data science job, I missed this physical intuition. I was solving problems related to pricing - the underlying principles were economic. It’s a topic that I know a bit about from reading non-fiction and listenting to podcasts, but I am far from an expert. I would occasionally try and approach topics from the research standpoint, but the principles that I wanted to know were either absent or not described in a way I could even locate. Is the functional form of a cooling curve for a stellar flare easier to model than the price of a used phone? Or are economists not that into functional forms? Or do they just use such different words that it doesn’t make sense to me?

impure data science

When I’m attacking a problem, I tend to alternate between fitting/exploration and research/intuition. If I fit a line to some points, I try and do it with some idea of what it means in the context of my dataset and problem. What is the fundamental reason that these variables related? What are other variables that might be involved in this relationship? Are there some statistics that can tell me what relationship should be? Does it make sense that this is a line rather than some other function?

When I look at online data science training - and when I was hiring a data science student at my last job - I’ve found that this is not the desired approach. It looks a lot more like throwing your data into a machine learning model and seeing what fits, then drawing your conclusions from that. And this agnostic approach can be super valuable - it can point out things that aren’t obvious at first or would be overlooked. But in my view that should only be one of multiple exploratory approaches. The two times I’ve tried to solve problems using primarily machine learning models, they gave less accurate and less precise answers than more traditional fitting approaches.

So maybe pure data science is agnostic problem solving, and maybe it is super-focused on machine learning and AI. I either disagree with that, or I am just not a data scientist. Or I’m a termagant4 data scientist.

finding direction

As I’ve been thinking all this over, I’ve also been searching job ads. Not because I can imagine wanting to leave my current job, but because I can change, and companies (especially startups) can change, fail, or underpay. I am now sure that a pure data science job won’t fit me - I need science, too. I am looking at job ads to understand what’s possible and available in five or ten years? Do I want to be some sort of research scientist? Will I be a data scientist in title, but look in science-related industries? Now that I know what I like, what is my career?

  1. It’s the instrumental line function (ILS) that describes how a single wavelength of light spreads out due to both width of the aperture (opening where light enters) and tiny mis-alignments between the aperture and the detector. There is a derived functional form that works for our type of instrument (a cube-corner FTIR) but each individual one can be a little bit different and change over time. Those differences are important for our level of precision. 

  2. The plastic is of course mostly transparent - but it is slightly opaque (aka has some spectral features) around the methane line we want to detect. I’d completely forgetten we even had a cover, but he had it in mind and knew a bit of what its spectrum looked like. 

  3. I might have been happier in a large collaboration, but those have the possible downside of not letting their members shine as much on individual projects. In academia, that’s a big deal. For me now, it’s enough that folks at my company see my contributions. 

  4. I asked the internet if I’m old enough to be a curmudgeon, and it told me the female equivalent of curmudgeon is termagant. Am I old enough to be a termagant?