pdxmph


#

I ended up losing my afternoon to some indifferent analysis from a hostile party at some other company. I’m usually the one doing that kind of hostile analysis, so I know bad work when I see it. The process of discrediting their work helped crystallize what constitutes good work.

So I took the opportunity to write a custom gem for Gemini that ingests the kind of reporting I have to do once or twice a quarter and does the kind of analysis I end up doing each time. That’s how to automate, right? Just after being badly offended by someone else’s incompetence, and high on your own expertise, you get it all out into code. Or after being badly burned by your own incompetence, and realizing you need to take a deep breath and describe what to do step-by-step.

Anyhow, we’re supposed to get gud at this stuff, and I’m dealing with the novel sensation of having done a deep dive into something before being told I should, so I’m casting about for problems to solve, and trying to contort myself into the approved tool.

On the one hand, I completely understand the appeal of the sort of magical oracle approach a chat interface provides, where you feed it a spreadsheet and tell it what to do, and it chats its analysis back at you. But on the back-end, to keep it from hallucinating, it has to write a Python script each time to read the input, and it’s not really doing anything besides some counting. The Gem layers on some interpretation, but I think you could just as easily get a table of output saying “These are the two broad percentages you care about, and this is a list of things you should investigate further.” That seems more deterministic and analyzable. When I was experimenting a lot with AI-driven tool-building, that was my preferred approach. I’m not super florid when presenting the kind of reporting I was working on today, so I felt a little resentment having to build into the prompt some “… and please don’t make a whole thing out of your findings.”

The custom gem approach came with a few of its own issues: Sometimes it crashes in the middle of the analysis. Just loses its mind and freezes. And before I explicitly told it had to write scripts to do the counting, it would decide that it knew how many kinds of a certain record there were, and it would state that number with a sort of thought-terminating authoritativeness. For something that needs to pick those values out of JSON, that’s sort of a drag.

So there’s the option to do a custom gem and create a magical oracle that tries to fuzz your analytical radar because it takes this know-it-all clankersplaining tone, or there’s the option to just code something up that’s just going to do the analysis and spit out a table. One feels sort of mysterious and murky and cool, but periodically decides it has no context window left to give; the other is introspectable even if it’s terse. Given a decent vibecoding platform, it’s way faster than managing the vagaries of stating and restating what you’re after, and still getting random shit back sometimes.

Which is not to say Gemini is dead to me, because it’s much faster at cranking out the kind of script I need to run: I’m an indifferent coder at best. I just have to think about how to teach this stuff to my team, among whom I am more like a creature from a mysterious and ancient computing culture than a friendly guide who will take them over the rainbow bridge to The Singularity.

So I signed up for some courses. I don’t have a knowledge or skill problem so much as I have a pedagogical technique problem, so I’m gonna sit still and allow myself to be trained, the better to train.