I use a lot of homemade YouTube videos as part of my lab courses. Most of the wet labs have a short (10–15 min) prelab video that covers whatever concept we’re exploring that week. Miami is also a little unusual (for US departments, at least), in that we teach the spectroscopy of organic compounds as part of the lab courses. In my classes, we do this using an inverted classroom approach, where the students watch longer (~40 min) lectures on IR spectroscopy, NMR spectroscopy, or mass spectrometry and then do an assignment in-class with help from me and the TAs.
So, there are a fair number of videos associated with these courses, most of which date from around the pandemic era. An ongoing problem, however, has been that the closed captions for these videos are just the ones generated automatically by YouTube, and they are terrible. Captioning by YouTube has improved in recent years, but 5 years ago it produced long, stream of consciousness rants free of punctuation or capitalization. Even for newly uploaded videos, the captions tend to have mistakes associated with misinterpreting chemistry words and are just too literal. I think the captions have more value if they edit out filler words (“um”) and correct misspeaking. (I have a bad habit of occasionally false starting sentences, double-speaking the first word or two. I don’t think it’s too bad in most of the videos, but I’d rather that the captions skip over mistakes like this.)
New standards associated with the Americans with Disabilities Act mean that all course videos now need to have quality captions. Honestly, though, I should have fixed these years ago. My wife and I can’t even watch Ted Lasso without turning on the closed captions. The value for students new to the chemistry “language” is obvious. The problem is that correcting hours of poorly constructed autogenerated captions by hand is extremely tedious.
I have recently become AI-curious, and this seemed like a good test project for an LLM. I am skeptical of most of the hype around LLMs, but if they are going to be useful for anything, surely it should be manipulating language.
Continue reading
