We’re launching Machines on Paper from New York, where coffee shops buzz with people sipping hot beverages and planning trips to the Catskills. But if you listen closely — and direct your eavesdropping towards the tables of men with laptops — you’ll hear excitement about something else. Look West, they are saying. Find gold. Generate text.
The AI revolution has made it to New York and if it makes you anxious, you are not alone. I grew up on a hype-resistant diet of public television and Midwestern pragmatism and would prefer to ignore the whole thing, but the reality is that the new AI technology is really freaking good. It is now possible to generate text, images, video (imminently) and code (definitely) that can pass for — and sometimes exceed — human output. Even skeptics like me have to acknowledge that AI will probably lead to radical changes in white collar work.
Here at Machines on Paper, we’ll be following the integration of AI into our professional lives with curiosity, humor and a smattering of computer science. We’d love to hear from YOU about what you want to learn, especially if it's a basic question that you’re embarrassed to ask. You can send us your questions anonymously here, or send us an email at email@example.com.
We’ll kick off our Q&A content with a question from a friend. Thank you Hannah for your accidental submission, I hope the following answer is clearer than whatever I drew on that napkin at the bar.
Reader Question: What is a machine learning model?
It is a great question, Hannah, but before I try to answer it again, I should quickly address the relationship between machine learning (ML) and artificial intelligence (AI). If AI is the one having a revolution, why should you care about ML?
The short answer is that in 2023, an AI model is an ML model.
But if we transported ourselves to 1960, an AI model would be a set of hard-coded rules (If the human asks “who are you?”, print “I am your computer friend”) that tried and failed to brute force its way to making computers act as intelligent as humans. Decades later, the field of machine learning developed techniques to use statistics and data to do much less ambitious things with computers, such as identifying spam and predicting ad click rates. These ML techniques became increasingly accurate and are now so powerful that they can do the “intelligent” tasks that the old-school AI community tried and failed to achieve. You could say that ML has "solved" parts of AI, which is why modern ML techniques are often referred to as AI.
Okay, so what is a ML model?
An ML model is a particular kind of computer program. Like any program, it takes some input (text, images, videos, numbers) and transforms it into some output (text, images, videos, numbers).
What makes an ML model different from your average joe computer program is that you do not need to specify how to turn the input into the output. You do not need to say, “look at the pixel in position (10,10) and if it is black, then there is a cat in the photo.” Instead, you can provide the model with examples of inputs and desired outputs and the model will “learn” how to do the transformation. In the cat example, you would need photos of cats (input) and the number of cats in each photo (desired output).
I’m going to attempt to explain the process of model training without math, so instead of talking about computer programs, I’m going to talk about recipes.
Like computer programs, recipes contain a set of ordered instructions for how to take inputs (ingredients) and transform them into outputs (a dish).
Recipes also include knobs that home cooks can tweak to change the output. Want more tomato flavor? Increase the NUMBER OF TOMATOES knob from 3 to 4.
Let’s say we know what the inputs are to make gazpacho, but we want to learn a recipe. So we’re starting from:
The machine learning approach to figuring out how to make a good gazpacho would be to:
- Fill in the ??’s randomly. Take 20 tomatoes, 1 cucumber, 0 onions and no olive oil. Mix, cool, smash.
- Find people you trust to taste your terrible gazpacho and give you feedback based on their gazpacho tastes. Way too much tomato flavor. I don’t taste any onions. The consistency should be smoother.
- Adjust the recipe given the collective feedback and go back to step 1 until you can make gazpacho that most people mostly like.
That’s the gist of model training. The model in this case is the recipe structure, the fact that there are three steps and variable quantities of the four ingredients. The training examples are your poor friends who are willing to give you feedback on how to make your recipe better fit their tastes.
Given a recipe structure, there’s an upper limit to how good you can make the recipe. You may notice that there is no salt in the above recipe, so in this case, there is a very low upper limit for how good the recipe will be.
ML models that have more parameters (more ??’s that need to be learned) can be trained to be more accurate (more tasty), but they also require more examples (more taste testers) to learn. The large language models you’ve seen that can write like humans have tens of billions of parameters, and are trained on all available digitized text.
I hope that leaves you with some useful intuition. And I know it is no longer gazpacho season, but this is my favorite recipe for when it rolls around again.