Simon Willison on using LLMs to write code

Simon, who created the popular Python framework Django, published a long blog post about this today (here).

He's been using LLMs in his coding workflow for about two years now.

The context available to the LLM is a major factor in its effectiveness. A product design corollary to this is that LLM context should be accessible to the user.

One of the reasons I mostly work directly with the ChatGPT and Claude web or app interfaces is that it makes it easier for me to understand exactly what is going into the context. LLM tools that obscure that context from me are less effective.

Cursor, the code editor I use, automatically includes the entire existing repo in its context. I've found that Cursor's agent, using Claude 3.7 Sonnet, can navigate the repo and find relevant context for a given task. Also, Cursor apparently trained its own powerful embeddings model (source).

LLMs are especially useful in the research phase:

Most of my projects start with some open questions: is the thing I'm trying to do possible? What are the potential ways I could implement it? Which of those options are the best? I use LLMs as part of this initial research phase.

I tend to use ChatGPT o1 for this. I ask a question like, "I'm looking to build a software application that does X, Y, Z. Can you suggest a few approaches to building this, ideally ranging from extremely simple to complex?"

I may also include parameters such as "I'd like to use Node and React for this" or "Let's use Supabase for this project."

Once you've identified your approach, then you micro-manage the LLM:

Once I've completed the initial research I change modes dramatically. For production code my LLM usage is much more authoritarian: I treat it like a digital intern, hired to type code for me based on my detailed instructions.

I've found Cursor's agent to be very good at handling open-ended prompts like "Add a list of weeks that the user can multi-select to book one or more sponsorships and include a checkout flow." What I tend to have to be specific about is design details, such as "A fixed footer should reflect the current spending total, with the dollar tally on the left side and a primary color 'Proceed to Checkout' button on the right."

If I don't like what an LLM has written, they'll never complain at being told to refactor it! "Break that repetitive code out into a function", "use string manipulation methods rather than a regular expression", or even "write that better!"—the code an LLM produces first time is rarely the final implementation, but they can re-type it dozens of times for you without ever getting frustrated or bored.

In my experience there's a bit of a learning curve to this. You get really impressed by the work that products like Cursor agent and ChatGPT do very quickly, and there's a temptation to adopt a mental model that you're talking to a highly capable software engineer. So then when a mistake is made or the agent is going down the wrong path, you want to gently give it constructive feedback to change course or rollback. You sort of brace for a difficult conversation but the immediate reply that appears is chipper, almost like it's excited that there's more work to do.

The best way to learn LLMs is to play with them. Throwing absurd ideas at them and vibe-coding until they almost sort-of work is a genuinely useful way to accelerate the rate at which you build intuition for what works and what doesn't.

Something I didn't know about until reading this piece is Simon has an open source repo of over 80 mini apps ("tools") that he's built using LLMs. Check it out on Github here. This is an early example of the discontinuous productivity individuals now have because of LLMs.