← back to bio

Thoughts on Coding Agents and Creative Agency

April 1, 2025

Reflections on building personal software through AI coding ("codegen") tools — from a Chrome extension to interactive websites—and what I've observed as someone without a traditional software development background.

I recently began dedicating a significant portion of my time to developing software with codegen tools. As a product leader with a social science and humanities background, I set out to test the limits of these tools by building 100% through prompting. A week into this journey, the term 'vibe coding' was coined and a community of such explorations has blossomed. For me, the experience has been expansive.

  • I am building small pieces of software that feel beautiful, personal and helpful.
  • I am exploring about ways to connect the dots for powerful experiences even with quite limited resources (eg: in a Chrome Extension, vs an OS).
  • I am waking up each day excited to see how the latest model, tooling or protocol will change what's possible for me.

I wanted to share a few reflections so far — on what I’ve made, practices for effective building, and challenges I’ve observed with these tools in their present form.

I end with early thoughts on the evolution of generative code for people everywhere.

My background

I studied Comparative Literature and Global Affairs at Yale. I focused on statecraft, postcolonial theory, and political philosophy (and spent zero days learning to code). My work examined how narrative, policy, and networks shape the social fabric in a globalized world. After graduating, driven by the same macro forces I turned to investing and technology. I began at D.E. Shaw, then LinkedIn, and now Google. I’ve worked with SQL and Python for data analysis but only recently began building software myself.

My Tech Stack

My reflections are all based on my experience with Windsurf, with websites hosted by Netlify. I experimented with Lovable, Bolt, and Cursor when starting out as well. I found the agentic IDEs most intuitive and I liked the fact that they would provide the most longterm flexibility for me. Windsurf was more agent-first which is what I was interested in, so I chose it over Cursor.

I use Netlify to host all of my websites which allows me to deploy directly from the IDE.

Four Projects and Some Takeaways

I've worked on four essentially 100% generated personal projects so far, each with its own quirks and lessons.

Interactive Personal Portfolio (click to collapse)

In the span of around ~7 days I created this interactive personal portfolio.

My goal was to reimagine my existing portfolio – a static set of digital art and writing – into an interactive playscape that encouraged curiosity and discovery.

I aimed to do this through things like: new UX interactions requiring discovery for simple tasks (eg: the bouncing circle for navigation), and unconventional use of established paradigms to nudge exploration (eg: the shopping metaphor to encourage engagement with questions, images and poems).

Shop

View Cart

Checkout

Print Receipt

Most Fun

Leaning on the model to increase efficiency while testing out how 'aligned' our taste and creativity was.

For example, requesting:

  • "Five different fortune teller personalities and random selection among them for our poem generation prompt"
  • "50 questions touching on the viewer's current state, experience in the gallery, and childhood to gain information for their poem, using personal data similar to that in my own writing"

Most Frustrating

Formatting the 'receipt' for checkout.

It was challenging to identify which DOM elements were causing extra white space.

The model also consistently failed to dynamically adjust the HTML2Canvas height based on poem length, due to poor sequencing of the poem container height calculation. Unseasoned, this took me a while to figure out.

More Demos and Details


Practical notes for effective building

Newcomers to building software should be prepared to get into the weeds even when building through prompting. Areas where I've invested time:

  • Refining few-shot prompting methods to avoid iteration (error prone). I've found the best 'few-shot' prompts include broad context of my goals so the model can take creative liberties aligned with overarching objectives, outline key features, and specify UI vibe. Then, I coax the model to 'keep going' until we have a working prototype. I've also found that requesting the model to "follow best practices to [eg: implement a cart feature]" is useful.

    For example, sharing "This is going to be a Valentine's day website for my husband who loves retro tech and is Japanese" gave the model a clear direction for design leeway.

  • Co-deliberating on approach before implementing. This includes examining the model's code to both acquire new vocabulary for precise prompting and asking questions to guide the model to good decisions.

    For example, I frequently ask questions like: "Is this the simplest way?" "What are some other options, and the pros & cons?" And for Git commands, often: "What will that command do?" or "That's the wrong command (sic)".

  • Building testing and debugging toolset. The models excel at writing their own tests and logs, and walking you through where and how to monitor them. I've also found custom debugging interfaces useful, especially when richer visual feedback is helpful.

    For my interactive portfolio, I built a custom tool to track user responses and poem views. When prompted to generate a debugging interface, the model dreamed up a tiny, animated terminal-like tool that served my exact purpose:

    Check question capture

    Check poem capture

Current limitations and mitigations

While using codegen to build something like the Valentine's Website is quite straightforward, there are fundamental limitations - relevant to everyone, not just newcomers - that make more complex projects trickier.

Here's my stack-rank of these limitations by level of pain and how I mitigate them.

  1. Code Quality

    There are still challenges with producing errorless, efficient code. Even functional solutions are often overcomplicated as the models seem to want to always do more. Recent improvements like automatic lint-error correction and bringing local processes into context (so the model can read console logs) are helpful in improving quality, but challenges with overcomplication, structural mistakes, and inefficient architecture choices remain even in the latest models.

    As a mitigation, for simple changes I remain vigilant during execution and intervene with corrections (for example, "we already have a function for that, use it"). For more complex features or changes I ask the model to explain its plan before executing. The model returns a thorough description and code snippets. I then ask follow-up questions, using this as an opportunity to proactively identify errors and interrogate the approach to ensure we are making efficient, robust design choices.

  2. Context Limitations

    There are also challenges with context retrieval, especially as the codebase grows given context window limitations. Large refactors or changes with cross-cutting dependencies can become tricky to get right. Context limitations paired with the models' tendency to sometimes overcomplicate solutions can produce a downward spiral that becomes hard to vibe-climb out of. I've encounterd this with both Sonnet 3.5 and 3.7. Unfortunately as of publishing this I have not been able to gather data on Gemini 2.5 Pro (2.5M context) due to rate limiting.

    As a mitigation, I ask the model to carefully consider the entire codebase (in chunks) and form a methodical plan before implementing. The model usually iterates through key files and aims to identify dependencies before proposing a solution. In my experience this gets ~80% of the way. The final 20% requires iteration.

  3. Error Loops

    When the model repeatedly generates self-identified 'bad code' and attempts to correct, regenerating its answer for a given prompt until finally timing out.

    As a mitigation, I start a new chat with fresh context.

  4. Extraneous Changes

    When the model introduces changes unrelated to what you've asked for - sometimes as extreme as restyling large portions of the UI in response to a targeted request ("lower the scroll anchor for the receipt"). I encountered this frequently with Sonnet 3.5 while developing the interactive portfolio.

    As a mitigation, when requesting targeted changes I ask the model to "focus specifically on this request and make no unrelated changes".

  5. *Note: With newer models, the frequency of Error Loops and Extraneous Changes seem to have decreased. SWAG estimate: I encounter them 5x less often with Sonnet 3.7 vs 3.5. A likely spurious variable is improvement in my own prompting.

Wrapping up

On “generative tools” and mass consumer appeal

I see high potential for people of a range of backgrounds to begin building. Perhaps without even realizing it.

I imagine this taking shape in emergent and ephemeral ways: small tools generated on demand to suit a specific personal use case, and look & feel personal based on the surrounding context. Like my little debugging tools, or personal Chrome extension.

What if anyone could summon personal programs like these? Maybe the best personal 'agent' is the one that gives me simple yet powerful agency over what it can do, and how.

I believe even the average consumer (heuristics: my parents in Montana, my GenZ sister) would find sticky delight and usefulness in the ability to generate their own tools - shaping function and form - just as folks have enjoyed generative images, answers, and text.

We could call these personal programs 'declarative agents' or more aptly 'generative tools'. I cautiously imagine them to be possibly the first really revolutionary, useful consumer application of generative AI.

Generative tools would be both personal and empowering. Built for me, “by” me, on demand. Customizable to my taste. Philosophically, ushering in a future where automation enables command over one's tasks rather than separation from them.

In this way, generative tools might be a small, regular reminder of one’s own agency - a critical component of what gives people a sense of meaning - by demonstrating in small and consistent ways: with your ideas and actions you can shape your reality.

The technical foundations feel nearly there. Still, there is a gap in commercial demand: a maker mindset shift is needed, one where people dream a little more ("wouldn't it be nice if I could ____"). How might we bridge this gap through what we build for people now, beginning to elevate tool- user to maker?

← back to bio