Updated 22. Aug 2024
by
@OtivDevI’m a developer with more than 10 years of experience in programming and more than 5 years of React Native. I consider myself an expert in cross platform mobile development, but have also worked as a full stack developer - I’ve been working with Node for at least 9 years and also built a couple of side projects with elixir… This will be important later on.
When I started writing this blog post in 2023 there were claims going around that developers will be redundant very soon. That you will simply need product managers to describe what software needs to do, and the program will be written for them.
I’ve decided to explore this by building an entire game with AI. A “simple” card collectible game built with React Native and card trading backend functionality built with Elixir. I called the game Terraverse as you will collect the cards from different places on earth. You can see more details about the game here.
TLDR: You cannot build an entire game just based off a single prompt that will be good and captivating at this point or maybe ever - it requires a lot of re-prompting. It's also extremely hard (if not impossible) to generate a good game just with prompting and without understanding the code so you must expect that exceptional experience will require human coders for some time.
Let’s look at some of the things that did work and are in use with Terraverse.
If default ChatGPT AI is good for anything it's for generating ideas. you can simply chat with AI and express your preferences and then eventually you will come to few cool conclusions. I had a few conversations with the AI and I decided to go with the Card collectible game built around real world locations.
Pro tip: Ask the AI to give you a list of ideas so you can just pick and choose instead of asking again.
Core Game Loop Idea | Asking AI to give me ideas for specific features |
Okay now that we have an idea, let’s try to make it. First I set up the project by following the react native set up documentation - AI didn’t really help, as instructions are pretty easy to follow and you just execute 1 command after another.
Lets see what really happens when you want to code with AI. According to GitHUB CEO AI will soon write 80% of the code. If we look into that you quick realise that you need to guide AI the entire time. And completing a line (being a better autocomplete) is not really writing the code. Copilot that they are referring to is smarter autocomplete and I believe that is what 80% refers to. You can hardly ask it to write bigger chunks of code that contain a lot of logic and integrate with the rest of the system.
There are tons of tools out there to help you with the code, I tried a few and in the next few paragraphs I will try to highlight the important parts, but remember pretty much all of these tools have considerably more depth than what is written here. I’m focusing on the highlights.
Insight: Even if we stop writing code in a programming language, we will be writing prompts that are extremely detailed, but in english. In a way we will still be programming.
Everybody knows ChatGPT and the chat interface that you can use. In terms of coding it’s useful for really well defined and common problems and algorithms. If I asked it to write a bigger chunk of code it would definitely contain some mistakes that I would have to review and fix (and don’t forget it’s easier to write than read code!). And then waiting for response and going between editor and back… It’s just slow. I didn’t use ChatGPT (directly) to write any code as it felt like it was a waste of time correcting the mistakes and copying code to the editor over and over again. That’s where Cursor editor comes in.
Pro tip: Don’t use ChatGPT interface to write code, use a dedicated editor (see below)
Code is harder to read than to write, so why are we letting the AI write it?
Cursor is essentially VS Code with modified interface for AI interactions. As mentioned before, if you are using ChatGPT you have to ask it a question and provide it with context, then you have to manually copy pieces over into your editor. This can be quite error prone, as you may not copy everything correctly. Cursor will talk to ChatGPT(or another LLM) for you and present a nice diff view so you can simply pick which parts you want. It’s a massive step up from just using ChatGPT. You can also simply provide the context of the code by mentioning @files and @symbols and @library docs.
It was particularly interesting how well cursor worked when I worked in a language that I am not an expert in - Elixir. I generally knew how to approach a problem, but I didn’t know all the APIs and syntax to make it work. So it sped up things quite a bit. But on another hand, I tried to use it in TypeScript for React Native app (I have tons of experience with) and it slowed me down more than anything. I would constantly have to review and modify the code. It doesn’t help that there are 2000 ways to work with React.
100% of the Elixir trade server was written with Cursor.
Cursor recently released a composer mode that works a lot like Aider (described below). So it seems like useful functionalities of AI tools are converging.
Insight: If you are a noob in a programming language it will make things go fast, if you are an expert it will make things go slow
Aider is a bot that talks with ChatGPT/Claude/Local LLM and modifies your repository for you.
I didn’t spend quite enough time with it to form a strong opinion, but it’s definitely not trivial to pick up. From all the options, this is the future of AI programming that seems the most promising. The main benefit is that it can operate over multiple files and do many actions for you.
It will plan the change, ask you for the files to include and then execute the change for you, as a bonus you may even provide documentation links to aider and it will use the content of documentation when providing its answer.
Few months have passed since I wrote this paragraph and Claude Sonnet 3.5 was released, which works exceptionally well. I managed to write entire libraries for other projects with it, but unfortunately in case of Terraverse at the time of writing this blog post Claude Sonnet wasn’t yet available and Aider did a pretty bad job at everything I needed.
Insight: Aider is good enough to write entire libraries with common well defined problems, but it struggles with massive repositories with thousands of files. This will likely change in the future.
When it comes to writing code, copilot felt like it is only indispensable tool. While you can use the Copilot and ask it questions about the code, I didn’t find this to be the most productive. There are really 2 ways copilot writes the code for you:
It writes code in big chunks to solve a problem
It completes lines based on context
I use it only for line completion and never to solve logical problems. Unfortunately, I avoid using it for logical problems because it's often incorrect or fails to consider edge cases, resulting in bugs that I have to deal with later. It can be a great benefit, especially when you are repeating similar lines elsewhere in the same file. For example, let’s say you have a customer object defined and want to generate types for it:
const customerObject = {
name: 'vito',
twitterHandle: '@otivdev'
description: '0.001x coder'
dateOfBirth: 1995
}
type CustomerObject = {
// This will all be trivially completed by copilot:
name: string;
twitterHandle: string;
description: string;
dateOfBirth: number;
}
But! It feels weird when there is no internet connection and copilot is not offering suggestions. My current theory is that it makes you a worse coder, since you are not thinking about nitty gritty details of the code and edge cases, but you let copilot handle it for you. For example, what if twitterHandle can actually be null or undefined? Copilot will not know about it in this case and will define the wrong types. So how much time does it actually save and how many bugs can it actually introduce?
Insight: You will feel like god initially having code completed for you in a ways you never imagined, but after a while you will realise it’s making you a worse coder and that it hinders your ability to code when you have no internet.
During the early stages of development I switched to phind.com for searching full time. It’s quite good and it provides the sources of information, which is fantastic. It can also write the code, but you have the same issue as with ChatGPT - you must copy paste the code. Again I did not use this for coding, but to ask a general coding questions it can be quite useful.
Insight: I need actual documentation / GitHub links ASAP, I don’t need a long explanation of simple questions, hence I went back to Google. So I can get to that repository ASAP.
Here is where it gets interesting. Since it’s a card collectible game I knew I wanted to have at least a few hundred cards in the same style, potentially thousands. There are tools to generate images out there, but there are no tools to generate massive sets of cards that I needed. So as a result if you want to generate a lot of different images that are also meaningfully connected to your data, you will need to create your own dashboard for it.
I made my dashboard for generating cards public, you can have a look here: Dashboard (not mobile friendly)
So how do you do this? In 2 steps:
Generate card data
Generate images based of card data
Insight: It’s up to you to connect all the tools together, and this is potentially an interesting space for a business idea.
I probably approached this problem sub-optimally because I wanted ChatGPT to generate all the cards from the get go. And what started to happen is as AI approached the context size limits, it didn’t know which cards came before so it started to repeat itself. So essentially you need to figure out how to breakdown your cards into groups that are small enough to be processed in entirety. And then ask it to generate a group by group.
This was a limitation in 2023, but it seems like with context sizes of 2024 (120k+ tokens) this issue may be non-existent, but I haven’t yet updated the code to use new models.
Eventually, I updated the dashboard so you could just choose a continent and a set and press the generate cards button and it will take all the existing cards in that set and send it to ChatGPT and ask for few new ones. Seems to work pretty well for now.
You can do image generation via remote API, just like ChatGPT or you can use stable diffusion - a local generative model. One benefit of stable diffusion is that you have much more control. You can guide the AI, for example if you use mid-journey or Dall-e you can pretty much just use prompt to generate what you want, but with stable diffusion you have so many more options. For example, there are LORAs, Control Nets, prompt mixers, styles…
Let’s explore what went into Terraverse Card generation:
Prompt mixers allow you to specify multiple words and as batches of images get generated, those words get changed. For example a prompt:
[tree|river] landscape
will in the first iteration generate tree landscape
and river landscape
in the next.
Really useful if you want to generate tons of different images and compare them but not much more. I didn’t generate batches of images in existing tools, my dashboard took care of batches and the data from ChatGPT took care of variety.
Insight: Prompt mixer on it’s own are not really a useful tool. How is a folder with 350 images that are not linked to your data helpful?
LORAs (Low-Rank Adaptation) are tiny models that are used for fine tuning. Meaning you can take existing images you like, train a LORA model, and then for future generations you can use this LORA model to get images similar to the ones you trained it on.
I used it mostly as a stylistic consistency tool. I used two open source LORAs, one for landscapes and one for cities. Depending on the card they have different weights, so all the landscape images look alike and all the city images have a consistency element to them as well.
Unlike LORAs control nets are not tiny. In essence control nets allow you to guide image generation with another image. For example, you can take a depth map and supply it to image generation with control net and the resulting image will have the same depth. If you've recently seen QR codes that look a bit funky, those were made with control nets.
Here is an example:
You can also control and combine multiple control nets to get exactly what you want.
For Landscapes
I stupidly made a decision to generate bigger images than model was trained on so there were tiling issues. One way I was able to work around it was to include a simple gradient from white bottom up to the black top, and use that in a depth control net. This works because landscape always have the closest point such as land mass at the bottom of the image and the furthest point like sky or clouds at the top of the image.
This was an option that was turned off by default and I would turn it on if I had tiling issues.
For Landmarks
If you use only stable diffusion and ask it to generate any landmark, even a super famous one, it's going to have a really close resemblance to it, but almost always there are going to be obvious mistakes in the image, especially if you know the landmark or if you've seen it in real life, I really wanted to avoid that.
So I sent my girlfriend on quest to help me collect copyright-free reference images for all the landmarks. And those images were fed into landmark generation. I used “Depth” and “Canny” model, depending on the landmark and details it required.
If you subscribe to newsletter for Terraverse here, you can reply to emails with your landmark images and I may include them in the future updates!
On the left: Landscape Generation Controls On the right: Landmark Generation Controls |
---|
While mid-journey for example will automatically “enhance” your prompt to contain various interesting styles, stable diffusion does not do this by default. But it is really easy to do. You write out a style and simply append it to every prompt.
It can be anything from artists names to elements you want in the picture.
Here is what is considered a style of “Terraverse” picture:
const prompt = `intricate detail, realistic, soft shaded,
soft dramatic back lights, landscape, ((comic book)),
[Rayleigh scattering], atmosphere, shadow,
intricate detail, light cast, sun shafts, (beautiful), realistic,
(unreal engine), (rendered:${renderedWeight})(${card.continent}),
(((${card.name}))) ${characteristicList}
(horizon:${horizonWeight}), ${lora}, [[[sky]]],(${card.predominantColor}), ${artists}`
Marketing is such a broad topic that it’s impossible to contain in a chapter of a blog post. It was useful in many aspects:
Generating logo (again stable diffusion with control net to get the shape I wanted)
Generating ideas (for blog posts, even this one!)
Writing copy (for website)
What didn’t work?
Writing entire blog posts
I tried Jasper AI last year and it definitely improves experience, but I am not sure it’s worth it for a few blog posts that I write
It doesn’t have enough context to tell you the interesting things, it doesn’t say anything novel.
It doesn’t have big enough context size to write it from start to finish (I currently only have access to GPT-4 with 8k token size limit).
Doing any kind of automation
It was mostly useful for one-off tasks
I am no marketer myself, but the biggest problem, as with development is connecting it all together. You still need a human to prompt ChatGPT and transfer copy to other programs.
Here are 3 of my favourite tips that helped with marketing (a little):
Pro tip: Create a personal “AI Prompts” document where you save things you use often.
Pro tip: Write down the project brief that you can copy paste to various different questions so AI immediately has more context
Pro tip: Write a blog post on your own, ask ChatGPT to analyse and extract the writing style, and use it in the future to help you write in your own style.
Image generation: $0
ChatGPT: <$4
Various Different Tools that didn’t work: $40
Yeah, but don’t trust it as it still makes too many mistakes. Keep learning and stay up to date.
No, but it can help you do the tasks you otherwise couldn’t do on your own. In my example - I would never be able to draw 1000s different landscapes.
In terms of programming it can code entire projects as long as their scope is small and contained. For a game like Terraverse, there was a lot of animations that needed to feel good when user was using it, so it required a lot of manual tweaking. Furthermore in a highly changing ecosystem such as react native it was often suggesting out of date code that simply didn't work.
Lastly, you know how good programmers keep refactoring their codebase to make it more maintainable? AI is not doing this yet as long as you don't ask it, so it will just stuff more and more changes into your codebase making it hard to understand. So don't forget to prompt it to refactor!
No, not yet at least.
Right now it definitely feels like there are so many separate systems that someone has to connect together. So we are safe for at least some time. AI is better, even great at some tasks, but all of those tasks are orchestrated and connected to other systems by humans. In case of Terraverse I had to connect image generation with chatgpt and the app.
I don’t see a massive shift happening in that space just yet. If anything it's enhancing our jobs and making it faster, in the future you might do a lot of "coding" in english instead of your language of choice.
Dunno, I would do it if I knew.