Monday, May 20, 2024
Art & DesignArtificial Intelligence

Dreaming In Isometric: Variations on Midjourney Prompts

I’ve been experimenting quite a bit with Midjourney, a generative AI platform that produces images based on text prompts (and vice versa). Having recently upgraded from the “Basic” plan ($8/mo) to the “Standard” plan ($20/mo), I now have effectively unlimited “relax” time, which means that the prompts don’t render things instantaneously, but they didn’t render them instantaneously anyway, so I figured why not? Anyway, I wanted to go deep into the question of how to create permutations on a simple prompt, and to see what the limitations of the platform are for doing this. I settled on a simple prompt with variations, to:

/imagine prompt: isometric tokamak, isometric rendering on a white background, {various parameters that I changed every time}

Toke-A-Mac? Is That The Name Of A Northern Michigan Cannabis Festival?

(Please clap for that joke). I first encountered the term “Tokamak” while playing the critically acclaimed video game Outpost (1994), which is sort of like Sim City or Civilization but sci-fi, and was really difficult for 12-year-old me to master. In our household, we did not buy new video games, but rather waited for them to be heavily discounted in second- or third-run retail packages of ten games of yesteryear. In the primitive age of the internet, this means that even the very term, let alone the concept of an experimental fusion reactor, developed and still to this day holds some sort of fascination with the imagined world beyond.

Tokamak is an abbreviation from the Russian, тороидальная камера с аксиальным магнитным полем (toroidal’naya kamera s aksial’nym magnitnym polem), which means, basically, magnetic donut chamber (“donut” because no one knows what a torus is, but we all know what a donut is). It’s a fusion reactor, essentially, and they’re all some permutation of a circular donut shape. Any imagined version of this usually includes the circular shape. Whether it includes other shapes, though, is an interesting question of the variations I’ve produced with MJ.

Why Tokamak?

I wanted to pick something that was going to have a reasonably predictable basic shape and implications, and with enough variation to allow for easier experimentation (n.b.: “industrial sciencey thing of some sort, based on a torus shape as the main component”). But I wanted something that was obscure enough to allow me to experiment without being pigeonholed into a specific thing that the platform would have a hard time altering.

If you’re wondering why you can’t credibly render a landscape of pink grass and sparkly mushrooms, it’s because Midjourney hasn’t been trained on pink grass and sparkly mushrooms, because those things don’t exist in real life. People have certainly imagined fantastical landscapes– usually rendered with software like Blender or Maya or what have you. And Midjourney has, to some degree, been trained on these. It’s been trained on plenty of scientific and science fiction imagery. But there are far more, say, selfies of white girls on Al Gore’s internet, than there are science fiction renderings of an experimental fusion reactor. Hence. why prompts for the former are much easier to generate credible products than for the latter.

Starting With The Basics: Iteration Is Key

The easiest permutations in Midjourney are ones that involve changing a small part of the prompt. If you scroll through the community chat rooms to which the peasantry are relegated when they first subscribe to the trial of the platform, you’ll see a lot of silly stuff like teenagers imagining hi-tech cyberpunk cars or cyberpunk girlfriends, or some prosaic Catholic trying to get the perfect image of a battered refugee child praying for intercession to the patron saint of American immigration reform, or what have you.

What you often see– in repeated attempts from the same users- is someone who is clearly trying very hard, iteration after iteration, to get a very specific thing that the AI isn’t delivering. People will often add additional words, or even redundancies in the phrasing– “Midjourney will understand my prompt if I’m exhaustively specific!” I’ve written about this in my article about how hard it is to get a technically specific thing.

It’s not necessarily that “less is more.” Rather, there are different routes to get to the same destination.

Variation 0: Vanilla Tokamak

I find it interesting to experiment with different colors of backgrounds. If you render something with a “cyberpunk” tag in the prompt, it’s inevitably going to have a dark background, because of the aesthetic milieu associated with that term. Sometimes Midjourney will just go wild and do a bunch of different colors, too. I wanted to keep these uniform with white backgrounds, but I did try a few without specifying the background, and I’d say that around 11/12 were on a black background. Note that these are fairly similar to the “vanilla” on white background. These are just made with the term “isometric rendering of tokamak reactor.”

Variation 1: Pink and White Tokamak.

Hey, who says the space age isn’t colorful?! I am wondering if these are easier to produce for Midjourney because I’m just sticking to simple color variations.

Variation 2: Cyan & White Tokamak.

Note that while some of these end up being similar, they’re completely distinct. This is the magic of an AI model that fundamentally relies on combining different images– based on the probabilistic math of seed randomization- to the point that it’s difficult (if not infinitesimally improbable) to get the exact same image.

Variation 3: Cyan and Pink Tokamak.

In the below series, we see renderings that are basically in the same ballpark, but simply with the combination of the two colors I asked for.

Variation 4: Cyberpunk Tokamak.

There are a few color schemes that often appear together in similar formats, and that’s the characteristic neon glow– usually along some sort of ominous, gothic darkness- of cyberpunk. This is a style whose name was coined decades ago by author Bruce Bethke, who described “…kids who trashed my computer; their kids were going to be Holy Terrors, combining the ethical vacuity of teenagers with a technical fluency we adults could only guess at. Further, the parents and other adult authority figures of the early 21st Century were going to be terribly ill-equipped to deal with the first generation of teenagers who grew up truly “speaking computer.”

In other words, he was predicting Gen Z. The term now usually refers to some varietal of dystopian techno-future. It became mainstreamed through the visually stunning, though monumental buggy game, Cyberpunk 2077. The cyberpunk aesthetic is also similar to the vaporwave aesthetic, and both notably seem to heavily lean on the extremely loud combination of cyan and pink, which is why I included both of these colors to start.

Variation 5: Green and Gold Tokamak.

As I mentioned in the article about trying to get Midjourney to render lightswitches, it’s interesting to see what kinds of things it can or cannot render as easily in terms of color variations. The green and gold combo was pretty striking!

Variation 6: Brown and Gold Tokamak.

As was brown and gold. I especially appreciated the depth of the color and shading on both of these. Gold is a color that can often be interpreted as “yellow,” but these all seem to include some lustrous quality from whatever original image that Midjourney was trained on.

Variation 7: Steampunk Tokamak.

These might well be my favorite because they include both the Victorian architectural aesthetics common to steampunk visual tropes, but also keep the torus shaped machinery. This is most apparent on the far right one, which features a neoclassical Victorian “entrance” or portal on the left side. Honorable mention to a style that didn’t make the article: Rusty tokamak, which was sorta similar to these.

Variation 8: Solarpunk Tokamak.

At the risk of generating reader fatigue over the anythingpunk, I did want to see what sorts of images have been hardcoded into the model to the point that it can generate something easily recognizable as, well, that thing. Solarpunk is another varietal of worldbuilding that relies on an ecological focus. I included it right after steampunk because it may well combine some elements of steampunk’s preoccupation with elaborate mechanics, in this case, in order to build a more ecologically stable world, or what have you. I’ve had good luck with getting landscape illustrations for solarpunk, too. Note how these are closer to the Steampunk ones in that a couple of them include more green (or actual trees and grass).

Variation 9: Food Tokamak.

Now, it’s time to turn up the weird to eleven. I knew that the “toroidalnaya” refers to the donut shape. So, I wanted to know what would happen if I tried to make an actual, literal donut tokamak? Or a cookie tokamak? Or a pizza? The results are, well. Interesting. They preserve overall composition of the model, but they do honestly look like actual pastries. Note how many shapes on the first (far left) image appear to be some sort of unnamed pastry or baking product. The one in the middle might well be a stack of pancakes, in which cold fusion occurs. Hey, who knows what’s going to happen in the future?!=

Variation 10: Lisa Frank Tokamak.

Having failed with a number of artists and having had mixed luck with architectural styles, I decided to take a break and try what would be a shoe-in: the Lisa Frank Tokamak. Again, I just used that very simple prompt. And Midjourney delivered! This was relatively simple as a matter of color manipulation (from a standpoint of what the AI has to do to create a credible image), but it also included a lot of the playful elements of contrasts and color gradients that are characteristic of Lisa Frank stuff.

Variations 11, 12, and 13: Victorian, Haussmannian, and Art Nouveau Tokamak.

Clearly, fusion reactors didn’t exist during the 17th or 18th centuries, nor in the portions of the 19th or 20th centuries when people became excited about the older architectural styles. But a lot of experimentation with AI is a question of imagination, of asking: what if they did? And, to help Midjourney along (because the AI is a complex series of algorithms and lacks its own imagination), what are elements– whether named or implied elements of an image, or elements of the Midjourney prompt itself. The series includes Victorian (first), Haussmannian (second), and Art Nouveau (third).

Important here is that we can better understand what interpretations are gyrating within the mechanical imagination of the artificial intelligence. The rightmost two images in the Victorian series feature spherical or toroidal-esque shapes. We might note that European architecture from the Renaissance through the early 20th century was pretty obsessed with domes and cupolas. Domes and cupolas are, topologically speaking, kinda sorta in the same realm as a torus, in that they both have a lot of curvilinear surfaces. It has been helpful for me to think about things in terms of objects (or elements of a bigger object) that are similar enough to another one to where the AI might actually fill in some of the blanks.

This is also evidenced in the Haussmannian, Parisian Second Empire series, and in the Art Nouveau one, where our image on the far left looks a bit like a greenhouse (or perhaps municipal water infrastructure), while the thing on the right looks more like a water tower (water towers are extremely hard to render in Midjourney, which I’ll write about in the future!).

Variation 14: Wedding Cake Tokamak.

This was more of an experiment in fantasy than anything else, but I thought these were kinda fun! Again, we’re using as our base the idea of two objects that are some combination of cylindrical, toroidal, or spherical. I gave it a prompt referring to an extravagant wedding cake. It’s really not a bad interpretation!

Honorable Mentions for Midjourney Tokamak Styles

This was partially an experiment on my end, and partially a postprandial family activity before my brother’s wedding festivities, in which we threw around different ideas and laughed uproariously at bizarre things we could come up with. It was interesting to see what happened when we tried to put in parameters that the platform clearly wasn’t able to connect to the tokamak geometry, so it instead opted for something like style or color scheme.

A few did not quite make sense to me. Attempting to incorporate styles of, variously, Jennifer Bartlett, Jean-Michel Basquiat, George Nakashima, Helen Frankenthaler, Santiago Calatrava, Miami Beach Art Deco, Zaha Hadid, Tadao Ando, Jeanne Gang, Minoru Yamasaki, Robert Indiana, and, of course, Dahlov Ipcar and William Zorach (shout-out to my fam), I didn’t usually come up with believable results.

Robert Indiana created a plausible enough tokamak (below), but just with giant letters (lulz) and pop art colors. Miami Beach Art Deco created some hotel-esque buildings that just didn’t look like either hotels or a tokamak. The Roy Lichtenstein tokamak created some profoundly bright colors, reminiscent of pop art. I didn’t include most of these because the results were either indistinguishable from the normal one, or weren’t worth posting all in series. The Helen Frankenthaler one was kinda neat, because in all of my attempts, the quads were completely insane and beautiful, but in a completely abstract way (seems to have latched onto more of the late artist’s color style rather than the hallowed geometry of the donut, but a few of them did have this neat round thing in it, whatever it actualy was).

Conclusions: Experiment and Iterate!

Don’t be afraid to experiment with Midjourney. That’s kinda the point of it. You’re just going to have to be patient. The total product of this exercise created two whole gigabytes of data that I downloaded– to say nothing of the gigabyte or so that I left behind on the Discord server or, certainly, the gigabytes of throughput required to produce these images.

As with iterative process improvement in business and agile development realm, every time you try a new prompt, you don’t necessarily have to blow it up and start from scratch. You have to make minor course corrections to correct things you don’t like. You have to stick with what works, and then you also have to figure out how to incorporate these minor course corrections to add new things. While a lot of data science is “garbage in, garbage out,” the stochastic and probabilistic machinations of artificial intelligence means that a lot of this will come down to total randomness. One must simply try and try again until producing something that is going to stick.

Nat M. Zorach

Nat M. Zorach, AICP, MBA, is a city planner and energy professional based in Detroit, where he writes about infrastructure, sustainability, tech, and more. A native of Lancaster, Pennsylvania, he attended Grinnell College in Iowa, the Kogod School of Business at American University, the POCACITO transatlantic program, the SISE program at the University of Illinois Chicago, and he is also a StartingBloc Social Innovation Fellow. He enjoys long walks through historic, disinvested Rust Belt neighborhoods at sunset. (Nat's views and opinions are his own and do not represent those of his employer).

Leave a Reply