How to make an RPG textbox

JRPGs communicate with the player via textboxes. From the contents of a chest at end of a dungeon to the ominious words of the games antagonist; it's the humble textbox that delivers each message. During your game players will interact with many textboxes, so it's worth getting them right!

There are broadly two categories of textbox:

  1. Textboxes the communicate speech or thought
  2. Textboxes that communicate something about the state of the world

The first set of textboxes come from characters in the game, including the player's avatar. The second type come from the game's unseen narrator. You may not even realise the unseen narrator exists! He tells you when the game's saved, when party health is recovered at an inn or the amount of gold hidden in a vase in some poor stranger's house.

Textboxes from Final Fantasy Tactics.

The first type of textbox are deliver messages from characters in the game world. To help reinforce that the textboxes are spoken, the text is often typed out letter by letter. That's how words work in the real world - they sounds are spoken one after another not all at once.

Textboxes telling you some information about the world.

The second type of textbox is not directed at the player's avatar so much as to the player themselves. These words are not usually intended to come from any entity living in the gameworld, instead it's more like the game that's talking. These words are not being spoken in any meaningful sense and therefore appear at once rather than being typed in.

Next let's breakdown the component parts of a textbox.

Anatomy of a Textbox

A textbox has three basic parts.

  • Frame
  • Text Area
  • Continue button

The three major parts of a textbox.

If we take all these parts, combine them together and add some text the textbox looks like this:

The textbox combined.

Pretty nice.

The frame is the body of textbox and contains the text, the continue button and any number of additional components such as an avatar picture or speaker label. The frame also contains a backing panel which is rendered first. The backing panel allows the text to be readable no matter the location. The frame frames the text.

The text area is the most important part of the box. The text represents what is being said, thought or read. It may have a transition where it's typed into the box. If the text is long it's broken into pages. One page is shown at a time. When the player presses the continue button they move from one page to the next until they come to the end of sequence.

The continue button tells the player that there's more text to see. The continue button sometimes uses two different images. The first image indicates that text continues, if you press the continue button, then you can read what the character says next. The second image indicates the character speaking has no more to say and that if you press continue now, he will stop talking and something new will happen. This might mean control is returned to the player, or that a new character might speak. A very common image used for the continue button is bouncing arrow pointing down.

Let's take a look at some textboxes.

Example Textboxes

Games have used textboxes for decades now - that's time-immemorial in the games industry! There are many variations on a theme so let's look at a random sample:

Zelda: Breath of the Wild

The textbox from Zelda Breath of the Wild.

Above is a textbox from Zelda: Breath of the Wild. The backing panel here is a nice rounded rectangle, it's semi-transparent and has decoration on the sides. It could be a fixed size image, nine-box style resizable panel or possibly a distance field done using a shader. The text area is in the center of the frame with generous horizontal padding compared to the outer frame. The continue button is a small arrow, aligned to the bottom center of the box. Additionally there's a speaker title at the top of the box and a portion of text is colored.

Another question that's interesting to ask about this textbox is "Who is speaking?". The title is "Old Man's Diary" so:

  • Is this the text presented by the unseen narrator?
  • Is it Link reading it out loud?
  • Or are we seeing it from the point of view the Old Man writing it?

If it was the unseen narrator presenting the text I wouldn't have type-in animation, for the other two I would.

Shadowrun

The textbox from Shadowrun.

This textbox is from Shadowrun, a more traditional PC-style RPG but it conforms to many of the conventions we describe in this article. One immediate difference to note is it's much bigger. Shadowrun is a much wordier game than most traditional JRPGS and it needs more space for text.

The frame is the large box on the right of the screen. Like Zelda it has a semi-transparent background. The frame is decorative and helps reinforce the cyberpunk style of the game. The text area probably covers most of the area in the frame. There's a continue button a little after the text.

The avatar appears as a separate element outside of the frame. Inside the frame, above the text area is a title to indicate who is speaking.

Vagrant Story

The textbox from Vagrant Story.

Vagrant Story uses textboxes inspired by comic books. For spoken text Vagrant Story uses a tail to indicate the speaker. If the text is being thought then no tail is used and instead the textbox hangs over the thinker's head. Like most JRPG inspired games the text in the textbox has type-in animation.

Note that Vagrant Story does not use a continue button but of course it does support the continue action.

Vagrant Story uses has a number of different textbox frames that it swaps out depending on the size of the text.

Flow

How complicated you want to make your textbox is up to. Below is a diagram showing a number of states for fully featured implementation.

The flow of the textbox.

We'll get into the details of some of these states below.

Open & Close Transitions

Millions of years of evolution have gone into writing the subconscious behaviors that keep us alive. Our tiger detection heuristics are excellent; if something suddenly appears in-front of us, we jump and get ready to run. Horror movies exploit this with jump scares, if something suddenly appears right in-front you - that mean DANGER! How does this relate to RPGs? Well for one, if a textbox appears on the screen with no warning, it's jarring! If you want your game messages to be jarring, that's great but for most textboxes a smoother open and close transition is preferable.

The open and close transitions describe how the box appears and dissappears off screen. Most transitions are some combination of scaling and fading in.

The open and close transtions for the FFT textbox.

The textbox from Final Fantasy Tactics is shown above, you'll note it's open-transition uses a scale, but it actually has a bit of a bounce. Instead of scaling from 0 to 1 it scales from 0 to 1.1 (at a guess) and then to 1. You can mimic a lot of these effects with a good Tween class.

Typed Text

Text render and layout is a vast topic but here's how it works basically. When writing your text engine you write a function that given a string, tells you how long that string is in pixels (or world space units) in your game world.

You then have an algorithm that takes the text you want to display, the width of the text area and breaks it into lines of text to display. Then you use these lines and the height of the text area to break the lines up into pages. Each page is a list of lines that fits neatly into one text area.

For a simple type in effect, decide how long it should take for a single character to appear, say 0.05 seconds. This means after 1 second has passed 20 characters will have been typed into the textbox. Using a counter that counts the time between each frame we can run the following test to determine if a character gets rendered:

if CharAppearTime * index > TypeInCounter then
    Render(text[index])
end

The How to Make an RPG book goes into more detail about laying out text, so I won't get into the fine-grain details here.

Paged Text

Once the text box finishes it's open transition, it's filled up with text. This is the first page of text. If there's only one page of text, the textbox will close when the player hits the continue button.

RPGs are wordier than other games. In your game there will come a time when a single textbox is not enough to hold all the dialog you want a character to say. If there are multiple pages of text, when the player hits continue the box remains open but the current page of text is cleared and a new one is typed in.

For multiple pages of text we only perform one open transition otherwise it looks odd. This can be seen by comparing the two images below:

Paged textbox versus one that opens every page.

The textbox on the left doesn't support paging, note how it opens and closes for each line of text. It feels a bit jittery, the one on the right however remains open for each page of text to be displayed.

Things To Avoid

Typed in text looks bad if you do the layout and type-in effect at the same time. You tend to see words being built up character by character right up to the edge of the text area. Then when typing the next character will cause the word to overlap the bounds - then the entire word suddenly jumps down to the second line.

The code might looks something a little like this:

currentWord += nextChar;
pixelWidth = lenPixel(currentLine) + lenPixel(currentWord);

if pixelWidth < lineWidth then
    renderLine(currentLine, currentWord)
else
    startNewLine(currentWord)
end

The simplest way to avoid this is to:

  1. Do the layout first and store all the positions.
  2. Draw one character at time at the stored positions.

This will look much nicer!

Special Effects

JRPGs started as a silent medium. It's only relatively recently games have become fully voiced. Voice carries a lot more information than just the line being read, the timber, volume, pace - it's adds depth to everything said. Textboxes cannot hope to match this but they can aspire to get a little closer.

Another a silent medium is the comic. Here text is written in big bold letters for sounds effects like WHACK. Some speech balloons are spikey to indicate interjection, thought bubbles look like clouds using these the author and artist can add more depth to the work.

Typed text can be extended in similar ways:

  • Colored text
  • Words that are typed fast and slow (such as STOP! or ...)
  • Words that after being typed jiggle or glow
  • Icons and emoticons

Icons and emoticons are also commonly seen over the speakers head, rather than embedded directly but both are used.

Attribution

A textbox lets the player read what a character says but that's not enough! The player also needs a way to know who is speaking. Games handle attribution in different ways but here are some of the more common strategies.

Titles

The character speaking is labelled at the top of the textbox. This maybe done as part of the textbox frame, or it may only be shown for the first page of spoken text.

Screen Position

The simplest way of attributing a textbox, is to position the box over the speaker's head. This is pretty effective but depending on the game can be a little crowded if characters are grouped together.

Final Fantasy 7 proximity textbox.

The above is a textbox from Final Fantasy 7. It's also a bit of a cheat for this example as both characters have some words in this textbox. It's presenting the player with a choice of reply. Also most Final Fantasy 7 textboxes use a title on the first page to let you know who's currently talking.

Avatars

Avatars are often used in tandem with the textbox. Commonly a label is used to help remind us who is speaking and often a nice avatar picture is shown too. This picture may have a number of variations to express the speaker's feelings about what's being said.

When an avatar is used it's often very clear who is speaking and therefore the textbox need not be drawn near the speaker. A common pattern with an avatar, especially a large one, is to draw the textbox along the bottom of the screen, taking up the full width. This style is very popular for visual novels.

A classic layout is the avatar face pic inside the frame of the textbox.

Avatars used in Xenogears textboxes.

Here you can see a textbox from Xenogears, note there's a title and an avatar in this textbox.

Another very popular layout is larger avatar positioned outside the box. Below is a screenshot from Persona 5, the avatar is drawn to left but it's quite common to see avatars drawn above the textboxes as well.

Avatars used in Persona 5 textboxes.

A Tail

The word tail comes from comics and it describes the pointed bit of a word balloon. This tails points towards the speaker and gives the impression the words are coming from that avatars mouth.

Final Fantasy 9 has a textbox with a tail.

Animation

If the game resolution is high enough and the characters are large enough then animating the lips is another way to indicate who is speaking. Because this requires the mouth to take up a reasonable amount. Animation is more commonly used with fully-voiced replies.

Closing

The things you need to implement a textbox:

  • A way to render text
  • A way to measure text in pixels
  • A function to cut the text into lines
  • A method to seperate lines into pages
  • A way to control the state of the textbox and pages

This has been a high level overview of the parts and functions of a JRPG textbox. I've not gone into the messy implementation but hopefully it gives a good start point for you own implementation. I'll share more details about how I've tackled the implementation in an upcoming article.