6/14/2011

06-14-11 - How to do input for video games

1. Read all input in one spot. Do not scatter input reading all over the game. Read it into global state which then applies for the time slice of the current frame. The rest of the game code can then ask "is this key down" or "was this pressed" and it just checks the cached state, not the hardware.

2. Respond to input immediately. Generally what that means is you should have a linear sequence of events that is something like this :

Poll input
Do actions triggered by input (eg. fire bullets)
Do time evolution of player-action objects (eg. move bullets)
Do environment responses (eg. did bullets hit monsters?)
Render frame
(* see later)

3. On a PC you have to deal with the issue of losing focus, or pausing and resuming. This is pretty easy to get correct if you obeyed #1 - read all your input in one spot, it just zeros the input state while you are out of focus. The best way to resume is when you regain focus you immediately query all your input channels to wipe any "new key down" flags, but just discard all the results. I find a lot of badly written apps that either lose the first real key press, or incorrectly respond to previous app's keys when they didn't have focus.

( For example I have keys like ctrl-alt-q that toggle focus around for me, and badly written apps will respond to that "q" as if it were for them, because they just ask for the global "new key down" state and they see a Q that wasn't there the last time they checked. )

4. Use a remapping/abstraction layer. Don't put actual physical button/keys all around your app. Even if you are sure that you don't want to provide remapping, do it anyway, because it's useful for you as a developer. That is, in your player shooting code don't write

  if ( NewButtonDown(X) ) ...
instead write
  if ( NewButtonDown(BUTTON_SHOOT) ) ...
and have a layer that remaps BUTTON_SHOOT to a physical key. The remap can also do things like taps vs holds, combos, sequences, etc. so all that is hidden from the higher level and you are free to easily change it at a later date.

This is obvious for real games, but it's true even for test apps, because you can use the remapping layer to log your key operations and provide help and such.

(*) extra note on frame order processing.

I believe there are two okay frame sequences and I'm not sure there's a strong argument in one way or the other :


Method 1 :

Time evolve all non-player game objects
Prepare draw buffers for non-player game objects
Get input
Player responds to input
Player-actions interact with world
Prepare draw buffers for player & stuff just made
Kick render buffers

Method 2 :

Get input
Player responds to input
Player-actions interact with world
Time evolve all non-player game objects
Prepare draw buffers for player & stuff just made
Prepare draw buffers for non-player game objects
Kick render buffers

The advantage of Method 1 is that the time between "get input" and "kick render" is absolutely minimized (it's reduced by the amount of time that it takes you to process the non-player world), so if you press a button that makes an explosion, you see it as soon as possible. The disadvantage is that the monsters you are shooting have moved before you do input. But, there's actually a bunch of latency between "kick render" and getting to your eye anyway, so the monsters are *always* ahead of where you think they are, so I think Method 1 is preferrable. Another disadvantage of Method 1 is that the monsters essentially "get the jump on you" eg. if they are swinging a club at you, they get to do that before your "block" button reaction is processed. This could be fixed by doing something like :

Method 3 :

Time evolve all non-player game objects (except interactions with player)
Prepare draw buffers for non-player game objects
Get input
Player responds to input
Player-actions interact with world
Non-player objects interact with player
Prepare draw buffers for player & stuff just made
Kick render buffers

this is very intentionally not "fair" between the player and the rest of the world, we want the player to basically win the initiative roll all the time.

Some game devs have this silly idea that all the physics needs to be time-evolved in one atomic step which is absurd. You can of course time evolve all the non-player stuff first to get that done with, and then evolve the player next.

11 comments:

Anonymous said...

In method 1, pre-building the draw buffers is problematic if the player has direct control over the camera (e.g. mouselook), since you won't know what to draw until after processing player input.

Hmm, there MIGHT actually be something to be said for polling the mouse position at the absolute last moment before rendering, which means separate from the rest of user input (which is usually latent through accelerations/decelerations anyway).

Nino Mojo said...

John Carmack speaks about inputs at about 14 minutes in this 20 minutes interview.
http://www.computerandvideogames.com/306236/news/id-softwares-john-carmack-20-minute-video-interview/


Input latency is something that drives me mad, and with the rise of multicore machines, my understanding as a non-programmer is that it will only get worse as long as games use multicore engines.

cbloom said...

"pre-building the draw buffers is problematic if the player has direct control over the camera (e.g. mouselook)"

Ah yeah, good point, I was thinking in terms of 3rd person console game. But you can certainly still do most of the render work before the input poll (things like animating skinned characters).

-tom! said...

Point #2 can be further simplified by

Gather inputs

Simulate, "time evolve the world"

Render

All applications should follow this logic, they will be simpler for it. Even if you want to run your input polling on a separate thread (so you can get higher resolution response curves) you should "lock in" the view that the application has on the input state at a fixed point in the processing loop.

cbloom said...

"Point #2 can be further simplified by"

Not really. The verbose version of point #2 is important.

There's a lot of games that do

"Gather inputs, Simulate, Render"

and get it wrong. Just saying that is not enough.

For example, the simulate phase might move bullets before creating bullets.

To minimize latency, it's crucial that those sub-steps are ordered correctly.

Thatcher Ulrich said...

If the "prepare" step has any substantial variance, I'm suspicious of the inconsistency in feel. For that reason I think I lean toward the simpler version.

ross said...

You mention giving the player enough time to react to an enemy's action by rendering ASAP after input. If your game is running at 60 fps, the order of these things within a single frame is not significant. You're talking about times much shorter than 1/60th of a second. Other considerations dwarf input lag here.

Input lag frustrates me as both a developer and a competitive gamer, but I don't think this will have the effect you're imagining.

The notes about abstraction/decoupling are great though.

cbloom said...

"If your game is running at 60 fps, the order of these things within a single frame is not significant. You're talking about times much shorter than 1/60th of a second. Other considerations dwarf input lag here."

I don't agree with that. First of all, it's very rare for a game to actually run at 60 fps, the majority are 30 fps. But let's pretend for the moment that we do have a 60 fps game.

It's very very unlikely the display is actually updated 16 millis after an input event, or even 33 millis after an input event. Only the old arcade games actually updated frames that fast.

You probably have a frame of delay in your renderer (most people enqueue rendering these days for parallelism). Then you have a frame of delay in the page flip (nobody runs single buffered any more), and the graphics card/tv/monitor usually add at least 1 frame of delay.

So a "60 fps" game generally has at least 50 millis of latency already, with no input lag.

The difference in doing your input perfectly is another 10 millis or so (let's say, 2/3 of the frame time at 16 millis per frame).

So you can either have 50 millis of lag or 60 millis of lag.

It's unlikely that anyone would feel that difference (certainly running at 30 fps vs 60 fps is a much bigger issue), but why would you refuse it? It's a little bit less latency *for free* , all it takes is some careful thought.

Obviously the big issue is games that have 200 millis of lag, if you're down to 60 you're already doing well, but really if you're down to 60 you're probably thinking about these issues already.

And even if you don't really care about the 10 millis I contend that it's good to go through this kind of reasoning, because this is how you get low lag and good responsive games - you trace the effect of user input through the code processing path and see what order it effects the world and make sure the latency is what you want.

Aaron said...

There's also the spillover effect. If you're in the mode of thinking about ordering input right to save milliseconds of input lag, you'll start catching other areas where you're allowing unnecessary lag and fix those too, and pretty soon you've eliminated three frames of lag instead of one.

jwatte_food said...

Why evolve players separately from the world? With rigid body systems, my experience is it just leads to more special cases and instability.

1. Detect penetrations and triggers
2. Read input
3. Run all rules based on triggers and input
4. Evolve physical simulation
5. Render state of world

cbloom said...

With your system physical triggers based on player input are a full extra frame behind.

I know it's appealing to think of the player as just another object to physically simulate, but I think the advantages of special-case hacking it are massive and worth doing.

old rants