Thursday, June 14, 2018

Joystick sampling rate in games.

I investigated an interesting conundrum this morning: why was my game running so much differently on my iPad PRO? The tank was snappy, and turning aggressively on the iPad PRO, but not on Linux and Android.

The main difference between iPad PRO and other platforms is its higher display refresh. But I was certain I had this covered, as I step my simulation exactly the same on all platforms: with 1/120s steps. The only difference being, that I render after each step on iPad PRO and only render once after two sim steps on other platforms that have 1/60s display refresh.

First thing to do, is to rule out differences in the iOS port. When I force my iPad to render at 1/60s instead, the iPad behaviour reverts to the same as the Linux/Android ports. Confirmed: it is the display refresh rate that makes the difference, not the platform's architecture.

So why would these two scenarios have different outcome?
[ sim 1/120s, render ] [ sim 1/120s, render ]

[ sim 1/120s,       sim 1/120s,      render ]
|                     |                     |
0ms                 8.3ms                 16.6ms

A logical explanation would be that I somehow influence the simulation somewhere, as I render. But after examining the code, nothing showed up.

It dawned on me that in the high display refresh case, the faster rendering is not the only difference. In 120Hz mode, you not only get more rendering activity, you also get more frequent input events. Touches come in faster when you render 120Hz, as they do when you render at 60Hz. Joystick changes, and touch events are batched with display refresh.

To confirm this, I put in an artificial joystick value, that would simply rotate the joystick at a set pace. Then I adjusted how those joystick changes were relayed for a 60Hz display frame. The result is the video below.

On the right, I adjust the joystick angle with 0.10 radians before each sim step. On the left, I adjust the joystick angle only once for two steps, but at double the the radians.

At 120Hz stick sampling, I get a smoother joystick signal. Even though the joystick rotation speed is the same, the 60Hz sampling shows more jarring deltas. I hadn't expect the effect on the simulation outcome this big.

The reason for the dramatic difference is that the small difference is amplified by the PID controllers I use in my game. In the case of low stick sampling rate, the PID controller will always see a zero change during the second step, and a large change in the first step. The PID controller can react a lot more effectively if it gets a higher frequency signal.

Lesson learned: these two scenarios give different simulation outcomes:

[ read stick,     sim 1/120s,     sim 1/120s,     render ]

[ read stick, sim 1/120s, read stick, sim 1/120s, render ]
|                                                        |
0ms                                                    16.6ms

Although forcing the 120Hz stick signal down to 60Hz is simple to achieve, it will be hard to provide a 120Hz stick signal if you only get your events at 60Hz. So the sweet, reactive control on iPad PRO is hard to achieve on 60Hz devices.

Friday, May 18, 2018

Differential Steering.

I just did a fun little exercise to figure out the steering in my Flank That Tank! indiegame. Of course, the best way to steer a tank is by using two throttle levers, one for each track. This will let the tank driver directly control the differential steering of the tank. It also enables some pretty exciting and wild maneuvers.

So really, case closed. A gamepad typically has two analog joysticks, one joystick for each track. Done!

However, I want to be able to run this game on mobile platforms using touch. And I tried it, but the lack of physical stops really hampers the feel of driving. Two levers on a touch screen simply is no proper substitute. (Not to speak of controlling two levers and a fire button, which is even harder on a touch screen.)

So no levers on a touch screen. Could we perhaps do differential steering with a single touch? Or, quite similarly: do differential steering with a single joystick?

It's fun to figure it out.

  • The stick at 12 o'clock would mean full steam ahead, so the L and R tracks at +100% power.
  • The stick at 06 o'clock would mean full steam backwards, so the L and R tracks at -100% power.
  • The stick at 03 o'clock would mean a hard right turn, in place, so L at +100% and R at -100% power.
  • The stick at 09 o'clock would mean a hard left turn, in place, so L at -100% and R at +100% power.
For the intermediate joystick positions, interpolating these four settings is all that is required.

This ought to work nicely for touch screens. Left thumb to drive the tank, which leaves the right hand free for tapping the screen to shoot, and possibly aim the turret as well.

So I'll be implementing this scheme shortly. That leaves me to consider the issue of absolute/relative control. Some people can't steer a vehicle that drives towards the camera (or in 2D: to the bottom of the screen) as it reverses L/R from the driver's point of view. So I may implement an absolute system as well: the "12 o'clock position" will adapt to where the tank is pointing.

Friday, May 11, 2018

The curious case of FPS jitter.

I tried to record a video of my game this morning, and it bugged me that it wasn't 100% smooth. The game did report 60fps though, so let's find out what is going on.

First order of business, is to graph the delta-time for each frame, instead of just reporting the frames-per-second. And sure enough, I would see jitter in the signal: a slow frame followed by a fast frame.

What was really puzzling, was that I could induce this jitter by pressing a key on the keyboard. Even if this key has no game functionality behind it, the frame time would jitter: slow+fast, for each and every press. And also for the auto-repeat events.

In the picture above, the three red lines are at values 1/60, 1/30 and 1/20 seconds. The green marks are the measured frame times. The jitter shows up for every key I press.

So, perf and FlameGraph to the rescue.

To my surprise, I notice that SDL_IBus_UpdateTextRect() shows up in the profile. Why is SDL updating text rectangles? I'm not doing any text related things. I just render to an OpenGL window. Notice how a single key press leads to an avalanche of computation and communication, with a call depth of 34 functions deep no less!

Frogtoss told me to look into SDL's Text Input system. My code never started a text input cycle, but to be sure, I called SDL_IsTextInputActive() to check. And sure enough: Text Input is active by default! Adding a SDL_StopTextInput() fixed the jitter.

Judging from the flamegraph, a key press when Text Input is active, is incredibly costly, as it involves computation, communication with the X server, polling, waking up stuff, and more. An avalanche of IO happens for every press. So for games, it's best to turn it off as soon as you have initialized SDL.

Executive Summary: after launching your SDL2 based game, call SDL_StopTextInput() for a smoother frame rate.

 if ( SDL_IsTextInputActive() )
  SDL_StopTextInput();

Post Sctript: I will try recording that video again. If it still isn't smooth, at least it is not because of this.

Test specs:
Ubuntu 18.04
SDL 2.0.8

Tuesday, May 8, 2018

GDPR and iOS developers.

Two of my mobile games on iOS use AdMob to serve ads. This makes me vulnerable to EU General Data Protection fines, as I have no clear view on what exactly is collected by AdMob. Google puts the responsibility of requesting user permission for the data that AdMob collects on me.

The safest option at this time, is for me to completely stop serving ads. And I may very well end up using that option. But I thought it may be interesting to examine other options.

Let's start with disabling all ads, but only for European customers. How feasible would this be?

Well, it starts with the ill-defined term "European customer." We need to identify exactly who the GDPR applies to. This is what the EU has to say about that:

It applies to all companies processing and holding the personal data of data subjects residing in the European Union, regardless of the company’s location.

Still imprecise, because it says nothing about the subject's location, other than residence. What about a EU citizen on holiday in the US? What about a US citizen on holiday in EU? For now, let's ignore this, and just try to determine residence.

One way, would be to check the user's locale. Typically, it would be set to the country of residence. So the use of NSLocale would be a good start. Better than the alternative of actually checking the user's location, as that would first throw up an annoying dialog requesting permission for checking location.

Is that 100% fool proof? What if user's have their locale setup incorrectly? Let me guess, the onus is on me? Hmm... completely disabling ads seems safer indeed.

Ok, disabling ads completely. Is it as simple as going to the AdMob portal, and stop the Ad servings? Unfortunately not, because AdMob would still be active on the mobile device, and after contacting the AdMob servers would learn that there is no Ad service. However, who's to say the user profile hasn't already been sent to AdMob servers anyway, before learning there are no Ads to show?

So nope, disabling ads can't be done without building and uploading new versions to the app store.

One final remark: those GDPR tools that AdMob talks about? Not there!

Wednesday, April 25, 2018

Leaving Track Prints

I am currently developing my, still unnamed, indiegame. This game is a 2D top down tank fight. And its main gimmick is the 100% destructible world.

Nice destruction if I say so myself, but notice that the tanks don't leave track prints. Today, I will write about implementing track prints that the tanks leave behind on the terrain as they drive over it.

The first observation to be made here, is that there are a lot of them, for each tank. That means generating and rendering thousands of them, if not tens of thousands or even hundreds of thousands. This immediately tells me that they can't be rendered individually. I need to apply a technique called Instanced Rendering.

Rendering

In instanced rendering, all instances share the model vertex data and have some per-instance data to make them unique. This per-instance data is typically a transformation matrix, but can also include other things like colour if need be. In my case, the per-instance data can be particularly compact because I work in two dimensions.

All the prints will be identical, except for two things: their position in the world, and their orientation. So in theory, three values would be enough: an x and y coordinate, plus a rotation angle. But personally, I find that defining rotation with a vector, like Chipmunk2D does, is more elegant. Hence, I will feed OpenGL a 2D vector for position and 2D vector for orientation.

The next thing to consider is the life-time of the prints. If we create a new print at frame N, then we will need to render it at that frame N and all other frames after it. Up until frame M (M much larger than N) where we need to evict this print to make space. After all, we don't want to run out of resources by creating arbitrary many prints.

The fact that I progressively create the prints, and reclaim resources for the oldest one, leads me to the convenient solution of ring-buffers. We create a Vertex Buffer Object to hold the shared model data plus N instances. When creating instance N+1, we will reuse the slot at position 0. Each frame, we will only write the VBO at the slots that got new data that same frame.

Generating

Having the rendering covered, leaves me the problem of generating the prints. This problem is trickier. The tank has many track-segments touching the ground at any time, all leaving a print. When the tank drives straight, those prints all superimpose, so you would only really need to generate one of them. But when the tank turns, this won't work, and gets worse if it turns-in-place. See below what happens if you leave one print at each side of the tank. The tracks look fine, until the tank does a 180 degrees spin.

And it looks particularly bad if the tank gets bumped hard and moves sideways. I haven't really cracked the problem of generating proper tracks yet. I think the root of the problem lies in the fact that the game's simulation has no concept of the track links. The tank it self is just four rigid bodies, one chassis, one turret and two for the L/R tracks. The links of the tracks are just an animation effect.

So the generation of track prints needs some more work. I'll report back when and if I solve it.

Tuesday, February 20, 2018

Returning to iOS development.

It occurred to me that the new iPad Pro 120Hz display is a great motivation to update my Little Crane game for iOS. So after a long time, I returned to my iOS codebase. Here I report some random findings.

🔴 OS Support
Currently, Little Crane supports iOS3.2 and up. But the current Xcode (9.2) does not support anything under iOS8. Oh well, abandoning a few old devices then.

🔴 Launch Image
Also scrapped by iOS: Launch Images. If you want to have support for iPad Pro, you now need new fangled Launch Screen storyboards. As more iOS devices got released, the launching process got more complex over time:

  • First, they were just specially named images in your bundle.
  • Then, they were images in an Asset Catalog.
  • Now, they are a storyboard with a whole lot of crap that comes with this. Oh boy.

🔴 Bloated AdMob
Scrapped a long time ago, was the iAd product. So if you want to have ads in your app, you need to look elsewhere. I went with the other behemoth in advertisements: AdMob. When upgrading from AdMob SDK 7.6.0 to 7.28.0 I was unpleasantly surprised. I now need to link to a whole bunch of extra stuff. I think ads do 3D rendering now, as opposed to just playing a video? New dependencies in Admob:

  • GL Kit
  • Core Motion
  • Core Video
  • CFNetwork
  • Mobile Core Services

🔴 GKLeaderboardViewControllerDelegate
Leaderboards with a delegate has been deprecated. It probably still works, so I am tempted to leave in the old code. I do get this weird runtime error message when closing a Game Center dialog though: "yowza! restored status bar too many times!"

Tuesday, February 13, 2018

Flame Graphs and Data Wrangling.

In my pursuit of doing Real Time (60fps) Ray Tracing for a game, I have been doing a lot of profiling with 'perf.' One way to quickly analyse the results from a perf record run, is by making a FlameGraph. Here's a graph for my ray tracing system:

Click here for expanded and interactive view.

During my optimization effort, I've found that lining up all the data nicely for consumption by your algorithm works wonders. Have everything ready to go, and blast through it with your SIMD units. For ray tracing, this means having your intersection routines blast through the data, as ray tracing in its core, is testing rays versus shapes. In my game, these shapes are all AABBs, and my intersection code tests 8 AABBs versus a single ray in one go. A big contribution to hitting 60fps ray tracing is the fact that my scenes use simple geometry: AABBs, almost as simple as spheres, but more practical for world building.

This is all fine and dandy, but does expose a new problem: your CPU is busy more with wrangling the data than doing the actual computation. Even when I cache the paths that primary rays take (from camera into scene) for quick reuse, the administration around intersection tests takes up more time than the tests themselves.

This is visible in the graph above, where the actual tests are in linesegment_vs_box8 (for shadow rays) and ray_vs_box8 (for primary rays.) It seems to be some wall I am hitting, and having a hard time to push through for even more performance.

So my shadow rays are more costly than my primary rays. I have a fixed camera position, so the primary rays traverse the world grid in the same fashion each frame. This, I exploit. But shadow rays go all over the place, of course, and need to dynamically march through my grid.

In order to alleviate the strain on the CPU a bit, I cut the number of shadow rays in half, by only computing shadow once for two frames, for each pixel. So half the shadow information lags by one frame.

So to conclude: if you line up all your geometry before hand, and having it packed by sets of 8, then the actual intersection tests take almost no time at all. This makes it possible to do real time ray tracing at a 800x400 resolution, at 60 frames per second, at 1.5 rays per pixel on 4 cores equipped with AVX2. To go faster than that, I need to find a way to accelerate the data-wrangling.