Analysing Performance
When we talk about optimizing performance, we could mean a number of different things. The first thought that comes to most people’s minds in the context of a gaming application is frame rate (FPS). FPS is very important, after all a jumpy or sluggish application will disappoint your end user. However, at ARM we also like to include battery performance within this. As good as the FPS might be, there is little point if the end user is unable to play the game for a reasonable amount of time.
The other scenario is that you are completely satisfied with the FPS you are achieving on a chosen device. However, the vast array of mid-range devices in the market that are more than capable of running the content offer a rich, often untapped, opportunity if a little optimization was carried out, allowing you to further monetize on your content. It’s always worth spending some time on optimization and with the great tools that ARM provides, a lot of the guess work and pain can be avoided.
ARM® DS-5™ Streamline™ has been a tool available for some time which allows developers to optimize applications across the whole of any system with an ARM Cortex-A CPU and ARM Mali GPU. Using the tool, you could not only improve FPS but also power efficiency.
Understanding When You Are Fragment Bound
Using DS-5 Streamline, you can quickly identify where in the system the bottleneck lies. In a graphics application there are a number of areas where this could be the case: CPU, Vertex Processing, Fragment Processing and Bandwidth. However, to understand the cause of the bottleneck from an application perspective, you need the visibility and features provided by Mali Graphics Debugger.
In the case of the latter two, Fragment Processing and Bandwidth, we have a number of new features in the latest Mali Graphics Debugger v1.2 to address the common issues.
Overdraw
Overdraw is the term used when you are writing out a fragment more than once. This often occurs when you have transparency in a scene or when you are drawing your objects from back to front. Overdraw is not a bad thing, you might need it to achieve a certain effect - unnecessary overdraw is what we want to avoid. For example, unnecessary overdraw occurs if you are drawing with transparency when it is not required or, as mentioned above, you are drawing your objects from back to front causing the Fragment Processing on the GPU to perform work on pixels that may never be seen, reducing performance and also burning precious joules.
The Overdraw Map feature in the Mali Graphics Debugger can be enabled with a simple toggle in the UI allowing the device to switch to a special mode displaying overdraw. Once in this mode you can capture a frame and step through each draw call to see how the overdraw is being built up and then identify the offending draw call.
Figure 1. Overdraw Map
Figure 1 above shows the overdraw for a game application. The whiter the area on the map, the more overdraw there is (black being zero overdraw). The view allows a simple means to identify areas where you could reduce overdraw.
Shader Utilization
The Mali Graphics Debugger has a useful feature allowing you to capture all the shaders (vertex and fragment) being utilized in a frame along with cycle count information of each shader. With only this information, it’s difficult to identify the shaders you need to spend optimization effort on. Say for example you have a fragment shader being reported as 12 cycles compared with 5-6 cycles for the other shaders. Instinct might dictate that you start optimizing the 12 cycle shader. If that particular shader results in only rendering 4 pixels on the screen, there is unlikely to be any performance gain from working on this shader. A new feature called the Shader Utilization Map in the Mali Graphics Debugger v1.2 allows you to identify which are the most expensive shaders in terms of real usage. i.e. the shaders that took the most amount of GPU processing time.
Figure 2. Shader Utilization Map
Figure 2 above is the same gaming application mentioned in the overdraw example. This view allows you to see clearly which shader contributed to which pixel and understand where optimization efforts can start.
Figure 3. Shader Statistics
Along with the shader statistics (Figure 3), you are able to see the instances run of each shader and the total cost. A benefit of using this view is that it will also allow you to identify opportunities for batching draw calls, resulting in lower CPU overhead and also reducing the amount of state change.
Mali Graphics Debugger v1.2
The above features are all a part of the latest release of Mali Graphics Debugger, available on the Mali Developer Centre Site. We have also added a number of other features to the latest release. These include:
- Support for visualizing ASTC compressed textures
- Vertex Shader utilization
- Frame buffer thumbnails
- Many other improvements and fixes.
Head over to the Mali Developer Centre now to download your free copy. Please leave your feedback below.