Quantcast
Channel: ARM Mali Graphics
Viewing all articles
Browse latest Browse all 266

Vulkan & Validation Layers

$
0
0

Why the validation layers?

Unlike OpenGL, Vulkan drivers don't have a global context, don't maintain a global state and don't have to validate inputs from the application side. The goal is to reduce CPU consumption by the drivers and give applications a bit more freedom in engine implementation. This approach is feasible because a reasonably good application or game should not provide an incorrect input to the drivers in release mode and all the internal checks driver usually do are therefore a waste of CPU time.  However, during development/debugging stages, an invalid input detecting mechanism is a useful and powerful tool which can make a developer's life a lot easier. As a new feature in the Vulkan driver all input validations have been moved into a separate standalone module called the validation layers. While debugging or preparing the graphics application to release, running the validation layers is a good self-assurance that there are no obvious mistakes being made by the application. While "clean" validation layers don't necessarily guarantee a bug-free application, they’re a good step towards a happy customer.  The validation layers is an open source project which belongs to Khronos community so everyone is welcome to contribute or raise an issue: https://github.com/KhronosGroup/Vulkan-LoaderAndValidationLayers/issues

 

My application runs OK on this device. Am I good to ship it?

No you are not! Vulkan specifications are the result of contribution from multiple vendors and as such there is a list of functionalities that Vulkan API offers that can be used for Vendor A, but may be somewhat irrelevant to Vendor B. This is especially true for Vulkan operations that are not directly observable by applications, for instance layout transitions, execution of memory barriers etc. While applications are required to manage resources correctly, you don't know what exactly happens on a given device when, for example, memory barrier is executed on an image sub-resource. In fact, it depends heavily on the specifics of the memory architectures and GPU. From this perspective, mistakes in areas such as sharing of the resources, layout transitions, selecting visibility scopes and transferring resource ownership may have different consequences on different architectures. This is really a critical point as incorrectly managed resources may not show up on this device due to the implementation options chosen by the vendor, but may prevent the application from running on another device, powered by another vendor.

 

Frequently observed application issues with the Vulkan driver on Mali.

 

External resources ownership.

Resources like presentable images are treated as external to the Vulkan driver, meaning that it doesn’t have ownership of them. The driver obtains a lock of such an external resource on a temporary basis to execute a certain rendering operation or a series of rendering operations.  When this is done the resource is released back to the system.  When ownership is changed to be the driver's, the external resource has to be mapped and get valid entries in MMU tables in order to be correctly read/written on GPU. Once graphics operations involving the resource are finished it has to be released back to the system and all the MMU entries invalidated. It is the application's responsibility to tell the driver at which stage the given external resource ownership is supposed to be changed by providing this information as a part of render pass creation structure or as a part of the execution of a pipeline barrier.

 

Ex.When the presentable resource is expected to be in use by the driver layouts are transitioned from VK_IMAGE_LAYOUT_PRESENT_SRC_KHR to VK_IMAGE_LAYOUT_GENERAL or  VK_IMAGE_LAYOUT_COLOR{DEPTH_STENCIL}_ATTACHMENT_OPTIMAL. When rendering to the attachment is done and it's expected to be presented on display, layouts need to be transitioned back to VK_IMAGE_LAYOUT_PRESENT_SRC_KHR.

 

Incorrectly used synchronization

Vulkan Objects lifetime is another critical case in Vulkan applications.  The Application must ensure that Vulkan objects, or the pools they were allocated from, are destroyed or reset only when they are no longer in use.  The consequence of incorrectly managing object lifetimes is unpredictable. The most likely problem is MMU faults that will result in rendering issues and losing of a device. Most of these situations can be caught and reported by validation layers, for example, if the application is trying to reset a command pool while the command buffer which was allocated from it is still in flight; the validation layers should intercept it with the following report:

 

[DS] Code 54: Attempt to reset command pool with command buffer (0xXXXXXXXX)which is in use

 

Another example. When the application is trying to record commands into the command buffer which is still in flight, the validation layers should intercept it with the following report:

 

[MEM] Code 9: Calling vkBeginCommandBuffer() on active CB 0xXXXXXXXX before it has completed.

You must check CB fence before thiscall.

 

Memory requirements violation.

Vulkan applications are responsible for providing a memory backing image or buffer object via the appropriate calls to vkBindBufferMemory or vkBindImageMemory. The application must not make assumptions about appropriate memory requirements for an object even if it's, for example, a vkImage object created with VK_IMAGE_TILING_LINEAR tiling, as there is no guarantee of contiguous memory. Allocations must be done based on size and alignment return values from vkGetImageMemoryRequirements or vkGetBufferMemoryRequirements. Data upload to the subresource must then be done with respect to sub-resource layout values like offset to the start of sub-resource, size, row/array/depth pitch values.  Violation of memory requirements for a Vulkan object can often result in segmentation faults or MMU faults on GPU and eventually VK_ERROR_DEVICE_LOST.  It’s recommended to run validation layers as a means of protection against these kind of issues. While validation layers can detect situations like memory overflow, cross object memory aliasing, mapping/unmapping issues; insufficient memory being bound isn't currently detected by the validation layers for today.


Viewing all articles
Browse latest Browse all 266

Trending Articles