Playing with buffers 🟢¶

With webgpu.hpp

Resulting code: step031

Vanilla webgpu.h

Resulting code: step031-vanilla

Before feeding vertex data to the render pipeline, we need to get familiar with the notion of a buffer. A buffer is “just” a chunk of memory allocated in the VRAM (the GPU’s memory). Think of it as some kind of new or malloc for the GPU.

In this chapter, we will see how to create (i.e., allocate), write from CPU, copy from GPU to GPU and read back to CPU.

Note

Note that textures are a special kind of memory (because of the way we usually sample them) so they live in a different kind of object.

Since this is just an experiment, I suggest we temporarily write the whole code of this chapter at the end of the Initialize() function. The overall outline of our code is as follows:

// Experimentation for the "Playing with buffers" chapter
{{Create a first buffer}}
{{Create a second buffer}}

{{Write input data}}

{{Encode and submit the buffer to buffer copy}}

{{Read buffer data back}}

{{Release buffers}}

Creating a buffer¶

The overall structure of the buffer creation will surprise no one now: a descriptor, and a call to createBuffer.

With webgpu.hpp

BufferDescriptor bufferDesc;
bufferDesc.label = "Some GPU-side data buffer";
bufferDesc.usage = BufferUsage::CopyDst | BufferUsage::CopySrc;
bufferDesc.size = 16;
bufferDesc.mappedAtCreation = false;
Buffer buffer1 = device.createBuffer(bufferDesc);

Vanilla webgpu.h

WGPUBufferDescriptor bufferDesc = {};
bufferDesc.nextInChain = nullptr;
bufferDesc.label = "Some GPU-side data buffer";
bufferDesc.usage = WGPUBufferUsage_CopyDst | WGPUBufferUsage_CopySrc;
bufferDesc.size = 16;
bufferDesc.mappedAtCreation = false;
WGPUBuffer buffer1 = wgpuDeviceCreateBuffer(device, &bufferDesc);

One notable difference with a CPU buffer is that we must state some usage hints, telling about our use of this memory. For instance, if we are going to use it only to write it from the CPU but never to read it back, we set its CopyDst usage flag on but not the CopySrc flag. This not fully agnostic memory management helps the device figure out the best memory layout.

Note

A GPU buffer is mapped when it is connected to a specific part of the CPU-side RAM. The driver then automatically synchronizes its content, either for reading or for writing. We do not use this feature for now.

For our little exercise, let us create a second buffer, called buffer2. We will load data in the first buffer, issue a copy command so that the GPU copies data from one to another, then read the destination buffer back.

We can reuse the descriptor, only changing teh label for now:

With webgpu.hpp

bufferDesc.label = "Output buffer";
Buffer buffer2 = device.createBuffer(bufferDesc);

Vanilla webgpu.h

bufferDesc.label = "Output buffer";
WGPUBuffer buffer2 = wgpuDeviceCreateBuffer(device, &bufferDesc);

Also, don’t forget to release your buffers once you no longer use them:

With webgpu.hpp

// In Terminate()
buffer1.release();
buffer2.release();

Vanilla webgpu.h

// In Terminate()
wgpuBufferRelease(buffer1);
wgpuBufferRelease(buffer2);

Note

Buffers (as well as textures) also provide a destroy method:

With webgpu.hpp

buffer1.destroy();

Vanilla webgpu.h

wgpuBufferDestroy(buffer1);

This can be used to force freeing the GPU memory even if there remains existing references to the buffer.

Destroy frees the GPU memory that was allocated for the buffer, but the buffer object itself, which lives on the driver/backend side, still exists.
Release frees the driver/backend side object (or rather decreases its reference pointer) and destroys it if nobody else uses it.

Writing to a buffer¶

The device queue provides a queue.writeBuffer function (or the C-style wgpuQueueWriteBuffer), to which we first give the GPU buffer to write into, then a CPU-side memory address and size from which data is copied:

With webgpu.hpp

// Create some CPU-side data buffer (of size 16 bytes)
std::vector<uint8_t> numbers(16);
for (uint8_t i = 0; i < 16; ++i) numbers[i] = i;
// `numbers` now contains [ 0, 1, 2, ... ]

// Copy this from `numbers` (RAM) to `buffer1` (VRAM)
queue.writeBuffer(buffer1, 0, numbers.data(), numbers.size());

Vanilla webgpu.h

// Create some CPU-side data buffer (of size 16 bytes)
std::vector<uint8_t> numbers(16);
for (uint8_t i = 0; i < 16; ++i) numbers[i] = i;
// `numbers` now contains [ 0, 1, 2, ... ]

// Copy this from `numbers` (RAM) to `buffer1` (VRAM)
wgpuQueueWriteBuffer(queue, buffer1, 0, numbers.data(), numbers.size());

Note

Uploading data from the CPU-side memory (RAM) to the GPU-side memory (VRAM) takes time. When the function writeBuffer() returns, data transfer may not have finished yet but what is guaranteed is that:

You can free up the memory from the address you just passed, because the backend maintains its own CPU-side copy of the buffer during transfer (use mapping if you want to avoid that).
Commands that are submitted in the queue after the writeBuffer() operation will not be executed before the data transfer is finished.

And don’t forget that commands sent through the command encoder are only submitted when calling queue.submit() with the encoded command buffer returned by encoder.finish().

Copying a buffer¶

We can now submit a buffer-buffer copy operation to the command queue. This is not directly available from the queue object but rather requires us to create a command encoder. Once we have an encoder we may simply add the following:

With webgpu.hpp

// After creating the command encoder
encoder.copyBufferToBuffer(buffer1, 0, buffer2, 0, 16);

Vanilla webgpu.h

// After creating the command encoder
wgpuCommandEncoderCopyBufferToBuffer(encoder, buffer1, 0, buffer2, 0, 16);

The argment 0 after each buffer is the byte offset within the buffer at which the copy must happen. This enables copying sub-parts of buffers.

We wrap this in a command encoding process similar to the render pass one:

With webgpu.hpp

CommandEncoder encoder = device.createCommandEncoder(Default);

{{Copy buffer to buffer}}

CommandBuffer command = encoder.finish(Default);
encoder.release();
queue.submit(1, &command);
command.release();

Vanilla webgpu.h

WGPUCommandEncoder encoder = wgpuDeviceCreateCommandEncoder(device, nullptr);

{{Copy buffer to buffer}}

WGPUCommandBuffer command = wgpuCommandEncoderFinish(encoder, nullptr);
wgpuCommandEncoderRelease(encoder);
wgpuQueueSubmit(queue, 1, &command);
wgpuCommandBufferRelease(command);

Conclusion¶

Congratulations! We were able to create a GPU-side memory buffer, upload data into it, copy it remotely (operation triggered from the CPU, but executed on the GPU) using the command queue and download data back from the GPU. We can now use a buffer to specify vertex attributes, in particular vertex positions!

With webgpu.hpp

Resulting code: step031

Vanilla webgpu.h

Resulting code: step031-vanilla

Playing with buffers 🟢¶

Creating a buffer¶

Writing to a buffer¶

Copying a buffer¶

Reading from a buffer¶

Mapping¶

Asynchronous polling¶

Mapping context¶

Using the mapped buffer¶

Conclusion¶