|
Sometimes, we're required to render graphics in two dimensions instead of the
nowadays usual three. In a 3D game or application, this is not as easy as it
sounds because just blitting stuff to the screen requires a dynamic texture and
might even stall the GPU's rendering queue, causing performance problems.
This article will show you how to render 2D overlays in a 3D scene taking full
advantage of 3D acceleration. First, let's see why we want to go through all
the hazzle instead of just using a set of blitting routines to render onto
a direct draw surface or texture.
-
Using the graphics card to draw bitmaps and sprites is way faster than
performing blits with the CPU these days. You can even perform other
calculations on the CPU while the graphics card is rendering.
-
Automatic pixel format conversion. Even if your application uses the current
desktop color depth, you never need to think about rewriting every drawing
routine for RGB-5-6-5, XRGB-1-5-5-5, RGB-8-8-8 and XRGB-8-8-8-8 formats again.
-
Clipping will be done by the graphics card. The risk of having your fancy
drawing algorithms writing outside the desired memory area is entirely eliminated.
-
Special effects like translucent bitmaps, zooming and rotating sprites,
recoloring existing images or even using hardware shaders on your 2D images.
This article isn't finished!
I published it in the hope that it will be useful to someone researching methods
for rendering 2D overlays using a 3D graphics API. For .NET / XNA you can find
a complete implementation that renders any type of primitive here:
Nuclex
Framework: PrimitiveBatch class.
I am going to assume you know how to set up Direct3D, create a device and use it to
at least clear the screen. Even if you're new to this, you can use the code
accompanying the article or some of the tutorials available on the net. A basic
understanding of C++ classes and STL containers is also recommended.
In the progression of this article, we will build a class named
VertexDrawer which provides us with a convenient way for rendering
bitmaps and sprites while still performing optimal vertex batching and
texture grouping.
Preparing for 2D
Let's start with something simple. We're going to draw some lines, using a
VertexBuffer and to batch our lines' vertices into larger chunks so we
can efficiently send them to the graphics card.
To render in 2D, we have to use (pre-)transformed vertices. A vertex marks the
corner of a polygon or the beginning/ending of a line, pretransformed meaning that
it will not be modified by the GPU to account for the camera's position in 3D space.
This is the structure for this kind of vertex in Direct3D:
/// Predefined screen-coordinate vertex
struct PretransformedVertex {
float X, Y, Z;
float RHW;
unsigned int Diffuse;
unsigned int Specular;
float U, V;
};
When using such vertices to draw lines and other 2D primitives, there's an interesting
fact that we should pay attention to: An even screen coordinate will not hit a pixel's
center, but lie exactly between two pixels. If you draw with such coordinates, small
rounding errors will decide which texel (pixel on a texture) of the bitmap is chosen
for a pixel, generating lots of slightly dislocated pixels for any sprite or bitmap we
draw. To combat this, all we have to do is offset the screen coordinates by 0.5 pixels
to the upper left.
As you probably know, graphics cards can render large amounts of polygons, but to
eliminate the call-overhead and processing cost, these polygons have to be sent to
the graphics card in larger batches. In order to obtain good rendering performance,
we'll need to build such batches from the vertices generated for the 2D primitives
that are drawn.
Usually, when we render two-dimensional stuff, the depth buffer needs to be disabled,
lighting calculations avoided and culling of of backfaces usually is of no relevance
to us. In Direct3D, these features are disabled by setting various RenderStates accordingly.
Line rendering
Let's start with a small class that batches vertices for lines and sends them to
the Direct3D device for rendering, using a dynamic vertex buffer:
#include <comdef.h>
#include <d3d9.h>
const size_t VertexBatchSize = 1024;
class VertexDrawer {
public:
VertexDrawer(const IDirect3DDevice9Ptr &spD3DDevice) :
m_spD3DDevice(spD3DDevice),
m_VertexBatches(1) {
_com_util::CheckError(spD3DDevice->CreateVertexBuffer(...));
}
void drawLine(int x1, int y1, int x2, int y2, unsigned int color);
void render();
private:
struct VertexBatch {
VertexBatch() : UsedVertexCount(0) {}
PretransformedVertex Vertices[VertexBatchSize];
size_t UsedVertexCount;
};
typdef std::vector<VertexBatch> VertexBatchVector;
IDirect3DDevice9Ptr m_spD3DDevice;
IDirect3DVertexBuffer9Ptr m_spVertexBuffer;
VertexBatchVector m_VertexBatches;
};
As you can see, the VertexDrawer manages a set of
VertexBatches. The drawLine() method will append two
vertices to the lastmost VertexBatch and fill them with the line's
starting and ending coordinates respectively.
When you call render, it will lock the vertex buffer (usually using
D3DLOCK_DISCARD), fill up the vertex buffer with the batched vertices,
unlock and draw all lines contained in the batch using a single
DrawPrimitive() call. I'll leave the trivial implementations of
these two methods as an exercise to the reader.
Rectangle rendering
Next, we're going to add the capability to render rectangles with our
VertexDrawer. The lines were drawn using the D3DPT_LINELIST
mode of Direct3D's DrawPrimitive() method. Rendering rectangles will
require us to do a seperate call to DrawPrimitive(), using
D3DPT_TRIANGLELIST instead. This means that in order to keep
the correct depth ordering of the drawing commands, we must not draw all rectangles
and then all lines in two simple steps, but the line drawing calls have to be mixed
with the rectangle drawing commands in our vertex batches.
Sounds more complicated than it is. We'll extend the VertexBatch structure
with a list of rendering operations, each if which indicating what mode to use for
DrawPrimitive(), where in the VertexBatch to begin and where
to stop:
#include <comdef.h>
#include <d3d9.h>
const size_t VertexBatchSize = 1024;
class VertexDrawer {
public:
VertexDrawer(const IDirect3DDevice9Ptr &spD3DDevice) :
m_spD3DDevice(spD3DDevice),
m_VertexBatches(1) {
_com_util::CheckError(spD3DDevice->CreateVertexBuffer(...));
}
void drawLine(int x1, int y1, int x2, int y2, unsigned int color);
void drawRectangle(int x1, int y1, int x2, int y2, unsigned int color);
void render();
private:
struct VertexBatch {
struct RenderOperation {
D3DPRIMITIVETYPE PrimitiveType;
size_t StartVertex;
size_t EndVertex;
};
typedef std::list<RenderOperation> RenderOperationList;
PretransformedVertex Vertices[VertexBatchSize];
RenderOperationList RenderOperations;
};
typedef std::vector<VertexBatch> VertexBatchVector;
IDirect3DDevice9Ptr m_spD3DDevice;
IDirect3DVertexBuffer9Ptr m_spVertexBuffer;
VertexBatchVector m_VertexBatches;
};
Each VertexBatch now also has a list of RenderOperations
which specify what DrawPrimitive() commands to issue when the drawing
is rendered.
Whenever a drawing method of the VertexDrawer is called, it has to
check whether the lastmost RenderOperation is of the required type
(D3DPT_LINELIST or D3DPT_TRIANGLELIST currently). If
the type does not match, a new RenderOperation has to be appended to
the list, starting at the EndVertex of the previous one. The vertices
can still be written to the same list, allowing us to completely eliminate any
vertex buffer switches while rendering.
A working implementation of the drawLine() and
drawRectangle() methods in this state of the VertexDrawer
class is provided in the code accompanying the article.
Rendering bitmaps
Using bitmaps in the context of 3D accelerators means using textures, which have
the habit of being forced into sizes that are a power of 2. But we'll take care of
that later. To support rectangle rendering in our lines-only VertexDrawer
required the introduction of RenderOperations to the
VertexBatches. Changing the current texture will also require a separate
DrawPrimitive() call, so we can just as well use those
RenderOperations to change the current texture:
struct VertexBatch {
struct RenderOperation {
D3DPRIMITIVETYPE PrimitiveType;
size_t StartVertex;
size_t EndVertex;
IDirect3DTexture9Ptr spTexture;
};
typedef std::list<RenderOperation> RenderOperationList;
PretransformedVertex Vertices[VertexBatchSize];
RenderOperationList RenderOperations;
};
Doing hundreds of texture switches, one for each bitmap you're going to draw, would
certainly kill rendering performance, so we'd like to eliminate texture switches. We
also want to be able to render bitmaps of arbitrary sizes, maybe for drawing text to
the screen, maybe just for convenience. There's a good working solution for all this,
althought it's a quite complicated one.
All bitmaps that are rendered have to be copied into large internal cache textures by
the VertexDrawer in a way such that as many of the bitmaps are placed onto
a single cache texture as possible. Algorithms for this process are known as bin packing
or 2d allocation algorithms, varying in space efficiency and in speed. We're going to
use the simplest of all algorithms, the lower left boundary packer.
/// Optimized rectangle allocator
struct RectanglePacker {
RectanglePacker(size_t width, size_t height) :
m_Width(width),
m_Height(height) {}
bool placeRectangle(
size_t width, size_t height,
size_t &out_x, size_t &out_y
);
};
|