2D Graphics with Direct3D (C++)

Careful! This article is for C++ developers using the native DirectX 9.0 COM interface!

Sometimes, we're required to render graphics in two dimensions instead of the nowadays usual three. This article will show you how to do so, taking full advantage of 3D accelerators. As in any of my articles, let's see why we want to go through all the hazzle instead of just using a set of blitting routines to render onto a direct draw surface or texture.

  • Using the graphics card to draw bitmaps and sprites is way faster than performing blits with the CPU these days. You can even perform other calculations on the CPU while the graphics card is rendering.
  • Automatic pixel format conversion. Even if your application uses the current desktop color depth, you never need to think about rewriting every drawing routine for RGB-5-6-5, XRGB-1-5-5-5, RGB-8-8-8 and XRGB-8-8-8-8 formats again.
  • Clipping will be done by the graphics card. The risk of having your fancy drawing algorithms writing outside the desired memory area is entirely eliminated.
  • Special effects like translucent bitmaps, zooming and rotating sprites, recoloring existing images or even using hardware shaders on your 2D images.

I am going to assume you know how to set up Direct3D, create a device and use it to at least clear the screen. Even if you're new to this, you can use the code accompanying the article or some of the tutorials available on the net. A basic understanding of C++ classes and STL containers is also recommended.

In the progression of this article, we will build a class named VertexDrawer which provides us with a convenient way for rendering bitmaps and sprites while still performing optimal vertex batching and texture grouping.

Preparing for 2D

Let's start with something simple. We're going to draw some lines, using a VertexBuffer and to batch our lines' vertices into larger chunks so we can efficiently send them to the graphics card.

To render in 2D, we have to use (pre-)transformed vertices. A vertex marks the corner of a polygon or the beginning/ending of a line, pretransformed meaning that it will not be modified by the graphics card to account for the camera's position in 3D space. This is the structure for this kind of vertex in Direct3D:

/// Predefined screen-coordinate vertex
struct PretransformedVertex {
  float X, Y, Z;
  float RHW;
  unsigned int Diffuse;
  unsigned int Specular;
  float U, V;
};

When using such vertices to draw lines and other 2D primitives, there's an interesting fact that we should pay attention to: An even screen coordinate will not hit a pixel's center, but lie exactly between two pixels. If you draw with such coordinates, small rounding errors will decide which texel (pixel on a texture) of the bitmap is chosen for a pixel, generating lots of slightly dislocated pixels for any sprite or bitmap we draw. To combat this, all we have to do is offset the screen coordinates by 0.5 pixels to the upper left.

As you probably know, graphics cards can render large amounts of polygons, but to eliminate the call-overhead and processing cost, these polygons have to be sent to the graphics card in larger batches. In order to obtain good rendering performance, we'll need to build such batches from the vertices generated for the 2D primitives that are drawn.

Usually, when we render two dimensional stuff, the depth buffer needs to be disabled, lighting calculations avoided and culling of of backfaces usually is of no relevance to us. In Direct3D, these features are disabled by setting various RenderStates accordingly.

Line rendering

Let's start with a small class that batches vertices for lines and sends them to the Direct3D device for rendering, using a dynamic vertex buffer:

#include <comdef.h>
#include <d3d9.h>

const size_t VertexBatchSize = 1024;

class VertexDrawer {
  public:
    VertexDrawer(const IDirect3DDevice9Ptr &spD3DDevice) :
      m_spD3DDevice(spD3DDevice),
      m_VertexBatches(1) {
      _com_util::CheckError(spD3DDevice->CreateVertexBuffer(...));
    }

    void drawLine(int x1, int y1, int x2, int y2, unsigned int color);
    void render();

  private:
    struct VertexBatch {
      VertexBatch() : UsedVertexCount(0) {}

      PretransformedVertex Vertices[VertexBatchSize];
      size_t UsedVertexCount;
    };
    typdef std::vector<VertexBatch> VertexBatchVector;

    IDirect3DDevice9Ptr m_spD3DDevice;
    IDirect3DVertexBuffer9Ptr m_spVertexBuffer;
    VertexBatchVector m_VertexBatches;
};

As you can see, the VertexDrawer manages a set of VertexBatches. The drawLine() method will append two vertices to the lastmost VertexBatch and fill them with the line's starting and ending coordinates respectively.

When you call render, it will lock the vertex buffer (usually using D3DLOCK_DISCARD), fill up the vertex buffer with the batched vertices, unlock and draw all lines contained in the batch using a single DrawPrimitive() call. I'll leave the trivial implementations of these two methods as an exercise to the reader.

Rectangle rendering

Next, we're going to add the capability to render rectangles with our VertexDrawer. The lines were drawn using the D3DPT_LINELIST mode of Direct3D's DrawPrimitive() method. Rendering rectangles will require us to do a seperate call to DrawPrimitive(), using D3DPT_TRIANGLELIST instead. This means that in order to keep the correct depth ordering of the drawing commands, we must not draw all rectangles and then all lines in two simple steps, but the line drawing calls have to be mixed with the rectangle drawing commands in our vertex batches.

Sounds more complicated than it is. We'll extend the VertexBatch structure with a list of rendering operations, each if which indicating what mode to use for DrawPrimitive(), where in the VertexBatch to begin and where to stop:

#include <comdef.h>
#include <d3d9.h>

const size_t VertexBatchSize = 1024;

class VertexDrawer {
  public:
    VertexDrawer(const IDirect3DDevice9Ptr &spD3DDevice) :
      m_spD3DDevice(spD3DDevice),
      m_VertexBatches(1) {
      _com_util::CheckError(spD3DDevice->CreateVertexBuffer(...));
    }

    void drawLine(int x1, int y1, int x2, int y2, unsigned int color);
    void drawRectangle(int x1, int y1, int x2, int y2, unsigned int color);
    void render();

  private:
    struct VertexBatch {
      struct RenderOperation {
        D3DPRIMITIVETYPE PrimitiveType;
        size_t StartVertex;
        size_t EndVertex;
      };
      typedef std::list<RenderOperation> RenderOperationList;

      PretransformedVertex Vertices[VertexBatchSize];
      RenderOperationList RenderOperations;
    };
    typdef std::vector<VertexBatch> VertexBatchVector;

    IDirect3DDevice9Ptr m_spD3DDevice;
    IDirect3DVertexBuffer9Ptr m_spVertexBuffer;
    VertexBatchVector m_VertexBatches;
};

See the modification ?

Each VertexBatch now also has a list of RenderOperations which specify what DrawPrimitive() commands to issue when the drawing is rendered.

Whenever a drawing method of the VertexDrawer is called, it has to check whether the lastmost RenderOperation is of the required type (D3DPT_LINELIST or D3DPT_TRIANGLELIST currently). If the type does not match, a new RenderOperation has to be appended to the list, starting at the EndVertex of the previous one. The vertices can still be written to the same list, allowing us to completely eliminate any vertex buffer switches while rendering.

A working implementation of the drawLine() and drawRectangle() methods in this state of the VertexDrawer class is provided in the code accomanying the article.

Rendering bitmaps

Using bitmaps in the context of 3D accelerators means using textures, which have the habit of being forced into sizes that are a power of 2. But we'll take care of that later. To support rectangle rendering in our lines-only VertexDrawer required the introduction of RenderOperations to the VertexBatches. Changing the current texture will also require a seperate DrawPrimitive() call, so we can just as well use those RenderOperations to change the current texture:

struct VertexBatch {
  struct RenderOperation {
    D3DPRIMITIVETYPE PrimitiveType;
    size_t StartVertex;
    size_t EndVertex;
    IDirect3DTexture9Ptr spTexture;
  };
  typedef std::list<RenderOperation> RenderOperationList;

  PretransformedVertex Vertices[VertexBatchSize];
  RenderOperationList RenderOperations;
};

Doing hundreds of texture switches, one for each bitmap you're going to draw, would certainly kill rendering performance, so we'd like to eliminate texture switches. We also want to be able to render bitmaps of arbitrary sizes, maybe for drawing text to the screen, maybe just for convenience. There's a good working solution for all this, althought it's a quite complicated one.

All bitmaps that are rendered have to be copied into large internal cache textures by the VertexDrawer in a way such that as many of the bitmaps are placed onto a single cache texture as possible. Algorithms for this process are known as bin packing or 2d allocation algorithms, varying in space efficiency and in speed. We're going to use the simplest of all algorithms, the lower left boundary packer.

/// Optimized rectangle allocator
struct RectanglePacker {
  RectanglePacker(size_t width, size_t height) :
    m_Width(width),
    m_Height(height) {}

  bool placeRectangle(
    size_t width, size_t height,
    size_t &out_x, size_t &out_y
  );
};