반응형

http://www.swedishcoding.com/2008/08/31/are-we-out-of-memory/






Are we out of memory?

August 31st, 2008 by Christian Gyrling — Game Coding — 8 Comments

If I had a dollar every time I heard the question “Do we not have any more memory?”
When I ask the question “How much memory are you using for subsystem X?”, very few developers know the answer. It is usually a smaller or bigger ballpark but no definite answer.
Memory for any application is a crucial resource. No matter what type of application you are creating you will benefit from having good memory management. With some basic memory management in place you can know your memory distribution between systems, finding memory leaks much easier, simplify memory alignment requirements… These are just a few of the areas where you will benefit greatly with good memory management.

Getting the chaos under control

The first thing you need to do, unless you have already done it, is to override the global new and delete functions. Doing this gives you a starting point for all your memory management. All calls to new and delete will be handled by your custom functions. The very first step is to just intercept the allocation request and carry it out like it normal.

void* operator new(size_t size)
{
    return malloc(size, sizeof(char));
}

void delete (void* mem)
{
    free(mem);
}

// Don’t forget the array version of new/delete
void* operator new[](size_t size)
{
    return malloc(size, sizeof(char));
}

void delete[](void* mem)
{
    free(mem);
}

I’m not going to go into any detail about intercepting calls to malloc or the like. All I will say is that you will do best in staying away from using alloc, malloc, calloc, realloc and free (and all other permutations that I have forgotten). In most cases it can be tricky to intercept those calls. I would say that you should only use malloc once and that is to allocate all the needed memory for your application… but more about that later.

Custom versions of new and delete

You will quickly find that when you get an allocation request in you ‘new’ handler it would be very handy to know who made that allocation. From the code doing the allocation it would sometimes be very nice to be able to also specify extra information such as the alignment needed for the memory block. Aligned memory in particular is a pain to work with unless it is supported by the memory allocator. If you have a class that has members that need to be 16-byte aligned (SSE for example) it will be messy.

// Allocate memory with the needed padding to ensure that you
// can properly align it to the required boundary. Then you need
// to placement construct the new object in that memory.
char* pInstanceMem = new char[sizeof(MyClass) + 0x0F];
char* pAlignedMem = pInstanceMem & (~0x0F);
MyClass* pMyClassInst = new (pAlignedMem) MyClass();

// The allocation of the object it not terribly ugly but it is
// a pain to work with… but not as much of a pain as it is to
// delete the instance. The code below will crash or just leak.
// The destructor will be properly called but the memory address
// pointed to by pMyClassInst was never returned from a call
// to ‘new’. The memory address actually returned by the call
// to 'new' used for this object is really unknown at this point.
// What to do!?!?!?
delete pMyClassInst

Wouldn’t it be much nicer if you could allocate your MyClass instance like this…

// Just allocate with an extra argument to ‘new’
MyClass* pMyClassInst = new (Align(16)) MyClass();
MyClass* pMyClassInst = new (Align(16)) MyClass();

// and deletion will be straight forward… The pointer passed to
// ‘delete’ is the same pointer returned from the call to ‘new’.
delete pMyClassInst.

This is where overloaded versions of new and delete comes to the rescue.

// We use a class to represent the alignment to avoid any code
// situations where 'new (0x12345678) MyClass()' and
// 'new ((void*)0x12345678) Myclass()' might cause a placement
// construction instead of construction on an aligned memory
// address.
class Align
{
public:
    explicit Align(int value) : m_value(value)
    {
    }
    int GetValue() const
    {
       return m_value;
    }
private:
    int m_value;
};

// Overridden 'normal' new/delete
void* operator new (size_t size);
void* operator new[] (size_t size);
void operator delete( void* mem );
void operator delete[]( void* mem );

// Aligned versions of new/delete
void* operator new[] (size_t size, Align alignment);
void* operator new (size_t size, Align alignment);
void operator delete (void* mem, Align alignment);
void operator delete[] (void* mem, Align alignment);

Allocate all memory up front

Now when you have ways to allocate memory and pass arbitrary parameters to the allocator you can start organizing your memory.
If you need to ship an application that can only use 256 MB of RAM I would suggest that you allocate 256 MB of system memory up front and call that ‘retail memory’. Most of the time you need more memory during development of various systems and for this I would allocate another clump of memory… and call this memory ‘development memory’. You now have two huge buffers to use for your application and you should not allocate any more system memory. When you receive a call to ‘new’ you could check a variable whether to allocate the memory from the retail or the development memory pool.
By clearly separating the two memory pools you make sure that debug-only allocations end up in the debug pool and allocations needed to ship the game are taken from the retail pool. This way it is very clear from which pool you allocate memory and it is easy to keep track of the allocated/available memory. You could even use another custom overloaded new/delete that pass in whether the allocation should be from retail or development memory.

Divide and specialize

Now when you have your chunks of memory it might be a good idea to split it up based on the needs of your application. After all it’s terribly impractical to have only one huge buffer for an application. I am very much against dynamic allocations at run time of an application in general but some allocations have to happen and what is the best way of handling it?

A good way to organize the memory is to split the memory chunks into smaller blocks managed by different allocators using various allocation algorithms. This will not just help to be more efficient about the memory usage but will also serve as a way to clearly assign the memory to the various systems.

Not all allocations are the same. Here are a just few common allocations that came to mind.

PERSISTENT ALLOCATIONS

Allocated once at startup/creation of a system (pools, ring buffers, arrays). It will never be deleted and therefore we don’t need any complicated algorithms to allocate persistent memory. Linear allocations work great it this scenario. All it takes for an allocator like this is a pointer to the buffer, the size of the buffer and the current offset into the buffer (allocated space).

// Simple class to handle linear allocations
class LinearAllocator
{
public:
    LinearAllocator(char* pBuffer, int bufferSize) :
        m_pBuffer(pBuffer),
        m_buffersize(bufferSize),
        m_currentOffset(0)
    {
    }
    void* Alloc(size_t size)
    {
       void* pMemToReturn = m_pBuffer + m_currentOffset;
       pMemToReturn += size;
       return pMemToReturn;
    }
    void Free(void* pMem)
    {
       // We can't easily free memory in this type of allocator.
       // Therefore we just ignore this... or you could assert.
    }
private:
    char* m_pBuffer;
    int m_bufferSize;
    int m_currentOffset;
};

DYNAMIC ALLOCATIONS

This memory is allocated and freed at random points throughout the lifetime of the application. Sometimes you just need memory to create an instance of some object and you can’t predict when that will happen. This is when you can use a dynamic memory allocator. This allocator will have some type of algorithm to remember what memory blocks are free and which ones are allocated. It will handle consolidation of neighboring free blocks. It will however suffer from fragmentation which can greatly reduce the amount of memory available to you. There are tons of dynamic allocation algorithms out there, all with different characteristics; speed, overhead and more. Pick a simple one to start with… you can always change later.

ONE-FRAME ALLOCATIONS

These allocations happen for example when one system creates data and another consumes it later in that same frame. It could be variable sized arrays used for rendering, animation or just debug text created by some system earlier in the frame. This allocator handles this by resetting itself at the beginning (or end) of every frame, hence freeing all memory automatically. By doing this we also avoid fragmentation of the memory. The above LinearAllocator can be used here as well with the addition of a simple ‘Reset()’ function.

Memory Map

Conclusion

Well… I hope this is useful for someone out there. This type of information has helped me when I have worked on my personal or professional projects.

Another thing I did not talk much about is memory tracking. I guess all I will say about that for now is… Do it! Spend the time to implement something quick and easy to help you track down memory leaks and where in your code are you using up all the memory. This is a large topic by itself and therefore I will not write about it in this post. Make use of the overloading of the new/delete operators to allow you to pass __FILE__ and __LINE__ for each allocation. Use macros or other things to make the code prettier.

Until next time…

8 RESPONSES

  • Noel Llopis // Sep 2, 2008 at 5:51 am

    Very nice article, Christian. It’s very much how I like to deal with memory these days as well. I’m getting ready to write about memory allocation patterns for my column and this will definitely come in handy.

  • realtimecollisiondetection.net - the blog » Posts and links you should have read // Sep 2, 2008 at 7:34 am

    [...] a PC game developer). Learn the how’s and why’s by reading Christian Gyrling’s Are we out of memory? and Jesus de Santos Garcia’s post about Practical Efficient Memory Management (though the [...]

  • systematicgaming // Sep 3, 2008 at 10:54 pm

    This is a lot like my recent series on memory management – very similar ideas and implementations. That’s to be expected I suppose for similar applications (games) operating in similar environments.

  • James Bird // Sep 16, 2008 at 12:27 am

    I think there might be a bug in your alignment code.

    Assume that pInstanceMem = 3
    Then this would cause: pAlignedMem = 0

    Which is before the first byte of your allocated chunk of memory.

    I think what you meant to do was round up to the nearest multiple of 16, instead of down:

    pAlignedMem = (pInstanceMem + 0×0F) & ~(0×0F);

  • Christian Gyrling // Sep 16, 2008 at 9:46 am

    James… you’re right. Good catch. I did miss out on that when I wrote that example.
    I guess that will teach me to publish a post after midnight. :-)

  • Arseny Kapoulkine // Jan 3, 2009 at 2:23 am

    There is one minor problem with the code above – unless the code for delete(void*) and delete(void*, Align) is the same, it won’t work. delete ptr; always calls delete(void*), in fact the only case when delete(void*, Align) will be called is if ctor of the constructed class throws an exception.

  • Charles Nicholson // Jan 25, 2009 at 7:35 pm

    Hey Christian, Your “aligned delete” operator will only ever get called if the object being constructed throws when allocated with your “aligned new” operator.

    When you simply invoke ‘delete’ on a pointer that was allocated with your overloaded “aligned new”, your overloaded unaligned global operator delete will be called, not your “aligned delete” operator. See 5.3.4/17 and 5.3.4/19 for details.

    Or, run this code and see the compiler in action:

    #include
    #include

    enum MyToken
    {
    Charles
    };

    void* operator new(size_t size, MyToken token)
    {
    std::printf(“MyToken new\n”);
    return ::operator new(size);
    }

    void operator delete(void* p, MyToken token)
    {
    std::printf(“MyToken delete\n”);
    return ::operator delete(p);
    }

    struct Type
    {
    explicit Type(bool shouldThrow)
    {
    std::printf(“Type ctor\n”);
    if (shouldThrow)
    throw std::exception();
    }

    ~Type()
    {
    std::printf(“Type dtor\n”);
    }

    char c[256];
    };

    int main()
    {
    std::printf(“—New/delete without ctor-throw:\n”);
    delete new (Charles) Type(false);

    std::printf(“—New with ctor-throw:\n”);
    try
    {
    new (Charles) Type(true);
    }
    catch(const std::exception&)
    {
    }

    return 0;
    }

  • Jon B // Apr 14, 2010 at 7:18 am

    I think there may be a bug with your linear allocator…

    void* Alloc(size_t size)
    {
    void* pMemToReturn = m_pBuffer + m_currentOffset;
    pMemToReturn += size; // Oops..
    return pMemToReturn;
    }

    Should perhaps be…

    void* Alloc(size_t size)
    {
    void* pMemToReturn = m_pBuffer + m_currentOffset;
    m_currentOffset += size; // Thats better..
    return pMemToReturn;
    }

반응형

+ Recent posts