Debugging Memory Problems

Credit: Will Ware

Problem

You’re developing C extensions, and you experience memory problems. You suspect mismanagement of reference counts and want to check whether your C extension code is managing reference counts correctly.

Solution

To chase these problems optimally, you need to alter Python’s sources and rebuild Python. Specifically, add this function in Objects/object.c immediately before the _Py_PrintReferences function:

void
_Py_CountReferences(FILE *fp)
{
    int n;
    PyObject *op;
    for (n = 0, op = refchain._ob_next;
         op != &refchain;
         op = op->_ob_next, n += op->ob_refcnt)
    { }
    fprintf(fp, "%d refs\n", n);
}

I place in the following macros in my C extension:

#if defined(Py_DEBUG) || defined(DEBUG)
extern void _Py_CountReferences(FILE*);
#define CURIOUS(x) { fprintf(stderr, _ _FILE_ _ ":%d ", _ _LINE_ _); x; }
#else
#define CURIOUS(x)
#endif
#define MARKER(  )        CURIOUS(fprintf(stderr, "\n"))
#define DESCRIBE(x)     CURIOUS(fprintf(stderr, "  " #x "=%d\n", x))
#define DESCRIBE_HEX(x) CURIOUS(fprintf(stderr, "  " #x "=%08x\n", x))
#define COUNTREFS(  )     CURIOUS(_Py_CountReferences(stderr))

To debug, I rebuild Python using make OPT="-DPy_DEBUG", which causes the code under Py_TRACE_REFS to be built. My own makefile for my extensions does the same trick by including these lines:

debug:
        make clean; make OPT="-g -DPy_DEBUG" all
CFLAGS = $(OPT) -fpic -O2 -I/usr/local/include -I/usr/include/python1.5

Discussion

If I’m developing C extensions and I run into memory problems, I find that the typical cause is mismanagement of reference counts, particularly abuses of Py_INCREF and Py_DECREF, as well as forgetfulness of the reference-count effects of functions such as Py_BuildValue, PyArg_ParseTuple, PyTuple/List_SetItem/GetItem, etc. The 1.5.2 source code base offers some help with this (search for Py_TRACE_REFS), but I found it useful to add this recipe’s function in Objects/object.c just before _Py_PrintReferences.

Unlike _Py_PrintReferences, this recipe’s function will print only the total of all the reference counts in the system, so it can be used safely in loops that will repeat millions of times, whereas _Py_PrintReferences would print out way too many counts to be useful. This can help you identify errantly wandering Py_INCREFs and Py_DECREFs.

So when I suspect that one of my functions is responsible for memory problems, I liberally sprinkle the suspect function with calls to the COUNTREFS macro. This allows me to keep track of exactly how many references are being created or destroyed as I go through my function. This is particularly useful in tight loops, in which dumb mistakes can cause reference counts to grow ridiculously fast. Also, reference counts that shrink too fast (overzealous use of Py_DECREF) can cause core dumps because the memory for objects that should still exist has been reallocated to new objects.

See Also

The only documentation in this case is the source code (“Use the source, Luke!”).

Get Python Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.