Skip to content

[BUG] Memory leaks when casting raw pointers to python types #2742

Open
@bstaletic

Description

@bstaletic

Issue description

Once again I've been playing with my toys - valgrind, pybind11@ #2741 and CPython 3.9.1

Good news: We're almost there. Almost a clean run.
Bad news: Some casters leak like a broken faucet.

More specifically, py::cast(new int{}) leaks and so do all the other casters that copy the data - list_caster from stl.h is another example. Another thing that leaks is new CustomClass{} with py::return_value_policy::move.

While py::cast(new int{}); could be argued that it's a user error, that argument doesn't take into account the following:

m.def("f", []{ return new int{};});
m.def("g", []{ return new Class{};}, py::return_value_policy::move);

Even our docs have return new Example(); and "don't mix things that don't play well together" in this case is very subtle of a contract to introduce after so many years of people assuming that it just works (tm).

Reproducible example code

#include <pybind11/pybind11.h>
struct s{};
PYBIND11_MODULE(test, m) {
    pybind11::class_<s>(m, "s");
    m.def("leak1", []{ return new int{}; });
    m.def("no_leak", []{ return new s{}; });
    m.def("leak2", []{ return new s{}; }, pybind11::return_value_policy::move);
}

Calling all three functions results in these valgrind errors:

==7672== HEAP SUMMARY:
==7672==     in use at exit: 811,198 bytes in 5,907 blocks
==7672==   total heap usage: 32,402 allocs, 26,495 frees, 4,805,059 bytes allocated
==7672== 
==7672== 1 bytes in 1 blocks are definitely lost in loss record 2 of 1,971
==7672==    at 0x483ADEF: operator new(unsigned long) (vg_replace_malloc.c:342)
==7672==    by 0x5B58637: pybind11_init_test(pybind11::module_&)::{lambda()#3}::operator()() const (foo.cpp:7)
==7672==    by 0x5B59BD5: s* pybind11::detail::argument_loader<>::call_impl<s*, pybind11_init_test(pybind11::module_&)::{lambda()#3}&, , pybind11::detail::void_type>(pybind11_init_test(pybind11::module_&)::{lambda()#3}&, std::integer_sequence<unsigned long>, pybind11::detail::void_type&&) && (cast.h:2022)
==7672==    by 0x5B594A9: std::enable_if<!std::is_void<s*>::value, std::is_void>::type pybind11::detail::argument_loader<>::call<s*, pybind11::detail::void_type, pybind11_init_test(pybind11::module_&)::{lambda()#3}&>(pybind11_init_test(pybind11::module_&)::{lambda()#3}&) && (cast.h:1994)
==7672==    by 0x5B5915C: void pybind11::cpp_function::initialize<pybind11_init_test(pybind11::module_&)::{lambda()#3}, s*, , pybind11::name, pybind11::scope, pybind11::sibling, pybind11::return_value_policy>(pybind11_init_test(pybind11::module_&)::{lambda()#3}&&, s* (*)(), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, pybind11::return_value_policy const&)::{lambda(pybind11::detail::function_call&)#3}::operator()(pybind11::detail::function_call) const (pybind11.h:184)
==7672==    by 0x5B591C3: void pybind11::cpp_function::initialize<pybind11_init_test(pybind11::module_&)::{lambda()#3}, s*, , pybind11::name, pybind11::scope, pybind11::sibling, pybind11::return_value_policy>(pybind11_init_test(pybind11::module_&)::{lambda()#3}&&, s* (*)(), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&, pybind11::return_value_policy const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call) (pybind11.h:161)
==7672==    by 0x5B67099: pybind11::cpp_function::dispatcher(_object*, _object*, _object*) (pybind11.h:717)
==7672==    by 0x498F13D: cfunction_call (methodobject.c:539)
==7672==    by 0x493C463: _PyObject_MakeTpCall (call.c:191)
==7672==    by 0x4A4962F: _PyObject_VectorcallTstate (abstract.h:116)
==7672==    by 0x4A4CF66: PyObject_Vectorcall (abstract.h:127)
==7672==    by 0x4A4CF66: call_function (ceval.c:5072)
==7672==    by 0x4A5946C: _PyEval_EvalFrameDefault (ceval.c:3487)
==7672== 
==7672== 4 bytes in 1 blocks are definitely lost in loss record 3 of 1,971
==7672==    at 0x483ADEF: operator new(unsigned long) (vg_replace_malloc.c:342)
==7672==    by 0x5B58601: pybind11_init_test(pybind11::module_&)::{lambda()#1}::operator()() const (foo.cpp:5)
==7672==    by 0x5B59B65: int* pybind11::detail::argument_loader<>::call_impl<int*, pybind11_init_test(pybind11::module_&)::{lambda()#1}&, , pybind11::detail::void_type>(pybind11_init_test(pybind11::module_&)::{lambda()#1}&, std::integer_sequence<unsigned long>, pybind11::detail::void_type&&) && (cast.h:2022)
==7672==    by 0x5B593D5: std::enable_if<!std::is_void<int*>::value, std::is_void>::type pybind11::detail::argument_loader<>::call<int*, pybind11::detail::void_type, pybind11_init_test(pybind11::module_&)::{lambda()#1}&>(pybind11_init_test(pybind11::module_&)::{lambda()#1}&) && (cast.h:1994)
==7672==    by 0x5B58D48: void pybind11::cpp_function::initialize<pybind11_init_test(pybind11::module_&)::{lambda()#1}, int*, , pybind11::name, pybind11::scope, pybind11::sibling>(pybind11_init_test(pybind11::module_&)::{lambda()#1}&&, int* (*)(), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&)::{lambda(pybind11::detail::function_call&)#3}::operator()(pybind11::detail::function_call) const (pybind11.h:183)
==7672==    by 0x5B58DAF: void pybind11::cpp_function::initialize<pybind11_init_test(pybind11::module_&)::{lambda()#1}, int*, , pybind11::name, pybind11::scope, pybind11::sibling>(pybind11_init_test(pybind11::module_&)::{lambda()#1}&&, int* (*)(), pybind11::name const&, pybind11::scope const&, pybind11::sibling const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call) (pybind11.h:161)
==7672==    by 0x5B67099: pybind11::cpp_function::dispatcher(_object*, _object*, _object*) (pybind11.h:717)
==7672==    by 0x498F13D: cfunction_call (methodobject.c:539)
==7672==    by 0x493C463: _PyObject_MakeTpCall (call.c:191)
==7672==    by 0x4A4962F: _PyObject_VectorcallTstate (abstract.h:116)
==7672==    by 0x4A4CF66: PyObject_Vectorcall (abstract.h:127)
==7672==    by 0x4A4CF66: call_function (ceval.c:5072)
==7672==    by 0x4A5946C: _PyEval_EvalFrameDefault (ceval.c:3487)

In case people find it easier to parse the embedded interpreter version:

#include <pybind11/pybind11.h>
#include <pybind11/embed.h>

int main() {
    pybind11::cpp_function("f", []{ return new int{}; })(); // Remember to call the function
}

According to @YannickJadoul's analysis, this happens because the affected casters don't end up owning the returned pointer. The casters copy the data and end up owning the copied data, but leak the returned pointer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugcastersRelated to casters, and to be taken into account when analyzing/reworking casters

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions