Discussion:
Building a tool which links with librados
David Zafman
2014-08-21 23:37:14 UTC
Permalink
Has anyone seen anything like this from an application linked with librados using valgrind? Or a Segmentation fault on exit from such an application?

Invalid free() / delete / delete[] / realloc()
at 0x4C2A4BC: operator delete(void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x8195C12: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.16)
by 0x13890F3: coll_t::~coll_t() (osd_types.h:468)
by 0x8944DEC: __cxa_finalize (cxa_finalize.c:56)
by 0x6E1CEC5: ??? (in /src/ceph/src/.libs/librados.so.2.0.0)
by 0x725F400: ??? (in /src/ceph/src/.libs/librados.so.2.0.0)
by 0x89449D0: __run_exit_handlers (exit.c:78)
by 0x8944A54: exit (exit.c:100)
by 0x137FF37: usage(boost::program_options::options_description&) (ceph_objectstore_tool.cc:1794)
by 0x1380572: main (ceph_objectstore_tool.cc:1849)

David Zafman
Senior Developer
http://www.inktank.com
http://www.redhat.com

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Gregory Farnum
2014-08-21 23:50:48 UTC
Permalink
Has anyone seen anything like this from an application linked with li=
brados using valgrind? Or a Segmentation fault on exit from such an ap=
plication?
Invalid free() / delete / delete[] / realloc()
at 0x4C2A4BC: operator delete(void*) (in /usr/lib/valgrind/vgprel=
oad_memcheck-amd64-linux.so)
by 0x8195C12: std::basic_string<char, std::char_traits<char>, std=
::allocator<char> >::~basic_string() (in /usr/lib/x86_64-linux-gnu/libs=
tdc++.so.6.0.16)
by 0x13890F3: coll_t::~coll_t() (osd_types.h:468)
by 0x8944DEC: __cxa_finalize (cxa_finalize.c:56)
by 0x6E1CEC5: ??? (in /src/ceph/src/.libs/librados.so.2.0.0)
by 0x725F400: ??? (in /src/ceph/src/.libs/librados.so.2.0.0)
by 0x89449D0: __run_exit_handlers (exit.c:78)
by 0x8944A54: exit (exit.c:100)
by 0x137FF37: usage(boost::program_options::options_description&)=
(ceph_objectstore_tool.cc:1794)
by 0x1380572: main (ceph_objectstore_tool.cc:1849)
This looks fairly strange to me =E2=80=94 why does ceph_objectstore_too=
l do
anything with librados? I thought it was just hitting the OSD
filesystem structure directly.
Also note that the crash appears to be underneath the coll_t
destructor, probably in destroying its string. That combined with the
weird librados presence makes me think memory corruption is running
over the stack somewhere.
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" i=
n
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Sage Weil
2014-08-21 23:56:28 UTC
Permalink
Post by David Zafman
Has anyone seen anything like this from an application linked with librados using valgrind? Or a Segmentation fault on exit from such an application?
Invalid free() / delete / delete[] / realloc()
at 0x4C2A4BC: operator delete(void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0x8195C12: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.16)
by 0x13890F3: coll_t::~coll_t() (osd_types.h:468)
by 0x8944DEC: __cxa_finalize (cxa_finalize.c:56)
by 0x6E1CEC5: ??? (in /src/ceph/src/.libs/librados.so.2.0.0)
by 0x725F400: ??? (in /src/ceph/src/.libs/librados.so.2.0.0)
by 0x89449D0: __run_exit_handlers (exit.c:78)
by 0x8944A54: exit (exit.c:100)
by 0x137FF37: usage(boost::program_options::options_description&) (ceph_objectstore_tool.cc:1794)
by 0x1380572: main (ceph_objectstore_tool.cc:1849)
This looks fairly strange to me ? why does ceph_objectstore_tool do
anything with librados? I thought it was just hitting the OSD
filesystem structure directly.
Also note that the crash appears to be underneath the coll_t
destructor, probably in destroying its string. That combined with the
weird librados presence makes me think memory corruption is running
over the stack somewhere.
Ah, this was fixed in 5d79605319fcde330bccce5e1b07276a98be02de in the
wip-libcommon branch. The problem is partly when we link libcommon
staticaly (ceph-objectstore-tool) and dynamically (librados) at teh same
time. The easy fix here is not linking librados at all.

Not sure why we see this sometimes and not always.. maybe link order? In
any case, wip-libcommon moves libcommon.la into a .so shared between
librados and the binary using it to avoid the problem. Makes things
slightly more restrictive with mixed versions, but i suspect it is worth
avoiding this sort of pain.

Can you cherry-pick that commit and see if it resolves this for you?
And/or merge in that entire branch?

sage


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
David Zafman
2014-08-22 00:11:52 UTC
Permalink
The import-rados feature (#8276) uses librados so in my wip-8231 branch=
I now link with librados. It is hard to reproduce, but I=92ll play w=
ith that commit and branch.

David Zafman
Senior Developer
http://www.inktank.com
http://www.redhat.com
=20
Has anyone seen anything like this from an application linked with =
librados using valgrind? Or a Segmentation fault on exit from such an =
application?
=20
Invalid free() / delete / delete[] / realloc()
at 0x4C2A4BC: operator delete(void*) (in /usr/lib/valgrind/vgpre=
load_memcheck-amd64-linux.so)
by 0x8195C12: std::basic_string<char, std::char_traits<char>, st=
d::allocator<char> >::~basic_string() (in /usr/lib/x86_64-linux-gnu/lib=
stdc++.so.6.0.16)
by 0x13890F3: coll_t::~coll_t() (osd_types.h:468)
by 0x8944DEC: __cxa_finalize (cxa_finalize.c:56)
by 0x6E1CEC5: ??? (in /src/ceph/src/.libs/librados.so.2.0.0)
by 0x725F400: ??? (in /src/ceph/src/.libs/librados.so.2.0.0)
by 0x89449D0: __run_exit_handlers (exit.c:78)
by 0x8944A54: exit (exit.c:100)
by 0x137FF37: usage(boost::program_options::options_description&=
) (ceph_objectstore_tool.cc:1794)
by 0x1380572: main (ceph_objectstore_tool.cc:1849)
=20
This looks fairly strange to me ? why does ceph_objectstore_tool do
anything with librados? I thought it was just hitting the OSD
filesystem structure directly.
Also note that the crash appears to be underneath the coll_t
destructor, probably in destroying its string. That combined with th=
e
weird librados presence makes me think memory corruption is running
over the stack somewhere.
=20
Ah, this was fixed in 5d79605319fcde330bccce5e1b07276a98be02de in the=
=20
wip-libcommon branch. The problem is partly when we link libcommon=20
staticaly (ceph-objectstore-tool) and dynamically (librados) at teh s=
ame=20
time. The easy fix here is not linking librados at all.
=20
Not sure why we see this sometimes and not always.. maybe link order?=
In=20
any case, wip-libcommon moves libcommon.la into a .so shared between=20
librados and the binary using it to avoid the problem. Makes things=20
slightly more restrictive with mixed versions, but i suspect it is wo=
rth=20
avoiding this sort of pain.
=20
Can you cherry-pick that commit and see if it resolves this for you? =
=20
And/or merge in that entire branch?
=20
sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" i=
n
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Loading...