Discussion:
Storing cls and erasure code plugins in a pool
Loic Dachary
2014-09-07 08:26:04 UTC
Permalink
Hi Ceph,

There is a need for a cluster to share code such as cls https://github.com/ceph/ceph/tree/master/src/cls or erasure code plugins https://github.com/ceph/ceph/tree/master/src/erasure-code/.

These plugins could have a life cycle independent of Ceph, as long as they comply to the supported API ( https://github.com/ceph/ceph/blob/master/src/erasure-code/ErasureCodeInterface.h ). For erasure code plugins it currently works this way (or it will as soon as https://github.com/ceph/ceph/pull/2397 is merged):

a) upgrade from Hammer to I* half the OSD nodes. The new I* have new erasure code plugins
b) the MON will refuse to create an erasure coded pool using the new I* plugins, otherwise the Hammer nodes will find themselves unable to participate in the pool

Instead it could work this way:

a) upgrade from Hammer to I* half the OSD nodes. The new I* have new erasure code plugins
b) the new erasure code plugins are uploaded to a "plugins" pool
c) an erasure coded pool is created using a new plugin from I*
d) the Hammer OSD downloads the plugin from the pool and can participate in the pool

It is easier said than done and there are a lot of details to consider. However it not different from maintaining an Operating System that includes shared libraries and the path to do so properly is well known.

Thoughts ?

Cheers
--
Loïc Dachary, Artisan Logiciel Libre
Milosz Tanski
2014-09-07 13:38:50 UTC
Permalink
If you're planning on having plugins that are not shipped with the
host software you have to worry about both API and ABI stability.
Traditionally (and in my personal experience) keeping a C++ ABI
compatible is hard. For those reasons, I would strongly campaign for
having the plugins interface be in C (or et least their interface
would be C linkage).

Otherwise here's some things you can read read about C++ ABI stability
(from the KDE people):
https://techbase.kde.org/Policies/Binary_Compatibility_Issues_With_C++

This are my 2 cents based on my past experience.
Post by Loic Dachary
Hi Ceph,
There is a need for a cluster to share code such as cls https://githu=
b.com/ceph/ceph/tree/master/src/cls or erasure code plugins https://git=
hub.com/ceph/ceph/tree/master/src/erasure-code/.
Post by Loic Dachary
These plugins could have a life cycle independent of Ceph, as long as=
they comply to the supported API ( https://github.com/ceph/ceph/blob/m=
aster/src/erasure-code/ErasureCodeInterface.h ). For erasure code plugi=
ns it currently works this way (or it will as soon as https://github.co=
Post by Loic Dachary
a) upgrade from Hammer to I* half the OSD nodes. The new I* have new =
erasure code plugins
Post by Loic Dachary
b) the MON will refuse to create an erasure coded pool using the new =
I* plugins, otherwise the Hammer nodes will find themselves unable to p=
articipate in the pool
Post by Loic Dachary
a) upgrade from Hammer to I* half the OSD nodes. The new I* have new =
erasure code plugins
Post by Loic Dachary
b) the new erasure code plugins are uploaded to a "plugins" pool
c) an erasure coded pool is created using a new plugin from I*
d) the Hammer OSD downloads the plugin from the pool and can particip=
ate in the pool
Post by Loic Dachary
It is easier said than done and there are a lot of details to conside=
r. However it not different from maintaining an Operating System that i=
ncludes shared libraries and the path to do so properly is well known.
Post by Loic Dachary
Thoughts ?
Cheers
--
Lo=C3=AFc Dachary, Artisan Logiciel Libre
--=20
Milosz Tanski
CTO
16 East 34th Street, 15th floor
New York, NY 10016

p: 646-253-9055
e: ***@adfin.com
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" i=
n
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Loic Dachary
2014-09-07 16:13:20 UTC
Permalink
Post by Milosz Tanski
If you're planning on having plugins that are not shipped with the
host software you have to worry about both API and ABI stability.
Traditionally (and in my personal experience) keeping a C++ ABI
compatible is hard. For those reasons, I would strongly campaign for
having the plugins interface be in C (or et least their interface
would be C linkage).
Hi Milosz,

That's a good point indeed.
Post by Milosz Tanski
Otherwise here's some things you can read read about C++ ABI stability
https://techbase.kde.org/Policies/Binary_Compatibility_Issues_With_C++
Thanks, I did not even think C++ ABI compatibility was possible at all ;-) The web site is down at the moment but I'll take a look.

Cheers
Post by Milosz Tanski
This are my 2 cents based on my past experience.
Post by Loic Dachary
Hi Ceph,
There is a need for a cluster to share code such as cls https://github.com/ceph/ceph/tree/master/src/cls or erasure code plugins https://github.com/ceph/ceph/tree/master/src/erasure-code/.
a) upgrade from Hammer to I* half the OSD nodes. The new I* have new erasure code plugins
b) the MON will refuse to create an erasure coded pool using the new I* plugins, otherwise the Hammer nodes will find themselves unable to participate in the pool
a) upgrade from Hammer to I* half the OSD nodes. The new I* have new erasure code plugins
b) the new erasure code plugins are uploaded to a "plugins" pool
c) an erasure coded pool is created using a new plugin from I*
d) the Hammer OSD downloads the plugin from the pool and can participate in the pool
It is easier said than done and there are a lot of details to consider. However it not different from maintaining an Operating System that includes shared libraries and the path to do so properly is well known.
Thoughts ?
Cheers
--
Loïc Dachary, Artisan Logiciel Libre
--
Loïc Dachary, Artisan Logiciel Libre
Yehuda Sadeh
2014-09-07 19:42:07 UTC
Permalink
Post by Loic Dachary
Hi Ceph,
There is a need for a cluster to share code such as cls https://github.com/ceph/ceph/tree/master/src/cls or erasure code plugins https://github.com/ceph/ceph/tree/master/src/erasure-code/.
a) upgrade from Hammer to I* half the OSD nodes. The new I* have new erasure code plugins
b) the MON will refuse to create an erasure coded pool using the new I* plugins, otherwise the Hammer nodes will find themselves unable to participate in the pool
a) upgrade from Hammer to I* half the OSD nodes. The new I* have new erasure code plugins
b) the new erasure code plugins are uploaded to a "plugins" pool
c) an erasure coded pool is created using a new plugin from I*
d) the Hammer OSD downloads the plugin from the pool and can participate in the pool
It is easier said than done and there are a lot of details to consider. However it not different from maintaining an Operating System that includes shared libraries and the path to do so properly is well known.
Thoughts ?
And here we go (almost) full circle. Originally the objclass (cls)
mechanism worked somewhat similar. The code would be injected to the
monitors, and it was then distributed to the osds. When uploading the
objects we'd also specify the architecture, and there's an embedded
version in each so that it was possible to disable one version and
enable another.
The problem is that this specific method is very problematic when
dealing with heterogeneous environments, where each osd can run on a
different architecture, or different distribution. Also need to
maintain a very well contained api for the objclasses to use (which we
don't have), and be very careful about versioning. In other words, it
doesn't work, and the trouble doesn't worth the benefits.
I'm not too familiar with the erasure code plugins, but it seems to me
that they need to be versioned. Then you could have multiple plugins
with different versions installed, and then you could specify which
version to use. You could have a tool that would make sure that all
the osds have access to the appropriate plugin version. But the actual
installation of the plugin wouldn't be part of ceph's internal task.
It might be that the erasure code plugins are more contained than the
objclasses, and something like you suggested might actually work.
Though I'm having trouble seeing that happening having a compiled
shared object as the resource that needs to be distributed. The first
objclass implementation actually pushed python code to the nodes (it
really did!), maybe having something like that for erasure code could
work, given the appropriate environment and tools.

Yehuda
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Loic Dachary
2014-09-07 20:27:53 UTC
Permalink
Hi Yehuda,

You are right : erasure code plugins must obey the same constraints as objclass plugins, only worse because once a plugin starts being used it must remain available for as long as the pool needs it to read/write chunks.

Maintaining a repository of binary plugins is indeed not trivial. I just meant to write that it is not much different from apt-get installing shared libraries from architecture dependant repositories. We do not have to invent something, we can mimic and adapt existing best practices.

Maybe this idea is over engineering and there is a better / simpler solution to deal with erasure code plugin upgrades ? Or objclass upgrades for that matter.

Cheers
Post by Yehuda Sadeh
Post by Loic Dachary
Hi Ceph,
There is a need for a cluster to share code such as cls https://github.com/ceph/ceph/tree/master/src/cls or erasure code plugins https://github.com/ceph/ceph/tree/master/src/erasure-code/.
a) upgrade from Hammer to I* half the OSD nodes. The new I* have new erasure code plugins
b) the MON will refuse to create an erasure coded pool using the new I* plugins, otherwise the Hammer nodes will find themselves unable to participate in the pool
a) upgrade from Hammer to I* half the OSD nodes. The new I* have new erasure code plugins
b) the new erasure code plugins are uploaded to a "plugins" pool
c) an erasure coded pool is created using a new plugin from I*
d) the Hammer OSD downloads the plugin from the pool and can participate in the pool
It is easier said than done and there are a lot of details to consider. However it not different from maintaining an Operating System that includes shared libraries and the path to do so properly is well known.
Thoughts ?
And here we go (almost) full circle. Originally the objclass (cls)
mechanism worked somewhat similar. The code would be injected to the
monitors, and it was then distributed to the osds. When uploading the
objects we'd also specify the architecture, and there's an embedded
version in each so that it was possible to disable one version and
enable another.
The problem is that this specific method is very problematic when
dealing with heterogeneous environments, where each osd can run on a
different architecture, or different distribution. Also need to
maintain a very well contained api for the objclasses to use (which we
don't have), and be very careful about versioning. In other words, it
doesn't work, and the trouble doesn't worth the benefits.
I'm not too familiar with the erasure code plugins, but it seems to me
that they need to be versioned. Then you could have multiple plugins
with different versions installed, and then you could specify which
version to use. You could have a tool that would make sure that all
the osds have access to the appropriate plugin version. But the actual
installation of the plugin wouldn't be part of ceph's internal task.
It might be that the erasure code plugins are more contained than the
objclasses, and something like you suggested might actually work.
Though I'm having trouble seeing that happening having a compiled
shared object as the resource that needs to be distributed. The first
objclass implementation actually pushed python code to the nodes (it
really did!), maybe having something like that for erasure code could
work, given the appropriate environment and tools.
Yehuda
--
Loïc Dachary, Artisan Logiciel Libre
Loading...