[ceph-users] why the erasure code pool not support random write?

Nicheal

2014-10-21 07:46:57 UTC

Permalink

When I look into the ceph source code, I found the erasure cod=

e pool

not support
the random write, it only support the append write. Why? Is that ran=

dom

write of is erasure code high cost and the performance of the deep s=

crub is

very poor?

To modify a EC object you need to read all chunks in order to compute
the parity again.
So that would involve a lot of reads for what might be just a very sm=

all

write.
That's also why EC can't be used for RBD images.

But for RBD cases, the smallest write will be 4k and the largest will
be 512k, which is determined by the block device driver. Currently, if
we use cache-tiering, one 4k random write may promote the whole 4M
object into the hot pool even though just serveral 4Ks in this object
is hot. So this is unreasonable. Furthermore, for the hot pool, it can
be a common replicated pool, and even, we can use kv-store for hot
pool. In kv-store, the content will be stripped to 1k =3D 1024 byte as
default. So is that possible to make use of this feature to realize
the model, if miss 4k, we just promote 4k data to the hot pool and
saving it into kv-store? Expecting haomai Wang's input

Nicheal,
Regards

Thanks.
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

--
Wido den Hollander
Ceph consultant and trainer
42on B.V.
Phone: +31 (0)20 700 9902
Skype: contact42on
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" i=
n
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html