Nicheal
2014-10-21 07:46:57 UTC
When I look into the ceph source code, I found the erasure cod=
not support
the random write, it only support the append write. Why? Is that ran=
the random write, it only support the append write. Why? Is that ran=
write of is erasure code high cost and the performance of the deep s=
very poor?
To modify a EC object you need to read all chunks in order to computethe parity again.
So that would involve a lot of reads for what might be just a very sm=
write.
That's also why EC can't be used for RBD images.
But for RBD cases, the smallest write will be 4k and the largest willThat's also why EC can't be used for RBD images.
be 512k, which is determined by the block device driver. Currently, if
we use cache-tiering, one 4k random write may promote the whole 4M
object into the hot pool even though just serveral 4Ks in this object
is hot. So this is unreasonable. Furthermore, for the hot pool, it can
be a common replicated pool, and even, we can use kv-store for hot
pool. In kv-store, the content will be stripped to 1k =3D 1024 byte as
default. So is that possible to make use of this feature to realize
the model, if miss 4k, we just promote 4k data to the hot pool and
saving it into kv-store? Expecting haomai Wang's input
Nicheal,
Regards
Thanks.
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Wido den Hollander
Ceph consultant and trainer
42on B.V.
Phone: +31 (0)20 700 9902
Skype: contact42on
_______________________________________________
ceph-users mailing list
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
To unsubscribe from this list: send the line "unsubscribe ceph-devel" i=
n
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html