2014-10-21 07:46:57 UTC
When I look into the ceph source code, I found the erasure cod=
e pool
not support
the random write, it only support the append write. Why? Is that ran=
write of is erasure code high cost and the performance of the deep s=
crub is
very poor?
To modify a EC object you need to read all chunks in order to compute
the parity again.
So that would involve a lot of reads for what might be just a very sm=
That's also why EC can't be used for RBD images.
But for RBD cases, the smallest write will be 4k and the largest will
be 512k, which is determined by the block device driver. Currently, if
we use cache-tiering, one 4k random write may promote the whole 4M
object into the hot pool even though just serveral 4Ks in this object
is hot. So this is unreasonable. Furthermore, for the hot pool, it can
be a common replicated pool, and even, we can use kv-store for hot
pool. In kv-store, the content will be stripped to 1k =3D 1024 byte as
default. So is that possible to make use of this feature to realize
the model, if miss 4k, we just promote 4k data to the hot pool and
saving it into kv-store? Expecting haomai Wang's input

