Discussion:
qemu drive-mirror to rbd storage : no sparse rbd image
Alexandre DERUMIER
2014-10-08 11:15:47 UTC
Permalink
Hi,

I'm currently planning to migrate our storage to ceph/rbd through qemu drive-mirror

and It seem that drive-mirror with rbd block driver, don't create a sparse image. (all zeros are copied to the target rbd).

Also note, that it's working fine with "qemu-img convert" , the rbd volume is sparse after conversion.


Could it be related to the "bdrv_co_write_zeroes" missing features in block/rbd.c ?

(It's available in other block drivers (scsi,gluster,raw-aio) , and I don't have this problem with theses block drivers).



Regards,

Alexandre Derumier


--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Fam Zheng
2014-10-11 07:01:18 UTC
Permalink
Post by Alexandre DERUMIER
Hi,
I'm currently planning to migrate our storage to ceph/rbd through qemu drive-mirror
and It seem that drive-mirror with rbd block driver, don't create a sparse image. (all zeros are copied to the target rbd).
Also note, that it's working fine with "qemu-img convert" , the rbd volume is sparse after conversion.
What is the source format? If the zero clusters are actually unallocated in the
source image, drive-mirror will not write those clusters either. I.e. with
"drive-mirror sync=top", both source and target should have the same "qemu-img
map" output.

Fam
Post by Alexandre DERUMIER
Could it be related to the "bdrv_co_write_zeroes" missing features in block/rbd.c ?
(It's available in other block drivers (scsi,gluster,raw-aio) , and I don't have this problem with theses block drivers).
Regards,
Alexandre Derumier
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexandre DERUMIER
2014-10-11 08:00:48 UTC
Permalink
What is the source format? If the zero clusters are actually unalloca=
ted in the
source image, drive-mirror will not write those clusters either. I.e.=
with
"drive-mirror sync=3Dtop", both source and target should have the sam=
e "qemu-img
map" output.
Thanks for your reply,

I had tried drive mirror (sync=3Dfull) with

raw file (sparse) -> rbd (no sparse)
rbd (sparse) -> rbd (no sparse)
raw file (sparse) -> qcow2 on ext4 (sparse)
rbd (sparse) -> raw on ext4 (sparse)

Also I see that I have the same problem with target file format on xfs.

raw file (sparse) -> qcow2 on xfs (no sparse)
rbd (sparse) -> raw on xfs (no sparse)


I only have this problem with drive-mirror, qemu-img convert seem to si=
mply skip zero blocks.


Or maybe this is because I'm using sync=3Dfull ?

What is the difference between full and top ?

""sync": what parts of the disk image should be copied to the destinati=
on;
possibilities include "full" for all the disk, "top" for only the sec=
tors
allocated in the topmost image".

(what is topmost image ?)


----- Mail original -----=20

De: "Fam Zheng" <***@redhat.com>=20
=C3=80: "Alexandre DERUMIER" <***@odiso.com>=20
Cc: "qemu-devel" <qemu-***@nongnu.org>, "Ceph Devel" <ceph-***@vger=
=2Ekernel.org>=20
Envoy=C3=A9: Samedi 11 Octobre 2014 09:01:18=20
Objet: Re: [Qemu-devel] qemu drive-mirror to rbd storage : no sparse rb=
d image=20

On Wed, 10/08 13:15, Alexandre DERUMIER wrote:=20
Hi,=20
=20
I'm currently planning to migrate our storage to ceph/rbd through qem=
u drive-mirror=20
=20
and It seem that drive-mirror with rbd block driver, don't create a s=
parse image. (all zeros are copied to the target rbd).=20
=20
Also note, that it's working fine with "qemu-img convert" , the rbd v=
olume is sparse after conversion.=20

What is the source format? If the zero clusters are actually unallocate=
d in the=20
source image, drive-mirror will not write those clusters either. I.e. w=
ith=20
"drive-mirror sync=3Dtop", both source and target should have the same =
"qemu-img=20
map" output.=20

=46am=20
=20
=20
Could it be related to the "bdrv_co_write_zeroes" missing features in=
block/rbd.c ?=20
=20
(It's available in other block drivers (scsi,gluster,raw-aio) , and I=
don't have this problem with theses block drivers).=20
=20
=20
=20
Regards,=20
=20
Alexandre Derumier=20
=20
=20
=20
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" i=
n
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Fam Zheng
2014-10-11 08:25:35 UTC
Permalink
Post by Alexandre DERUMIER
Post by Fam Zheng
What is the source format? If the zero clusters are actually unallocated in the
source image, drive-mirror will not write those clusters either. I.e. with
"drive-mirror sync=top", both source and target should have the same "qemu-img
map" output.
Thanks for your reply,
I had tried drive mirror (sync=full) with
raw file (sparse) -> rbd (no sparse)
rbd (sparse) -> rbd (no sparse)
raw file (sparse) -> qcow2 on ext4 (sparse)
rbd (sparse) -> raw on ext4 (sparse)
Also I see that I have the same problem with target file format on xfs.
raw file (sparse) -> qcow2 on xfs (no sparse)
rbd (sparse) -> raw on xfs (no sparse)
These don't tell me much. Maybe it's better to show the actual commands and how
you tell sparse from no sparse?

Does "qcow2 -> qcow2" work for you on xfs?
Post by Alexandre DERUMIER
I only have this problem with drive-mirror, qemu-img convert seem to simply skip zero blocks.
Or maybe this is because I'm using sync=full ?
What is the difference between full and top ?
""sync": what parts of the disk image should be copied to the destination;
possibilities include "full" for all the disk, "top" for only the sectors
allocated in the topmost image".
(what is topmost image ?)
For "sync=top", only the clusters allocated in the image itself is copied; for
"full", all those clusters allocated in the image itself, and its backing
image, and it's backing's backing image, ..., are copied.

The image itself, having a backing image or not, is called the topmost image.

Fam
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Andrey Korolyov
2014-10-11 08:30:40 UTC
Permalink
Post by Fam Zheng
Post by Alexandre DERUMIER
Post by Fam Zheng
What is the source format? If the zero clusters are actually unallocated in the
source image, drive-mirror will not write those clusters either. I.e. with
"drive-mirror sync=top", both source and target should have the same "qemu-img
map" output.
Thanks for your reply,
I had tried drive mirror (sync=full) with
raw file (sparse) -> rbd (no sparse)
rbd (sparse) -> rbd (no sparse)
raw file (sparse) -> qcow2 on ext4 (sparse)
rbd (sparse) -> raw on ext4 (sparse)
Also I see that I have the same problem with target file format on xfs.
raw file (sparse) -> qcow2 on xfs (no sparse)
rbd (sparse) -> raw on xfs (no sparse)
These don't tell me much. Maybe it's better to show the actual commands and how
you tell sparse from no sparse?
Does "qcow2 -> qcow2" work for you on xfs?
Post by Alexandre DERUMIER
I only have this problem with drive-mirror, qemu-img convert seem to simply skip zero blocks.
Or maybe this is because I'm using sync=full ?
What is the difference between full and top ?
""sync": what parts of the disk image should be copied to the destination;
possibilities include "full" for all the disk, "top" for only the sectors
allocated in the topmost image".
(what is topmost image ?)
For "sync=top", only the clusters allocated in the image itself is copied; for
"full", all those clusters allocated in the image itself, and its backing
image, and it's backing's backing image, ..., are copied.
The image itself, having a backing image or not, is called the topmost image.
Fam
--
Just a wild guess - Alexandre, did you tried detect-zeroes blk option
for mirroring targets?
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexandre DERUMIER
2014-10-12 10:02:42 UTC
Permalink
Just a wild guess - Alexandre, did you tried detect-zeroes blk option=
=20
for mirroring targets?=20
Hi, yes, I have also tried with detect-zeroes (on or discard) with virt=
io and virtio-scsi, doesn't help. (I'm not sure that is implemtend in d=
rive-mirror).

As workaround currently, after drive-mirror, I doing fstrim inside the =
guest (with virtio-scsi + discard),=20
and like this I can free space on rbd storage.



----- Mail original -----=20

De: "Andrey Korolyov" <***@xdel.ru>=20
=C3=80: "Fam Zheng" <***@redhat.com>=20
Cc: "Alexandre DERUMIER" <***@odiso.com>, "qemu-devel" <qemu-deve=
***@nongnu.org>, "Ceph Devel" <ceph-***@vger.kernel.org>=20
Envoy=C3=A9: Samedi 11 Octobre 2014 10:30:40=20
Objet: Re: [Qemu-devel] qemu drive-mirror to rbd storage : no sparse rb=
d image=20
On Sat, 10/11 10:00, Alexandre DERUMIER wrote:=20
What is the source format? If the zero clusters are actually unall=
ocated in the=20
source image, drive-mirror will not write those clusters either. I=
=2Ee. with=20
"drive-mirror sync=3Dtop", both source and target should have the =
same "qemu-img=20
map" output.=20
=20
Thanks for your reply,=20
=20
I had tried drive mirror (sync=3Dfull) with=20
=20
raw file (sparse) -> rbd (no sparse)=20
rbd (sparse) -> rbd (no sparse)=20
raw file (sparse) -> qcow2 on ext4 (sparse)=20
rbd (sparse) -> raw on ext4 (sparse)=20
=20
Also I see that I have the same problem with target file format on x=
fs.=20
=20
raw file (sparse) -> qcow2 on xfs (no sparse)=20
rbd (sparse) -> raw on xfs (no sparse)=20
=20
=20
These don't tell me much. Maybe it's better to show the actual comman=
ds and how=20
you tell sparse from no sparse?=20
=20
Does "qcow2 -> qcow2" work for you on xfs?=20
=20
=20
I only have this problem with drive-mirror, qemu-img convert seem to=
simply skip zero blocks.=20
=20
=20
Or maybe this is because I'm using sync=3Dfull ?=20
=20
What is the difference between full and top ?=20
=20
""sync": what parts of the disk image should be copied to the destin=
ation;=20
possibilities include "full" for all the disk, "top" for only the se=
ctors=20
allocated in the topmost image".=20
=20
(what is topmost image ?)=20
=20
For "sync=3Dtop", only the clusters allocated in the image itself is =
copied; for=20
"full", all those clusters allocated in the image itself, and its bac=
king=20
image, and it's backing's backing image, ..., are copied.=20
=20
The image itself, having a backing image or not, is called the topmos=
t image.=20
=20
Fam=20
--=20
Just a wild guess - Alexandre, did you tried detect-zeroes blk option=20
for mirroring targets?=20
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" i=
n
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexandre DERUMIER
2014-10-12 10:33:45 UTC
Permalink
These don't tell me much. Maybe it's better to show the actual comman=
ds and how=20
you tell sparse from no sparse?=20
Well,

I create 2 empty source images files of 10G. (source.qcow2 and source.r=
aw)

then:

du -sh source.qcow2 : 2M
du -sh source.raw : 0M


then I convert them with qemu-img convert : (source.qcow2 -> target.qco=
w2 , source.raw -> target.raw)

du -sh target.qcow2 : 2M
du -sh target.raw : 0M

(So it's ok here)

But If I convert them with drive-mirror:

du -sh target.qcow2 : 11G
du -sh target.raw : 11G



I have also double check with #df, and I see space allocated on filesys=
tem when using drive-mirror.

I have the same behavior if target is a rbd storage.

Also I have done test again ext3,ext4, and I see the same problem than =
with xfs.=20


I'm pretty sure that drive-mirror copy zero block, qemu-img convert tak=
e around 2s to convert the empty file
(because it's skipping zero block), and drive mirror take around 5min.

----- Mail original -----=20

De: "Fam Zheng" <***@redhat.com>=20
=C3=80: "Alexandre DERUMIER" <***@odiso.com>=20
Cc: "qemu-devel" <qemu-***@nongnu.org>, "Ceph Devel" <ceph-***@vger=
=2Ekernel.org>=20
Envoy=C3=A9: Samedi 11 Octobre 2014 10:25:35=20
Objet: Re: [Qemu-devel] qemu drive-mirror to rbd storage : no sparse rb=
d image=20

On Sat, 10/11 10:00, Alexandre DERUMIER wrote:=20
What is the source format? If the zero clusters are actually unallo=
cated in the=20
source image, drive-mirror will not write those clusters either. I.=
e. with=20
"drive-mirror sync=3Dtop", both source and target should have the s=
ame "qemu-img=20
map" output.=20
=20
Thanks for your reply,=20
=20
I had tried drive mirror (sync=3Dfull) with=20
=20
raw file (sparse) -> rbd (no sparse)=20
rbd (sparse) -> rbd (no sparse)=20
raw file (sparse) -> qcow2 on ext4 (sparse)=20
rbd (sparse) -> raw on ext4 (sparse)=20
=20
Also I see that I have the same problem with target file format on xf=
s.=20
=20
raw file (sparse) -> qcow2 on xfs (no sparse)=20
rbd (sparse) -> raw on xfs (no sparse)=20
=20
These don't tell me much. Maybe it's better to show the actual commands=
and how=20
you tell sparse from no sparse?=20

Does "qcow2 -> qcow2" work for you on xfs?=20
=20
I only have this problem with drive-mirror, qemu-img convert seem to =
simply skip zero blocks.=20
=20
=20
Or maybe this is because I'm using sync=3Dfull ?=20
=20
What is the difference between full and top ?=20
=20
""sync": what parts of the disk image should be copied to the destina=
tion;=20
possibilities include "full" for all the disk, "top" for only the sec=
tors=20
allocated in the topmost image".=20
=20
(what is topmost image ?)=20
=46or "sync=3Dtop", only the clusters allocated in the image itself is =
copied; for=20
"full", all those clusters allocated in the image itself, and its backi=
ng=20
image, and it's backing's backing image, ..., are copied.=20

The image itself, having a backing image or not, is called the topmost =
image.=20

=46am=20
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" i=
n
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Paolo Bonzini
2014-10-12 13:02:12 UTC
Permalink
Post by Alexandre DERUMIER
Hi,
I'm currently planning to migrate our storage to ceph/rbd through qemu drive-mirror
and It seem that drive-mirror with rbd block driver, don't create a sparse image. (all zeros are copied to the target rbd).
Also note, that it's working fine with "qemu-img convert" , the rbd volume is sparse after conversion.
Could it be related to the "bdrv_co_write_zeroes" missing features in block/rbd.c ?
(It's available in other block drivers (scsi,gluster,raw-aio) , and I don't have this problem with theses block drivers).
Lack of bdrv_co_write_zeroes is why detect-zeroes does not work.

Lack of bdrv_get_block_status is why sparse->sparse does not work
without detect-zeroes.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexandre DERUMIER
2014-10-13 06:06:29 UTC
Permalink
Lack of bdrv_co_write_zeroes is why detect-zeroes does not work.=20
Lack of bdrv_get_block_status is why sparse->sparse does not work=20
without detect-zeroes.
Ok, thanks Paolo !

Both are missing in rbd block driver. @ceph-devel . Could it be possibl=
e to implement them ?



Also, about drive-mirror, I had tried with detect-zeroes with simple qc=
ow2 file,=20
and It don't seem to help.
I'm not sure that detect-zeroes is implement in drive-mirror.

also, the target mirrored volume don't seem to have the detect-zeroes o=
ption


# info block
drive-virtio1: /source.qcow2 (qcow2)
Detect zeroes: on

#du -sh source.qcow2 : 2M

drive-mirror source.qcow2 -> target.qcow2

# info block
drive-virtio1: /target.qcow2 (qcow2)

#du -sh target.qcow2 : 11G




----- Mail original -----=20

De: "Paolo Bonzini" <***@redhat.com>=20
=C3=80: "Alexandre DERUMIER" <***@odiso.com>, "qemu-devel" <qemu-=
***@nongnu.org>=20
Cc: "Ceph Devel" <ceph-***@vger.kernel.org>=20
Envoy=C3=A9: Dimanche 12 Octobre 2014 15:02:12=20
Objet: Re: qemu drive-mirror to rbd storage : no sparse rbd image=20

Il 08/10/2014 13:15, Alexandre DERUMIER ha scritto:=20
Hi,=20
=20
I'm currently planning to migrate our storage to ceph/rbd through qem=
u drive-mirror=20
=20
and It seem that drive-mirror with rbd block driver, don't create a s=
parse image. (all zeros are copied to the target rbd).=20
=20
Also note, that it's working fine with "qemu-img convert" , the rbd v=
olume is sparse after conversion.=20
=20
=20
Could it be related to the "bdrv_co_write_zeroes" missing features in=
block/rbd.c ?=20
=20
(It's available in other block drivers (scsi,gluster,raw-aio) , and I=
don't have this problem with theses block drivers).=20

Lack of bdrv_co_write_zeroes is why detect-zeroes does not work.=20

Lack of bdrv_get_block_status is why sparse->sparse does not work=20
without detect-zeroes.=20

Paolo=20
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" i=
n
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Paolo Bonzini
2014-10-13 07:06:01 UTC
Permalink
Also, about drive-mirror, I had tried with detect-zeroes with simple qcow2 file,
and It don't seem to help.
I'm not sure that detect-zeroes is implement in drive-mirror.
also, the target mirrored volume don't seem to have the detect-zeroes option
# info block
drive-virtio1: /source.qcow2 (qcow2)
Detect zeroes: on
#du -sh source.qcow2 : 2M
drive-mirror source.qcow2 -> target.qcow2
# info block
drive-virtio1: /target.qcow2 (qcow2)
#du -sh target.qcow2 : 11G
Ah, you're right. We need to add an options field, or use a new
blockdev-mirror command.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexandre DERUMIER
2014-10-13 08:08:51 UTC
Permalink
Post by Paolo Bonzini
Ah, you're right. We need to add an options field, or use a new
blockdev-mirror command.
Ok, thanks. Can't help to implement this, but I'll glad to help for tes=
ting.


----- Mail original -----=20

De: "Paolo Bonzini" <***@redhat.com>=20
=C3=80: "Alexandre DERUMIER" <***@odiso.com>=20
Cc: "Ceph Devel" <ceph-***@vger.kernel.org>, "qemu-devel" <qemu-devel=
@nongnu.org>=20
Envoy=C3=A9: Lundi 13 Octobre 2014 09:06:01=20
Objet: Re: qemu drive-mirror to rbd storage : no sparse rbd image=20

Il 13/10/2014 08:06, Alexandre DERUMIER ha scritto:=20
=20
Also, about drive-mirror, I had tried with detect-zeroes with simple =
qcow2 file,=20
and It don't seem to help.=20
I'm not sure that detect-zeroes is implement in drive-mirror.=20
=20
also, the target mirrored volume don't seem to have the detect-zeroes=
option=20
=20
=20
# info block=20
drive-virtio1: /source.qcow2 (qcow2)=20
Detect zeroes: on=20
=20
#du -sh source.qcow2 : 2M=20
=20
drive-mirror source.qcow2 -> target.qcow2=20
=20
# info block=20
drive-virtio1: /target.qcow2 (qcow2)=20
=20
#du -sh target.qcow2 : 11G=20
=20
Ah, you're right. We need to add an options field, or use a new=20
blockdev-mirror command.=20

Paolo=20
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" i=
n
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Loading...