issue 8747 / 9011

Discussion:

issue 8747 / 9011

Sage Weil

2014-09-19 23:43:31 UTC

Hey Dmitry,

Are you still seeing this crash?

osd/ReplicatedPG.cc: 5297: FAILED assert(soid < scrubber.start || soid >= scrubber.end)

We haven't turned it up in our testing in the last two months, so we
still have no log of it occurring.

Thanks!
sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Dmitry Smirnov

2014-09-20 03:08:08 UTC

Permalink

Hi Sage,

Post by Sage Weil
Are you still seeing this crash?
osd/ReplicatedPG.cc: 5297: FAILED assert(soid < scrubber.start || soid >= scrubber.end)

Thanks for following-up on this, Sage.
Yes, I've seen this crash just recently on 0.80.5. It usually happens during
long recovery like when OSD is replaced. I've seen this happening after hours
of backfilling/remapping although it may take a long time to manifest.
--
Cheers,
Dmitry Smirnov
GPG key : 4096R/53968D1B

---

However beautiful the strategy, you should occasionally look at the
results.
-- Winston Churchill

Sage Weil

2014-09-21 19:28:23 UTC

Permalink

Post by Dmitry Smirnov
Hi Sage,

Post by Sage Weil
Are you still seeing this crash?
osd/ReplicatedPG.cc: 5297: FAILED assert(soid < scrubber.start || soid >=
scrubber.end)

Is there any possibility of enabling logging on your osds (debug ms = 1,
debug osd = 20) so that we can capture this?

Thanks-
sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Dmitry Smirnov

2014-09-22 01:13:49 UTC

Permalink

Post by Sage Weil
Is there any possibility of enabling logging on your osds (debug ms = 1,
debug osd = 20) so that we can capture this?

I'll put it on my TODO list but I won't be able to do it anytime soon...

Meanwhile is there any chance for #8752 to get some attention? It is regarding
inconsistent PGs on RBD caching pool. Thanks.

--
Cheers,
Dmitry Smirnov
GPG key : 4096R/53968D1B

Sage Weil

2014-09-22 02:01:52 UTC

Permalink

Post by Dmitry Smirnov

Post by Sage Weil
Is there any possibility of enabling logging on your osds (debug ms = 1,
debug osd = 20) so that we can capture this?

I'll put it on my TODO list but I won't be able to do it anytime soon...
Meanwhile is there any chance for #8752 to get some attention? It is regarding
inconsistent PGs on RBD caching pool. Thanks.

This is one we have never seen in our QA environment, and no real leads.
There are a couple slightly different scrub issues that pop up
occasionally that we are trying to nail down, but this one is a bit
different. Being able to reliably reproduce it and generate logs is the
usual strategy...

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Dmitry Smirnov

2014-10-02 13:47:13 UTC

Permalink

Post by Sage Weil
This is one we have never seen in our QA environment, and no real leads.

I'm much surprised about this... Is it really that unusual to use replicated
caching pool in front of RBD erasure pool? All my OSDs are Btrfs-based and
recently I've upgraded all kernels (i.e. kernel RBD clients) to 3.16.3.

Unlike some shifty issues that may be hard to replicate this particular one
was very persistent and noticeable, no effort to reproduce at all. I've been
observing it for several months already...

It is unlikely that I have anything special in my v0.80.5 cluster's
configuration...

Post by Sage Weil
There are a couple slightly different scrub issues that pop up
occasionally that we are trying to nail down, but this one is a bit
different. Being able to reliably reproduce it and generate logs is the
usual strategy...

Please advise what kind of logs could be useful. Something like "(debug ms =
1, debug osd = 20)" from primary OSD where inconsistent PG lies at a time when
"scrub" command is given?

Thanks.

--
All the best,
Dmitry Smirnov.

Sage Weil

2014-10-02 15:28:16 UTC

Permalink

Post by Dmitry Smirnov

Post by Sage Weil
This is one we have never seen in our QA environment, and no real leads.

My guess is a btrfs issue. The weird thing about your report is the byte
totals are off by an uneven number of bytes (3 bytes, 9 bytes, etc.).
We haven't ever seen this. We do test RBD over cache tiers on btrfs,
but not with EC on the base. I'll add that combo to the matrix. My first
guess is a btrfs issue, honestly.

Post by Dmitry Smirnov
Unlike some shifty issues that may be hard to replicate this particular one
was very persistent and noticeable, no effort to reproduce at all. I've been
observing it for several months already...

Does it continue to come up after the kernels are upgraded (and after a
full cycle of scrub and repairs have been done to clear out
inconsistencies introduced while running the older kernel)?

sage

Post by Dmitry Smirnov
It is unlikely that I have anything special in my v0.80.5 cluster's
configuration...

Please advise what kind of logs could be useful. Something like "(debug ms =
1, debug osd = 20)" from primary OSD where inconsistent PG lies at a time when
"scrub" command is given?
Thanks.
--
All the best,
Dmitry Smirnov.

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Dmitry Smirnov

2014-10-02 21:09:41 UTC

Permalink

Post by Sage Weil
My guess is a btrfs issue. The weird thing about your report is the byte
totals are off by an uneven number of bytes (3 bytes, 9 bytes, etc.).
We haven't ever seen this. We do test RBD over cache tiers on btrfs,
but not with EC on the base. I'll add that combo to the matrix. My first
guess is a btrfs issue, honestly.

I think I found where it is happening: for a while I was using Btrfs-based
OSDs with journals on ext4 partition on SSD. As an experiment I've decided to
try moving all journal files back to their OSDs and it eliminated
inconsistencies. I've updated the ticket with this information.
This behaviour is reproducible on 0.80.6.

It looks like Btrfs snapshotting do not affect this issue.

Post by Sage Weil
Does it continue to come up after the kernels are upgraded (and after a
full cycle of scrub and repairs have been done to clear out
inconsistencies introduced while running the older kernel)?

Yes, I tried many times after every kernel update or any change in cluster
whatsoever. Repair is usually ineffective and doesn't change anything: it
would log "repair 1 errors, 1 fixed" but "ceph pg scrub" will find an error
right away. Moreover repair is not even necessary -- inconsistencies stay on
some PGs for a while then "move" to different PGs. For example "ceph pg scrub
19.NN" sometimes would be clearing affected pg from "inconsistent" state or
discover a new inconsistency seemingly at random.

Thank you.

--
Cheers,
Dmitry Smirnov.

---

Odious ideas are not entitled to hide from criticism behind the human
shield of their believers' feelings.
-- Richard Stallman