Cache tiering slow request issue: currently waiting for rw locks

Hi,

I've opened http://tracker.ceph.com/issues/9285 to track this.

I think you're right--we need a check in agent_maybe_evict() that will
skip objects that are being promoted. I suspect a flag on the
ObjectContext is enough?

sage

Post by Wang, Zhiqiang
Hi all,
I've ran into this slow request issue some time ago. The problem is like this: when running with cache tieing, there are 'slow request' warning messages in the log file like below.
2014-08-29 10:18:24.669763 7f9b20f1b700 0 log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 30.996595 secs
2014-08-29 10:18:24.669768 7f9b20f1b700 0 log [WRN] : slow request 30.996595 seconds old, received at 2014-08-29 10:17:53.673142: osd_op(client.114176.0:144919 rb.0.17f56.6b8b4567.000000000935 [sparse-read 3440640~4096] 45.cf45084b ack+read e26168) v4 currently waiting for rw locks
Recently I made some changes to the log, captured this problem, and finally figured out its root cause. You can check the attachment for the logs.
There is a cache miss when doing read. During promotion, after copying the data from base tier osd, the cache tier primary osd replicates the data to other cache tier osds. Some times this takes quite a long time. During this period of time, the promoted object may be evicted because the cache tier is full. When the primary osd finally gets the replication response and restarts the original read request, it doesn't find the object in the cache tier, and do promotion again. This loops for several times, and we'll see the 'slow request' in the logs. Theoretically, this could loops forever, and the request from the client would never be finished.
Add a field in the object state, indicating the status of the promotion. It's set to true after the copy of data from base tier and before the replication. It's reset to false after the replication and the original client request starts to execute. Evicting is not allowed when this field is true.
What do you think?

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Sage Weil

2014-08-30 02:29:02 UTC

Hi,

Can you take a look at https://github.com/ceph/ceph/pull/2363 and see if
that addresses the behavior you saw?

Thanks!
sage

Post by Sage Weil
Hi,
I've opened http://tracker.ceph.com/issues/9285 to track this.
I think you're right--we need a check in agent_maybe_evict() that will
skip objects that are being promoted. I suspect a flag on the
ObjectContext is enough?
sage

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
More majordomo info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Wang, Zhiqiang

2014-09-01 01:33:20 UTC

I don't think the object context is blocked at that time. It is un-blocked after the copying of data from base tier. It doesn't address the problem here. Anyway, I'll try it and see.

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Sage Weil
Sent: Saturday, August 30, 2014 10:29 AM
To: Wang, Zhiqiang
Cc: 'ceph-***@vger.kernel.org'
Subject: Re: Cache tiering slow request issue: currently waiting for rw locks

Hi,

Can you take a look at https://github.com/ceph/ceph/pull/2363 and see if that addresses the behavior you saw?

Thanks!
sage

Post by Wang, Zhiqiang
Hi all,
I've ran into this slow request issue some time ago. The problem is like this: when running with cache tieing, there are 'slow request' warning messages in the log file like below.
2014-08-29 10:18:24.669763 7f9b20f1b700 0 log [WRN] : 1 slow
requests, 1 included below; oldest blocked for > 30.996595 secs
2014-08-29 10:18:24.669768 7f9b20f1b700 0 log [WRN] : slow request
osd_op(client.114176.0:144919 rb.0.17f56.6b8b4567.000000000935
[sparse-read 3440640~4096] 45.cf45084b ack+read e26168) v4 currently
waiting for rw locks
Recently I made some changes to the log, captured this problem, and finally figured out its root cause. You can check the attachment for the logs.
There is a cache miss when doing read. During promotion, after copying the data from base tier osd, the cache tier primary osd replicates the data to other cache tier osds. Some times this takes quite a long time. During this period of time, the promoted object may be evicted because the cache tier is full. When the primary osd finally gets the replication response and restarts the original read request, it doesn't find the object in the cache tier, and do promotion again. This loops for several times, and we'll see the 'slow request' in the logs. Theoretically, this could loops forever, and the request from the client would never be finished.
Add a field in the object state, indicating the status of the promotion. It's set to true after the copy of data from base tier and before the replication. It's reset to false after the replication and the original client request starts to execute. Evicting is not allowed when this field is true.
What do you think?

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html

Wang, Zhiqiang

2014-09-02 06:54:25 UTC

Tried the pull request, checking the object is blocked or not doesn't work. Actually this check is already done in function agent_work.

I tried to make a fix to add a field/flag to the object context. This is not a good idea for the following reasons:
1) If making this filed/flag to be a persistent one, when resetting/clearing this flag, we need to persist it. This is not good for read request.
2) If making this field/flag not to be a persistent one, when the object context is removed from the cache ' object_contexts', this field/flag is removed as well. This object is removed in the later evicting. The same issue still exists.

So, I came up with a fix to add a set in the class ReplicatedPG to hold all the promoting objects. This fix is at https://github.com/ceph/ceph/pull/2374. It is tested and works well. Pls review and comment, thx.

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Wang, Zhiqiang
Sent: Monday, September 1, 2014 9:33 AM
To: Sage Weil
Cc: 'ceph-***@vger.kernel.org'
Subject: RE: Cache tiering slow request issue: currently waiting for rw locks

I don't think the object context is blocked at that time. It is un-blocked after the copying of data from base tier. It doesn't address the problem here. Anyway, I'll try it and see.

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Sage Weil
Sent: Saturday, August 30, 2014 10:29 AM
To: Wang, Zhiqiang
Cc: 'ceph-***@vger.kernel.org'
Subject: Re: Cache tiering slow request issue: currently waiting for rw locks

Hi,

Can you take a look at https://github.com/ceph/ceph/pull/2363 and see if that addresses the behavior you saw?

Thanks!
sage

Post by Wang, Zhiqiang
Hi all,
I've ran into this slow request issue some time ago. The problem is like this: when running with cache tieing, there are 'slow request' warning messages in the log file like below.
2014-08-29 10:18:24.669763 7f9b20f1b700 0 log [WRN] : 1 slow
requests, 1 included below; oldest blocked for > 30.996595 secs
2014-08-29 10:18:24.669768 7f9b20f1b700 0 log [WRN] : slow request
osd_op(client.114176.0:144919 rb.0.17f56.6b8b4567.000000000935
[sparse-read 3440640~4096] 45.cf45084b ack+read e26168) v4 currently
waiting for rw locks
Recently I made some changes to the log, captured this problem, and finally figured out its root cause. You can check the attachment for the logs.
There is a cache miss when doing read. During promotion, after copying the data from base tier osd, the cache tier primary osd replicates the data to other cache tier osds. Some times this takes quite a long time. During this period of time, the promoted object may be evicted because the cache tier is full. When the primary osd finally gets the replication response and restarts the original read request, it doesn't find the object in the cache tier, and do promotion again. This loops for several times, and we'll see the 'slow request' in the logs. Theoretically, this could loops forever, and the request from the client would never be finished.
Add a field in the object state, indicating the status of the promotion. It's set to true after the copy of data from base tier and before the replication. It's reset to false after the replication and the original client request starts to execute. Evicting is not allowed when this field is true.
What do you think?

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html

Wang, Zhiqiang

2014-09-05 07:20:40 UTC

I made some comments based on your comments of the pull request https://github.com/ceph/ceph/pull/2374. Can you take a look? Thx.

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Wang, Zhiqiang
Sent: Tuesday, September 2, 2014 2:54 PM
To: Sage Weil
Cc: 'ceph-***@vger.kernel.org'
Subject: RE: Cache tiering slow request issue: currently waiting for rw locks

Tried the pull request, checking the object is blocked or not doesn't work. Actually this check is already done in function agent_work.

I tried to make a fix to add a field/flag to the object context. This is not a good idea for the following reasons:
1) If making this filed/flag to be a persistent one, when resetting/clearing this flag, we need to persist it. This is not good for read request.
2) If making this field/flag not to be a persistent one, when the object context is removed from the cache ' object_contexts', this field/flag is removed as well. This object is removed in the later evicting. The same issue still exists.

So, I came up with a fix to add a set in the class ReplicatedPG to hold all the promoting objects. This fix is at https://github.com/ceph/ceph/pull/2374. It is tested and works well. Pls review and comment, thx.

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Wang, Zhiqiang
Sent: Monday, September 1, 2014 9:33 AM
To: Sage Weil
Cc: 'ceph-***@vger.kernel.org'
Subject: RE: Cache tiering slow request issue: currently waiting for rw locks

I don't think the object context is blocked at that time. It is un-blocked after the copying of data from base tier. It doesn't address the problem here. Anyway, I'll try it and see.

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Sage Weil
Sent: Saturday, August 30, 2014 10:29 AM
To: Wang, Zhiqiang
Cc: 'ceph-***@vger.kernel.org'
Subject: Re: Cache tiering slow request issue: currently waiting for rw locks

Hi,

Can you take a look at https://github.com/ceph/ceph/pull/2363 and see if that addresses the behavior you saw?

Thanks!
sage

Post by Wang, Zhiqiang
Hi all,
I've ran into this slow request issue some time ago. The problem is like this: when running with cache tieing, there are 'slow request' warning messages in the log file like below.
2014-08-29 10:18:24.669763 7f9b20f1b700 0 log [WRN] : 1 slow
requests, 1 included below; oldest blocked for > 30.996595 secs
2014-08-29 10:18:24.669768 7f9b20f1b700 0 log [WRN] : slow request
osd_op(client.114176.0:144919 rb.0.17f56.6b8b4567.000000000935
[sparse-read 3440640~4096] 45.cf45084b ack+read e26168) v4 currently
waiting for rw locks
Recently I made some changes to the log, captured this problem, and finally figured out its root cause. You can check the attachment for the logs.
There is a cache miss when doing read. During promotion, after copying the data from base tier osd, the cache tier primary osd replicates the data to other cache tier osds. Some times this takes quite a long time. During this period of time, the promoted object may be evicted because the cache tier is full. When the primary osd finally gets the replication response and restarts the original read request, it doesn't find the object in the cache tier, and do promotion again. This loops for several times, and we'll see the 'slow request' in the logs. Theoretically, this could loops forever, and the request from the client would never be finished.
Add a field in the object state, indicating the status of the promotion. It's set to true after the copy of data from base tier and before the replication. It's reset to false after the replication and the original client request starts to execute. Evicting is not allowed when this field is true.
What do you think?

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html

Wang, Zhiqiang

2014-09-09 07:12:34 UTC

Pasting the conversations in the pull request here.

wonzhq commented 6 days ago
If checking ObjectContext::RWState::waiters and ObjectContext::RWState::count for the pending requests on this object, there is still a window which the problem can happen. That is after the promotion replication and requeuing the client request, and before dequeuing the client request. Should we loop the OSD::op_wq to check the pending requests on an object? Or adding something in the ObjectContext to remember the pending requests? @athanatos @liewegas

liewegas commented 10 hours ago
Hmm, that's true that there is still that window. Is it necessary that this is completely air-tight, though? As long as we avoid evicting a newly-promoted object before the request is processed we will win. I'm afraid that a complicated mechanism to cover this could introduce more complexity than we need.

wonzhq commented 2 minutes ago
Tried to use ObjectContext::RWState::count to check the pending request. In my testing, it hit the slow request just once. I checked the log, it exactly falls into the window we talked above. So with this solution, it's possible that we still hit this issue, but much less than before. Should we go ahead with this solution?

Sage/Sam, what do you think?

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Wang, Zhiqiang
Sent: Friday, September 5, 2014 3:21 PM
To: Sage Weil; '***@inktank.com'
Cc: 'ceph-***@vger.kernel.org'
Subject: RE: Cache tiering slow request issue: currently waiting for rw locks

I made some comments based on your comments of the pull request https://github.com/ceph/ceph/pull/2374. Can you take a look? Thx.

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Wang, Zhiqiang
Sent: Tuesday, September 2, 2014 2:54 PM
To: Sage Weil
Cc: 'ceph-***@vger.kernel.org'
Subject: RE: Cache tiering slow request issue: currently waiting for rw locks

Tried the pull request, checking the object is blocked or not doesn't work. Actually this check is already done in function agent_work.

I tried to make a fix to add a field/flag to the object context. This is not a good idea for the following reasons:
1) If making this filed/flag to be a persistent one, when resetting/clearing this flag, we need to persist it. This is not good for read request.
2) If making this field/flag not to be a persistent one, when the object context is removed from the cache ' object_contexts', this field/flag is removed as well. This object is removed in the later evicting. The same issue still exists.

So, I came up with a fix to add a set in the class ReplicatedPG to hold all the promoting objects. This fix is at https://github.com/ceph/ceph/pull/2374. It is tested and works well. Pls review and comment, thx.

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Wang, Zhiqiang
Sent: Monday, September 1, 2014 9:33 AM
To: Sage Weil
Cc: 'ceph-***@vger.kernel.org'
Subject: RE: Cache tiering slow request issue: currently waiting for rw locks

I don't think the object context is blocked at that time. It is un-blocked after the copying of data from base tier. It doesn't address the problem here. Anyway, I'll try it and see.

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Sage Weil
Sent: Saturday, August 30, 2014 10:29 AM
To: Wang, Zhiqiang
Cc: 'ceph-***@vger.kernel.org'
Subject: Re: Cache tiering slow request issue: currently waiting for rw locks

Hi,

Can you take a look at https://github.com/ceph/ceph/pull/2363 and see if that addresses the behavior you saw?

Thanks!
sage

Post by Wang, Zhiqiang
Hi all,
I've ran into this slow request issue some time ago. The problem is like this: when running with cache tieing, there are 'slow request' warning messages in the log file like below.
2014-08-29 10:18:24.669763 7f9b20f1b700 0 log [WRN] : 1 slow
requests, 1 included below; oldest blocked for > 30.996595 secs
2014-08-29 10:18:24.669768 7f9b20f1b700 0 log [WRN] : slow request
osd_op(client.114176.0:144919 rb.0.17f56.6b8b4567.000000000935
[sparse-read 3440640~4096] 45.cf45084b ack+read e26168) v4 currently
waiting for rw locks
Recently I made some changes to the log, captured this problem, and finally figured out its root cause. You can check the attachment for the logs.
There is a cache miss when doing read. During promotion, after copying the data from base tier osd, the cache tier primary osd replicates the data to other cache tier osds. Some times this takes quite a long time. During this period of time, the promoted object may be evicted because the cache tier is full. When the primary osd finally gets the replication response and restarts the original read request, it doesn't find the object in the cache tier, and do promotion again. This loops for several times, and we'll see the 'slow request' in the logs. Theoretically, this could loops forever, and the request from the client would never be finished.
Add a field in the object state, indicating the status of the promotion. It's set to true after the copy of data from base tier and before the replication. It's reset to false after the replication and the original client request starts to execute. Evicting is not allowed when this field is true.
What do you think?

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Chen, Xiaoxi

2014-09-09 08:45:22 UTC

can we set the cache_min_evict_age to a reasonable larger number (say 5min? 10min?) to walk around the window?-----If a request cannot finished in minutes, that indicate there should be some issue in the cluster.

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Wang, Zhiqiang
Sent: Tuesday, September 9, 2014 3:13 PM
To: Sage Weil; '***@inktank.com'
Cc: 'ceph-***@vger.kernel.org'
Subject: RE: Cache tiering slow request issue: currently waiting for rw locks

Pasting the conversations in the pull request here.

wonzhq commented 6 days ago
If checking ObjectContext::RWState::waiters and ObjectContext::RWState::count for the pending requests on this object, there is still a window which the problem can happen. That is after the promotion replication and requeuing the client request, and before dequeuing the client request. Should we loop the OSD::op_wq to check the pending requests on an object? Or adding something in the ObjectContext to remember the pending requests? @athanatos @liewegas

liewegas commented 10 hours ago
Hmm, that's true that there is still that window. Is it necessary that this is completely air-tight, though? As long as we avoid evicting a newly-promoted object before the request is processed we will win. I'm afraid that a complicated mechanism to cover this could introduce more complexity than we need.

wonzhq commented 2 minutes ago
Tried to use ObjectContext::RWState::count to check the pending request. In my testing, it hit the slow request just once. I checked the log, it exactly falls into the window we talked above. So with this solution, it's possible that we still hit this issue, but much less than before. Should we go ahead with this solution?

Sage/Sam, what do you think?

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Wang, Zhiqiang
Sent: Friday, September 5, 2014 3:21 PM
To: Sage Weil; '***@inktank.com'
Cc: 'ceph-***@vger.kernel.org'
Subject: RE: Cache tiering slow request issue: currently waiting for rw locks

I made some comments based on your comments of the pull request https://github.com/ceph/ceph/pull/2374. Can you take a look? Thx.

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Wang, Zhiqiang
Sent: Tuesday, September 2, 2014 2:54 PM
To: Sage Weil
Cc: 'ceph-***@vger.kernel.org'
Subject: RE: Cache tiering slow request issue: currently waiting for rw locks

Tried the pull request, checking the object is blocked or not doesn't work. Actually this check is already done in function agent_work.

I tried to make a fix to add a field/flag to the object context. This is not a good idea for the following reasons:
1) If making this filed/flag to be a persistent one, when resetting/clearing this flag, we need to persist it. This is not good for read request.
2) If making this field/flag not to be a persistent one, when the object context is removed from the cache ' object_contexts', this field/flag is removed as well. This object is removed in the later evicting. The same issue still exists.

So, I came up with a fix to add a set in the class ReplicatedPG to hold all the promoting objects. This fix is at https://github.com/ceph/ceph/pull/2374. It is tested and works well. Pls review and comment, thx.

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Wang, Zhiqiang
Sent: Monday, September 1, 2014 9:33 AM
To: Sage Weil
Cc: 'ceph-***@vger.kernel.org'
Subject: RE: Cache tiering slow request issue: currently waiting for rw locks

I don't think the object context is blocked at that time. It is un-blocked after the copying of data from base tier. It doesn't address the problem here. Anyway, I'll try it and see.

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Sage Weil
Sent: Saturday, August 30, 2014 10:29 AM
To: Wang, Zhiqiang
Cc: 'ceph-***@vger.kernel.org'
Subject: Re: Cache tiering slow request issue: currently waiting for rw locks

Hi,

Can you take a look at https://github.com/ceph/ceph/pull/2363 and see if that addresses the behavior you saw?

Thanks!
sage

Post by Wang, Zhiqiang
Hi all,
I've ran into this slow request issue some time ago. The problem is like this: when running with cache tieing, there are 'slow request' warning messages in the log file like below.
2014-08-29 10:18:24.669763 7f9b20f1b700 0 log [WRN] : 1 slow
requests, 1 included below; oldest blocked for > 30.996595 secs
2014-08-29 10:18:24.669768 7f9b20f1b700 0 log [WRN] : slow request
osd_op(client.114176.0:144919 rb.0.17f56.6b8b4567.000000000935
[sparse-read 3440640~4096] 45.cf45084b ack+read e26168) v4 currently
waiting for rw locks
Recently I made some changes to the log, captured this problem, and finally figured out its root cause. You can check the attachment for the logs.
There is a cache miss when doing read. During promotion, after copying the data from base tier osd, the cache tier primary osd replicates the data to other cache tier osds. Some times this takes quite a long time. During this period of time, the promoted object may be evicted because the cache tier is full. When the primary osd finally gets the replication response and restarts the original read request, it doesn't find the object in the cache tier, and do promotion again. This loops for several times, and we'll see the 'slow request' in the logs. Theoretically, this could loops forever, and the request from the client would never be finished.
Add a field in the object state, indicating the status of the promotion. It's set to true after the copy of data from base tier and before the replication. It's reset to false after the replication and the original client request starts to execute. Evicting is not allowed when this field is true.
What do you think?

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Sage Weil

2014-09-09 16:34:33 UTC

Post by Wang, Zhiqiang
Pasting the conversations in the pull request here.
wonzhq commented 6 days ago
liewegas commented 10 hours ago
Hmm, that's true that there is still that window. Is it necessary that this is completely air-tight, though? As long as we avoid evicting a newly-promoted object before the request is processed we will win. I'm afraid that a complicated mechanism to cover this could introduce more complexity than we need.
wonzhq commented 2 minutes ago
Tried to use ObjectContext::RWState::count to check the pending request. In my testing, it hit the slow request just once. I checked the log, it exactly falls into the window we talked above. So with this solution, it's possible that we still hit this issue, but much less than before. Should we go ahead with this solution?
Sage/Sam, what do you think?

I think it is definitely worth adding that check, even if it doesn't catch
the requeue case, because it is still useful to defer eviction if there is
a request queued for that object. That seems true at least in the
writeback cache mode... perhaps not so in other modes like forward.

I'm still not sure what would close the hole reliably. Perhaps a flag on
the obc indicating whether any request has touched it since the initial
promote? Maybe that, coupled with a time limit (so that eventually we can
still evict in case the original request never gets processed... e.g.
because the client disconnected before it was requeued or something).

?

sage

Post by Wang, Zhiqiang
-----Original Message-----
Sent: Friday, September 5, 2014 3:21 PM
Subject: RE: Cache tiering slow request issue: currently waiting for rw locks
I made some comments based on your comments of the pull request https://github.com/ceph/ceph/pull/2374. Can you take a look? Thx.
-----Original Message-----
Sent: Tuesday, September 2, 2014 2:54 PM
To: Sage Weil
Subject: RE: Cache tiering slow request issue: currently waiting for rw locks
Tried the pull request, checking the object is blocked or not doesn't work. Actually this check is already done in function agent_work.
1) If making this filed/flag to be a persistent one, when resetting/clearing this flag, we need to persist it. This is not good for read request.
2) If making this field/flag not to be a persistent one, when the object context is removed from the cache ' object_contexts', this field/flag is removed as well. This object is removed in the later evicting. The same issue still exists.
So, I came up with a fix to add a set in the class ReplicatedPG to hold all the promoting objects. This fix is at https://github.com/ceph/ceph/pull/2374. It is tested and works well. Pls review and comment, thx.
-----Original Message-----
Sent: Monday, September 1, 2014 9:33 AM
To: Sage Weil
Subject: RE: Cache tiering slow request issue: currently waiting for rw locks
I don't think the object context is blocked at that time. It is un-blocked after the copying of data from base tier. It doesn't address the problem here. Anyway, I'll try it and see.
-----Original Message-----
Sent: Saturday, August 30, 2014 10:29 AM
To: Wang, Zhiqiang
Subject: Re: Cache tiering slow request issue: currently waiting for rw locks
Hi,
Can you take a look at https://github.com/ceph/ceph/pull/2363 and see if that addresses the behavior you saw?
Thanks!
sage

Post by Wang, Zhiqiang
Hi all,
I've ran into this slow request issue some time ago. The problem is like this: when running with cache tieing, there are 'slow request' warning messages in the log file like below.
2014-08-29 10:18:24.669763 7f9b20f1b700 0 log [WRN] : 1 slow
requests, 1 included below; oldest blocked for > 30.996595 secs
2014-08-29 10:18:24.669768 7f9b20f1b700 0 log [WRN] : slow request
osd_op(client.114176.0:144919 rb.0.17f56.6b8b4567.000000000935
[sparse-read 3440640~4096] 45.cf45084b ack+read e26168) v4 currently
waiting for rw locks
Recently I made some changes to the log, captured this problem, and finally figured out its root cause. You can check the attachment for the logs.
There is a cache miss when doing read. During promotion, after copying the data from base tier osd, the cache tier primary osd replicates the data to other cache tier osds. Some times this takes quite a long time. During this period of time, the promoted object may be evicted because the cache tier is full. When the primary osd finally gets the replication response and restarts the original read request, it doesn't find the object in the cache tier, and do promotion again. This loops for several times, and we'll see the 'slow request' in the logs. Theoretically, this could loops forever, and the request from the client would never be finished.
Add a field in the object state, indicating the status of the promotion. It's set to true after the copy of data from base tier and before the replication. It's reset to false after the replication and the original client request starts to execute. Evicting is not allowed when this field is true.
What do you think?

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html

--
--
--
--

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Wang, Zhiqiang

2014-09-10 01:13:38 UTC

Post by Wang, Zhiqiang
-----Original Message-----
Sent: Wednesday, September 10, 2014 12:35 AM
To: Wang, Zhiqiang
Subject: RE: Cache tiering slow request issue: currently waiting for rw locks

completely air-tight, though? As long as we avoid evicting a newly-promoted
object before the request is processed we will win. I'm afraid that a
complicated mechanism to cover this could introduce more complexity than we
need.

Post by Wang, Zhiqiang
wonzhq commented 2 minutes ago
Tried to use ObjectContext::RWState::count to check the pending request. In

my testing, it hit the slow request just once. I checked the log, it exactly falls
into the window we talked above. So with this solution, it's possible that we still
hit this issue, but much less than before. Should we go ahead with this
solution?

Post by Wang, Zhiqiang
Sage/Sam, what do you think?

OK, I'll add this check first. But why we don't want this check in the forward mode?

Post by Wang, Zhiqiang
I'm still not sure what would close the hole reliably. Perhaps a flag on the obc
indicating whether any request has touched it since the initial promote?
Maybe that, coupled with a time limit (so that eventually we can still evict in
case the original request never gets processed... e.g.
because the client disconnected before it was requeued or something).

I've tried to add a flag in the obc before. But for some reasons, this didn't work well. I set
the flag since the initial promote. But later when checking this flag after the promotion,
sometimes this flag is not set. I haven't figured out the reason for this yet. I'm guessing it's
because we don't hold every obc in the 'object_contexts'. An obc is removed from it under
some conditions (e.g., reaching its size limit). So when an obc is removed, and the flag is not
persisted, we lose this flag when doing another 'get_object_context'. Is this true?

Post by Wang, Zhiqiang
?
sage

https://github.com/ceph/ceph/pull/2374. Can you take a look? Thx.

Post by Wang, Zhiqiang
-----Original Message-----
Sent: Tuesday, September 2, 2014 2:54 PM
To: Sage Weil
Subject: RE: Cache tiering slow request issue: currently waiting for rw locks
Tried the pull request, checking the object is blocked or not doesn't work.

Actually this check is already done in function agent_work.

Post by Wang, Zhiqiang
I tried to make a fix to add a field/flag to the object context. This is not a good
1) If making this filed/flag to be a persistent one, when resetting/clearing this

flag, we need to persist it. This is not good for read request.

Post by Wang, Zhiqiang
2) If making this field/flag not to be a persistent one, when the object context

is removed from the cache ' object_contexts', this field/flag is removed as well.
This object is removed in the later evicting. The same issue still exists.

Post by Wang, Zhiqiang
So, I came up with a fix to add a set in the class ReplicatedPG to hold all the

promoting objects. This fix is at https://github.com/ceph/ceph/pull/2374. It is
tested and works well. Pls review and comment, thx.

Post by Wang, Zhiqiang
-----Original Message-----
Sent: Monday, September 1, 2014 9:33 AM
To: Sage Weil
Subject: RE: Cache tiering slow request issue: currently waiting for rw locks
I don't think the object context is blocked at that time. It is un-blocked after

the copying of data from base tier. It doesn't address the problem here.
Anyway, I'll try it and see.

Post by Wang, Zhiqiang
-----Original Message-----
Sent: Saturday, August 30, 2014 10:29 AM
To: Wang, Zhiqiang
Subject: Re: Cache tiering slow request issue: currently waiting for rw locks
Hi,
Can you take a look at https://github.com/ceph/ceph/pull/2363 and see if

that addresses the behavior you saw?

Post by Wang, Zhiqiang
Thanks!
sage

Post by Sage Weil
Hi,
I've opened http://tracker.ceph.com/issues/9285 to track this.
I think you're right--we need a check in agent_maybe_evict() that
will skip objects that are being promoted. I suspect a flag on the
ObjectContext is enough?
sage

Post by Wang, Zhiqiang
Hi all,
I've ran into this slow request issue some time ago. The problem is like

this: when running with cache tieing, there are 'slow request' warning
messages in the log file like below.

Post by Wang, Zhiqiang
2014-08-29 10:18:24.669763 7f9b20f1b700 0 log [WRN] : 1 slow
requests, 1 included below; oldest blocked for > 30.996595 secs
2014-08-29 10:18:24.669768 7f9b20f1b700 0 log [WRN] : slow request
osd_op(client.114176.0:144919 rb.0.17f56.6b8b4567.000000000935
[sparse-read 3440640~4096] 45.cf45084b ack+read e26168) v4
currently waiting for rw locks
Recently I made some changes to the log, captured this problem, and

finally figured out its root cause. You can check the attachment for the logs.

Post by Wang, Zhiqiang
There is a cache miss when doing read. During promotion, after copying

the data from base tier osd, the cache tier primary osd replicates the data to
other cache tier osds. Some times this takes quite a long time. During this
period of time, the promoted object may be evicted because the cache tier is
full. When the primary osd finally gets the replication response and restarts the
original read request, it doesn't find the object in the cache tier, and do
promotion again. This loops for several times, and we'll see the 'slow request'
in the logs. Theoretically, this could loops forever, and the request from the
client would never be finished.

Post by Wang, Zhiqiang
Add a field in the object state, indicating the status of the promotion. It's

set to true after the copy of data from base tier and before the replication. It's
reset to false after the replication and the original client request starts to
execute. Evicting is not allowed when this field is true.

Post by Wang, Zhiqiang
What do you think?

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Wang, Zhiqiang

2014-09-16 08:38:56 UTC