Discussion:
10/7/2014 Weekly Ceph Performance Meeting
Mark Nelson
2014-10-08 00:51:21 UTC
Permalink
Hi All,

Just a remind that the weekly performance meeting is on Wednesdays at
8AM PST. Same bat time, same bat channel!

Etherpad URL:
http://pad.ceph.com/p/performance_weekly

To join the Meeting:
https://bluejeans.com/268261044

To join via Browser:
https://bluejeans.com/268261044/browser

To join with Lync:
https://bluejeans.com/268261044/lync


To join via Room System:
Video Conferencing System: bjn.vc -or- 199.48.152.152
Meeting ID: 268261044

To join via Phone:
1) Dial:
+1 408 740 7256
+1 888 240 2560(US Toll Free)
+1 408 317 9253(Alternate Number)
(see all numbers - http://bluejeans.com/numbers)
2) Enter Conference ID: 268261044

Mark
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Andreas Bluemle
2014-10-08 16:32:34 UTC
Permalink
Hi,

as mentioned during today's meeting, here are the kernel
boot parameters which I found to provide the basis for
good performance results:

processor.max_cstate=0
intel_idle.max_cstate=0

I understand these to basically turn off any power saving modes
of the CPU; the CPU's we are using are like
Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz
Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz

At the BIOS level, we
- turn off Hyperthraeding
- turn off Turbo mode (in order ot not leave the specifications)
- turn on frequency floor override

We also assert that
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
is set to "performance"

Using above we see a constant frequency at the maximum level
allowed by the CPU (except Turbo mode).


Best Regards

Andreas Bluemle






On Wed, 8 Oct 2014 02:51:21 +0200
Post by Mark Nelson
Hi All,
Just a remind that the weekly performance meeting is on Wednesdays at
8AM PST. Same bat time, same bat channel!
http://pad.ceph.com/p/performance_weekly
https://bluejeans.com/268261044
https://bluejeans.com/268261044/browser
https://bluejeans.com/268261044/lync
Video Conferencing System: bjn.vc -or- 199.48.152.152
Meeting ID: 268261044
+1 408 740 7256
+1 888 240 2560(US Toll Free)
+1 408 317 9253(Alternate Number)
(see all numbers - http://bluejeans.com/numbers)
2) Enter Conference ID: 268261044
Mark
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Andreas Bluemle mailto:***@itxperts.de
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910

Company details: http://www.itxperts.de/imprint.htm
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Somnath Roy
2014-10-08 17:38:26 UTC
Permalink
Thanks Andres for sharing this. I will try those out.
BTW, I am using Ubuntu 14.04 LTS and couldn't find any sysfs entry like 'cpufreq'..

***@stormeap-4:~# ll /sys/devices/system/cpu/cpu10/
cache/ crash_notes driver/ microcode/ online subsystem/ topology/
cpuidle/ crash_notes_size firmware_node/ node0/ power/ thermal_throttle/ uevent

I am using Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz.

Regards
Somnath

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Andreas Bluemle
Sent: Wednesday, October 08, 2014 9:33 AM
To: ceph-***@vger.kernel.org
Subject: Re: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Hi,

as mentioned during today's meeting, here are the kernel boot parameters which I found to provide the basis for good performance results:

processor.max_cstate=0
intel_idle.max_cstate=0

I understand these to basically turn off any power saving modes of the CPU; the CPU's we are using are like
Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz
Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz

At the BIOS level, we
- turn off Hyperthraeding
- turn off Turbo mode (in order ot not leave the specifications)
- turn on frequency floor override

We also assert that
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
is set to "performance"

Using above we see a constant frequency at the maximum level allowed by the CPU (except Turbo mode).


Best Regards

Andreas Bluemle






On Wed, 8 Oct 2014 02:51:21 +0200
Post by Mark Nelson
Hi All,
Just a remind that the weekly performance meeting is on Wednesdays at
8AM PST. Same bat time, same bat channel!
http://pad.ceph.com/p/performance_weekly
https://bluejeans.com/268261044
https://bluejeans.com/268261044/browser
https://bluejeans.com/268261044/lync
268261044
+1 408 740 7256
+1 888 240 2560(US Toll Free)
+1 408 317 9253(Alternate Number)
(see all numbers - http://bluejeans.com/numbers)
2) Enter Conference ID: 268261044
Mark
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html
--
Andreas Bluemle mailto:***@itxperts.de
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910

Company details: http://www.itxperts.de/imprint.htm
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

________________________________

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Duan, Jiangang
2014-10-08 17:47:10 UTC
Permalink
Can you guys share the w/ HT and w/o HT data? I want to take a look at that to understand why.

-jiangang

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Somnath Roy
Sent: Wednesday, October 08, 2014 10:38 AM
To: Andreas Bluemle; ceph-***@vger.kernel.org
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Thanks Andres for sharing this. I will try those out.
BTW, I am using Ubuntu 14.04 LTS and couldn't find any sysfs entry like 'cpufreq'..

***@stormeap-4:~# ll /sys/devices/system/cpu/cpu10/
cache/ crash_notes driver/ microcode/ online subsystem/ topology/
cpuidle/ crash_notes_size firmware_node/ node0/ power/ thermal_throttle/ uevent

I am using Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz.

Regards
Somnath

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Andreas Bluemle
Sent: Wednesday, October 08, 2014 9:33 AM
To: ceph-***@vger.kernel.org
Subject: Re: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Hi,

as mentioned during today's meeting, here are the kernel boot parameters which I found to provide the basis for good performance results:

processor.max_cstate=0
intel_idle.max_cstate=0

I understand these to basically turn off any power saving modes of the CPU; the CPU's we are using are like
Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz
Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz

At the BIOS level, we
- turn off Hyperthraeding
- turn off Turbo mode (in order ot not leave the specifications)
- turn on frequency floor override

We also assert that
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
is set to "performance"

Using above we see a constant frequency at the maximum level allowed by the CPU (except Turbo mode).


Best Regards

Andreas Bluemle






On Wed, 8 Oct 2014 02:51:21 +0200
Post by Mark Nelson
Hi All,
Just a remind that the weekly performance meeting is on Wednesdays at
8AM PST. Same bat time, same bat channel!
http://pad.ceph.com/p/performance_weekly
https://bluejeans.com/268261044
https://bluejeans.com/268261044/browser
https://bluejeans.com/268261044/lync
268261044
+1 408 740 7256
+1 888 240 2560(US Toll Free)
+1 408 317 9253(Alternate Number)
(see all numbers - http://bluejeans.com/numbers)
2) Enter Conference ID: 268261044
Mark
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html
--
Andreas Bluemle mailto:***@itxperts.de
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910

Company details: http://www.itxperts.de/imprint.htm
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

________________________________

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Somnath Roy
2014-10-08 17:53:16 UTC
Permalink
Hi Jiangang,
Give me a day or two, I will gather all the data and share with community.

Thanks & Regards
Somnath

-----Original Message-----
From: Duan, Jiangang [mailto:***@intel.com]
Sent: Wednesday, October 08, 2014 10:47 AM
To: Somnath Roy; Andreas Bluemle; ceph-***@vger.kernel.org
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Can you guys share the w/ HT and w/o HT data? I want to take a look at that to understand why.

-jiangang

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Somnath Roy
Sent: Wednesday, October 08, 2014 10:38 AM
To: Andreas Bluemle; ceph-***@vger.kernel.org
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Thanks Andres for sharing this. I will try those out.
BTW, I am using Ubuntu 14.04 LTS and couldn't find any sysfs entry like 'cpufreq'..

***@stormeap-4:~# ll /sys/devices/system/cpu/cpu10/
cache/ crash_notes driver/ microcode/ online subsystem/ topology/
cpuidle/ crash_notes_size firmware_node/ node0/ power/ thermal_throttle/ uevent

I am using Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz.

Regards
Somnath

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Andreas Bluemle
Sent: Wednesday, October 08, 2014 9:33 AM
To: ceph-***@vger.kernel.org
Subject: Re: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Hi,

as mentioned during today's meeting, here are the kernel boot parameters which I found to provide the basis for good performance results:

processor.max_cstate=0
intel_idle.max_cstate=0

I understand these to basically turn off any power saving modes of the CPU; the CPU's we are using are like
Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz
Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz

At the BIOS level, we
- turn off Hyperthraeding
- turn off Turbo mode (in order ot not leave the specifications)
- turn on frequency floor override

We also assert that
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
is set to "performance"

Using above we see a constant frequency at the maximum level allowed by the CPU (except Turbo mode).


Best Regards

Andreas Bluemle






On Wed, 8 Oct 2014 02:51:21 +0200
Post by Mark Nelson
Hi All,
Just a remind that the weekly performance meeting is on Wednesdays at
8AM PST. Same bat time, same bat channel!
http://pad.ceph.com/p/performance_weekly
https://bluejeans.com/268261044
https://bluejeans.com/268261044/browser
https://bluejeans.com/268261044/lync
268261044
+1 408 740 7256
+1 888 240 2560(US Toll Free)
+1 408 317 9253(Alternate Number)
(see all numbers - http://bluejeans.com/numbers)
2) Enter Conference ID: 268261044
Mark
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html
--
Andreas Bluemle mailto:***@itxperts.de
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910

Company details: http://www.itxperts.de/imprint.htm
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

________________________________

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Duan, Jiangang
2014-10-08 20:03:23 UTC
Permalink
Sound good. Thanks. -jiangang

-----Original Message-----
From: Somnath Roy [mailto:***@sandisk.com]
Sent: Wednesday, October 08, 2014 10:53 AM
To: Duan, Jiangang; Andreas Bluemle; ceph-***@vger.kernel.org
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Hi Jiangang,
Give me a day or two, I will gather all the data and share with community.

Thanks & Regards
Somnath

-----Original Message-----
From: Duan, Jiangang [mailto:***@intel.com]
Sent: Wednesday, October 08, 2014 10:47 AM
To: Somnath Roy; Andreas Bluemle; ceph-***@vger.kernel.org
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Can you guys share the w/ HT and w/o HT data? I want to take a look at that to understand why.

-jiangang

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Somnath Roy
Sent: Wednesday, October 08, 2014 10:38 AM
To: Andreas Bluemle; ceph-***@vger.kernel.org
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Thanks Andres for sharing this. I will try those out.
BTW, I am using Ubuntu 14.04 LTS and couldn't find any sysfs entry like 'cpufreq'..

***@stormeap-4:~# ll /sys/devices/system/cpu/cpu10/
cache/ crash_notes driver/ microcode/ online subsystem/ topology/
cpuidle/ crash_notes_size firmware_node/ node0/ power/ thermal_throttle/ uevent

I am using Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz.

Regards
Somnath

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Andreas Bluemle
Sent: Wednesday, October 08, 2014 9:33 AM
To: ceph-***@vger.kernel.org
Subject: Re: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Hi,

as mentioned during today's meeting, here are the kernel boot parameters which I found to provide the basis for good performance results:

processor.max_cstate=0
intel_idle.max_cstate=0

I understand these to basically turn off any power saving modes of the CPU; the CPU's we are using are like
Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz
Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz

At the BIOS level, we
- turn off Hyperthraeding
- turn off Turbo mode (in order ot not leave the specifications)
- turn on frequency floor override

We also assert that
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
is set to "performance"

Using above we see a constant frequency at the maximum level allowed by the CPU (except Turbo mode).


Best Regards

Andreas Bluemle






On Wed, 8 Oct 2014 02:51:21 +0200
Post by Mark Nelson
Hi All,
Just a remind that the weekly performance meeting is on Wednesdays at
8AM PST. Same bat time, same bat channel!
http://pad.ceph.com/p/performance_weekly
https://bluejeans.com/268261044
https://bluejeans.com/268261044/browser
https://bluejeans.com/268261044/lync
268261044
+1 408 740 7256
+1 888 240 2560(US Toll Free)
+1 408 317 9253(Alternate Number)
(see all numbers - http://bluejeans.com/numbers)
2) Enter Conference ID: 268261044
Mark
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html
--
Andreas Bluemle mailto:***@itxperts.de
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910

Company details: http://www.itxperts.de/imprint.htm
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

________________________________

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Somnath Roy
2014-10-09 00:50:52 UTC
Permalink
Hi Jiangang,
I managed to get some data for you but it's for a 3 node cluster. I will try to get data for single node as well.

Test config:
-------------

Cluster and rbd node config:
----------------------------------
"2x E5-2680 10C 2.8GHz 25M
8x 16GB RDIMM, dual rank x4 (128GB)
Mellanox MT27500 40 Gigabit Ethernet
LSI 9207 SAS HBA"

8 X 800 GB SSDs (Optimus Eco) per cluster node

3 cluster nodes + 3 rbd nodes

Total storage ~ 19 TB

We have total 24 OSDs running , each node has 8 OSDs/SSD

Configured 3 pools with 528 PGs/pool and 6 RBDs/pool . Each RBD image size is ~230G.

We have tried on 64K_RR_QD64 workload here.

HT_ENABLE
--------------

IOPS : 112500
Throughput (MB/S): 7012
Avg Resp.Time (m.sec): 17
Max Resp.Time (m.sec): 3184

HT_DISABLE
--------------

IOPS : 120864
Throughput (MB/S): 7530
Avg Resp.Time (m.sec): 11
Max Resp.Time (m.sec): 1056


So, ~7% iop increase but response time decrease is ~35% which is real good.

Thanks & Regards
Somnath

-----Original Message-----
From: Duan, Jiangang [mailto:***@intel.com]
Sent: Wednesday, October 08, 2014 1:03 PM
To: Somnath Roy; Andreas Bluemle; ceph-***@vger.kernel.org
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Sound good. Thanks. -jiangang

-----Original Message-----
From: Somnath Roy [mailto:***@sandisk.com]
Sent: Wednesday, October 08, 2014 10:53 AM
To: Duan, Jiangang; Andreas Bluemle; ceph-***@vger.kernel.org
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Hi Jiangang,
Give me a day or two, I will gather all the data and share with community.

Thanks & Regards
Somnath

-----Original Message-----
From: Duan, Jiangang [mailto:***@intel.com]
Sent: Wednesday, October 08, 2014 10:47 AM
To: Somnath Roy; Andreas Bluemle; ceph-***@vger.kernel.org
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Can you guys share the w/ HT and w/o HT data? I want to take a look at that to understand why.

-jiangang

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Somnath Roy
Sent: Wednesday, October 08, 2014 10:38 AM
To: Andreas Bluemle; ceph-***@vger.kernel.org
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Thanks Andres for sharing this. I will try those out.
BTW, I am using Ubuntu 14.04 LTS and couldn't find any sysfs entry like 'cpufreq'..

***@stormeap-4:~# ll /sys/devices/system/cpu/cpu10/
cache/ crash_notes driver/ microcode/ online subsystem/ topology/
cpuidle/ crash_notes_size firmware_node/ node0/ power/ thermal_throttle/ uevent

I am using Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz.

Regards
Somnath

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Andreas Bluemle
Sent: Wednesday, October 08, 2014 9:33 AM
To: ceph-***@vger.kernel.org
Subject: Re: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Hi,

as mentioned during today's meeting, here are the kernel boot parameters which I found to provide the basis for good performance results:

processor.max_cstate=0
intel_idle.max_cstate=0

I understand these to basically turn off any power saving modes of the CPU; the CPU's we are using are like
Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz
Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz

At the BIOS level, we
- turn off Hyperthraeding
- turn off Turbo mode (in order ot not leave the specifications)
- turn on frequency floor override

We also assert that
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
is set to "performance"

Using above we see a constant frequency at the maximum level allowed by the CPU (except Turbo mode).


Best Regards

Andreas Bluemle






On Wed, 8 Oct 2014 02:51:21 +0200
Post by Mark Nelson
Hi All,
Just a remind that the weekly performance meeting is on Wednesdays at
8AM PST. Same bat time, same bat channel!
http://pad.ceph.com/p/performance_weekly
https://bluejeans.com/268261044
https://bluejeans.com/268261044/browser
https://bluejeans.com/268261044/lync
268261044
+1 408 740 7256
+1 888 240 2560(US Toll Free)
+1 408 317 9253(Alternate Number)
(see all numbers - http://bluejeans.com/numbers)
2) Enter Conference ID: 268261044
Mark
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html
--
Andreas Bluemle mailto:***@itxperts.de
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910

Company details: http://www.itxperts.de/imprint.htm
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

________________________________

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Mark Nelson
2014-10-09 01:07:33 UTC
Permalink
Hi Somnath,

Was this with HT enabled/disabled on both the cluster and the RBD nodes?

Mark
Post by Somnath Roy
Hi Jiangang,
I managed to get some data for you but it's for a 3 node cluster. I will try to get data for single node as well.
-------------
----------------------------------
"2x E5-2680 10C 2.8GHz 25M
8x 16GB RDIMM, dual rank x4 (128GB)
Mellanox MT27500 40 Gigabit Ethernet
LSI 9207 SAS HBA"
8 X 800 GB SSDs (Optimus Eco) per cluster node
3 cluster nodes + 3 rbd nodes
Total storage ~ 19 TB
We have total 24 OSDs running , each node has 8 OSDs/SSD
Configured 3 pools with 528 PGs/pool and 6 RBDs/pool . Each RBD image size is ~230G.
We have tried on 64K_RR_QD64 workload here.
HT_ENABLE
--------------
IOPS : 112500
Throughput (MB/S): 7012
Avg Resp.Time (m.sec): 17
Max Resp.Time (m.sec): 3184
HT_DISABLE
--------------
IOPS : 120864
Throughput (MB/S): 7530
Avg Resp.Time (m.sec): 11
Max Resp.Time (m.sec): 1056
So, ~7% iop increase but response time decrease is ~35% which is real good.
Thanks & Regards
Somnath
-----Original Message-----
Sent: Wednesday, October 08, 2014 1:03 PM
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params
Sound good. Thanks. -jiangang
-----Original Message-----
Sent: Wednesday, October 08, 2014 10:53 AM
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params
Hi Jiangang,
Give me a day or two, I will gather all the data and share with community.
Thanks & Regards
Somnath
-----Original Message-----
Sent: Wednesday, October 08, 2014 10:47 AM
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params
Can you guys share the w/ HT and w/o HT data? I want to take a look at that to understand why.
-jiangang
-----Original Message-----
Sent: Wednesday, October 08, 2014 10:38 AM
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params
Thanks Andres for sharing this. I will try those out.
BTW, I am using Ubuntu 14.04 LTS and couldn't find any sysfs entry like 'cpufreq'..
cache/ crash_notes driver/ microcode/ online subsystem/ topology/
cpuidle/ crash_notes_size firmware_node/ node0/ power/ thermal_throttle/ uevent
Regards
Somnath
-----Original Message-----
Sent: Wednesday, October 08, 2014 9:33 AM
Subject: Re: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params
Hi,
processor.max_cstate=0
intel_idle.max_cstate=0
I understand these to basically turn off any power saving modes of the CPU; the CPU's we are using are like
At the BIOS level, we
- turn off Hyperthraeding
- turn off Turbo mode (in order ot not leave the specifications)
- turn on frequency floor override
We also assert that
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
is set to "performance"
Using above we see a constant frequency at the maximum level allowed by the CPU (except Turbo mode).
Best Regards
Andreas Bluemle
On Wed, 8 Oct 2014 02:51:21 +0200
Post by Mark Nelson
Hi All,
Just a remind that the weekly performance meeting is on Wednesdays at
8AM PST. Same bat time, same bat channel!
http://pad.ceph.com/p/performance_weekly
https://bluejeans.com/268261044
https://bluejeans.com/268261044/browser
https://bluejeans.com/268261044/lync
268261044
+1 408 740 7256
+1 888 240 2560(US Toll Free)
+1 408 317 9253(Alternate Number)
(see all numbers - http://bluejeans.com/numbers)
2) Enter Conference ID: 268261044
Mark
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html
--
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910
Company details: http://www.itxperts.de/imprint.htm
--
________________________________
PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
--
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Somnath Roy
2014-10-09 06:45:01 UTC
Permalink
Yes, Mark...

-----Original Message-----
From: Mark Nelson [mailto:***@inktank.com]
Sent: Wednesday, October 08, 2014 6:08 PM
To: Somnath Roy; Duan, Jiangang; Andreas Bluemle; ceph-***@vger.kernel.org
Subject: Re: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Hi Somnath,

Was this with HT enabled/disabled on both the cluster and the RBD nodes?

Mark
Post by Somnath Roy
Hi Jiangang,
I managed to get some data for you but it's for a 3 node cluster. I will try to get data for single node as well.
-------------
----------------------------------
"2x E5-2680 10C 2.8GHz 25M
8x 16GB RDIMM, dual rank x4 (128GB)
Mellanox MT27500 40 Gigabit Ethernet
LSI 9207 SAS HBA"
8 X 800 GB SSDs (Optimus Eco) per cluster node
3 cluster nodes + 3 rbd nodes
Total storage ~ 19 TB
We have total 24 OSDs running , each node has 8 OSDs/SSD
Configured 3 pools with 528 PGs/pool and 6 RBDs/pool . Each RBD image size is ~230G.
We have tried on 64K_RR_QD64 workload here.
HT_ENABLE
--------------
IOPS : 112500
Throughput (MB/S): 7012
Avg Resp.Time (m.sec): 17
Max Resp.Time (m.sec): 3184
HT_DISABLE
--------------
IOPS : 120864
Throughput (MB/S): 7530
Avg Resp.Time (m.sec): 11
Max Resp.Time (m.sec): 1056
So, ~7% iop increase but response time decrease is ~35% which is real good.
Thanks & Regards
Somnath
-----Original Message-----
Sent: Wednesday, October 08, 2014 1:03 PM
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params
Sound good. Thanks. -jiangang
-----Original Message-----
Sent: Wednesday, October 08, 2014 10:53 AM
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params
Hi Jiangang,
Give me a day or two, I will gather all the data and share with community.
Thanks & Regards
Somnath
-----Original Message-----
Sent: Wednesday, October 08, 2014 10:47 AM
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params
Can you guys share the w/ HT and w/o HT data? I want to take a look at that to understand why.
-jiangang
-----Original Message-----
Sent: Wednesday, October 08, 2014 10:38 AM
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params
Thanks Andres for sharing this. I will try those out.
BTW, I am using Ubuntu 14.04 LTS and couldn't find any sysfs entry like 'cpufreq'..
cache/ crash_notes driver/ microcode/ online subsystem/ topology/
cpuidle/ crash_notes_size firmware_node/ node0/ power/ thermal_throttle/ uevent
Regards
Somnath
-----Original Message-----
Sent: Wednesday, October 08, 2014 9:33 AM
Subject: Re: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params
Hi,
processor.max_cstate=0
intel_idle.max_cstate=0
I understand these to basically turn off any power saving modes of the CPU; the CPU's we are using are like
At the BIOS level, we
- turn off Hyperthraeding
- turn off Turbo mode (in order ot not leave the specifications)
- turn on frequency floor override
We also assert that
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
is set to "performance"
Using above we see a constant frequency at the maximum level allowed by the CPU (except Turbo mode).
Best Regards
Andreas Bluemle
On Wed, 8 Oct 2014 02:51:21 +0200
Post by Mark Nelson
Hi All,
Just a remind that the weekly performance meeting is on Wednesdays at
8AM PST. Same bat time, same bat channel!
http://pad.ceph.com/p/performance_weekly
https://bluejeans.com/268261044
https://bluejeans.com/268261044/browser
https://bluejeans.com/268261044/lync
268261044
+1 408 740 7256
+1 888 240 2560(US Toll Free)
+1 408 317 9253(Alternate Number)
(see all numbers - http://bluejeans.com/numbers)
2) Enter Conference ID: 268261044
Mark
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html
--
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910
Company details: http://www.itxperts.de/imprint.htm
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html
________________________________
PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Duan, Jiangang
2014-10-10 23:39:07 UTC
Permalink
Thanks. let's try do this test on our setup.
BTW, what workload you use here?

-jiangang

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Somnath Roy
Sent: Wednesday, October 08, 2014 5:51 PM
To: Duan, Jiangang; Andreas Bluemle; ceph-***@vger.kernel.org
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Hi Jiangang,
I managed to get some data for you but it's for a 3 node cluster. I will try to get data for single node as well.

Test config:
-------------

Cluster and rbd node config:
----------------------------------
"2x E5-2680 10C 2.8GHz 25M
8x 16GB RDIMM, dual rank x4 (128GB)
Mellanox MT27500 40 Gigabit Ethernet
LSI 9207 SAS HBA"

8 X 800 GB SSDs (Optimus Eco) per cluster node

3 cluster nodes + 3 rbd nodes

Total storage ~ 19 TB

We have total 24 OSDs running , each node has 8 OSDs/SSD

Configured 3 pools with 528 PGs/pool and 6 RBDs/pool . Each RBD image size is ~230G.

We have tried on 64K_RR_QD64 workload here.

HT_ENABLE
--------------

IOPS : 112500
Throughput (MB/S): 7012
Avg Resp.Time (m.sec): 17
Max Resp.Time (m.sec): 3184

HT_DISABLE
--------------

IOPS : 120864
Throughput (MB/S): 7530
Avg Resp.Time (m.sec): 11
Max Resp.Time (m.sec): 1056


So, ~7% iop increase but response time decrease is ~35% which is real good.

Thanks & Regards
Somnath

-----Original Message-----
From: Duan, Jiangang [mailto:***@intel.com]
Sent: Wednesday, October 08, 2014 1:03 PM
To: Somnath Roy; Andreas Bluemle; ceph-***@vger.kernel.org
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Sound good. Thanks. -jiangang

-----Original Message-----
From: Somnath Roy [mailto:***@sandisk.com]
Sent: Wednesday, October 08, 2014 10:53 AM
To: Duan, Jiangang; Andreas Bluemle; ceph-***@vger.kernel.org
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Hi Jiangang,
Give me a day or two, I will gather all the data and share with community.

Thanks & Regards
Somnath

-----Original Message-----
From: Duan, Jiangang [mailto:***@intel.com]
Sent: Wednesday, October 08, 2014 10:47 AM
To: Somnath Roy; Andreas Bluemle; ceph-***@vger.kernel.org
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Can you guys share the w/ HT and w/o HT data? I want to take a look at that to understand why.

-jiangang

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Somnath Roy
Sent: Wednesday, October 08, 2014 10:38 AM
To: Andreas Bluemle; ceph-***@vger.kernel.org
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Thanks Andres for sharing this. I will try those out.
BTW, I am using Ubuntu 14.04 LTS and couldn't find any sysfs entry like 'cpufreq'..

***@stormeap-4:~# ll /sys/devices/system/cpu/cpu10/
cache/ crash_notes driver/ microcode/ online subsystem/ topology/
cpuidle/ crash_notes_size firmware_node/ node0/ power/ thermal_throttle/ uevent

I am using Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz.

Regards
Somnath

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Andreas Bluemle
Sent: Wednesday, October 08, 2014 9:33 AM
To: ceph-***@vger.kernel.org
Subject: Re: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Hi,

as mentioned during today's meeting, here are the kernel boot parameters which I found to provide the basis for good performance results:

processor.max_cstate=0
intel_idle.max_cstate=0

I understand these to basically turn off any power saving modes of the CPU; the CPU's we are using are like
Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz
Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz

At the BIOS level, we
- turn off Hyperthraeding
- turn off Turbo mode (in order ot not leave the specifications)
- turn on frequency floor override

We also assert that
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
is set to "performance"

Using above we see a constant frequency at the maximum level allowed by the CPU (except Turbo mode).


Best Regards

Andreas Bluemle






On Wed, 8 Oct 2014 02:51:21 +0200
Post by Mark Nelson
Hi All,
Just a remind that the weekly performance meeting is on Wednesdays at
8AM PST. Same bat time, same bat channel!
http://pad.ceph.com/p/performance_weekly
https://bluejeans.com/268261044
https://bluejeans.com/268261044/browser
https://bluejeans.com/268261044/lync
268261044
+1 408 740 7256
+1 888 240 2560(US Toll Free)
+1 408 317 9253(Alternate Number)
(see all numbers - http://bluejeans.com/numbers)
2) Enter Conference ID: 268261044
Mark
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html
--
Andreas Bluemle mailto:***@itxperts.de
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910

Company details: http://www.itxperts.de/imprint.htm
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

________________________________

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Somnath Roy
2014-10-10 23:43:06 UTC
Permalink
As I mentioned, total workload is ~19 TB...Each RBD is ~230 GB and io_size = 64K..

-----Original Message-----
From: Duan, Jiangang [mailto:***@intel.com]
Sent: Friday, October 10, 2014 4:39 PM
To: Somnath Roy; Andreas Bluemle; ceph-***@vger.kernel.org
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Thanks. let's try do this test on our setup.
BTW, what workload you use here?

-jiangang

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Somnath Roy
Sent: Wednesday, October 08, 2014 5:51 PM
To: Duan, Jiangang; Andreas Bluemle; ceph-***@vger.kernel.org
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Hi Jiangang,
I managed to get some data for you but it's for a 3 node cluster. I will try to get data for single node as well.

Test config:
-------------

Cluster and rbd node config:
----------------------------------
"2x E5-2680 10C 2.8GHz 25M
8x 16GB RDIMM, dual rank x4 (128GB)
Mellanox MT27500 40 Gigabit Ethernet
LSI 9207 SAS HBA"

8 X 800 GB SSDs (Optimus Eco) per cluster node

3 cluster nodes + 3 rbd nodes

Total storage ~ 19 TB

We have total 24 OSDs running , each node has 8 OSDs/SSD

Configured 3 pools with 528 PGs/pool and 6 RBDs/pool . Each RBD image size is ~230G.

We have tried on 64K_RR_QD64 workload here.

HT_ENABLE
--------------

IOPS : 112500
Throughput (MB/S): 7012
Avg Resp.Time (m.sec): 17
Max Resp.Time (m.sec): 3184

HT_DISABLE
--------------

IOPS : 120864
Throughput (MB/S): 7530
Avg Resp.Time (m.sec): 11
Max Resp.Time (m.sec): 1056


So, ~7% iop increase but response time decrease is ~35% which is real good.

Thanks & Regards
Somnath

-----Original Message-----
From: Duan, Jiangang [mailto:***@intel.com]
Sent: Wednesday, October 08, 2014 1:03 PM
To: Somnath Roy; Andreas Bluemle; ceph-***@vger.kernel.org
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Sound good. Thanks. -jiangang

-----Original Message-----
From: Somnath Roy [mailto:***@sandisk.com]
Sent: Wednesday, October 08, 2014 10:53 AM
To: Duan, Jiangang; Andreas Bluemle; ceph-***@vger.kernel.org
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Hi Jiangang,
Give me a day or two, I will gather all the data and share with community.

Thanks & Regards
Somnath

-----Original Message-----
From: Duan, Jiangang [mailto:***@intel.com]
Sent: Wednesday, October 08, 2014 10:47 AM
To: Somnath Roy; Andreas Bluemle; ceph-***@vger.kernel.org
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Can you guys share the w/ HT and w/o HT data? I want to take a look at that to understand why.

-jiangang

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Somnath Roy
Sent: Wednesday, October 08, 2014 10:38 AM
To: Andreas Bluemle; ceph-***@vger.kernel.org
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Thanks Andres for sharing this. I will try those out.
BTW, I am using Ubuntu 14.04 LTS and couldn't find any sysfs entry like 'cpufreq'..

***@stormeap-4:~# ll /sys/devices/system/cpu/cpu10/
cache/ crash_notes driver/ microcode/ online subsystem/ topology/
cpuidle/ crash_notes_size firmware_node/ node0/ power/ thermal_throttle/ uevent

I am using Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz.

Regards
Somnath

-----Original Message-----
From: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.kernel.org] On Behalf Of Andreas Bluemle
Sent: Wednesday, October 08, 2014 9:33 AM
To: ceph-***@vger.kernel.org
Subject: Re: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params

Hi,

as mentioned during today's meeting, here are the kernel boot parameters which I found to provide the basis for good performance results:

processor.max_cstate=0
intel_idle.max_cstate=0

I understand these to basically turn off any power saving modes of the CPU; the CPU's we are using are like
Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz
Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz

At the BIOS level, we
- turn off Hyperthraeding
- turn off Turbo mode (in order ot not leave the specifications)
- turn on frequency floor override

We also assert that
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
is set to "performance"

Using above we see a constant frequency at the maximum level allowed by the CPU (except Turbo mode).


Best Regards

Andreas Bluemle






On Wed, 8 Oct 2014 02:51:21 +0200
Post by Mark Nelson
Hi All,
Just a remind that the weekly performance meeting is on Wednesdays at
8AM PST. Same bat time, same bat channel!
http://pad.ceph.com/p/performance_weekly
https://bluejeans.com/268261044
https://bluejeans.com/268261044/browser
https://bluejeans.com/268261044/lync
268261044
+1 408 740 7256
+1 888 240 2560(US Toll Free)
+1 408 317 9253(Alternate Number)
(see all numbers - http://bluejeans.com/numbers)
2) Enter Conference ID: 268261044
Mark
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html
--
Andreas Bluemle mailto:***@itxperts.de
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910

Company details: http://www.itxperts.de/imprint.htm
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

________________________________

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to ***@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Loic Dachary
2014-10-08 17:57:56 UTC
Permalink
Hi,
Post by Somnath Roy
Thanks Andres for sharing this. I will try those out.
BTW, I am using Ubuntu 14.04 LTS and couldn't find any sysfs entry like 'cpufreq'..
cache/ crash_notes driver/ microcode/ online subsystem/ topology/
cpuidle/ crash_notes_size firmware_node/ node0/ power/ thermal_throttle/ uevent
-rw-r--r-- 1 root root 4096 oct. 8 17:31 /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
-rw-r--r-- 1 root root 4096 oct. 8 17:31 /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
-rw-r--r-- 1 root root 4096 oct. 8 17:31 /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
-rw-r--r-- 1 root root 4096 oct. 8 17:31 /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
-rw-r--r-- 1 root root 4096 oct. 8 17:31 /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
-rw-r--r-- 1 root root 4096 oct. 8 17:31 /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
-rw-r--r-- 1 root root 4096 oct. 8 17:31 /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor
-rw-r--r-- 1 root root 4096 oct. 8 17:31 /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor

model name : Intel(R) Core(TM) i7-4900MQ CPU @ 2.80GHz

$ lsb_release -d
Description: Ubuntu Trusty Tahr (development branch)

Cheers
Post by Somnath Roy
Regards
Somnath
-----Original Message-----
Sent: Wednesday, October 08, 2014 9:33 AM
Subject: Re: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params
Hi,
processor.max_cstate=0
intel_idle.max_cstate=0
I understand these to basically turn off any power saving modes of the CPU; the CPU's we are using are like
At the BIOS level, we
- turn off Hyperthraeding
- turn off Turbo mode (in order ot not leave the specifications)
- turn on frequency floor override
We also assert that
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
is set to "performance"
Using above we see a constant frequency at the maximum level allowed by the CPU (except Turbo mode).
Best Regards
Andreas Bluemle
On Wed, 8 Oct 2014 02:51:21 +0200
Post by Mark Nelson
Hi All,
Just a remind that the weekly performance meeting is on Wednesdays at
8AM PST. Same bat time, same bat channel!
http://pad.ceph.com/p/performance_weekly
https://bluejeans.com/268261044
https://bluejeans.com/268261044/browser
https://bluejeans.com/268261044/lync
268261044
+1 408 740 7256
+1 888 240 2560(US Toll Free)
+1 408 317 9253(Alternate Number)
(see all numbers - http://bluejeans.com/numbers)
2) Enter Conference ID: 268261044
Mark
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html
--
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910
Company details: http://www.itxperts.de/imprint.htm
--
________________________________
PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Loïc Dachary, Artisan Logiciel Libre
Alexandre DERUMIER
2014-10-08 18:07:45 UTC
Permalink
hi,
BTW, I am using Ubuntu 14.04 LTS and couldn't find any sysfs entry li=
ke 'cpufreq'..=20

check this arch wiki about kernel modules needed
https://wiki.archlinux.org/index.php/CPU_frequency_scaling



Also note that all theses tuning can normaly be done at bios level.
(on last dell servers bios, setting power profile to max performance, i=
s setting the governor to max and disable all cstate)

I'm always doing it on my kvm hypervisors hosts.


They are also the C1E option to disable on AMD processor.


----- Mail original -----=20

De: "Somnath Roy" <***@sandisk.com>=20
=C3=80: "Andreas Bluemle" <***@itxperts.de>, ceph-***@vge=
r.kernel.org=20
Envoy=C3=A9: Mercredi 8 Octobre 2014 19:38:26=20
Objet: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot param=
s=20

Thanks Andres for sharing this. I will try those out.=20
BTW, I am using Ubuntu 14.04 LTS and couldn't find any sysfs entry like=
'cpufreq'..=20

***@stormeap-4:~# ll /sys/devices/system/cpu/cpu10/=20
cache/ crash_notes driver/ microcode/ online subsystem/ topology/=20
cpuidle/ crash_notes_size firmware_node/ node0/ power/ thermal_throttle=
/ uevent=20

I am using Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz.=20

Regards=20
Somnath=20

-----Original Message-----=20
=46rom: ceph-devel-***@vger.kernel.org [mailto:ceph-devel-***@vger.=
kernel.org] On Behalf Of Andreas Bluemle=20
Sent: Wednesday, October 08, 2014 9:33 AM=20
To: ceph-***@vger.kernel.org=20
Subject: Re: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot par=
ams=20

Hi,=20

as mentioned during today's meeting, here are the kernel boot parameter=
s which I found to provide the basis for good performance results:=20

processor.max_cstate=3D0=20
intel_idle.max_cstate=3D0=20

I understand these to basically turn off any power saving modes of the =
CPU; the CPU's we are using are like=20
Intel(R) Xeon(R) CPU E5-2640 v2 @ 2.00GHz=20
Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz=20

At the BIOS level, we=20
- turn off Hyperthraeding=20
- turn off Turbo mode (in order ot not leave the specifications)=20
- turn on frequency floor override=20

We also assert that=20
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor=20
is set to "performance"=20

Using above we see a constant frequency at the maximum level allowed by=
the CPU (except Turbo mode).=20


Best Regards=20

Andreas Bluemle=20






On Wed, 8 Oct 2014 02:51:21 +0200=20
Hi All,=20
=20
Just a remind that the weekly performance meeting is on Wednesdays at=
=20
8AM PST. Same bat time, same bat channel!=20
=20
Etherpad URL:=20
http://pad.ceph.com/p/performance_weekly=20
=20
To join the Meeting:=20
https://bluejeans.com/268261044=20
=20
To join via Browser:=20
https://bluejeans.com/268261044/browser=20
=20
To join with Lync:=20
https://bluejeans.com/268261044/lync=20
=20
=20
To join via Room System:=20
Video Conferencing System: bjn.vc -or- 199.48.152.152 Meeting ID:=20
268261044=20
=20
To join via Phone:=20
1) Dial:=20
+1 408 740 7256=20
+1 888 240 2560(US Toll Free)=20
+1 408 317 9253(Alternate Number)=20
(see all numbers - http://bluejeans.com/numbers)=20
2) Enter Conference ID: 268261044=20
=20
Mark=20
--=20
To unsubscribe from this list: send the line "unsubscribe ceph-devel"=
=20
info at http://vger.kernel.org/majordomo-info.html=20
=20
=20
--=20
Andreas Bluemle mailto:***@itxperts.de=20
ITXperts GmbH http://www.itxperts.de=20
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917=20
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910=20

Company details: http://www.itxperts.de/imprint.htm=20
--=20
To unsubscribe from this list: send the line "unsubscribe ceph-devel" i=
n the body of a message to ***@vger.kernel.org More majordomo inf=
o at http://vger.kernel.org/majordomo-info.html=20

________________________________=20

PLEASE NOTE: The information contained in this electronic mail message =
is intended only for the use of the designated recipient(s) named above=
=2E If the reader of this message is not the intended recipient, you ar=
e hereby notified that you have received this message in error and that=
any review, dissemination, distribution, or copying of this message is=
strictly prohibited. If you have received this communication in error,=
please notify the sender by telephone or e-mail (as shown above) immed=
iately and destroy any and all copies of this message in your possessio=
n (whether hard copies or electronically stored copies).=20

--=20
To unsubscribe from this list: send the line "unsubscribe ceph-devel" i=
n=20
the body of a message to ***@vger.kernel.org=20
More majordomo info at http://vger.kernel.org/majordomo-info.html=20
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" i=
n
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Stefan Priebe
2014-10-08 18:35:53 UTC
Permalink
Post by Andreas Bluemle
Hi,
processor.max_cstate=0
intel_idle.max_cstate=0
I understand these to basically turn off any power saving modes of the CPU; the CPU's we are using are like
At the BIOS level, we
- turn off Hyperthraeding
- turn off Turbo mode (in order ot not leave the specifications)
- turn on frequency floor override
We also assert that
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
is set to "performance"
Using above we see a constant frequency at the maximum level allowed by the CPU (except Turbo mode).
How much performance do we gain by this? Till now i thought it's just
1-3% so i'm still running ondemand govenor plus power savings.

Greets,
Stefan
Post by Andreas Bluemle
Best Regards
Andreas Bluemle
On Wed, 8 Oct 2014 02:51:21 +0200
Post by Mark Nelson
Hi All,
Just a remind that the weekly performance meeting is on Wednesdays at
8AM PST. Same bat time, same bat channel!
http://pad.ceph.com/p/performance_weekly
https://bluejeans.com/268261044
https://bluejeans.com/268261044/browser
https://bluejeans.com/268261044/lync
268261044
+1 408 740 7256
+1 888 240 2560(US Toll Free)
+1 408 317 9253(Alternate Number)
(see all numbers - http://bluejeans.com/numbers)
2) Enter Conference ID: 268261044
Mark
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html
--
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910
Company details: http://www.itxperts.de/imprint.htm
--
________________________________
PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Paul Von-Stamwitz
2014-10-08 23:55:38 UTC
Permalink
Post by Andreas Bluemle
Post by Somnath Roy
Hi,
as mentioned during today's meeting, here are the kernel boot parameters
processor.max_cstate=0
intel_idle.max_cstate=0
I understand these to basically turn off any power saving modes of the
CPU; the CPU's we are using are like
Post by Somnath Roy
At the BIOS level, we
- turn off Hyperthraeding
- turn off Turbo mode (in order ot not leave the specifications)
- turn on frequency floor override
We also assert that
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
is set to "performance"
Using above we see a constant frequency at the maximum level allowed by
the CPU (except Turbo mode).
How much performance do we gain by this? Till now i thought it's just 1-3% so
i'm still running ondemand govenor plus power savings.
As always, it depends. I saw noticeable increases in some throughput tests (though I can't recall the % gain.) More important to me was that it made my fio results much more consistent. As we measure improvements, these settings remove some of the "system noise".

Best,
Paul
Post by Andreas Bluemle
Greets,
Stefan
Post by Somnath Roy
Best Regards
Andreas Bluemle
On Wed, 8 Oct 2014 02:51:21 +0200
Post by Mark Nelson
Hi All,
Just a remind that the weekly performance meeting is on Wednesdays at
8AM PST. Same bat time, same bat channel!
http://pad.ceph.com/p/performance_weekly
https://bluejeans.com/268261044
https://bluejeans.com/268261044/browser
https://bluejeans.com/268261044/lync
268261044
+1 408 740 7256
+1 888 240 2560(US Toll Free)
+1 408 317 9253(Alternate Number)
(see all numbers - http://bluejeans.com/numbers)
2) Enter Conference ID: 268261044
Mark
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
majordomo
Post by Somnath Roy
Post by Mark Nelson
info at http://vger.kernel.org/majordomo-info.html
--
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910
Company details: http://www.itxperts.de/imprint.htm
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
majordomo
Post by Somnath Roy
info at http://vger.kernel.org/majordomo-info.html
________________________________
PLEASE NOTE: The information contained in this electronic mail message is
intended only for the use of the designated recipient(s) named above. If the
reader of this message is not the intended recipient, you are hereby notified
that you have received this message in error and that any review,
dissemination, distribution, or copying of this message is strictly prohibited. If
you have received this communication in error, please notify the sender by
telephone or e-mail (as shown above) immediately and destroy any and all
copies of this message in your possession (whether hard copies or
electronically stored copies).
Post by Somnath Roy
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
majordomo
Post by Somnath Roy
info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the
http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Andreas Bluemle
2014-10-14 11:22:42 UTC
Permalink
Hi,


On Wed, 8 Oct 2014 16:55:38 -0700
Post by Paul Von-Stamwitz
Post by Andreas Bluemle
Post by Somnath Roy
Hi,
as mentioned during today's meeting, here are the kernel boot parameters
processor.max_cstate=0
intel_idle.max_cstate=0
I understand these to basically turn off any power saving modes of the
CPU; the CPU's we are using are like
Post by Somnath Roy
At the BIOS level, we
- turn off Hyperthraeding
- turn off Turbo mode (in order ot not leave the specifications)
- turn on frequency floor override
We also assert that
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
is set to "performance"
Using above we see a constant frequency at the maximum level allowed by
the CPU (except Turbo mode).
How much performance do we gain by this? Till now i thought it's
just 1-3% so i'm still running ondemand govenor plus power savings.
As always, it depends. I saw noticeable increases in some throughput
tests (though I can't recall the % gain.) More important to me was
that it made my fio results much more consistent. As we measure
improvements, these settings remove some of the "system noise".
Best,
Paul
There were two different aspects which showed improvemnt:
- code was executed faster
- thread switching delays were reduced significantly

See the attached grahics. They show processing of a 4 kB write
request: processing at the Pipe::Reader is roughly 200 us in both
pictures, and sth. like 20 us at the OSD::Dispatcher. So there
is not much of a benefit here.

But the delay between the end of the Pipe::Reader and the start
of the OSD::Dispatcher threads reduced really significantly.


(And sorry for the late response)
Post by Paul Von-Stamwitz
Post by Andreas Bluemle
Greets,
Stefan
Post by Somnath Roy
Best Regards
Andreas Bluemle
On Wed, 8 Oct 2014 02:51:21 +0200
Post by Mark Nelson
Hi All,
Just a remind that the weekly performance meeting is on
Wednesdays at 8AM PST. Same bat time, same bat channel!
http://pad.ceph.com/p/performance_weekly
https://bluejeans.com/268261044
https://bluejeans.com/268261044/browser
https://bluejeans.com/268261044/lync
268261044
+1 408 740 7256
+1 888 240 2560(US Toll Free)
+1 408 317 9253(Alternate Number)
(see all numbers - http://bluejeans.com/numbers)
2) Enter Conference ID: 268261044
Mark
--
To unsubscribe from this list: send the line "unsubscribe
ceph-devel" in the body of a message to
majordomo
Post by Somnath Roy
Post by Mark Nelson
info at http://vger.kernel.org/majordomo-info.html
--
Andreas Bluemle
GmbH http://www.itxperts.de Balanstrasse
73, Geb. 08 Phone: (+49) 89 89044917 D-81541 Muenchen
(Germany) Fax: (+49) 89 89044910
Company details: http://www.itxperts.de/imprint.htm
--
To unsubscribe from this list: send the line "unsubscribe
More
majordomo
Post by Somnath Roy
info at http://vger.kernel.org/majordomo-info.html
________________________________
PLEASE NOTE: The information contained in this electronic mail message is
intended only for the use of the designated recipient(s) named
above. If the reader of this message is not the intended recipient,
you are hereby notified that you have received this message in
error and that any review, dissemination, distribution, or copying
of this message is strictly prohibited. If you have received this
communication in error, please notify the sender by telephone or
e-mail (as shown above) immediately and destroy any and all copies
of this message in your possession (whether hard copies or
electronically stored copies).
Post by Somnath Roy
--
To unsubscribe from this list: send the line "unsubscribe
More
majordomo
Post by Somnath Roy
info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Andreas Bluemle mailto:***@itxperts.de
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910

Company details: http://www.itxperts.de/imprint.htm
Sage Weil
2014-10-14 13:13:58 UTC
Permalink
Post by Andreas Bluemle
Hi,
On Wed, 8 Oct 2014 16:55:38 -0700
Post by Paul Von-Stamwitz
Post by Andreas Bluemle
Post by Somnath Roy
Hi,
as mentioned during today's meeting, here are the kernel boot parameters
processor.max_cstate=0
intel_idle.max_cstate=0
I understand these to basically turn off any power saving modes of the
CPU; the CPU's we are using are like
Post by Somnath Roy
At the BIOS level, we
- turn off Hyperthraeding
- turn off Turbo mode (in order ot not leave the specifications)
- turn on frequency floor override
We also assert that
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
is set to "performance"
Using above we see a constant frequency at the maximum level allowed by
the CPU (except Turbo mode).
How much performance do we gain by this? Till now i thought it's
just 1-3% so i'm still running ondemand govenor plus power savings.
As always, it depends. I saw noticeable increases in some throughput
tests (though I can't recall the % gain.) More important to me was
that it made my fio results much more consistent. As we measure
improvements, these settings remove some of the "system noise".
Best,
Paul
- code was executed faster
- thread switching delays were reduced significantly
See the attached grahics. They show processing of a 4 kB write
request: processing at the Pipe::Reader is roughly 200 us in both
pictures, and sth. like 20 us at the OSD::Dispatcher. So there
is not much of a benefit here.
But the delay between the end of the Pipe::Reader and the start
of the OSD::Dispatcher threads reduced really significantly.
This test had a single outstanding IO, right? The question for me is if
this reflect latencies we'd see under a realistic workload, where the are
more IOs in flight and the CPUs aren't likely to be in low power states.
I'm not sure how low the load needs to be before those states kick in and
these latencies start to appear...

sage
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Andreas Bluemle
2014-10-14 14:38:06 UTC
Permalink
Hi Sage,

[embedded below]

On Tue, 14 Oct 2014 06:13:58 -0700 (PDT)
Post by Sage Weil
Post by Andreas Bluemle
Hi,
On Wed, 8 Oct 2014 16:55:38 -0700
Post by Paul Von-Stamwitz
Post by Andreas Bluemle
Post by Somnath Roy
Hi,
as mentioned during today's meeting, here are the kernel boot parameters
processor.max_cstate=0
intel_idle.max_cstate=0
I understand these to basically turn off any power saving modes of the
CPU; the CPU's we are using are like
Post by Somnath Roy
At the BIOS level, we
- turn off Hyperthraeding
- turn off Turbo mode (in order ot not leave the
specifications)
- turn on frequency floor override
We also assert that
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
is set to "performance"
Using above we see a constant frequency at the maximum level allowed by
the CPU (except Turbo mode).
How much performance do we gain by this? Till now i thought it's
just 1-3% so i'm still running ondemand govenor plus power savings.
As always, it depends. I saw noticeable increases in some
throughput tests (though I can't recall the % gain.) More
important to me was that it made my fio results much more
consistent. As we measure improvements, these settings remove
some of the "system noise".
Best,
Paul
- code was executed faster
- thread switching delays were reduced significantly
See the attached grahics. They show processing of a 4 kB write
request: processing at the Pipe::Reader is roughly 200 us in both
pictures, and sth. like 20 us at the OSD::Dispatcher. So there
is not much of a benefit here.
But the delay between the end of the Pipe::Reader and the start
of the OSD::Dispatcher threads reduced really significantly.
This test had a single outstanding IO, right? The question for me is
if this reflect latencies we'd see under a realistic workload, where
the are more IOs in flight and the CPUs aren't likely to be in low
power states. I'm not sure how low the load needs to be before those
states kick in and these latencies start to appear...
sage
Yes and no...

Yes: the test was a fio sequential write, 4k per write, with a
single IO in flight.

No: this means that on a given object in the osd file store with the
default size of 4 MByte, 1024 subsequent write requests will hit that
object - and hence the corresponding ceph-osd daemon. So even though
the system as a whole was not very busy, the ceph-osd daemon assigned
to the file object under pressure was fairly busy.

The intention of the test was to eliminate additional latencies
because of queues building up.

What the test shows is the contribution of the various processing
steps within ceph-osd to the overall latency for an individual
write requres when CPU power state related effects have been
eliminated,
Post by Sage Weil
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Andreas Bluemle mailto:***@itxperts.de
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910

Company details: http://www.itxperts.de/imprint.htm
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Sage Weil
2014-10-15 02:23:19 UTC
Permalink
Hi all , recently we tested 4K random write performance on our full SSD
setup (12 x Intel DC3700) , but peak performance is ~23K IOPS, which is
much lower than hardware capability , with detail latency breakdown , we
found that most of latency comes from osd queue , we have noticed the
optimizations on osd queue , and tried latest master on our setup , but
there is a performance regression , we also checked the qlock and pg
lock with perf counter, the waiting count and latency are very small,
the attached pdf shows the details , any suggestion will be appreciated
?
I would start by making sure 'osd enable op tracker = false' if it isn't
already.

The other thing to keep in mind is that a lot of the work has enabled
OSD perforamnce to scale as the clients increase. It looks like
your test has a single client. Can you try running 2, 4, 8 clients
and see if the per-OSD throughput goes up?

Digging into the code with a tool like vtune would be extremely helpful, I
think. There is a lot of time spent in do_op (osd prepare and osd queue)
that fujitsu has called out but we haven't narrowed down where the time is
being spent.

sage
-----Original Message-----
Sent: Tuesday, October 14, 2014 10:38 PM
To: Sage Weil
Subject: Re: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params
Hi Sage,
[embedded below]
Post by Sage Weil
Post by Andreas Bluemle
Hi,
On Wed, 8 Oct 2014 16:55:38 -0700
Post by Paul Von-Stamwitz
Post by Andreas Bluemle
Post by Somnath Roy
Hi,
as mentioned during today's meeting, here are the kernel boot
parameters
processor.max_cstate=0
intel_idle.max_cstate=0
I understand these to basically turn off any power saving modes of the
CPU; the CPU's we are using are like
Post by Somnath Roy
At the BIOS level, we
- turn off Hyperthraeding
- turn off Turbo mode (in order ot not leave the
specifications)
- turn on frequency floor override
We also assert that
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
is set to "performance"
Using above we see a constant frequency at the maximum level allowed by
the CPU (except Turbo mode).
How much performance do we gain by this? Till now i thought it's
just 1-3% so i'm still running ondemand govenor plus power savings.
As always, it depends. I saw noticeable increases in some
throughput tests (though I can't recall the % gain.) More
important to me was that it made my fio results much more
consistent. As we measure improvements, these settings remove some
of the "system noise".
Best,
Paul
- code was executed faster
- thread switching delays were reduced significantly
See the attached grahics. They show processing of a 4 kB write
request: processing at the Pipe::Reader is roughly 200 us in both
pictures, and sth. like 20 us at the OSD::Dispatcher. So there is
not much of a benefit here.
But the delay between the end of the Pipe::Reader and the start of
the OSD::Dispatcher threads reduced really significantly.
This test had a single outstanding IO, right? The question for me is
if this reflect latencies we'd see under a realistic workload, where
the are more IOs in flight and the CPUs aren't likely to be in low
power states. I'm not sure how low the load needs to be before those
states kick in and these latencies start to appear...
sage
Yes and no...
Yes: the test was a fio sequential write, 4k per write, with a single IO in flight.
No: this means that on a given object in the osd file store with the default size of 4 MByte, 1024 subsequent write requests will hit that object - and hence the corresponding ceph-osd daemon. So even though the system as a whole was not very busy, the ceph-osd daemon assigned to the file object under pressure was fairly busy.
The intention of the test was to eliminate additional latencies because of queues building up.
What the test shows is the contribution of the various processing steps within ceph-osd to the overall latency for an individual write requres when CPU power state related effects have been eliminated,
Post by Sage Weil
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html
--
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910
Company details: http://www.itxperts.de/imprint.htm
--
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Somnath Roy
2014-10-15 02:43:56 UTC
Permalink
Sage,
I think they seem to be using 7VM (and thus 7 librbd clients) clients for the test.
XinXin,
You are running 2 OSDS/SSD and that is not recommended . Not sure that has an impact or not. Along with disabling optracker as Sage suggested, you may want to tweak the osd num shards and number of filestore threads to see if it is improving performance.
BTW, each librados client is now ~20% slower (even after rbd_cache = false) and with 7 clients adding those degradation could be significant. One quick check you can do to factor out librbd degradation, is to use firefly librbd/librados combination.

Thanks & Regards
Somnath

-----Original Message-----
From: Sage Weil [mailto:***@newdream.net]
Sent: Tuesday, October 14, 2014 7:23 PM
To: Shu, Xinxin
Cc: Andreas Bluemle; Paul Von-Stamwitz; Stefan Priebe; Somnath Roy; ceph-***@vger.kernel.org; Zhang, Jian
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params
Hi all , recently we tested 4K random write performance on our full
SSD setup (12 x Intel DC3700) , but peak performance is ~23K IOPS,
which is much lower than hardware capability , with detail latency
breakdown , we found that most of latency comes from osd queue , we
have noticed the optimizations on osd queue , and tried latest master
on our setup , but there is a performance regression , we also checked
the qlock and pg lock with perf counter, the waiting count and latency
are very small, the attached pdf shows the details , any suggestion
will be appreciated ?
I would start by making sure 'osd enable op tracker = false' if it isn't already.

The other thing to keep in mind is that a lot of the work has enabled OSD perforamnce to scale as the clients increase. It looks like your test has a single client. Can you try running 2, 4, 8 clients and see if the per-OSD throughput goes up?

Digging into the code with a tool like vtune would be extremely helpful, I think. There is a lot of time spent in do_op (osd prepare and osd queue) that fujitsu has called out but we haven't narrowed down where the time is being spent.

sage
-----Original Message-----
Sent: Tuesday, October 14, 2014 10:38 PM
To: Sage Weil
Cc: Paul Von-Stamwitz; Stefan Priebe; Somnath Roy;
Subject: Re: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params
Hi Sage,
[embedded below]
Post by Sage Weil
Post by Andreas Bluemle
Hi,
On Wed, 8 Oct 2014 16:55:38 -0700
Post by Paul Von-Stamwitz
Post by Andreas Bluemle
Post by Andreas Bluemle
Hi,
as mentioned during today's meeting, here are the kernel
boot parameters
processor.max_cstate=0
intel_idle.max_cstate=0
I understand these to basically turn off any power saving modes of the
CPU; the CPU's we are using are like
Post by Andreas Bluemle
At the BIOS level, we
- turn off Hyperthraeding
- turn off Turbo mode (in order ot not leave the
specifications)
- turn on frequency floor override
We also assert that
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
is set to "performance"
Using above we see a constant frequency at the maximum level allowed by
the CPU (except Turbo mode).
How much performance do we gain by this? Till now i thought
it's just 1-3% so i'm still running ondemand govenor plus
power savings.
As always, it depends. I saw noticeable increases in some
throughput tests (though I can't recall the % gain.) More
important to me was that it made my fio results much more
consistent. As we measure improvements, these settings remove
some of the "system noise".
Best,
Paul
- code was executed faster
- thread switching delays were reduced significantly
See the attached grahics. They show processing of a 4 kB write
request: processing at the Pipe::Reader is roughly 200 us in both
pictures, and sth. like 20 us at the OSD::Dispatcher. So there is
not much of a benefit here.
But the delay between the end of the Pipe::Reader and the start of
the OSD::Dispatcher threads reduced really significantly.
This test had a single outstanding IO, right? The question for me
is if this reflect latencies we'd see under a realistic workload,
where the are more IOs in flight and the CPUs aren't likely to be in
low power states. I'm not sure how low the load needs to be before
those states kick in and these latencies start to appear...
sage
Yes and no...
Yes: the test was a fio sequential write, 4k per write, with a single IO in flight.
No: this means that on a given object in the osd file store with the default size of 4 MByte, 1024 subsequent write requests will hit that object - and hence the corresponding ceph-osd daemon. So even though the system as a whole was not very busy, the ceph-osd daemon assigned to the file object under pressure was fairly busy.
The intention of the test was to eliminate additional latencies because of queues building up.
What the test shows is the contribution of the various processing
steps within ceph-osd to the overall latency for an individual write
requres when CPU power state related effects have been eliminated,
Post by Sage Weil
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html
--
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910
Company details: http://www.itxperts.de/imprint.htm
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html
________________________________

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).

--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Shu, Xinxin
2014-10-15 02:59:56 UTC
Permalink
Hi sage ,

With latest master , we do set 'osd_enable_op_tracker = false' , we tested up to 7 rbd clients in our test , but after two clients, the iops is stable at ~23K, there is no performance gain with more clients

-----Original Message-----
From: Sage Weil [mailto:***@newdream.net]
Sent: Wednesday, October 15, 2014 10:23 AM
To: Shu, Xinxin
Cc: Andreas Bluemle; Paul Von-Stamwitz; Stefan Priebe; Somnath Roy; ceph-***@vger.kernel.org; Zhang, Jian
Subject: RE: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params
Hi all , recently we tested 4K random write performance on our full
SSD setup (12 x Intel DC3700) , but peak performance is ~23K IOPS,
which is much lower than hardware capability , with detail latency
breakdown , we found that most of latency comes from osd queue , we
have noticed the optimizations on osd queue , and tried latest master
on our setup , but there is a performance regression , we also checked
the qlock and pg lock with perf counter, the waiting count and latency
are very small, the attached pdf shows the details , any suggestion
will be appreciated ?
I would start by making sure 'osd enable op tracker = false' if it isn't already.

The other thing to keep in mind is that a lot of the work has enabled OSD perforamnce to scale as the clients increase. It looks like your test has a single client. Can you try running 2, 4, 8 clients and see if the per-OSD throughput goes up?

Digging into the code with a tool like vtune would be extremely helpful, I think. There is a lot of time spent in do_op (osd prepare and osd queue) that fujitsu has called out but we haven't narrowed down where the time is being spent.

sage
-----Original Message-----
Sent: Tuesday, October 14, 2014 10:38 PM
To: Sage Weil
Cc: Paul Von-Stamwitz; Stefan Priebe; Somnath Roy;
Subject: Re: 10/7/2014 Weekly Ceph Performance Meeting: kernel boot params
Hi Sage,
[embedded below]
Post by Sage Weil
Post by Andreas Bluemle
Hi,
On Wed, 8 Oct 2014 16:55:38 -0700
Post by Paul Von-Stamwitz
Post by Andreas Bluemle
Post by Andreas Bluemle
Hi,
as mentioned during today's meeting, here are the kernel
boot parameters
processor.max_cstate=0
intel_idle.max_cstate=0
I understand these to basically turn off any power saving modes of the
CPU; the CPU's we are using are like
Post by Andreas Bluemle
At the BIOS level, we
- turn off Hyperthraeding
- turn off Turbo mode (in order ot not leave the
specifications)
- turn on frequency floor override
We also assert that
/sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
is set to "performance"
Using above we see a constant frequency at the maximum level allowed by
the CPU (except Turbo mode).
How much performance do we gain by this? Till now i thought
it's just 1-3% so i'm still running ondemand govenor plus
power savings.
As always, it depends. I saw noticeable increases in some
throughput tests (though I can't recall the % gain.) More
important to me was that it made my fio results much more
consistent. As we measure improvements, these settings remove
some of the "system noise".
Best,
Paul
- code was executed faster
- thread switching delays were reduced significantly
See the attached grahics. They show processing of a 4 kB write
request: processing at the Pipe::Reader is roughly 200 us in both
pictures, and sth. like 20 us at the OSD::Dispatcher. So there is
not much of a benefit here.
But the delay between the end of the Pipe::Reader and the start of
the OSD::Dispatcher threads reduced really significantly.
This test had a single outstanding IO, right? The question for me
is if this reflect latencies we'd see under a realistic workload,
where the are more IOs in flight and the CPUs aren't likely to be in
low power states. I'm not sure how low the load needs to be before
those states kick in and these latencies start to appear...
sage
Yes and no...
Yes: the test was a fio sequential write, 4k per write, with a single IO in flight.
No: this means that on a given object in the osd file store with the default size of 4 MByte, 1024 subsequent write requests will hit that object - and hence the corresponding ceph-osd daemon. So even though the system as a whole was not very busy, the ceph-osd daemon assigned to the file object under pressure was fairly busy.
The intention of the test was to eliminate additional latencies because of queues building up.
What the test shows is the contribution of the various processing
steps within ceph-osd to the overall latency for an individual write
requres when CPU power state related effects have been eliminated,
Post by Sage Weil
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html
--
ITXperts GmbH http://www.itxperts.de
Balanstrasse 73, Geb. 08 Phone: (+49) 89 89044917
D-81541 Muenchen (Germany) Fax: (+49) 89 89044910
Company details: http://www.itxperts.de/imprint.htm
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel"
info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Loading...