chrisg

Performance

So I had a good chance this weekend to get together with one of my longest term friends. We have been friends and or worked together for 10+ years. We talked a great deal about analyzing performance of systems and the like. This brought back some fun memories of when I/we used to go in analyze systems for performance, security, etc. So many times the process of appropriately designing for capacity, performance efficiency and long term supportability is not paid enough attention. It is often a very misunderstood process where the full picture is not taken into account. Perhaps this is why consulting organizations that specialize in this area offer so much value and return on the investment of their services. All of this also made me think about all the times in the past I have been asked about sizing a K2 solution. The question itself is much harder than it sounds on the surface. Yes, K2 is a software product we make and yes, we do know it intimately. However, one of the tremendous values that K2 brings is its tremendous flexibility. A customer can implement a process with different interfaces, different backend integrations, and different amounts of data being captured. It's truly impossible to give a good answer on this question other than "it depends." Application and system performance is affected by literally everything. I am not even convinced that the moon doesn't have an effect at times. Let's think about some of the items that you need to look at for system performance.

 

  • Amount and speed of memory
  • Bus speed and bandwidth
  • Number and speed of processors
  • Bandwidth between system bus and the storage system
  • Backplane speed of storage system
  • Speed of drives in storage system
  • Read/write cache for Storage system.
  • And the list goes on, but for this I think we covered some of the biggies.

 

It is my personal opinion that the area most often overlooked or misunderstood is storage and IO. In order to get the most out of your application you need to get the most out of your storage. Without taking into account all the ancillary items on data getting to the disks, I want to explain some items about disk performance in hopes that it helps you understand it better.

Disks themselves are a finite resource they can only do what they can do. So we need to make sure we put them into a configuration that based on what we need from them. This gives them the best chance of delivering the data efficiently and "performantly." So what does this mean? It means that we need to dig in and analyze all pieces of the storage puzzle such as

  • Disk Size
  • Number of disks
  • Disk rotational speed
  • Disk and or controller cache
  • R/W cache allocation
  • Raid level
  • Raid Stripe size
  • Chunk size
  • Allocation unit size.

If all of these are out of sync with the application and system profile, performance will be dramatically impacted.

One of the issues that is increasingly affecting system performance today is density. By this I mean if I plan for 4TB's of space for my database, with today's large capacity drives I could feasibly get this amount of space with as little as 4 drives. By doing this I dramatically decrease the amount of IO that this storage system can handle. Next let's dig in and take a look at some of the numbers that go into this.

Each individual disk has a maximum number of reads and or writes that it can make in a given amount of time (IOPS. IO's Per Second) In addition it has a maximum amount of data that it can transfer to each disk in a given amount of time (throughput). You can often get these numbers from the hardware vendor. However be aware that these numbers are often misleading as these numbers are PEAK performance and the tests used to generate them are geared towards the results they are looking to get. For example if I want extremely high numbers to show for IOPS I would want to make the amount of data per IO as small as possible like 512bytes. Of course this makes perfect sense, because writing a small amount of data over and over again is easier than writing a large amount, so my operations will increase the smaller the data size is.

Think about this in terms of an application. What is the application doing more often? This can either be small reads and writes, or large. Some examples of IO patterns that require large transfers would be video on demand and media streaming. Small IO could be a web server where the HTML is light and the transfer of large files is not necessary. Often times database-dependant applications fall somewhere in the middle but this can vary a lot.

So if each disk can handle a certain amount of data, what happens when you break that up and "stripe" it across more than 1 disk? Well you get RAID. RAID (Redundant Array of Independent Drives) not only helps in availability but also helps in performance because I am splitting the amount of work that needs to be done across a set of disks. In some ways you can take the maximum throughput of an individual disk and multiply it by the number of disks involved to get total peak throughput. This is obviously not 100% true because of the overhead often involved with RAID, mainly with write operations. As an example in RAID 5 there are 4 disk IO's for each write operation (2 reads, 2 writes to calculate and write the parity) RAID 1 or RAID 0+1 has an overhead as well (1 logical IO is actually 2 Physical IO's). So if you have a purely READ ONLY system you really don't need to concern yourself too much with this. However the more write intensive your application is the more you need to factor this in carefully. Typically every application has a Read/Write ratio or profile, and in order to properly plan for the best performance you need to know what your profile is. Sorry to say but for K2 once again "it depends."

Consider this, if I have a K2 implementation where the number of process instances is low. But reporting is a popular feature that is used extensively. This would skew the balance to heavy read. You should always be monitoring IO activity and be profiling it to be able to make changes throughout the lifecycle of a system.

So now let's have some fun with some formulas. One formula set that I have used previously is DFT, defined as:

Estimated maximum throughput = D * F * T

Where:

D = disk speed. The maximum rate of IOPS which is usually measured and provided by the disk manufacturer.

F = fudge factor. I used 0.75 (75% of max usage) You should always build in some level of this to plan for peaks as well as to not get to close to a max.

T = RAID factor. The raid factor attempts to take into account the type of RAID and the overhead involved.

For no raid or raid 0, the raid factor is because there is no overhead involved

T = 1 (no raid or raid 0)

For raid 10, the raid factor is:

T = (R + W)/(R + 2W) (Raid 10)

For raid 5, the raid factor is:

T = (R + W)/(R + 4W)

R and W are the number or reads and writes to the drives in an operation. You should be able to calculate your own factor by measuring the amount of reads and writes using industry standard monitoring tools. I know you want an example so, if I am monitoring my RAID 5 array and I see that the number of reads to the drives is 1000, and the number of writes is 500, then the DFT factor for Raid 5 would be

T = (1000 + 500)/(1000 + 4*500) = 1500/3000 = 0.50

The DFT factor can also be affected by the RW ratio. So let's take a look at an example of a RAID factor with a R:W ratio of 4:1 Let's use disks with a maximum raw throughput of 100 IOs per second, so D is 100

For a R:W ratio of 3:1, Raid 5, the raid factor is

T = (4 + 1)/(4+4*1) = 5/8 = 0.624

Now the full formula,

Estimated maximum throughput = D * F * T = 100 * 0.75 * 0.624 = 46.8

 

Here is some more math in case you were dying for some.

Rotational speed of drives, affects access time and thus affects IOPS

Access time is Comprised of Command time, Seek Time, Drive Latency, and Data transfer time (a function of bandwidth)

Total time for an example 10Krpm drive =8.53ms

Total time for an example 15Krpm drive =5.95ms

From these statistics we can determine the number of IO's per second that a single disk drive can perform. The number of IO's per second is determined by the following formula:

IO's per second (IOPS) = 1 / seconds per IO

Where the seconds per IO = seek time + rotational latency

10K Drives

IOPS = 1 / (8.53ms)

= 1 / (8.53) = 1 / (0.00853 ms)

= 117 IOPS

VS 15K

IOPS = 1 / (5.95ms)

= 1 / (5.95) = 1 / (0.00595 ms)

= 168 IOPS

That is a difference of 51 IOPS.

So now let's bring it together in a real example. Let's "ASSUME" you determine that your application is 60% Write 40% read

Raid 10

T=(60 + 40)(60 + 2*40) = 100/140=.71

Max throughput Raid 10 with 10K drives

D*F*T

117*.8*.71=68.328

Max throughput Raid 10 15K drives

168*.8*.71=95.424

Raid 5

T=(60 + 40)(60 + 4*40) = 100/220=.45

Max Throughput Raid 5 10K Drives

D*F*T

117*.8*.45=42.12

Max Throughput Raid 5 15K Drives

168*.8*.45=60.48

I have some basics that you may want to keep in mind. They are not absolutes but hopefully they give you some level of guidance.

1. More disk spindles = better IO

2. Raid 10 can be as much as 75% faster for write activity (thus it can be very important to know what your read write ratio will be for normal activity) Consider RAID 10 a best choice if overall system IO is more than 15% writes.

3. Keep Database files and Database logs on separate physical/logical drives to improve performance.

4. Consider dedicating physical/logical drives to each database to increase performance.

5. Stripe size can impact a systems performance as well and should not be overlooked as an important decision to be made in a storage configuration. Administrators should choose a stripe size that most closely matches their IO profile. Once a stripe size is chosen you should choose a NTFS allocation size that closely matches this or is at least a multiple of the stripe size. Obviously some of these choices may be affected by hardware vendors. Administrators should consult with their hardware vendor when making these andother hardware configuration choices.

Of course what you have read so far is only a very small piece of the puzzle. So many other things can affect performance. If there is interest out there let me know and I will really dig into some details, such as how sequential vs. random IO can affect performance and how full an array is can affect it as well. We could also have some fun getting into stripe and chunk sizes and how those can dramatically affect performance. I personally have seen simple changes in those make a huge difference.

Published Monday, October 22, 2007 11:58 AM by chrisg

Comments

No Comments
Anonymous comments are disabled

About chrisg

I am responsible for community development for SourceCode. I have been in technology for over 14 years mainly in infrastructure and security. I absolutely love technology especially new stuff and gadgets.