Wednesday, May 5, 2010

How to Calculate Your Disk I/O Requirements

A very good read for any aspiring Storage Architect

Now that you understand which Exchange activities and components generate disk I/O and how to configure your storage to support them, you must calculate the disk I/O requirements for your users. Calculating your disk I/O requirements ultimately allows you to optimize your disk subsystem to best support your users.

Your goal is to provide enough disk I/O performance (measured by the number of I/O operations per second [IOPS] that can be performed) with acceptable latency that allows for efficient Exchange functionality.

Calculating the IOPS per mailbox is a convenient way to measure the profile for a given server based on random database read/write I/O (transaction log I/O is not factored into this equation). The higher the IOPS per mailbox, the more aggressive the mailbox profile is in terms of disk usage.

There are two approaches to calculating your disk I/O requirements:

  • Determine user needs based on theoretical data
  • Calculate user activity by using the Performance console (Perfmon)

Regardless of your approach, you should plan and calculate based on peak usage periods. In many companies, this occurs at the beginning of business hours, as people arrive at work and check their e-mail.

If you do not have Exchange installed but want to plan for impending storage needs that result from an Exchange deployment, you can use data that has already been collected. This data is in the form of mailbox profiles, which describe general usage patterns for Exchange mailboxes.

The following table lists mailbox profiles that can be used as a guideline for capacity planning of Exchange mailbox servers. These profiles represent mailbox access for the “average user” Outlook (or MAPI-based) client within the organization.

User profiles and corresponding usage patterns

User TypeDatabase Volume IOPSSend/Receive per dayMailbox Size

Light

.5

20 sent/50 received

50 MB

Average

.75

30 sent/75 received

100 MB

Heavy

1.0

40 sent/100 received

200 MB

Large

1.5

60 sent/150 received

500 MB

Each profile represents total I/O to the Jet database and does not include I/O related to transaction log file activity. In order to accurately calculate your disk subsystem load, you must split this database I/O into read and write I/O because write operations are more I/O intensive than reads.

To help estimate your own read/write ratio, consider the usage patterns of a company that has a heavy mailbox profile. In a production environment, the company can expect to incur read/write ratios between 75/25 percent and 66/33 percent, depending on the group of users being evaluated.

For a mail system consisting of 2,000 heavily used mailboxes, a total of 1,500 IOPS is generated on the database volume. The formula to calculate this is:

Estimated IOPS per User for User Type × Number of Users

In this example, .75 IOPS × 2,000 mailboxes=1,500 IOPS.

Using a conservative ratio of two reads for every write (66 percent reads to 33 percent writes), you would plan for 1,000 read I/O and 500 write I/O requests per second for your database volume. Every write request is first written to the transaction log file and then written to the database. Approximately 10 percent of the total 1,500 IOPS seen on the database volume will be seen on the transaction log volume (10 percent of 1,500 is 150 IOPS); 500 write I/O requests will be written to the database.

These estimated profiles are for an Exchange server that has no other components installed beyond the base operating system. If other variables, such as third-party Personal Information Management (PIM) software, MAPI-based virus scanning (server and client side), management and monitoring software, or backup software are used during peak usage periods, these profiles will not adequately describe the I/O profile in your organization. In these cases, you must also factor in the additional reads and writes that are requested by these applications.

Database IOPS every 1000 mailboxes

MailboxesDatabase Volume IOPSIOPSActual IOPS

1000

1.0

1000

1000

2000

1.0

2000

2500

3000

1.0

3000

3750

4000

1.0

4000

5000

When you increase the number of users with similar profiles on an Exchange Server, more users must compete with the database cache causing an increase in database disk transfers.

For Example:

1000 mailboxes at 1.0 IOPS produce 1000 IOPS on the database logical unit number (LUN). If you double the number of users with the same profile to 2000, then the formula is: 1.0 IOPS x 2000 mailboxes x 1.25 = 2500 IOPS.

DatabasesIOPS FactorActual IOPS

1

0%

1000

2

2%

1020

4

6%

1060

8

14%

1140

10

18%

1180

20

38%

1380

The benefits of single instance storage are reduced as you add databases to an Exchange server. The degree of the impact is dependant upon the user profile and the average message size. For Example; if you have 1000 users on 1 database, consuming 1000 database IOPS, splitting those users into 10 databases would consume 18 percent more database IOPS, or 1180 IOPS. When someone sends a 10 MB PowerPoint file as an attachment to 20 mailboxes and they all reside on the same database, only 10 MB is written to the database. If those same 20 recipients are on different databases, 200 MB must be written to disk; a twenty fold increase in write activity. This data is based on the MMB3 test which uses a significant number of distribution lists and is a heavy corporate profile.

User TypeOutlook Cached ModeOutlook Online ModeInbox Size

Corporate

1.0 IOPS

1.0 IOPS

10,000 Items – 500MB

Large

1.0 IOPS

1.25 IOPS

20,000 Items – 1Gb

Huge

1.0 IOPS

1.75 IOPS

40,000 Items – 2GB

With Outlook cached mode, many common disk intensive tasks are performed on the client. The initial full sync in Outlook cached mode is a disk intensive activity, yet it is rarely done. With Outlook online mode, searches and sorts are performed on the server. After the initial index creation, it is automatically updated as mail is received. Users that have an abnormally large number of indexes can exceed the limit, causing some of the indexes to be recreated. Some Outlook plug-ins and external applications create indexes and cause additional physical disk pressure. When moving from 500 MB to 1 GB the physical disk cost to the database increased approximately 25 percent, while a 1 GB to 2 GB change increased the database physical disk cost approximately 40 percent.

If you already deployed Exchange, you should use your existing production environment to identify your I/O requirements. A benefit of monitoring your production environment is that your data includes all I/O that occurs across all applications, including third-party applications.

When you calculate IOPS per mailbox, use the current number of mailboxes on that server. If the server contains many unused mailboxes or is running other applications that do not add much load during the peak two hours, your results may not represent a typical user load. You should select a server that has typical user mailboxes for your measurements, or does not include the unused mailboxes in your calculation.

Be aware that different days of the week have slightly different usage loads. Because this varies widely, you should monitor your environment over an extended period of time (ideally one month) to determine when you are likely to experience peaks.

For detailed steps about how to calculate your storage I/O requirements using data collected from your Exchange production environment, see How to Calculate Storage I/O Requirements Using Environmental Data.

This IOPS per mailbox value is what you will experience at the operating system level. If you have a hardware RAID solution in place, then you will have a different number of IOPS at the disk level that cannot be measured with Perfmon. There is a penalty for RAID-5 that is different than the penalty for RAID-1+0. For more information about calculating IOPS for each type of RAID solution, see Disk Subsystem Performance Analysis for Windows.

After you calculate your IOPS per mailbox value, you can optimize your existing storage architecture.

Taken form Microsoft Technet

=========================================================================

Average IOPS per disk

  • 7.2K RPM disk: 75
  • 10K RPM disk: 125
  • 15K RPM disk: 175
  • Solid State Disk: 6,000
Formula to calculate IOPS requirements

(Total Workload IOPS * Percentage of workload that is read operations) + (Total Workload IOPS * Percentage of workload that is write operations * RAID IO Penalty

I'm currently studying for my Clariion Technology Architect exam and Technology Architect Expert exam....so this is a good article that I come across while doing some research.