Web servers control the caching behavior of Web clients (browsers and proxies) by sending them, along with a requested file, certain HTTP headers. (HTTP headers are small data fields that are sent and read by servers, browsers and proxies as part of every request or response on the Web.) Browsers and proxies use these headers to decide whether, and for how long, to cache a given copy of a file. CacheRight is designed to give developers or site managers an easy means of controlling the cache-related headers sent out by the Web server. In this way, it lets you control and optimize how browsers and proxies cache your pages. In the sections that follow, we introduce all of the concepts required to use CacheRight effectively.
|
![]() |
Installation Installing CacheRight is simple. Once you have downloaded the installer executable from the Port80 Web site, double click on the file to begin the CacheRight setup program. The CacheRight setup program will guide you through the steps necessary to complete the installation. You will be asked to accept the license agreement and to choose an installation destination. The default installation location is C:\Program Files\Port80\CacheRight. Note: During the installation process, you will also be given the option to install a copy of the CacheRight sample rules file in each Web site's Web root directory. Whether or not you make use of this option, a copy of the sample rules file is always installed in the CacheRight installation directory for later redistribution. Files Installed by CacheRight CacheRight installs the following files in the following default locations: C:\%SystemRoot%\System32\inetsrv\
Using the Settings Manager The Settings Manager can be launched from the Port80/CacheRight program group in the Start menu, or directly by running CacheRight.exe. The Settings Manager controls the general operation of CacheRight for any Web site (virtual server) provisioned on the computer. You will find that the configuration options in the Settings Manager are quite limited. This is because CacheRight is primarily controlled via its site-specific rules files (rules.cr). It is these rules files which allow non-administrators to author and maintain CacheRight rules on a per-site basis. Configuration options that can only be accessed by system administrators (or those with access to the Settings Manager) are kept to a bare minimum. On the left side of the Settings Manager is a list of all the Web Sites (virtual servers) that are available on the computer. Use this list to select the Web site whose settings you want to change. ![]()
Understanding Cache Control Ideally, the cache control life cycle works in a way that makes the most of caching for bandwidth savings and response speed: 1. A client makes an initial request a set of resources (for example, an HTML page with linked image, script and style sheet files). 2. The server responds by sending the requested resources, together with headers that tell how long each resource should be considered fresh. 3. From this point on, subsequent requests for these same resources are mediated by a caching mechanism, as shown in the following diagram: ![]() 4. As long as a given resource is still fresh, subsequent requests by the client for that resource are served from the cache, saving both the time and bandwidth of a round trip to the server. (Case A in the diagram.) 5. When the cache needs to display a resource that has expired or gone "stale," it polls the server to find out if that resource has changed. (Case B in the diagram.) 6. In a perfect world, whenever a client does one of these checks, the resource it is asking about will in fact have been changed, and the server will respond with the newest version. (Case C in the diagram.) Common Cache Control Problems In a less-than-perfect world, a lot go can wrong with caching: Problem: The server never sends the right cache control headers. These are the kinds of problems CacheRight helps you to avoid, by making it easy to send the right headers with every file on your site.Result: The client keeps requesting files from the server that it could have cached instead. Problem: The server sends cache control headers for some resources (like the HTML file), but not for others (like the image files). Result: The HTML file will be served out of the cache, but the image files will constantly be re-fetched from the server, when they could have been cached. Problem: The server does send cache control headers for the images, but the expiration times are much too short. (Suppose for example that the image files change rarely. This is often the case when images are used as navigational elements, uniform page headers and backgrounds, and so on.) Result: When the client sees that an image file in its cache is "stale," it will start checking with the server every time it needs to display that image, to find out if the file has changed. Since the image won't have changed, every one of these "conditional requests" from the client, and the corresponding "304" or "Not Modified" responses from the server, represents wasted bandwidth and unnecessary delay. The image ends up being served out of the client's cache anyway, but only after a round trip to the server. The Syntax of a CacheRight Rule CacheRight's main interface is a text file containing one or more rules that your server uses to decide what cache control headers to send out for a given response. The text file that contains these rules must be named rules.cr, and it must be located in your Web site's home (or root) directory. The rules themselves are just short statements that declare how a file or group of files should be cached. These statements have a very simple syntax, the general form of which is as follows: Here is a brief explanation of each of the four parts of a rule (each part is discussed in greater detail below):scope: a keyword specifying the scope of the rule, that is, the range of files it can apply to (there are three different scopes to choose from). Example CacheRight Rules In order to examine CacheRight rules in more detail, it is useful to have some concrete examples to work with. Here are four example rules. They could be the entire contents of a rules.cr file for a simple Web site: ExpiresDefault : immediately public The next several sections of this document will repeatedly refer back to these examples, while examining each of the four parts of a CacheRight rule in turn. By the end of the fourth section, you will know all there is know about writing CacheRight rules.The Scope Keyword Looking at the four example rules given in the previous topic, you will notice that in each begins with a different scope keyword. This is because each example rule demonstrates one of the four kinds of scope a rule can have in CacheRight: ExpiresDefault : immediately publicA rule with ExpiresDefault scope sets the default expiration time for all the files on a Web site. If no other rule applies to a given file, it will have the expiration time given in the ExpiresDefault rule. You can only have one ExpiresDefault rule in your rules.cr file. ExpiresByType image/jpeg : 14 days after access public no-transformRules with this scope can set the expiration time for all those files that have the media (MIME) type specified in the rule. ExpiresByType rules always override the ExpiresDefault rule, if one is present. ExpiresByPath /images/* : Thu, 01 May 2003 12:00:01 GMT privateRules with this scope can set the expiration time for any files picked out by the request path specified in the rule. ExpiresByPath rules override both the ExpiresDefault rule (if it exists) and also any ExpiresByType rule(s) that would otherwise apply. BlockByPath /images/tempimages/* : Thu, 01 May 2003 12:00:01 GMT privateRules with this scope can block the application of other CacheRight rules for any files picked out by the request path specified in the rule. BlockByPath rules override all other CacheRight rules that would otherwise apply. The Selector Now that we know about the four kinds of scope a CacheRight rule can have, we are ready to look at the next part of the rule statement, the selector. This is the part of the rule just after the scope keyword and just before the colon. Recall that its function is to select one or more files to which the rule will apply. Looking once again at our example rules: ExpiresDefault : immediately publicAs you can see, the ExpiresDefault rule is unique in that it does not have an explicit selector between the scope keyword and the colon. The reason is that an ExpiresDefault rule applies to every file on the site that is not covered by a more specific rule. Thus, its selector can always be inferred from those of all the other rules in rules.cr (or from the absence of any other rules). ExpiresByType image/jpeg : 14 days after access public no-transformThe ExpiresByType selector is simply a media type -- also known as a MIME (Multipurpose Internet Mail Extension) type. (Information about MIME types, including the relevant RFCs and a list of registered MIME types, is available online at http://www.oac.uci.edu/indiv/ehood/MIME/MIME.html.) Although this example uses only one MIME type, the selector can include as many as you wish, separated by commas. You will often want caches to treat files with different MIME types differently (for example, caching images longer than text files). This type of selector makes it easy to write such rules. In the example, the selector will apply this rule to all files of jpeg type. ExpiresByPath /images/* : Thu, 01 May 2003 12:00:01 GMT privateThe ExpiresByPath and BlockByPath selectors represent the (virtual) location of one or more files, relative to the Web site's home or root directory. The selector can include multiple paths, separated by commas. Paths may contain any combination of alphanumeric characters, underscores, forward slashes ( / ) and dots ( . ), plus wildcards (the * symbol). Paths must begin with either an initial slash (representing the home or root directory) or a wildcard. In the examples, the ExpiresByPath selector picks out all files located in the images directory, or any of its subdirectories, while the BlockByPath selector does the same for the tempimages directory and its subdirectories. The Expiration Clause Having seen how CacheRight decides which rules apply to which files, we now need to look at how rules set expiration times for the files to which they apply. Three of our four example rules demonstrate the three ways an expiration clause can be written (the BlockByPath rule is unique in that it has an empty expiration clause, so we don't need to cover it in this section): ExpiresDefault : immediately publicOne way to write an expiration clause is by using one of two special keywords -- never and immediately. Using immediately, as in the example, sets the expiration time to be exactly equivalent to the date of access. (The date of access is simply the date and time when the server delivered the file to the client.) Since the file will be "stale" as soon as it is received, it will not be cached. Using never causes the expiration time to be set for one year from the date of access. The file will be cached, and the cached copy will remain "fresh" for one year. ExpiresByType image/jpeg : 14 days after access public no-transformA second way to write an expiration clause is to specify an interval of time relative to some starting-point, at the end of which a cached copy will be considered "stale." This is done with a simple phrase consisting of a number and an interval keyword -- year(s), month(s), week(s), day(s) or minute(s), followed by the word after and one of two starting-point keywords -- access or modification. The access keyword means that the specified interval is relative to the date on which the client accessed the file on the server. The modification keyword means that the interval is relative to the last time the file was changed. In the example, a cached copy of an affected file will be considered stale 14 days after the server delivers it to the client. ExpiresByPath /images/* : Thu, 01 May 2003 12:00:01 GMT privateThe third way to write an expiration clause is to specify an expiration date. The date must be in GMT format (details can be found online at www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.3). In the example, cached copies of the affected files will be considered fresh until March1st, 2003. Additional Directives We have now covered the main parts of a rule -- those that let you control the expiration times for every file on your Web site. The third and last part of a rule consists of one or more directive keywords that cause additional cache-control directives to be added to a file's headers. Three of our example rules take such supplementary directives (the BlockByPath rule is, once again, the exception). ExpiresDefault : immediately publicThe public and private keywords cause either the "public" or the "private" directive to be added to the cache control headers for a given file or files. You may have noticed that all the examples have one or other of these directives, but never both. There is a good reason for this: The public and private directives are mutually exclusive and one or the other of them must always be appended to every CacheRight rule. The "public" directive simply insures that a response is cachable by both shared (proxy) and nonshared (browser) caches. The "private" directive prevents a response from being cached by shared (proxy) caches, but permits caching by browsers. In the examples, only the last of the three rules (the ExpiresByPath rule) has private rather than public, so only files affected by that rule will be uncachable by shared (proxy) caches. ExpiresByType image/jpeg : 14 days after access public no-transformThe no-transform directive is optional. Caches are allowed to transform the objects they cache to make their storage and transmission more efficient. A cache may, for example, recode image files or compress text files. Under certain circumstances, these kinds of transformations can cause problems. Appending no-transform to a rule prevents caches from modifying the files affected by that rule. Since shared (proxy) caches are more likely to use such transformations, no-transform will often be appended to CacheRight rules that have the public keyword, as in the example. Getting Started Writing CacheRight Rules You now know everything there is to know about CacheRight rules. The next step is to begin writing some of your own. To get started, you can use the sample rules.cr file that comes with CacheRight. When CacheRight is installed, a sample rules.cr file is optionally copied into the root directory of every Web site on the server. In addition (or instead) your system administrator may have chosen other means to distribute this file (for example, via an FTP server). Alternatively, you can create your own rules file in a text editor like Notepad. In either case, remember that the file must be named rules.cr and saved to your Web site's home (or root) directory. As you start writing CacheRight rules on your own, you will need to know how to do three things with the rules.cr file: validate it, reload it, and see the effect of changes made to it. These topics are covered, each in turn, in the remainder of the help document. Validating the Rules File If there is an error in the syntax of one or more of the rules in your rules.cr file, CacheRight will not load the file and none of the rules in it will have any effect. Client requests will not be interrupted, but they won't have the proper cache control headers either. To avoid this, we highly recommend that you validate your rules.cr file every time you make a change to it. CacheRight comes with a utility called cr_syntax.exe that validates rules.cr files and displays any syntax errors they may contain. The cr_syntax.exe utility is installed together with CacheRight on the IIS server computer. It can be launched from the CacheRight Settings Manager (using the Validate button) for use on the server locally. Since it is a standalone utility, however, it can also be redistributed to developers or site managers, for remote use. Reloading the Rules File In order for changes to a rules.cr file to take effect, CacheRight must reload that file in memory. This happens automatically whenever the IIS process running CacheRight is restarted. Of course, you will usually want changes to your rules.cr file to take effect without having to recycle an application pool or restart the Web service. In order to do this, you must tell CacheRight that it needs to reload the rules.cr file. Naturally, this can be accomplished by clicking Apply or OK on the Settings Manager, but what about the case of a remote user (such as a developer or site manager) who has uploaded changes to the rules file for a particular site? To remotely trigger a rules file reload for a particular Web site, use the query parameter cr_reset. Simply save your changes to rules.cr and make a request for any file on the site (for example the homepage), appending ?cr_reset to the URL. Make sure your HTTP request is an unconditional GET (for example, by using Control+F5 in Internet Explorer). You must use cr_reset every time you change the rules file, in order for your changes to take effect. How to examine the HTTP Headers Since CacheRight rules work by controlling the caching-related headers sent out by the Web server, you will want to be able to examine these headers while you are editing rules.cr, in order to see the effects of any changes you make to the rules. To make this possible, you need a header-scanning tool. Header-scanning tools work like ordinary Web browsers, except that they show the HTTP headers transmitted by the server. A couple of tools that have this capability are readily available online: If the Web site on which you are running CacheRight is accessible from the Internet, you can use the CacheCheck tool at www.cacheright.com. Once you have access to a header-scanning tool capable of reaching your Web site, the process of seeing the effects of your changes to the rules.cr file is quite simple:If your CacheRight site is behind a firewall -- or if you would just rather work with a client-side tool that runs on your desktop -- you can download a tracing utility such as free ieHTTPHeaders. Links to other such tools, along with instructions for using them, can be found in our CacheRight Evaluation Guide: Save any changes to your rules.cr file. Once you can see the headers, you will need to know what to look for. This is the subject of the next topic.Validate your rules.cr file using the cr_syntax application. Using the header-scanning tool, make a Web request for the file you want to test, being sure to append the ?cr_reset parameter to the query string. What to Look for in the Server's Response CacheRight controls a number of headers & directives sent by the Web server. The most important ones to be concerned with when editing CacheRight rules are the Expires header and the max-age directive of the Cache-control header. These are the header fields that the server uses to supply the client with the expiration date (or interval) for the requested file. This date (or interval) should correspond to the one specified in the expiration clause of the applicable CacheRight rule. In addition, any directives added to the end of the CacheRight rule statement should also show up in the Cache-control header. To see how this works, suppose a Web site had a rules.cr file containing just our three example rules. If you make a series of requests with a header-scanning tool, with each request URL chosen so as to trigger a different rule, what would the results look like in the cache-related server headers? Consider some examples: First Example A request with a header-scanning tool for the site's home page (index.html for example) would cause the ExpiresDefault rule to be applied. Recall the expiration clause and directive of this simple rule: As a result of this rule being applied, the following headers would be displayed from the server's response:Because the rule's expiration clause said that this file should expire immediately, The result is that the max-age directive was added to the Cache-control header, with a value of 0. The Expires header was also set, and notice that its value is the same as that of the Date header, which gives the date of access in GMT time. Note also that the public directive was added to the Cache-control header. Second Example A second request by the header-scanning tool, this time for a jpeg file that is not in the /images directory, would cause the ExpiresByType rule to be applied. Here is the rule, with its expiration clause and directives bolded: And here are the corresponding headers that would be displayed in the header-scanning tool:Because the rule's expiration clause specified a relative interval of 14 days after last access for the file's expiration, the max-age directive was added to the Cache-control header with a value of 1,209,600 seconds -- the equivalent of 14 days after access. The Expires header was also set with a GMT date that is exactly 14 days after the one reported in the Date header. Lastly, the public and no-transform directives were added to the Cache-control header. Note that, if the expiration clause had specified an interval of 14 days after modification (rather than after access), then the Cache-control: max-age and Expires values would have been adjusted accordingly. For Expires, the date/time would have been set relative to the Last-Modified header (not shown in the example), rather than the Date header. Since the max-age directive is always relative to date/time of access, the number of seconds given in it would have been the interval specified in the rule less the interval between Last-Modified and Date. Third Example A third and final request with the header-scanning tool, this time for a jpeg file in the /images directory, would cause the ExpiresByPath rule to be applied. Recall the rule, noting its expiration clause and directive: And here are the headers as reported in the header-scanning tool:Because the rule's expiration clause specified an absolute date/time for the file's expiration, that date/time was reproduced exactly in the Expires header. The max-age directive is not added to the Cache-control header in the case of a rule that specifies an absolute expiration time, since max-age is intended primarily for use with relative expiration intervals. Finally, the private directive was added to the Cache-control header, making the request non-cachable by shared (proxy) caches. Other Effects to Look For CacheRight also routinely adds a number of other directives to the Cache-control header (must-revalidate, proxy-revalidate) and makes sure the Date header is always present. These are best-practices that CacheRight handles without being explicitly directed to do so by a particular CacheRight rule, so you should see these in the header-scanning tool's output whenever CacheRight is active. System Requirements CacheRight is compatible with the following:
Port80 Software Technical Support support@port80software.com www.port80software.com/support 888.4PORT80 (888.476.7880) toll free 858.268.7960 phone 858.268.7760 fax |
| |||||