2 Referencing Policies

2.1 Overview and Purpose of Policy References

Locating a P3P policy is one of the first steps in the operation of the P3P protocol. Services use policy references to state what policy applies to a specific URI or set of URIs. User agents use policy references to locate the privacy policy that applies to a page, so that they can process that policy for the benefit of their user.

Policy references are used extensively as a performance optimization. P3P policies are typically several kilobytes of data, while a URI that references a privacy policy is typically less than 100 bytes. In addition to the bandwidth savings, policy references also reduce the need for computation: policies can be uniquely associated with URIs, so that a user agent need only parse and process a policy once rather than process it with every document to which the policy applies. Furthermore, by placing the information about relevant policies in a centralized location, Web site administration is simplified.

A policy reference file is used to associate P3P policies with certain regions of URI-space. The policy reference file is an XML (see [XML]) file that can specify the policy for a single Web document, portions of a Web site, or for an entire site. The policy reference file may refer to one or more P3P policies; this allows for a single reference file to cover an entire site, even if different P3P policies apply to different portions of the site. The policy reference file is used to make any or all of the following statements:

  • The URI where a P3P policy is found

  • The URIs or regions of URI-space covered by this policy

  • The URIs or regions of URI-space not covered by this policy

  • The regions of URI-space for embedded content on other servers that are covered by this policy

  • The cookies that are or are not covered by this policy

  • The access methods for which this policy is applicable

  • The period of time for which these claims are considered to be valid

All of these statements are made in the body of the policy reference file.

2.2 Locating Policy Reference Files

This section describes the mechanisms used to indicate the location of a policy reference file. Detailed syntax is also given for the supported mechanisms.

The location of the policy reference file can be indicated using one of three mechanisms. The policy reference file may be located in a predefined "well-known" location, or a document may indicate a policy reference file through an HTML link tag, or through an HTTP header.

Note that if user agents support retrieving HTML content over HTTP, they MUST handle all three mechanisms listed above interchangeably. See also the requirements for non-ambiguity.

Note that policies are applied at the level of HTTP entities. An entity, retrieved by fetching a URI, has a P3P policy associated with it. A "page" from the user's perspective may be composed of multiple HTTP entities; each entity may have its own P3P policy associated with it. As a practical note, however, placing many different P3P policies on different entities on a single page may make rendering the page and informing the user of the relevant policies difficult for user agents. Additionally, services are recommended to attempt to craft their policy reference files such that a single policy reference file covers any given "page"; this will speed up the user's browsing experience.

For a user agent to process the policy that applies to a given entity, it must locate the policy reference file for that entity, fetch the policy reference file, parse the policy reference file, fetch any required P3P policies, and then parse the P3P policy or policies.

This document does not specify how P3P policies may be associated with documents retrieved by means other than HTTP. However, it does not preclude future development of mechanisms for associating P3P policies with documents retrieved over other protocols. Furthermore, additional methods of associating P3P policies with documents retrieved using HTTP may be developed in the future.

2.2.1 Well-Known Location

Web sites using P3P SHOULD place a policy reference file in a "well-known" location. To do this, a policy reference file would be placed in the site's /w3c directory, under the name p3p.xml. Thus a user agent could request this policy reference file by using a GET request for the resource /w3c/p3p.xml.

Note that sites are not required to use this mechanism; however, by using this mechanism, sites can ensure that their P3P policy will be accessible to user agents before any other resources are requested from the site. This will reduce the need for user agents to access the site using safe zone practices. Additionally, if a site chooses to use this mechanism, the policy reference file located in the well-known location is not required to cover the entire site. For example, sites where not all of the content is under the control of a single organization MAY choose not to use this mechanism, or MAY choose to post a policy reference file which covers only a limited portion of the site.

Use of the well-known location for a policy reference file does not preclude use of other mechanisms for specifying a policy reference file. Portions of the site MAY use any of the other supported mechanisms to specify a policy reference file, so long as the non-ambiguity requirements are met.

For example, imagine a shopping-mall Web site run by the MallExample company. On their Web site (mall.example.com), companies offering goods or services at the mall would get a company-specific subtree of the site, perhaps in the path /companies/company-name. The MallExample company may choose to put a policy reference file in the well-known location which covers all of their site except the /companies subtree. Then if the ShoeStoreExample company has some content in /companies/shoestoreexample, they could use one of the other mechanisms to indicate the location of a policy reference file covering their portion of the mall.example.com site.

One case where using the well-known location for policy reference files is expected to be particularly useful is in the case of a site which has divided its content across several hosts. For example, consider a site which uses a different logical host for all of its Web-based applications than for its static HTML content. The other mechanisms allowed for specifying the location of a policy reference file require that some URI on the host being accessed must be fetched to locate the policy reference file. However, the well-known location mechanism has no such requirement. Consider the example of an HTML form located on www.example.com. Imagine that the action URI on that form points to server cgi.example.com. The policy reference file that covers the form is unable to make any statements about the action URI that processes the form. However, the site administrator publishes a policy reference file at http://cgi.example.com/w3c/p3p.xml that covers the action URI, thus enabling a user agent to easily locate the P3P policy that applies to the action URI before submitting the form contents.

2.2.2 HTTP Headers

Any document retrieved by HTTP MAY point to a policy reference file through the use of a new response header, the P3P header ([P3P-HEADER]). If a site is using P3P headers, it SHOULD include this on responses for all appropriate request methods, including HEAD and OPTIONS requests.

The P3P header gives one or more comma-separated directives. The syntax follows:

[1]

p3p-header

=

`P3P: ` p3p-header-field *(`,` p3p-header-field)

[2]

p3p-header-field

=

policy-ref-field | compact-policy-field | extension-field

[3]

policy-ref-field

=

`policyref="` URI `"`

[4]

extension-field

=

token [`=` (token | quoted-string) ]

Here, URI is defined as per RFC 2396 [URI], token and quoted-string are defined by [HTTP1.1].

In keeping with the rules for other HTTP headers, the name of the P3P header may be written with any casing. The contents should be specified using the casing precisely as specified in this document.

The policyref directive gives a URI which specifies the location of a policy reference file which may reference the P3P policy covering the document that pointed to the reference file, and possibly others as well. When the policyref attribute is a relative URI, that URI is interpreted relative to the request URI. Note that fetching the URI given in the policyref directive MAY result in a 300-class HTTP return code (redirection); user agents MUST interpret those redirects with normal HTTP semantics. Services should note, of course, that use of redirects will increase the time required for user agents to find and interpret their policies. The policyref URI MUST NOT be used for any other purpose beyond locating and referencing P3P policies.

The compact-policy-field is used to specify "compact policies." This is described in Section 4.

User agents which find unrecognized directives (in the extension-fields) MUST ignore the unrecognized directives. This is to allow easier deployment of future versions of P3P.

Example 2.1:

  1. Client makes a GET request.

    GET /index.html HTTP/1.1
    Host: catalog.example.com
    Accept: */*
    Accept-Language: de, en
    User-Agent: WonderBrowser/5.2 (RT-11)
    
  2. Server returns content and the P3P header pointing to the policy of the page.

    HTTP/1.1 200 OK
    P3P: policyref="http://catalog.example.com/P3P/
        PolicyReferences.xml"
    Content-Type: text/html
    Content-Length: 7413
    Server: CC-Galaxy/1.3.18
    

2.2.3 The HTML link Tag

Servers MAY serve HTML content with embedded link tags that indicate the location of the relevant P3P policy reference file. This use of P3P does not require any change in the server behavior.

The link tag encodes the policy reference information that could be expressed using the P3P header. The link tag takes the following form:

[5]

p3p-link-tag

=

`<link rel="P3Pv1" href="` URI `">`

Here, URI is defined as per RFC 2396 [URI].

When the href attribute is a relative URI, that URI is interpreted relative to the request URI.

In order to illustrate with an example the use of the link tag, we consider the policy reference expressed in Example 2.1 using HTTP headers. That example can be equivalently expressed using the link tag with the following piece of HTML:

<link rel="P3Pv1"
    href="http://catalog.example.com/P3P/PolicyReferences.xml">

Finally, note that since the p3p-link-tag is embedded in an HTML document, its character encoding will be the same as that of the HTML document. In contrast to P3P policy and policy reference documents (see section 2.3 and section 3 below), the p3p-link-tag need not be encoded using [UTF-8]. Note also that the link tag is not case sensitive

2.2.4 HTTP Ports and Other Protocols

The mechanisms described here MAY be used for HTTP transactions over any underlying protocol. This includes plain-text HTTP over TCP/IP connections as well as encrypted HTTP over SSL connections, as well as HTTP over any other communications protocol network designers wish to implement.

URLs MAY contain TCP/IP port numbers, as specified in RFC 2396 [URI]. For the purposes of P3P, the different ports on a single host MUST be considered to be separate "sites." Thus, for example, the policy reference file at the well-known location for www.example.com on port 80 (http://www.example.com/w3c/p3p.xml) would not give any information about the policies which apply to www.example.com when accessed over SSL (as the SSL communication would take place on a different port, 443 by default).

This document does not specify how P3P policies may be associated with documents retrieved by means other than HTTP. However, it does not preclude future development of mechanisms for associating P3P policies with documents retrieved over other protocols. Furthermore, additional methods of associating P3P policies with documents retrieved using HTTP may be developed in the future.

2.3 Policy Reference File Syntax and Semantics

This section explains the contents of policy reference files in detail.

2.3.1 Example Policy Reference File

Consider the case of a Web site wishing to make the following statements:

  1. P3P policy /P3P/Policies.xml#first applies to the entire site, except the subtrees /catalog, /cgi-bin, and /servlet.

  2. P3P policy /P3P/Policies.xml#second applies to all documents in the /catalog directory (and its subdirectories).

  3. P3P policy /P3P/Policies.xml#third applies to all documents in the /cgi-bin and /servlet directories (and their subdirectories), except for /servlet/unknown.

  4. No statement is made about what P3P policy applies to /servlet/unknown.

  5. These statements are valid for 2 days.

These statements could be represented by the following piece of XML:

Example 2.2:

<META xmlns="http://www.w3.org/2001/09/P3Pv1">
 <POLICY-REFERENCES>
  <EXPIRY max-age="172800"/>

    <POLICY-REF about="/P3P/Policies.xml#first">
      <INCLUDE>/*</INCLUDE>
      <EXCLUDE>/catalog/*</EXCLUDE>
      <EXCLUDE>/cgi-bin/*</EXCLUDE>
      <EXCLUDE>/servlet/*</EXCLUDE>
    </POLICY-REF>

    <POLICY-REF about="/P3P/Policies.xml#second">
      <INCLUDE>/catalog/*</INCLUDE>
    </POLICY-REF>

    <POLICY-REF about="/P3P/Policies.xml#third">
      <INCLUDE>/cgi-bin/*</INCLUDE>
      <INCLUDE>/servlet/*</INCLUDE>
      <EXCLUDE>/servlet/unknown</EXCLUDE>
    </POLICY-REF>

 </POLICY-REFERENCES>
</META>

Note this example also includes via EXPIRY a relative expiry time in the document (cf. Section 2.3.2.3.2).

2.3.2 Policy Reference File Definition

This section defines the syntax and semantics of P3P policy reference files. All policies MUST be encoded using [UTF-8]. P3P servers MUST encode their policy references using this syntax. P3P user agents MUST be able to parse this syntax.

One significant point to make about the syntax of policy reference files is that the syntax defined here does not have an extension mechanism. The syntax for P3P policies has a powerful extension mechanism, but that mechanism is not supported for policy reference files.

2.3.2.1 Policy Reference File Processing
2.3.2.1.1 Significance of Order

A policy reference file may contain multiple POLICY-REF elements. If it does contain more than one element, they MUST be processed by user agents in the order given in the file. When a user agent is attempting to determine what policy applies to a given URI, it MUST use the first POLICY-REF element in the policy reference file which applies to that URI.

Note that each POLICY-REF may contain multiple INCLUDE, EXCLUDE, METHOD, COOKIE-INCLUDE, and COOKIE-EXCLUDE elements and that all of these elements within a given POLICY-REF MUST be considered together to determine whether the POLICY-REF applies to a given URI. Thus, it is not sufficient to find an INCLUDE element that matches a given URI, as EXCLUDE or METHOD elements may serve as modifiers that cause the POLICY-REF not to match.

2.3.2.1.2 Wildcards in Policy Reference Files

Policy reference files make statements about what policy applies to a given URI. Policy reference files support a simple wildcard character to allow making statements about regions of URI-space. The character asterisk ("*") is used to represent a sequence of 0 or more of any character. No other special characters (such as those found in regular expressions) are supported. Note that since the asterisk is also a legal character in URIs ([URI]), some special conventions have to be followed when encoding such "extended URIs" in a policy reference file:

  • URIs represented in policy-ref files MUST be properly escaped, as in [URI].

  • P3P user agents MUST escape any characters which should be escaped, as according to [URI], before attempting to match a URI for a policy.

  • P3P user agents MUST un-escape any escaped sequences which resolve to URI-legal characters, according to [URI], before attempting to match a URI for a policy, EXCEPT

  • Literal '*'s in URIs MUST be escaped by P3P user agents before attempting to match a URI for a policy.

  • P3P user agents MUST ignore any URI pattern that does not conform to [URI]

The wildcard character MAY be used in the INCLUDE and EXCLUDE elements, in the COOKIE-INCLUDE and COOKIE-EXCLUDE elements, and in the HINT element.

2.3.2.2 The META and POLICY-REFERENCES Elements

The META element contains a complete policy reference file. Optionally, one POLICIES element can follow. Additionally, other XML markup MAY follow the POLICY-REFERENCES (or POLICIES, if present) element, although that markup MUST be ignored by any P3P1.0 user agent.

<POLICY-REFERENCES>

This element MAY contain one or more POLICY-REF (policy reference) elements. It MAY also contain one EXPIRY element (indicating their expiration time), and one or more HINT element.

[6]

prf

=

`<META xmlns="http://www.w3.org/2001/09/P3Pv1">`
policyrefs
[policies]
PCDATA
"</META>"

[7]

policyrefs

=

"<POLICY-REFERENCES>"
[expiry]
*policyref
*hint
"</POLICY-REFERENCES>"

Here PCDATA is defined in [XML].

2.3.2.3 Policy Reference File Lifetimes and the EXPIRY Element
2.3.2.3.1 Motivation and Mechanism

It is desirable for servers to inform user agents about how long they can use the claims made in a policy reference file. By enabling clients to cache the contents of a policy reference file, it reduces the time required to process the privacy policy associated with a Web page. This also reduces load on the network. In addition, clients that don't have a valid policy reference file for a URI will need to use "safe zone" practices for their requests. If clients have policy reference files that they know are still valid, then they can make more informed decisions on how to proceed.

In order to achieve these benefits, policy reference files SHOULD contain an EXPIRY element, which indicates the lifetime of the policy reference file. If the policy reference file does not contain an EXPIRY element, then it is given a 24-hour lifetime.

The lifetime of a policy reference file tells user agents how long they can rely on the claims made in the policy reference file. By setting the lifetime of a policy reference file, the publishing site agrees that the policies mentioned in the policy reference file are appropriate for the lifetime of the policy reference file. For example, if a policy reference file has a lifetime of 3 days, then a user agent need not reload that file for 3 days, and can assume that the references made in that policy reference file are good for 3 days. All of the policy references made in a single policy reference file will receive the same lifetime. The only way to specify different lifetimes for different policy references is to use separate policy reference files.

The same mechanism used to indicate the lifetime of a policy reference file is also used to indicate the lifetime of a P3P policy. Thus P3P POLICIES elements SHOULD have an EXPIRY element associated with them as well. This lifetime applies to all P3P policies contained within that POLICIES element. If there is no EXPIRY element associated with a P3P policy, then it is given a 24-hour lifetime.

When picking a lifetime for policies and policy reference files, sites need to pick a lifetime which balances two competing concerns. One concern is that the lifetime ought to be long enough to allow user agents to receive significant benefits from caching. The other concern is that the site would like to be able to change their policy for new data collection without waiting for an extremely long lifetime to expire. It is expected that lifetimes in the range of 1?7 days would be a reasonable balance between these two competing desires. Sites also need to remember the policy update requirements when updating their policies.

When a policy reference file has expired, the information in the policy reference file MUST NOT be used by a user agent until that user agent has successfully revalidated the policy reference file, or has fetched a new copy of the policy reference file.

Note that while user agents are not obligated to revalidate policy reference files or policy files that have not expired, they MAY choose to revalidate those files before their expiry period has passed, in order to reduce the need for using "safe zone" practices. A valid P3P user agent implementation doesn't need to contain a cache for policies and policy reference files, though the implementation will have a better performance if it does.

2.3.2.3.2 The EXPIRY Element

The EXPIRY element can be used in a policy reference file and/or in a POLICIES element to state how long the policy reference file (or policies) remains valid. The expiry is given as either an absolute expiry time, or a relative expiry time. An absolute expiry time is a time, given in GMT, until which the policy reference file (or policies) is valid. A relative expiry time gives a number of seconds for which the policy reference file (or policies) is valid. This expiry time is relative to the time the policy reference file (or policies) was requested or last revalidated by the client. This computation MUST be done using the time of the original request or revalidation, and the current time, with both times generated from the client's clock. Revalidation is defined in section 13.3 of [HTTP1.1].

The minimum amount of time for any relative expiry time is 24 hours, or 86400 seconds. Any relative expiration time shorter than 86400 seconds MUST be treated as being equal to 86400 seconds in a client implementation. If a client encounters an absolute expiration time that is in the past, it MUST act as if NO policy reference file (or policy) is available. See section 2.4.7 "Absence of Policy Reference File" for the required procedure in such cases.

[8]

expiry

=

"<EXPIRY" (absdate|reldate) "/>"

[9]

absdate

=

`date="` HTTP-date `"`

[10]

reldate

=

`max-age="` delta-seconds `"`

Here, HTTP-date is defined in section 3.3.1 of [HTTP1.1], and delta-seconds is defined in section 3.3.2 of [HTTP1.1].

2.3.2.3.3 Requesting Policies and Policy Reference Files

In a real-world network, there may be caches which will cache the contents of policies and policy reference files. This is good for increasing the overall network performance, but may have deleterious effects on the operation of P3P if not used correctly. There are two specific concerns:

  1. When a user agent receives a policy reference file (or policy), if it was served from a network cache, the user agent needs to know how long the policy reference file or policy resided in the network cache. This time MUST be subtracted from the lifetime of the policy or policy reference file which uses relative expiry.

  2. When a user agent needs to revalidate a policy reference file (or policy), it needs to make sure that the revalidation fetches a current version of the policy reference file (or policy). For example, consider the case where a user agent holds a policy reference file with a 1 day relative expiry. If the user agent refetches it from a network cache, and the file has been residing in the network cache for 3 days, then the resulting file is useless.

HTTP 1.1 [HTTP1.1] contains powerful cache-control mechanisms to allow clients to place requirements on the operations of network caches; these mechanisms can resolve the problems mentioned above. The specific method will be discussed below.

HTTP 1.0, however, does not provide those more sophisticated cache control mechanisms. An HTTP 1.0 network cache will, in all likelihood, compute a cache lifetime for the policy reference file (or policies) based on the file's last-modified date; the resulting cache lifetime could be significantly longer than the lifetime specified by the EXPIRY element. The network cache could then serve the policy reference file (or policies) to clients beyond the lifetime in the EXPIRY; the result would be that user-agents would receive a useless policy reference file (or policies).

The second problem with HTTP 1.0 network caches is that a user agent has no way to know how long the reference file may have been stored by the network cache. If the policy reference file (or policies) relies on relative expiry, it would then be impossible for the user agent to determine if the reference file's lifetime has already expired, or when it will expire.

Thus, if a user agent is requesting a policy reference file or a policy, and does not know for certain that there are no HTTP 1.0 caches in the path to the origin server, then the request must force an end-to-end revalidation. This can be done with the Pragma: no-cache HTTP request-header. Note that neither HTTP nor P3P define a way to determine if there is a HTTP 1.0-compliant cache in any given network path, so unless the user agent has this information derived from an outside source, it MUST force the end-to-end revalidation.

If the user agent has some way to know that all caches in the network path to the origin server are compliant with HTTP 1.1 (or that there are no caches in the network path to the origin server), then the client MUST do the following:

  1. Use cache-control request-headers to ensure that the received response is not older than its lifetime. This is done with the max-age cache-control setting, with a maximum age significantly less than the lifetime of the policy reference file (or policies). For example, a user agent could send Cache-Control: max-age=43200, thus ensuring that the response is no more than 12 hours old.

  2. Subtract the age of the response from the lifetime of the policy reference file (or policies), if it uses a relative expiry time. The age of the response is given by the Age: HTTP response-header.

Note that it is impossible for a client to accurately predict the amount of latency that may affect an HTTP request. Thus, if the policy reference file covering a request is going to expire soon, clients MAY wish to consider warning their users and/or revalidating the policy reference file before continuing with the request.

2.3.2.3.4 Error Handling for Policy Reference File and Policy Lifetimes

The following situations have their semantics specifically defined:

  1. An absolute expiry date in the past renders the policy reference file (or policies) useless, as does an invalid or malformed expiry date, whether relative or absolute. In this case, user agents MUST act as if NO policy reference file (or policies) is available. See section 2.4.7 "Absence of Policy Reference File" for the required procedure in such cases.

  2. A relative expiration time shorter than 86400 seconds (1 day) is considered to be equal to 86400 seconds.

  3. When a policy reference file contains more than one EXPIRY element, the first one takes precedence for determining the lifetime of the policy reference file.

2.3.2.4 The POLICY-REF Element

A policy reference file may refer to multiple P3P policies, specifying information about each. The POLICY-REF element describes attributes of a single P3P policy. Elements within the POLICY-REF element give the location of the policy and specify the areas of URI-space (and cookies) that each policy covers.

POLICY-REF

Contains information about a single P3P policy.

  • about (mandatory attribute)

    URI reference ([URI]), where the fragment identifier part denotes the name of the policy (given in its name attribute), and the URI part denotes the URI where the policy resides. If this is a relative URI reference, it is interpreted relative to the URI of the policy reference file.

[11]

policy-ref

=

`<POLICY-REF about="` URI-reference `">`
*include
*exclude
*cookie-include
*cookie-exclude
*method-element
`</POLICY-REF>`

Here, URI is defined as per RFC 2396 [URI].

2.3.2.5 The INCLUDE and EXCLUDE Elements

Each INCLUDE or EXCLUDE element specifies one local URI or set of local URIs. A set of URIs is specified if the wildcard character '*' is used in the URI-pattern. These elements are used to specify the portion of the Web site that is covered by the policy referenced by the enclosing POLICY-REF element.

When INCLUDE (and optionally, EXCLUDE) elements are present in a POLICY-REF element, it means that the policy specified in the about attribute of the POLICY-REF element applies to all the URIs at the requested host corresponding to the local-URI(s) matched by any of the INCLUDEs, but not matched by an EXCLUDE element.

A policy referenced in a policy reference file can be applied only to URIs on the DNS (Domain Name System) host that reference it. The INCLUDE and EXCLUDE elements MUST specify URI patterns relative to the root of the DNS host to which they are applied. This requirement does NOT apply to the location of the P3P policy file (the about attribute on the POLICY-REF element).

If a METHOD element (section 2.3.2.8) specifies one or more methods for an enclosing policy reference, it follows that all methods not mentioned are consequently not covered by this policy. In the case that this is the only policy reference for a given URI prefix, user agents MUST assume that NO policy is in effect for all methods NOT mentioned in the policy reference file. It is legal but pointless to supply a METHOD element without any INCLUDE or COOKIE-INCLUDE elements.

It is legal, but pointless, to supply an EXCLUDE element without any INCLUDE elements; in that case, the EXCLUDE element MUST be ignored by user agents.

Note that the set of URIs specified with INCLUDE and EXCLUDE does not include cookies that might be triggered when requesting one of such URIs: in order to associate policies with cookies, the COOKIE-INCLUDE and COOKIE-EXCLUDE elements are needed.

[12]

include

=

"<INCLUDE>" relativeURI "</INCLUDE>"

[13]

exclude

=

"<EXCLUDE>" relativeURI "</EXCLUDE>"

Here, relativeURI is defined as per RFC 2396 [URI], with the addition that the '*' character is to be treated as a wildcard, as defined in section 2.3.2.1.2.

2.3.2.6 The HINT Element

Policy reference hints are a performance optimization that can be used under certain conditions. A DNS host may declare a policy reference for itself using the well-known location, the P3P response header, or the HTML link tag. The host MAY further provide a hint to additional policy references, such as those declared by other hosts. For example, an HTML page might hint at policy references for its hyperlinks, embedded content, and form submission URIs. User agents MAY use the hint mechanism to discover policy references before requesting the affected URIs when the policy references are not available from the well-known location.

Any policy reference file MAY contain zero or more policy reference hints. Each hint is contained in a HINT element, and consists of single host or domain of hosts to which the hinted policy reference can be applied. When using a hint applicable to multiple hosts, the policy reference is expected in the same relative location on each host, but the content may vary according to the host. Therefore, a user agent that finds a policy reference on a particular host via the hint mechanism MUST NOT apply it to another host.

The domain attribute is used to domain-match (possibly using the '*' wildcard) the host(s) to which the hinted policy reference file can be applied. The path attribute specifies the location of the hinted policy reference files relative to the applicable host rather than the policy reference file containing the hint.

Here is an example of HINT elements that hint at the location of policy reference files on the host example.org and on any host in the domain shop.example.com:

Example 2.3:

<HINT domain="example.org" path="/mypolicy/p2.xml"/>
<HINT domain="*.shop.example.com" path="/w3c/prf.xml"/>

If a hinted policy reference file is not found, expired, or otherwise invalid, the user agent MUST ignore the hint. Before using a hinted policy reference, the user agent MUST check the well-known location and give precedence to any policy references directly declared by the host, with the well-known location taking the highest precedence. If a hinted policy reference is not directly declared by the host as expected, the user agent MAY ignore it.

[14]

hint

=

`<HINT domain="` HN `" path="` token `/>`

Here, HN and token are defined as per RFC 2965 [STATE], with the addition that in HN the '*' character is to be treated as a wildcard, as defined in section 2.3.2.1.2.

2.3.2.7 The COOKIE-INCLUDE and COOKIE-EXCLUDE Elements

The COOKIE-INCLUDE and COOKIE-EXCLUDE elements are used to associate policies to cookies.

A cookie policy MUST cover any data (within the scope of P3P) that is stored in that cookie or linked via that cookie. It MUST also reference all purposes associated with data stored in that cookie or enabled by that cookie. In addition, any data/purpose stored or linked via a cookie MUST also be put in the cookie policy. In addition, if that linked data is collected by HTTP, then the policy that covers that GET/POST/whatever request must cover that data collection. For example, when CatalogExample asks customers to fill out a form with their name, billing, and shipping information, the P3P policy that covers the form submittal will disclose that CatalogExample collects this data and explain how it is used. If CatalogExample sets a cookie so that it can recognize its customers and observe their behavior on its web site, it would have a separate policy for this cookie. However, if this cookie is also linked to the user's name, billing, and shipping information?perhaps so CatalogExample can generate custom catalog pages based on where the customer lives?then that data must also be disclosed in the cookie policy.

For the purpose of this specification, state management mechanisms use either SET-COOKIE or SET-COOKIE2 headers, and cookie-namespace is defined as the value of the NAME, VALUE, Domain and Path attributes, specified in [COOKIES] and [STATE].

Each COOKIE-INCLUDE or COOKIE-EXCLUDE element can be used to match (similarly to INCLUDE and EXCLUDE) the NAME, VALUE, Domain and Path components of a cookie, expressing the cookies which are covered by the policy specified by the about attribute when the cookies are set from the documents on the Web site where the policy reference file resides:

COOKIE-INCLUDE (resp. COOKIE-EXCLUDE)

Include (resp. exclude) cookies that match the name, value, domain and path attributes.

  • name: match the NAME portion of the cookie

  • value: match the VALUE portion of the cookie

  • domain: match the Domain portion of the cookie

  • path: match the Path portion of the cookie

All four attributes are optional. If an attribute is absent, the COOKIE-INCLUDE (resp. COOKIE-EXCLUDE) will match cookies that have that attribute set to any value.

When COOKIE-INCLUDE (and optionally, COOKIE-EXCLUDE) elements are present in a POLICY-REF element, the policy specified in the about attribute of the POLICY-REF element applies to every cookie that is matched by any COOKIE-INCLUDE's, and not matched by a COOKIE-EXCLUDE element.

A site MUST NOT declare policies for cookies unless the cookies are set by its own site. User agents MUST accordingly interpret COOKIE-INCLUDE and COOKIE-EXCLUDE elements in a policy reference file to determine the policy that applies to cookies. Note that COOKIE-INCLUDE and COOKIE-EXCLUDE are the only mechanisms for associating policies with cookies in policy reference files (see Section 4).

The policy that applies to a cookie applies until the policy expires, even if the associated policy reference file expires prior to policy expiry (but after the cookie was set). If the policy associated with a cookie has expired, then the user agent SHOULD reevaluate the cookie policy before sending the cookie. In addition, user agents MUST use only non-expired policies and policy reference files when evaluating new set-cookie events.

Example 2.4 states that /P3P/Policies.xml#first applies to all cookies.

Example 2.4:

<META xmlns="http://www.w3.org/2001/09/P3Pv1">
 <POLICY-REFERENCES>
    <POLICY-REF about="/P3P/Policies.xml#first">
       <COOKIE-INCLUDE name="*" value="*" domain="*" path="*"/>
    </POLICY-REF>
 </POLICY-REFERENCES>
</META>

Example 2.5 states that /P3P/Policies.xml#first applies to all cookies, except cookies with the cookie name value of "obnoxious-cookie", a domain value of "example.com", and a path value of "/", and that /P3P/Policies.xml#second applies to all cookies with the cookie name of "obnoxious-cookie", a domain value of ".example.com", and a path value of "/".

Example 2.5:

<META xmlns="http://www.w3.org/2001/09/P3Pv1">
 <POLICY-REFERENCES>
    <POLICY-REF about="/P3P/Policies.xml#first">
       <COOKIE-INCLUDE name="*" value="*" domain="*" path="*"/>
       <COOKIE-EXCLUDE name="obnoxious-cookie" value="*"
          domain=".example.com" path="/"/>
    </POLICY-REF>
    <POLICY-REF about="/P3P/Policies.xml#second">
       <COOKIE-INCLUDE name="obnoxious-cookie" value="*"
          domain=".example.com" path="/"/>
    </POLICY-REF>
 </POLICY-REFERENCES>
</META>

[15]

cookie-include

=

"<COOKIE-INCLUDE"
   [` name="` token `"`]   ; matches the cookie's NAME
   [` value="` token `"`]  ; matches the cookie's VALUE
   [` domain="` token `"`] ; matches the cookie's Domain
   [` path="` token `"`]   ; matches the cookie's Path
"/>"

[16]

cookie-exclude

=

"<COOKIE-EXCLUDE"
   [` name="` token `"`]   ; matches the cookie's NAME
   [` value="` token `"`]  ; matches the cookie's VALUE
   [` domain="` token `"`] ; matches the cookie's Domain
   [` path="` token `"`]   ; matches the cookie's Path
"/>"

Here, token, NAME, VALUE, Domain and Path are defined as per RFC 2965 [STATE], with the addition that the '*' character is to be treated as a wildcard, as defined in section 2.3.2.1.2.

Note that [STATE] states default values for the domain and path attributes of cookies: these should be used in the comparison if those attributes are not found in a specific cookie. Also, conforming to [STATE], if an explicitly specified Domain value does not start with a full stop ("."), the user agent MUST prepend a full stop for it; and, note that every Path begins with the "/" symbol.

2.3.2.8 The METHOD Element

By default, a policy reference applies to the stated URIs regardless of the method used to access the resource. However, a Web site may wish to define different P3P policies depending on the method to be applied to a resource. For example, a site may wish to collect more data from users when they are performing PUT or DELETE methods than when performing GET methods.

The METHOD element in a policy reference file is used to state that the enclosing policy reference only applies when the specified methods are used to access the referenced resources. The METHOD element may be repeated to indicate multiple applicable methods. If the METHOD element is not present in a POLICY-REF element, then that POLICY-REF element covers the resources indicated regardless of the method used to access them.

So, to state that /P3P/Policies.xml#first applies to all documents in the subtree /docs/ for GET and HEAD methods, while /P3P/Policies.xml#second applies for PUT and DELETE methods, the following policy reference would be written:

Example 2.6:

<META xmlns="http://www.w3.org/2001/09/P3Pv1">
 <POLICY-REFERENCES>
    <POLICY-REF about="/P3P/Policies.xml#first">
      <INCLUDE>/docs/*</INCLUDE>
      <METHOD>GET</METHOD>
      <METHOD>HEAD</METHOD>
    </POLICY-REF>
    <POLICY-REF about="/P3P/Policies.xml#second">
      <INCLUDE>/docs/*</INCLUDE>
      <METHOD>PUT</METHOD>
      <METHOD>DELETE</METHOD>
    </POLICY-REF>
 </POLICY-REFERENCES>
</META>

Note that HTTP requires the same behavior for GET and HEAD requests, thus it is inappropriate to specify different P3P policies for these methods. The syntax for the METHOD element is:

[17]

method-element

=

`<METHOD>` Method `</METHOD>`

Here, Method is defined in the section 5.1.1 of [HTTP1.1].

Finally, note that the METHOD element is designed to be used in conjunction with INCLUDE or COOKIE-INCLUDE elements. A METHOD element by itself will never apply a POLICY-REF to a URI.

2.3.3 Applying a Policy to a URI

A policy reference file specifies the policy which applies to a given URI. The meaning of this is that the indicated policy describes all effects of performing any of the methods listed in the policy reference file against the given URI.

There is a general rule which describes what it means for a P3P policy to cover a URI: the referenced policy MUST cover actions that the user's client software is expected to perform as a result of requesting that URI. Obviously, the policy must describe all data collection performed by site as a result of processing the request for the URI. Thus, if a given URI is covered for terms of GET requests, then the policy given by the policy reference file MUST describe all data collection performed by the site when that URI is fetched. Likewise, if a URI is covered for POST requests, then any data collection that occurs as a result of posting a form or other content to that URI MUST be described by the policy.

The concept of "actions that the client software is expected to perform" includes the setting of client-side cookies or other state-management mechanisms invoked by the response. If executable code is returned when a URI is requested, then the P3P policy covering that URI MUST cover certain actions which will occur when that code is executed. The covered actions are any actions which could take place without the user explicitly invoking them. If explicit user action causes data to be collected, then the P3P policy covering the URI for that action would disclose that data collection.

Some specific examples:

  1. Fetching a URI returns an HTML page which contains a form, and the form contents are sent to a second URI when the user clicks a "Submit" button. The P3P policy covering the second URI MUST disclose all data collected by the form. The P3P policy covering the first URI (the URI the form was loaded from) MAY or MAY NOT disclose any of the data that will be collected on the form.

  2. An HTML page includes JavaScript code which tracks how long the page is displayed and whether the user moved the mouse over a certain object on the page; when the page is unloaded, the JavaScript code sends that information to the server where the HTML page originated. The activity of the JavaScript code MUST be covered by the P3P policy of the HTML page. The reasoning is that this activity takes place without the user's knowledge or consent, and it occurs automatically as a result of loading the page.

  3. A response is an installable image for an electronic mail program. In order to use the email program, the user must run an installation program, start the email program, and use its facilities. The P3P policy covering URI from where the email program was downloaded is not required to make a statement about the data which could be collected by using the email program. Installing and running the email program is clearly outside the Web browsing experience, so it is not covered by this specification. A separate protocol could be designed to allow downloaded applications to present a P3P policy, but this is outside the scope of this specification.

  4. An HTML page containing a form includes a reference to an executable which provides a custom client-side control. The data in the control is submitted to a site when the form is submitted. In this case, the URI for the HTML page and the URI for the custom control is not required to make a statement about the data the custom control represents. However, the URI to which the form contents are posted MUST cover the data from the custom control, just as it would cover any other data collected by processing the form. This behavior is similar to the way HTML forms are handled when they use only standard HTML controls: the control itself collects no data, and the data is collected when the form is posted. Note that this example assumes that the form is only posted when the user actively presses a "submit" or similar button. If the form were posted automatically (for example, by some JavaScript code in the page), then this example would be similar to example #2, and the data collected by the form MUST be described in the P3P policy which covers the HTML form.

  5. Requests to a URI are redirected to a third party. If the first party embeds previously collected personal data in the query string or other part of the redirect URI, the privacy policy for the first party's URI MUST describe the types of data transmitted and include the third party as a recipient.

2.3.4 Forms and Related Mechanisms

Forms deserve special consideration, as they often link to CGI scripts or other server-side applications in their action URIs. It is often the case that those action URIs are covered by a different policy than the form itself.

If a user agent is unable to find a matching include-rule for a given action URI in the policy reference file that was referenced from the page, it SHOULD assume that no policy is in effect. Under these circumstances, user agents SHOULD check the well-known location on the host of the action URI to attempt to find a policy reference file that covers the action URI. If this does not provide a P3P policy to cover the action URI, then a user agent MAY try to retrieve the policy reference file by using the HINT mechanism on the action URI, and/or by issuing a HEAD request to the action URI before actually submitting any data in order to find the policy in effect. Services SHOULD ensure that server-side applications can properly respond to such HEAD requests and return the corresponding policy reference link in the headers. In case the underlying application does not understand the HEAD request and no policy has been predeclared for the action URI in question, user agents MUST assume that no policy is in effect and SHOULD inform the user about this or take the corresponding actions according to the user's preferences.

Note that services might want to make use of the <METHOD> element in order to declare policies for server-side applications that only cover a subset of supported methods, e.g., POST or GET. Under such circumstances, it is acceptable that the application in question only supports the methods given in the policy reference file (i.e., HEAD requests need not be supported). User agents SHOULD NOT attempt to issue a HEAD request to an action URI if the relevant methods specified in the form's method attribute have been properly predeclared in the page's policy reference file.

In some cases, different data is collected at the same action URI depending on some selection in the form. For example, a search service might offer to both search for people (by name and/or email) and (arbitrary) images. Using a set of radio buttons on the form, a single server-side application located at one and the same action URI handles both cases and collects the required information necessary for the search. If a service wants to predeclare the data collection practices of the server-side application it MAY declare all of the data collection practices in a single policy file (using a <INCLUDE> declaration matching the action URI). In this case, user agents MUST assume that all data elements are collected under every circumstance. This solution offers the convenience of a single policy but might not properly reflect the fact that only parts of the listed data elements are collected at a time. Services SHOULD make sure that a simple HEAD request to the action URI (i.e., without any arguments, especially without the value of the selected radio button) will return a policy that covers all cases.

Note that if a form is handled through use of the GET method, then the action URI reflects the choice of form elements selected by the user. In some cases, it will be possible to make use of the wildcard syntax allowed in policy reference files to specify different policies for different uses of the same form action-handler URI. Therefore, user agents MUST include the query-string portion of URIs when making comparisons with INCLUDE