Caching

From Cloudrexx Development Wiki
Jump to: navigation, search

There are different types of cache available in Cloudrexx. This article explains what types exist, what they do and how they do it.

Cache types

Cloudrexx knows the following types of cache:

  • OPcode cache
  • Database cache
  • Page cache
  • Reverse proxy cache
  • SSI/ESI cache

OPcode cache

OPcode caching is provided by PHP or PHP extensions. It allows to cache the PHP bytecode. Using this cache type is highly suggested as it makes PHP faster without any relevant downsides.

Clear OPcode cache

Sometimes, especially on development setups, the OPcache can cache a file too long. In this situation the "Cache" command can clear the OPcode cache. Using Zend OPcache this should not be necessary.

./cx Cache clear opcode

Database cache

Using a cache provider like Memcached or APC, data can be cached to avoid costly operations such as database queries or YAML parsing. Cloudrexx uses the cache drivers provided by Doctrine and turns on Doctrine metadata-, query- and result-caching.

Possible problems

Data inconsistencies

Using database cache can lead to inconsistent data if data is accessed and manipulated in two different ways. For example: If a change in the database happens without using Doctrine, Doctrine cannot know about this change and will not update the cache. This results in Doctrine returning different data than what the database contains.

Security considerations

As cache providers like Memcached do not provide security, cached data could be read and written from another source. This means that multiple sites should never share the same memcached server.

Clear database cache

Database cache can be cleared by clicking the clear button in the general settings. Alternatively, the "Cache" command provides a way to clear the usercache (replace "memcached" by the cache engine in use):

./cx Cache clear user memcached

If the command mode does no longer work due to data inconsistencies caused by the database cache, Memcached itself can be advised to drop the complete cache. This can be done using the following command:

./cx debug --clear-usercache

Page cache

Output in frontend mode is cached using a simple mechanism: Before content is sent to the browser it is stored in a file. On the next request the content of the file is sent directly to the browser which drastically reduces the memory and CPU footprint per request. To avoid outdated data, this cache has a configurable lease time and can be cleared page-wise.

Technical details

As the same page can be rendered differently based on certain parameters the cache of a single page is specific to the following request information:

  • URL
  • GET and POST arguments
  • Resolved locale
  • Whether the mobile version of the page is requested

The request information is being hashed (<hash>) and together with some additional information being used as filename to cache the response. The filename schema is as follows:

<hash><type><pageId><user>

Explanation:

Fragment Description
<hash> Hash of the request information
<type> Can be one of:
  • _: if the cache-file contains the actual response body
  • _h: if the cache-file contains the HTTP-headers of the response
<pageId> ID of the resolved page.

Note: cache of URL-redirections do have this set to 0

<user> See section User-based cache below for more details.

Can either be empty or a combination of the prefix _u and one of:

  • valid session ID
  • 0 if the user has no session yet
  • empty if the user has a session, but the response is not user-based

Examples:

  • 044cb929975a275957cfa0f00cf9c3bf_0
  • 044cb929975a275957cfa0f00cf9c3bf_h0
  • 044cb929975a275957cfa0f00cf9c3bf_3
  • 044cb929975a275957cfa0f00cf9c3bf_h3
  • 044cb929975a275957cfa0f00cf9c3bf_h3_u
  • 044cb929975a275957cfa0f00cf9c3bf_h3_u0
  • 044cb929975a275957cfa0f00cf9c3bf_h3_ub3939259f9f62ca8b3e79241627b23cd

For each request, two cache files are created: one with <type> set to _ and another with <type> set to _h. While the former is used to cache the body of the response, the latter is used to cache the HTTP-headers (as serialized PHP-array) of the response.

User-based cache

The last part of the file name scheme indicates whether the cache is user specific. There are four possible cases:

User based cache Not user based cache
Session exists _u<sessionId> _u
No session present _u0

Cache is user based if the page it is for is protected or if a component calls \Cx\Core_Modules\Cache\Controller\Cache::forceUserbasedPageCache() during cache generation.

Possible problems

Outdated cache data

Request which introduce a change on a page can lead to inconsistencies if they fail. The reason for this is that the cache is cleared after the change is persisted. If the request fails after persisting the changes, but before clearing the cache, the cache holds outdated data. Also, if a change is made of which the system is not aware the cache cannot be cleared and the cache holds outdated data. This happens for example if a theme file is changed via FTP (or generally not via backend).

Dynamic content

If a page contains data that changes frequently (like a webcam image) the page cannot be cached efficiently. SSI/ESI provides a way to circumvent that.

User-based data

If a page contains data that is related to the current user, the system needs to know about that fact. The following example shows how to do so:

// in Controller/ComponentController.class.php

    /**
     * @inheritdoc
     */
    public function adjustResponse(\Cx\Core\Routing\Model\Entity\Response $response) {
        // This method is only called if a page of this component is requested.
        // You may want to apply some filter to only force userbased cache
        // for certain cmd's. E.g.:
        // if (!in_array($response->getPage()->getCmd(), array(...))) {
        $this->getComponent('Cache')->forceUserbasedPageCache();
    }

Clear page cache

There are multiple ways to drop parts of or the complete page cache:

Clear complete page cache

The following code clears the complete page cache. Please use carefully:

$cx->getComponent('Cache')->deleteAllFiles('cxPages');

Clear cache for one page

The following code clears the cache for one page (including its headers):

$cx->getComponent('Cache')->deleteSingleFile($pageId);

Clear cache for a component

The following code clears the cache for all pages pointing to a component (including symlink pages):

$cx->getComponent('Cache')->deleteComponentFiles($componentName);

Clear header cache

The following code clears all cached headers that do not belong to a page:

$cx->getComponent('Cache')->deleteNonPagePageCache();

Clear cache for a user

The following code clears page cache that is related to a user:

$cx->getComponent('Cache')->clearUserBasedPageCache($sessionId);

Set cache lease time

Additionally, a custom cache lease time can be set per page. The following code sets a custom lease time for the current page:

$cx->getResponse()->setExpirationDate($date);

Reverse Proxy cache

A reverse proxy (like NGINX) can be used to do more or less the same task as the page cache does, but without forwarding the cached requests to the webserver at all. This can be used to reduce the load on the webserver. Cloudrexx supports Varnish and NGINX as reverse proxy. It sends invalidation requests to the reverse proxy whenever a change to a page is done.

SSI/ESI cache

To avoid regenerating the complete page cache every time a small part of the page changes Cloudrexx uses SSI or ESI. The page cache contains SSI or ESI placeholders instead of the rendered content for these parts. When serving the cached content to the client these placeholders are replaced. The content for each such placeholder can be cached. This cache can have its own lease time for each placeholder and can be cleared separately or based on certain parameters.

The reverse proxy can be used to replace the SSI or ESI tags. Cloudrexx can send cache invalidation requests to Varnish or NGINX for ESI/SSI cache.

Cloudrexx has a built-in minimalistic ESI parser for if there's no (SSI or ESI capable) reverse proxy with cache support.

See ESI Widgets for more information about how to interact with this cache type.

Clear ESI cache

In addition to the cache clear methods mentioned for Widgets there are the following two, more generic, possibilities:

Clear complete ESI cache

The following code clears the complete page cache. Please use carefully:

$cx->getComponent('Cache')->clearSsiCache();

This methods accepts an optional parameter to supply an URL pattern to only drop cache matching the pattern.

Clear cache for a user

The following code clears ESI cache that is related to a user:

$cx->getComponent('Cache')->clearUserBasedEsiCache($sessionId);