sillygod/cdp-cache

View on GitHub
readme.org

Summary

Maintainability
Test Coverage
#+HTML: <a href="https://github.com/sillygod/cdp-cache/actions?query=workflow%3ACI"><img src="https://github.com/sillygod/cdp-cache/workflows/CI/badge.svg?branch=master" /></a>
#+HTML: </div>

#+HTML: <a href="https://www.codacy.com/manual/sillygod/cdp-cache?utm_source=github.com&amp;utm_medium=referral&amp;utm_content=sillygod/cdp-cache&amp;utm_campaign=Badge_Grade"><img src="https://app.codacy.com/project/badge/Grade/43d801ba437a42419e479492eca72ee2" /></a>
#+HTML: </div>


#+HTML: <a href="https://goreportcard.com/report/github.com/sillygod/cdp-cache"><img src="https://goreportcard.com/badge/github.com/sillygod/cdp-cache" /></a>
#+HTML: </div>

#+HTML: <a href="https://codeclimate.com/github/sillygod/cdp-cache/test_coverage"><img src="https://api.codeclimate.com/v1/badges/a99b4ae948836cdedd12/test_coverage" /></a>


* Caddy Http Cache

This is a http cache plugin for caddy 2.  The difference between this and cache-handler https://github.com/caddyserver/cache-handler is that this one is much easier to understand and configure.   But it does not support a =distrobuted ache=, and it needs to be compiled with golang 1.20 and caddy version v2.6.4


  In fact, I am not so familiar with the part of cdn cache's mechanism so I reference the code from https://github.com/nicolasazrak/caddy-cache. which is the caddy v1's cache module. Then, I migrate the codebase to be consist with the caddy2 architecture and develop more features based on that.

  Currently, this doesn't support =distributed cache= yet. It's still in plan.

* Features

** Multi storage backends support
   Now the following backends are supported.

   - file
   - inmemory
   - redis

   In the latter part, I will show the example Caddyfile to serve different type of proxy cache server.

** Conditional rule to cache the upstream response
   - uri path matcher
   - http header matcher

** Set default cache's max age
   A default age for matched responses that do not have an explicit expiration.
** Purge cache
   There are exposed endpoints to purge cache. I implement it with admin apis endpoints. The following will show you how to purge cache

*** first you can list the current caches

    Given that you specify the port 7777 to serve caddy admin, you can get the list of cache by the api below.

    #+begin_src restclient
      GET http://example.com:7777/caches
    #+end_src

*** purge the cache with http DELETE request
    It supports the regular expression.

    #+begin_src restclient
      DELETE http://example.com:7777/caches/purge
      Content-Type: application/json

      {
        "method": "GET",
        "host": "localhost",
        "uri": ".*\\.txt"
      }
    #+end_src
** Support cluster with consul

   NOTE: still under development and only the =memory= backend supports.

   I've provided a simple example to mimic a cluster environment.

   #+begin_src sh
     PROJECT_PATH=/app docker-compose --project-directory=./ -f example/distributed_cache/docker-compose.yaml up
   #+end_src

* How to build

  In development, go to the cmd folder and type the following commands.

  #+begin_src sh
    go build -ldflags="-w -s" -o caddy
  #+end_src

  To build linux binary by this
  #+begin_src
  GOOS=linux GOARCH=amd64 go build -ldflags="-w -s" -o caddy
  #+end_src

  Or you can use the [[https://github.com/caddyserver/xcaddy][xcaddy]] to build the executable file.
  Ensure you've install it by =go get -u github.com/caddyserver/xcaddy/cmd/xcaddy=
  #+begin_src sh
    xcaddy build v2.6.4 --with github.com/sillygod/cdp-cache
  #+end_src

  Xcaddy also provide a way to develop plugin locally.
  #+begin_src sh
    xcaddy run --config cmd/Caddyfile
  #+end_src

  To remove unused dependencies
  #+begin_src sh
    go mod tidy
  #+end_src

* How to run

  You can directly run with the binary.
  #+begin_src sh
    caddy run --config [caddy file] --adapter caddyfile
  #+end_src

  Or if you are preferred to use the docker
  #+begin_src sh
    docker run -it -p80:80 -p443:443 -v [the path of caddyfile]:/app/Caddyfile docker.pkg.github.com/sillygod/cdp-cache/caddy:latest
  #+end_src

* Configuration

  The following will list current support configs in the caddyfile.

*** status_header
    The header to set cache status. default value: =X-Cache-Status=

*** match_path
    Only the request's path match the condition will be cached. Ex. =/= means all request need to be cached because all request's path must start with =/=

*** match_methods
    By default, only =GET= and =POST= methods are cached. If you would like to cache other methods as well you can configure here which methods should be cached, e.g.: =GET HEAD POST=.

    To be able to distinguish different POST requests, it is advisable to include the body hash in the cache key, e.g.: ={http.request.method} {http.request.host}{http.request.uri.path}?{http.request.uri.query} {http.request.contentlength} {http.request.bodyhash}=

*** default_max_age
    The cache's expiration time.

*** stale_max_age
    The duration that a cache entry is kept in the cache, even though it has already expired. The default duration is =0=.

    If this duration is > 0 and the upstream server answers with an HTTP status code >= 500 (server error) this plugin checks whether there is still an expired (stale) entry from a previous, successful call in the cache. In that case, this stale entry is used to answer instead of the 5xx response.

*** match_header
    only the req's header match the condtions
    ex.

    #+begin_quote
    match_header Content-Type image/jpg image/png "text/plain; charset=utf-8"
    #+end_quote

*** path
    The position where to save the file. Only applied when the =cache_type= is =file=.

*** cache_key
    The key of cache entry. The default value is ={http.request.method} {http.request.host}{http.request.uri.path}?{http.request.uri.query}=

*** cache_bucket_num
    The bucket number of the mod of cache_key's checksum. The default value is 256.

*** cache_type
    Indicate to use which kind of cache's storage backend. Currently, there are two choices. One is =file= and the other is =in_memory=

*** cache_max_memory_size

    The max memory usage for in_memory backend.

*** distributed

    Working in process. Currently, only support =consul= to establish the cluster of cache server node.

    To see a example config, please refer [[file:example/distributed_cache/Caddyfile::health_check ":7777/health"][this]]

**** service_name
     specify your service to be registered in the consul agent.

**** addr
     the address of the consul agent.

**** health_check
     indicate the health_check endpoint which consul agent will use this endpoint to check the cache server is healthy


** Example configs
   You can go to the directory [[file:example/][example]]. It shows you each type of cache's configuration.

* Benchmarks

  Now, I just simply compares the performance between in-memory and disk.

** Env
   Caddy run with the config file under directory =benchmark= and tests were run on the mac book pro (1.4 GHz Intel Core i5, 16 GB 2133 MHz LPDDR3)

** Test Result

   The following benchmark is analysized by =wrk -c 50 -d 30s --latency -t 4 http://localhost:9991/pg31674.txt= without log open.
   Before running this, ensure you provision the tests data by =bash benchmark/provision.sh=


   |                         | req/s | latency (50% 90% 99%)     |
   | proxy + file cache      | 13853 | 3.29ms /  4.09ms / 5.26ms |
   | proxy + in memory cache | 20622 | 2.20ms /  3.03ms / 4.68ms |

* Todo list

  - [ ] upgrade to the latest caddy
  - [ ] https://github.com/mastercactapus/caddy2-proxyprotocol
    https://caddyserver.com/docs/caddyfile/options#listener-wrappers
    https://github.com/pires/go-proxyproto
    (add proxy protocol support)
  - [ ] distributed cache (in progress)
  - [ ] more optimization..