Slide 0

About this show...

GIMP is cool

Scheme is cool

This presentation was brought
to you by PinPoint, the new GIMP
presentation tool.

Text -> GIMP -> JPEG

Slide 1

rproxy: dynamic web caching

Martin Pool
Linuxcare, Inc.

Slide 2

Problem Statement

People use web resources repeatedly

Therefore: cache recently-used resources
on client or proxy

On each request, check currency: either
reload or use same

Increasingly, content is dynamic:
all-or-nothing caches are less effective

Slide 3


It would be nice if we could transfer
only differences

Must interoperate smoothly with HTTP

Must work on dynamic documents

Must fit into popular HTTP software

Slide 4


Fast file transfer protocol

Finds identical blocks between two
files, therefore the delta

Send per-block checksums

Search for matching blocks

Whatever's left is the difference

Slide 5

Integration with HTTP

Request/respond protocol



Every response may be different

Slide 6


Client transmits signature of cached
resource to server

Server computes & sends differences

Signature sent as new HTTP header

Delta as HTTP Transfer-Encoding

Ignored if not supported

Slide 7

Standalone Proxy

Run on on client, one upstream

Compress across slow links

Already in Debian/Woody & Sid

Slide 8


Integrate smoothly with many apps

Become the encoding library for
rsync 3.0

LGPL license for nonfree apps

Slide 9

Hosting Applications:

Mozilla: threaded

Apache: multi-process-model

Squid: select/poll-based

Therefore: do no IO in library, caller
supplies buffer

State machine

Slide 10

Privacy problems?

Client holds server-supplied data &

A "stealth cookie"?

No more so than normal Last-Modified

Client-generated signatures are even

Slide 11


Encode particular content-types

Fuzzy-matching of resources

Cache signatures

Choose block size

~90% saving

Slide 12

Other schemes

Explicit versioning

Client-side variable portions

Slide 13

Bonus slide: rsync 3.0

Scale to larger trees (1TB+ data, 10M+
files, 1000 machines)

Less hardcoded structure

Cached signatures, fuzzy matching

Multicast 1:m, n:m

Slide 14

rsync 3.0/2

Scriptable (Perl/Python/...): filtering,
matching, reporting, ...

Simpler client-server architecture

Documented protocol


rdiff tool: rsync-over-email?

Slide 15


Come and see Linus's penguin at the
Canberra aquarium.

Produced by PinPoint at Sat Jan 20 14:15:59 2001UTC