RPROXY DESIGN NOTES Copyright (C) 2000 by Martin Pool >>> Why bother? If this ships standard with Apache, people will use it and save bandwidth. If it's non-standard it will be interesting but will be at best slowly adopted. Apache is very nearly the standard, so this is where it ought to go. >>> Alternatives Against this, integrating it into an Apache module looks nontrivial, because Apache doesn't have an easy mechanism to hook into output generated by another module. It's not impossible -- see mod_gzip/design.txt for exploration of how to do this. Perhaps the best solution would be for it to be in an external server accelerator. If we do that, we have a choice of either extending rproxy to make it really good, or merging it into an existing proxy. In this case, the proxy would run on the same machine as the origin server, and so it's called an "accelerator". Candidate proxies are * squid * tinyproxy In favour of this is that it will be useful to people who are running servers other than Apache; that it keeps the code separated out. (And it would be something to talk about at ApacheCon ;-) Against it is that data will have to be copied across a (probably local) socket from the web server to the accelerator; and that people may be slower to deploy it if they have to install a new program rather than just turning on an option in a new version of Apache. >>> Improving rproxy The right thing would be to begin by profiling the existing implementation. Possible performance improvements include * merging with an existing server framework * pre-forking >>> Client side We can integrate this with * Java * libwww * Mozilla * Squid >>> hsync It looks like the protocol will not be the same as plain rsync. What about calling it "hsync" for "HTTP Synchronization"? >>> Structure Traditional rsync uses three processes to keep things pipelined: a "generator" on the destination calculates the signatures for the existing file; a "sender" on the source that generates and transmits the differences; and a "receiver" that applies the differences to the original file to generate the new file. In hsync the situation is quite different, because the server generates the signatures. Therefore, we have a requester process that sends the cached signature of the best matching file. The server has to generate the new file, which typically will come from an upstream proxy or from a source inside the web server. As content is generated the server must compare it to the signature, and also generate new signatures. The differences and new signature are sent interleaved to the client. The client has to apply the differences to the cached file and deliver that to its client. It also has to cache the results and signature for next time. >>> Implementation structure Possibly the code needs should be fully stateless, so that it can be suspended and resumed at arbitrary points. This would make it a lot like Boa: all reads and writes are non-blocking. In fact probably reads and writes should be from external buffers supplied by the caller. This gives us the most generality but it complicates the code a fair bit. Handling each request on its own thread is a time-honoured and simple way to do things. If possible, we should use the existing rsync code. Really. I find myself strangely inclined to rewrite it, but it would be more sensible to reuse it. Amongst other things, that would mean that we need to be able to mingle the signatures with the rest of the code. >>> HTTP Proposed standard RFC2068 defines HTTP/1.1. Implement username-password access control, and store these in a database file. This will let us offer rproxy on a public server without having to worry so much about abuse. Use the proxy-authentication extensions to do this. Transfer coding is about the encoding of the entity body in order to pass through the network, and not of the message itself. I suppose this is where rsync fits in. Chunked is an example of this -- it should never be necessary to use it in addition to rsync. Must not send transfer encodings to 1.0 clients. The message-body is defined as the entity-body transformed by the transfer-encoding. It is explicitly allowed to change the transfer-encoding in the request/response chain. Transfer encoding is defined to be a hop-by-hop header. I hope this doesn't mean proxies will discard it? Proxies are generally allowed to modify the Content-encoding header. >> HTTP Via header Via = "Via" ":" 1#( received-protocol received-by [ comment ] ) Each gateway or proxy must add an entry. It should be configurable whether the received-by entry is a hostname:port or a pseudonym. >>> Format If the transfer is interrupted, we want the client to be able to continue it using the rsync mechanism, and certainly not to require them to restart from scratch. If the server generates all the signatures then we have to make sure that the signatures are transmitted along with the data they describe and not just at the end. The rsync protocol consists of alternating instructions to either copy an existing block or to insert literal data. Perhaps we can add another instruction which means to append data to the signature? This should be invoked after each block is transmitted. Can the client tell when we've covered a block? Perhaps, but it's probably simpler not to make it worry about this. We also have to make sure that signatures can just be concatenated, and that the server will do the right thing if it gets a truncated signature. >>> Proxy authentication See RFC2069. It might be good to put proxy authentication support into rproxy so that it can be safely run on the net. It ought to get a security audit too. For the time being it should be run as a restricted user and in a jail and on an unimportant machine. >>> Boa It should be pretty simple to put rsync into boa: the code is way simpler than that of Apache: in particular, since it doesn't *support* modules we don't have to worry about how to filter their output. It does support cgis, which will be a decent test. >>> Merging Boa and standalone rproxy In fact, Boa might be a decent framework for a more-efficient standalone rproxy, or at least the concepts from Boa might be. Forking for each connection is probably OK for low-traffic sites and might be OK overall, but it's not terribly efficient. I think there might be possible overflows in the existing rproxy code, but I haven't traced it all the way through. >>> CGI wrapper We could also implement this as a CGI wrapper that passed requests through to other CGIs or served static files. That would be even simpler, and would work under all servers. The procedure is to read and interpret the headers. We fork a task to execute the CGI, and it has to have stdin coming from the request, and stdout writing to a pipe that is filtered by the rproxy.cgi to pass it through to librsync. If there is no rsync header, then we simply get out of the way by exec'ing the real CGI on top. This sounds like a pretty cool idea. In some ways it is similar to the cgiwrapper/suexec tools. >>> Supporting CGIs In general in Apache we're going to have people writing to a fd. They might be external tasks, or they might be modules in the same task. One important principle is that if the user-agent doesn't understand rsync then we should just get out of the way and let the originating module write straight through to the socket as usual. It might be good to use an anonymous pipe to give people a fd as usual to write to, but to have us reading the other end. This is complicated in Apache by the fact that the people writing don't expect to give up control, and the pipe will fill up and block unless they do. So in general that is no good in Apache 1.3. One option is to fork a second task to do the rsync'ing, but that's a bit ugly. The obvious place to hook this in is in the buffer interface: ap_bwrite() and so on. This would be OK if we could be sure that in every case people did actually go through this interface and they never went directly to the socket. The key question is what modules like mod_cgi and mod_php do? At least in the case of mod_cgi, Apache opens pipes to the child, and copies between them and the real pipe. This is pretty cool: we can interpose ourself in the buffer interface and rewrite everything as it goes. If we can use just EAPI and be able to release to the standard module interface in 1.3.11 that would be much better. The hook ap::buff::write looks promising. We can't just hook it every time, though: we have to make sure it is only modifies output after writing the headers, because the same stream is used for both the headers and the body. One would expect that Apache has to filter everything through there to implement SSL, chunking, and so on. We might instead get away with using sfio, as there is some support for it in buff.c. I don't think it's often linked in, though, so this could cause more trouble than it is worth. >>> Prototype in Python Worthwhile? Perhaps, but only if it's really necessary to rewrite everything. In particular, it feels like rproxy might be better off in Python with just some calls into librsync: things like finding the best cache file will be easier to do in a HLL. >>> Rethink, Tue Dec 21 1999 This is all getting a bit complicated. Remember, the only result we really want here is to make the code available in Apache: focus on that result, and on getting there with the least code and risk of breakage possible. Server-side generation of signatures is somewhat difficult: * we have to change the transfer format to support sending the signatures intermingled with content * alternatively, we can just send the signature at the end, but this will mean interrupted transfers cannot be completed using hsync * therefore we have to mess with the librsync code * if the client's cached signature is ever out of sync with the cached content then very bad things will happen None of these problems arise with client-generated signatures: * client code is a little more complex, but that code has already been written * problem of working out whether a particular server supports rsync or not * possible patent problems There are certainly enough challenges in getting the existing code merged in: * we have to hook into Apache output through EAPI or some other means * we have to make sure the blocking/concurrency behaviour is OK * rproxy should be cleaned up So for now I'm inclined to try to get the existing code to work as an Apache module and to worry about server-generated signatures later. >>> Mingling using Chunked encoding Perhaps we can send chunks of real rsync data, with partial server-generated signatures in their chunk-ext headers? That would work OK, but we can do it later. >>> Blocking So, the question here is: if we're not relying on the operating system to synchronize IO, how do we do it? Are the mechanisms compatible between Apache and librsync, or how can they be made so? In Apache, we're hooking into the buffer interface (src/main/buff.c). We may need the EAPI hooks to do this, but perhaps not. Handler modules are likely to call functions like ap_write, which go down through write_it_all to buff_write. Everything eventually comes down to ap_write, which is where the EAPI hook is. We have the small problem of making sure that the headers aren't encoded in this way. That should be OK: we can either not install the hook until after sending the headers, or we can set a variable that turns it off, or we can do something else again. We can do this using EAPI's "context attachment" mechanism, which will essentially let us put this setting into the BUFF structure. We need to make sure our handler is called. That shouldn't be too hard: I guess we install ourselves as the handler and then do an internal redirect? Something like that. >>> rproxy from inetd This might be an easier way for people to install and use rproxy. We don't lose a lot by starting it from inetd, because it doesn't have much initialization work to do. This would be a clean way to put in access control, which will be important if it's widely deployed. Probably a good way to do this is just with a command-line option that says that stdin/stdout are connected to the socket. Is inetd a bit out of fashion these days? >>> rproxy cache The current rproxy only does exact matching on the cache. What we would prefer is to progressively truncate the URL, looking for any matches on the same server. That's pretty simple. >>> libtool As of 1999-12-26 everything is working pretty nicely with libtool/automake, which is cool. This package should be configured with --with-librsync to let us find the library. We ought to add a default to cope if the library has been installed and we don't have the source. Can we rely on having it's .la file around? That seems reasonable. Libtool is not very happy about creating a shared library whose name doesn't match /^lib/. For the time being, I get around this by calling it a program, and giving it the -shared -module LDFLAGS. >>> url-matching We want to find the best match for a URL in the cache of already-known URLs. How do we do that? For example, we need to work out that http://localhost/cgi/foo/bar/quux is a candidate URL for http://localhost/cgi/shamble/shuffle/quux There are really two questions here: one is how to evaluate a possible cache file compared to a target URL. The other is how to efficiently search through all the cache files. For the time being we'll just read the whole directory to get a list of cache files. This will be too slow when the cache gets large, but there's no need to over-optimize. We can split the cache into directories per host-port so they don't get too big. The best match is considered to be the file which has the longest prefix in common with the requested URL. Perhaps we can have a shared memory pool between instances holding this information. >>> Other rules for cache matching We can improve the cache-selection algorithm in two directions. In the first case, we want to make the best possible guess even when there's nothing immediately obvious. For example, one can imagine using a file from another host in the same domain, in the hope that they're mirrors. Perhaps the headers sent back from the host can give information on what other files this is likely to be useful for? Secondly, when we already have some good choices we want to guess the best possible one. We can guess at the mime type of the URL and try to choose a cache file with the same extension: this is likely to give very good agreement: JPEGs are not likely to have much in common with HTML. >>> Interrupted transfers If a transfer is interrupted by either end, we probably want to still save the cache file, but only if there is not already a cache file for that URL. Is that reasonable? >>> HTTP complications We ought to cope with * partial transfers * if-modified-since * chunked encoding In each case, the right thing might be to disable that option if we can do better, but in some cases I am afraid we may have to parse them. Perhaps if the output already has a content-encoding then we should decline to do anything. >>> Write hook The basic behaviour of Apache if we just plug into the write hook and do nothing else is to call use just once for a single buffer including both the headers and the start of the body. This is no good because we need to rsync-encode the body and not the headers. How do we fix this? One approach is to just simply re-parse the headers in the result. This would be simple enough but... is it a bit ugly? Really we want Apache to tell us when it's finished sending the headers and to turn on encoding at that point. However there is no obvious way to get that. Alternatively we could try not hooking in here at all but rather doing a subrequest and trying to hook it's output. I'm not sure that would be a lot better. This would mean that we basically insert child_main as-is into the stream of output going back to the client. We will just need a bit of glue to let it interact with pools and buffers to do allocation and i/o respectively, which shouldn't be too hard. It keeps most of the hsync code not Apache-specific, which would be pretty good. We will need to make the request object available, and perhaps use the fixup to get the signature and insert our Content-Encoding header. If we go that way we can implement a chaining CGI as a preliminary step: this may be easier to debug. Can we get a hook in the right place without using EAPI? Are subrequests any use? We'd come in through ap_run_sub_req, after creating the subrequest by ap_sub_req_lookup_uri. We can then duplicate the connection and modify the buffer structure to go in through us... but again there's no >>> mod_proxy I wonder if this will even work with mod_proxy? That'd be pretty cool, though I don't know if mod_proxy is very widely used or not. It's not a very efficient proxy. >>> Interface 2 We also have a stateless interface, which should allow librsync to be used even inside Boa. We can interrupt reading from a stream to work in another session. We'd need the IO callbacks to be able to return a value which means "try again later", and in this case the top-level function ought to return the same value, and leave the responsibility with the caller to complete the call. This would be a pretty good thing, but perhaps is not worth worrying about too much. I would have guessed this is the point of the second interface, but it doesn't seem possible to interrupt it: the librsync_encode2 functions don't return unless it's either failed and returned -1 or completed. >>> Content-digest If the origin server sends a content-digest, then we should check it at the end to make sure we haven't broken anything. This would be a pretty useful thing during an automated test.