[Top]
[Description]
[Cache Indexes]
[Request Pages]
[Monitor Pages]
[Control Program]
[Configuration Editing]
[Cache Search]
[Internet Links]
[E-Mail]
The WWWOFFLE programs simplify World Wide Web browsing from computers that have
intermittent connections to the internet.
The wwwoffled program is a simple proxy server with special features for use
with intermittent internet links. This means that it is possible to browse web
pages and read them without having to remain connected.
The links in this list either refer to the Configuration file options
[CONF], to the page that allows editing of the options [EDIT] (may
require a password), to more information in this document [READ] or to
another URL [URL].
Basic Features
- Caching of HTTP, FTP and finger protocols.
- Allows the 'GET', 'HEAD', 'POST' and 'PUT' HTTP methods.
- Interactive or command line control of online/offline/autodial status.
[URL]
- Highly configurable.
[CONF]
- Low maintenance, start/stop and online/offline status can be automated.
While Online
- Caching of pages that are viewed for later review.
- Conditional fetching to only get pages that have changed.
[CONF]
[EDIT]
- Based on expiration date, time since last fetched or once per session.
[CONF]
[EDIT]
- Non cached support for SSL (Secure Socket Layer e.g. https).
[CONF]
[EDIT]
- Caching for https connections. (compile time option).
[CONF]
[EDIT]
- Can be used with one or more external proxies based on web page.
[CONF]
[EDIT]
- Control which pages cannot be accessed.
[CONF]
[EDIT]
- Control which pages are not to be stored in the cache.
[CONF]
[EDIT]
- Create backups of cached pages when server cannot be contacted.
- Option to create backup when server sends back an error page.
[CONF]
[EDIT]
- Requests compressed pages from web servers (compile time option).
[CONF]
[EDIT]
- Requests chunked transfer-encoding from web servers.
[CONF]
[EDIT]
While Offline
- Can be configured to use dial-on-demand for pages that are not cached.
- Selection of pages to download next time online
- Using normal browser to follow links.
- Command line interface to select pages for downloading.
- Control which pages can be requested when offline.
[CONF]
[EDIT]
- Provides non-cached access to intranet servers.
[CONF]
[EDIT]
Automated Download
- Downloading of specified pages non-interactively.
[CONF]
[EDIT]
- Options to automatically fetch objects in requested pages
[CONF]
[EDIT]
- Understands various types of pages
- HTML 4.0, Java classes, VRML (partial), XML (partial).
- Options to fetch different classes of objects
[CONF]
[EDIT]
- Images, Stylesheets, Frames, Scripts, Java or other objects.
- Option to not fetch webbug images (images of 1 pixel square).
[CONF]
[EDIT]
- Automatically follows links for pages that have been moved.
- Can monitor pages at regular intervals to fetch those that have changed.
[URL]
- Recursive fetching
[URL]
- To specified depth.
- On any host or limited to same server or same directory.
- Chosen from command line or from browser.
- Control over which links can be fetched recursively.
Convenience
- Optional information footer on HTML pages showing date cached and options.
[CONF]
[EDIT]
- Options to modify HTML pages
[CONF]
[EDIT]
- Provides information about cached pages
- Headers, raw and modified.
- Contents, images, links etc.
- Source code unmodified by WWWOFFLE.
- Automatic proxy configuration with Proxy Auto-Config file.
[URL]
- Searchable cache with the addition of the ht://Dig, mnoGoSearch (UdmSearch), Namazu or Hyper Estraier programs.
[URL]
[URL]
[URL]
[URL]
- Built in simple web-server for local pages.
[URL]
- HTTP and HTTPS access (compile time option).
- Allows CGI scripts
[CONF]
[EDIT]
- Timeouts to stop proxy lockups
[CONF]
[EDIT]
- Continue or stop downloads interrupted by client.
[CONF]
[EDIT]
- Purging of pages from cache
[CONF]
[EDIT]
- Based on URL matching.
[CONF]
[EDIT]
- To keep the cache size below a specified limit.
[CONF]
[EDIT]
- To keep the free disk space above a specified limit.
[CONF]
[EDIT]
- Interactive or command line control.
- Compression of cached pages based on age.
[CONF]
[EDIT]
- Provides compressed pages to web browser (compile time option).
[CONF]
[EDIT]
- Use chunked transfer-encoding to web browser.
[CONF]
[EDIT]
Indexes
- Multiple indexes of pages stored in cache
[READ]
- Servers for each protocol (http, ftp ...).
- Pages on each server.
- Pages waiting to be fetched.
- Pages fetched last time online.
- Pages requested last time offline.
- Pages monitored on a regular basis.
- Configurable indexes
- Sorted by name, date, server domain name, type of file.
- Options to delete, refresh or monitor pages.
- Selection of complete list of pages or hide un-interesting pages.
[CONF]
[EDIT]
Security
- Works with pages that require basic username/password authentication.
- Automates proxy authentication for external proxies that require it.
[CONF]
[EDIT]
- Control over access to the proxy
- Defaults to local host access only.
[CONF]
[EDIT]
- Host access configured by hostname or IP address.
[CONF]
[EDIT]
- Optional proxy authentication for user level access control.
[CONF]
[EDIT]
- Optional password control for proxy management functions.
[CONF]
[EDIT]
- HTTPS access to all proxy management web pages (compile time option).
- Can censor incoming and outgoing HTTP headers to maintain user privacy.
[CONF]
[EDIT]
Configuration
- All options controlled using a configuration file.
[URL]
- Interactive web page to allow editing of the configuration file.
[EDIT]
- User customisable error and information pages.
- Log file or syslog reporting with user specified error level.
The index of all of the pages stored in the WWWOFFLE cache is available in a
number of formats.
The indexes are organised by hostname with a separate list for each host. The
basic URL is /index/, but this can be modified with a path as follows.
- /index/
- The main index that has links to all of the other indexes.
- /index/http/
- A list of the http pages that are cached (or any other protocol).
- /index/http/www.gedanken.org.uk/
- A list of the http pages that are cached from the host www.gedanken.org.uk (or any other protocol and host).
- /index/outgoing/
- The list of requests waiting to be fetched.
- /index/monitor/
- The list of requests for pages that are monitored on a regular basis.
- /index/lasttime/
- The list of pages that were accessed the last time that the program was online.
This also provides access to a history of the pages accessed the previous five times online.
- /index/lastout/
- The list of pages that were requested the last time that the program was offline.
This also provides access to a history of the pages accessed the previous five times offline.
Pages that are wanted can be fetched by entering them into the browser's own
page selector. Or alternatively by using the command line interface or
interactively from a WWWOFFLE web page form.
- /refresh-options/
- An interactive page for specifying the fetching pages recursively.
Pages that are wanted on a regular basis can be fetched by entering them into a
WWWOFFLE web page form.
- /monitor-options/
- An interactive page for specifying the fetching pages on a regular basis.
The WWWOFFLE proxy can be controlled either through a command line interface, or
interactively from a web page.
- /control/
- An interactive page to control the proxy program.
[Requires the password from the configuration file if set, but any user name.]
The WWWOFFLE proxy configuration file can be edited either as a plain text file
or interactively from a web page.
- /configuration/
- An interactive page to edit the configuration file.
[Requires the password from the configuration file if set, but any user name.]
The WWWOFFLE proxy can be accessed as an https server or can intercept and cache
https connections if this feature is enabled at compile time.
- /certificates/
- The certificates that WWWOFFLE has created for itself, has seen from other
servers or are set as trusted.
With the installation of any of the programs
ht://Dig or
mnoGoSearch (UdmSearch) or
Namazu
Hyper Estraier
it is possible to search the pages in the WWWOFFLE cache.
- /search/htdig/search.html
- The ht://Dig search form.
- /search/mnogosearch/search.html
- The mnoGoSearch (UdmSearch) search form.
- /search/namazu/search.html
- The Namazu search form.
- /search/hyperestraier/search.html
- The Hyper Estraier search form.
The WWWOFFLE homepage on the internet is available at
http://www.gedanken.org.uk/software/wwwoffle/
and contains the latest information about the program in general.
[Top]
[Description]
[Cache Indexes]
[Request Pages]
[Monitor Pages]
[Control Program]
[Configuration Editing]
[Cache Search]
[Internet Links]
[E-Mail]