# codearchiver-bot An IRC bot to drive codearchiver. It takes commands on an IRC channel (via http2irc), runs codearchiver, and uploads its results to the Internet Archive. ## Configuration Configuration happens via environment variables: * `HTTP2IRC_GET_URL` and `HTTP2IRC_POST_URL`: GET/POST URLs for IRC channel interaction * `IA_S3_ACCESS` and `IA_S3_SECRET`: authentication for IA * `CODEARCHIVER_BOT_TEST` (optional): enables test mode when set to any non-empty value, with uploads going into items prefixed with `test_` and placed into `test_collection`. * `CODEARCHIVER_BOT_TIMEOUT` (optional): number of seconds how long a `codearchiver` command may run. Default: unlimited (0) * `CODEARCHIVER_BOT_NPROC` (optional): number of parallel `codearchiver` processes. Default: 1 The data produced by `codearchiver-bot` must be kept in its working directory for correct deduplication. However, there's nothing unique there; all data is uploaded to IA continuously, and operation can be restored from there by downloading all `*_codearchiver_metadata.txt` files and creating placeholders (e.g. symlinks to `.uploaded` as the script does by default) for everything else. A `Dockerfile` is provided for convenience. A volume should be mounted to `/data` to keep data beyond container replacements etc. ## License This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see .