An IRC bot to drive codearchiver. It takes commands on an IRC channel (via http2irc), runs codearchiver, and uploads its results to the Internet Archive.
Configuration happens via environment variables:
HTTP2IRC_POST_URL: GET/POST URLs for IRC channel interaction
IA_S3_SECRET: authentication for IA
CODEARCHIVER_BOT_TEST(optional): enables test mode when set to any non-empty value, with uploads going into items prefixed with
test_and placed into
CODEARCHIVER_BOT_TIMEOUT(optional): number of seconds how long a
codearchivercommand may run. Default: unlimited (0)
CODEARCHIVER_BOT_NPROC(optional): number of parallel
codearchiverprocesses. Default: 1
The data produced by
codearchiver-bot must be kept in its working directory for correct deduplication. However, there’s nothing unique there; all data is uploaded to IA continuously, and operation can be restored from there by downloading all
*_codearchiver_metadata.txt files and creating placeholders (e.g. symlinks to
.uploaded as the script does by default) for everything else.
Dockerfile is provided for convenience. A volume should be mounted to
/data to keep data beyond container replacements etc.
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.