432 Commits (master)
 

Author SHA1 Message Date
  JustAnotherArchivist e2085e6c81 Add cloudflare-email-decode 1 year ago
  JustAnotherArchivist 73f35f5591 Fix infinite loop when file ends with something that is not a WARC record 1 year ago
  JustAnotherArchivist 06d60a798c Bump read size 1 year ago
  JustAnotherArchivist 3e0b70be6b Handle processes with too many open connections 1 year ago
  JustAnotherArchivist df7b25c2db Error on unknown options 1 year ago
  JustAnotherArchivist 4bd4f5a30c Fix 'Argument list too long' error when using --urls-from-stdin with many URLs 2 years ago
  JustAnotherArchivist e20d35a553 Fix crash on 429 2 years ago
  JustAnotherArchivist cef61434a0 Add --urls-from-stdin 2 years ago
  JustAnotherArchivist b5cf04947b Add Wasabi 2 years ago
  JustAnotherArchivist d2afd1309d Add s3-bucket-find-direct-url 2 years ago
  JustAnotherArchivist 95988466ec Make S3 response pattern matching more flexible (so it also works on Scaleway) 2 years ago
  JustAnotherArchivist a9a03d3a00 Add urlsort 2 years ago
  JustAnotherArchivist 9798cc1188 Typo 2 years ago
  JustAnotherArchivist d193637e5e Add kill-connections 2 years ago
  JustAnotherArchivist 6cfe8e51ba Make job a global variable in --pyfilter expressions so it can be used in genexps 2 years ago
  JustAnotherArchivist a4627fa1c6 Queue derives with `ia tasks` instead of this manual curl rubbish 2 years ago
  JustAnotherArchivist c68b310afc Always print the parts value if there is an upload ID 2 years ago
  JustAnotherArchivist fdc3c3d69e Support float values for --partsize with M or G suffix 2 years ago
  JustAnotherArchivist 002c1eb7ae Wait until item exists 2 years ago
  JustAnotherArchivist 142a5a9c49 Get rid of asyncio 2 years ago
  JustAnotherArchivist b6663ae731 Add concurrency 2 years ago
  JustAnotherArchivist 22f2e68356 Add JSONL output option for S3 listing 2 years ago
  JustAnotherArchivist bfebe9a2a5 Fix only sending partial file contents on retries 2 years ago
  JustAnotherArchivist 39b3b7793a Add support for IA_CONFIG_FILE environment variable 2 years ago
  JustAnotherArchivist 7ed2906dd2 Add progress bar 2 years ago
  JustAnotherArchivist 58f0f0f8d0 Fix being unable to resume an upload that crashed in the first part 2 years ago
  JustAnotherArchivist 74485c399b Require decompressed WARCs with warc-tiny 2 years ago
  JustAnotherArchivist e24790132e Add at-tracker-sample-user-item-size 2 years ago
  JustAnotherArchivist a14939b069 Add base64url 2 years ago
  JustAnotherArchivist 5c2ce7ec10 Add cdx-chunk 2 years ago
  JustAnotherArchivist fe0b020352 Add support for reading from stdin 2 years ago
  JustAnotherArchivist 1010769c3c Handle connection errors 2 years ago
  JustAnotherArchivist 1acdc88c81 Add ia-upload-stream 2 years ago
  JustAnotherArchivist 360c4d9371 Add youtube-extract-rapid 2 years ago
  JustAnotherArchivist d07b5a7d09 Remove debugging prints 2 years ago
  JustAnotherArchivist bf5e065a0f Add URL/percent decoding tool 2 years ago
  JustAnotherArchivist 11485d9404 Add infrastructure for simple C-based tools 2 years ago
  JustAnotherArchivist c50a8fd796 Fix 'Dictionary mismatch' error when very small dicts are used because the temporary file isn't written to disk before zstdcat gets executed 2 years ago
  JustAnotherArchivist 5bc3d4b020 Fix crash on an empty response 2 years ago
  JustAnotherArchivist 7f25c092d1 Catch other connection errors 2 years ago
  JustAnotherArchivist f8352809f3 Handle ConnectionResetError 2 years ago
  JustAnotherArchivist 0b34268210 Catch socket.timeout, which is a separate exception class from TimeoutError before Python 3.10 2 years ago
  JustAnotherArchivist 0f7a2b32a3 Log number of results on a page 2 years ago
  JustAnotherArchivist 628aeb052f Handle rate limiting 2 years ago
  JustAnotherArchivist d3ea3ce8a0 Switch from urllib to http.client to reuse connections 2 years ago
  JustAnotherArchivist 8f7619ff3a Add retries 2 years ago
  JustAnotherArchivist f98fdd5f01 Fix printing HTTP response line to stdout instead of stderr 2 years ago
  JustAnotherArchivist c9400ac46f Fix recognition of command without optional parts 2 years ago
  JustAnotherArchivist 5ca15a7c94 Add concurrency support 2 years ago
  JustAnotherArchivist 191948cf9d Print number of modified records on requeueing 2 years ago