298 Commits (d07b5a7d09e5348e59cdae34571f07ef92d8cd78)
 

Author SHA1 Message Date
  JustAnotherArchivist d07b5a7d09 Remove debugging prints 2 years ago
  JustAnotherArchivist bf5e065a0f Add URL/percent decoding tool 2 years ago
  JustAnotherArchivist 11485d9404 Add infrastructure for simple C-based tools 2 years ago
  JustAnotherArchivist c50a8fd796 Fix 'Dictionary mismatch' error when very small dicts are used because the temporary file isn't written to disk before zstdcat gets executed 2 years ago
  JustAnotherArchivist 5bc3d4b020 Fix crash on an empty response 2 years ago
  JustAnotherArchivist 7f25c092d1 Catch other connection errors 2 years ago
  JustAnotherArchivist f8352809f3 Handle ConnectionResetError 2 years ago
  JustAnotherArchivist 0b34268210 Catch socket.timeout, which is a separate exception class from TimeoutError before Python 3.10 2 years ago
  JustAnotherArchivist 0f7a2b32a3 Log number of results on a page 2 years ago
  JustAnotherArchivist 628aeb052f Handle rate limiting 2 years ago
  JustAnotherArchivist d3ea3ce8a0 Switch from urllib to http.client to reuse connections 2 years ago
  JustAnotherArchivist 8f7619ff3a Add retries 2 years ago
  JustAnotherArchivist f98fdd5f01 Fix printing HTTP response line to stdout instead of stderr 2 years ago
  JustAnotherArchivist c9400ac46f Fix recognition of command without optional parts 2 years ago
  JustAnotherArchivist 5ca15a7c94 Add concurrency support 2 years ago
  JustAnotherArchivist 191948cf9d Print number of modified records on requeueing 2 years ago
  JustAnotherArchivist 5121524f83 Log retrieval of showNumPages 2 years ago
  JustAnotherArchivist aba7a1b0b8 Replace resumeKey pagination with page number pagination 2 years ago
  JustAnotherArchivist d57324a26c Add --where for arbitrary conditions 2 years ago
  JustAnotherArchivist fed64387bd Invert count/write logic 2 years ago
  JustAnotherArchivist f914b6afbe Also reset the status_code on requeueing 2 years ago
  JustAnotherArchivist 303bb69c37 Add ia-cdx-search 2 years ago
  JustAnotherArchivist 0b45f7b2ba Swap syntaxes 2 years ago
  JustAnotherArchivist b2c9ea2fa4 Refactor 2 years ago
  JustAnotherArchivist eaf53e1a44 Add alphabetseq 2 years ago
  JustAnotherArchivist c9c8b7e1f7 Add ia-wait-item-tasks 2 years ago
  JustAnotherArchivist b440b35c2f Handle ancient /?v= URLs 2 years ago
  JustAnotherArchivist 0044281b9d Add YouTube channel listing script 2 years ago
  JustAnotherArchivist 1686e04cbe Add a timeout to prevent potentially indefinite blocking 2 years ago
  JustAnotherArchivist 2fc9652ee9 Add support for other instances and full-instance listing 2 years ago
  JustAnotherArchivist b72da478b2 Fix org repo listing on new design/site structure 2 years ago
  JustAnotherArchivist ce7a069af5 Add --jsonl option 2 years ago
  JustAnotherArchivist 9412f0c81c Add azure-storage-list 2 years ago
  JustAnotherArchivist 696e221fc1 Add support for password-protected folders 2 years ago
  JustAnotherArchivist 158c1f1fe0 Fix usage error 2 years ago
  JustAnotherArchivist 53bfe468bf Basic error checks 2 years ago
  JustAnotherArchivist 8c612082b6 Restore MD5 check as the API returns it again 2 years ago
  JustAnotherArchivist 8554c01a84 Fix gofile.io download to the new getFolder endpoint and download server structure 2 years ago
  JustAnotherArchivist a246bad957 Add support for Shorts 2 years ago
  JustAnotherArchivist 6d019e63fc Fix removenonyt performance by using simpler fixed-string patterns instead of a PCRE 2 years ago
  JustAnotherArchivist b27a428787 Fix usage notes from URLs to lines on stdin 2 years ago
  JustAnotherArchivist c4b62c2fea Fix piping when reads return less data than expected 2 years ago
  JustAnotherArchivist dba6d1fb0e Fix stderr printing 2 years ago
  JustAnotherArchivist 6e5a019d9e Always decode stdin with surrogateescape to avoid breaking on binary input 2 years ago
  JustAnotherArchivist e48fb9d1b6 Tighten patterns for user and custom channel URLs so they can handle HTML input more easily 2 years ago
  JustAnotherArchivist 9cbc3f7968 Extract playlist and channel IDs from watch URLs 2 years ago
  JustAnotherArchivist 80bf010433 Percent-decode each line only once 2 years ago
  JustAnotherArchivist f1fcfabafa Add support for reading warc.zst from stdin 2 years ago
  JustAnotherArchivist d5f646f995 Add zstdwarccat 2 years ago
  JustAnotherArchivist 4415c8d5dd Add support for img.youtube.com (old thumbnails) 2 years ago