JustAnotherArchivist
8f7619ff3a
Add retries
2 years ago
JustAnotherArchivist
f98fdd5f01
Fix printing HTTP response line to stdout instead of stderr
2 years ago
JustAnotherArchivist
c9400ac46f
Fix recognition of command without optional parts
2 years ago
JustAnotherArchivist
5ca15a7c94
Add concurrency support
The proper way to do that (with asyncio) is of course aiohttp. A major drawback of the implemented approach is that running tasks can't be cancelled in case of an error. However, it works with just the standard library, and that advantage outweighs the awkward error handling for now.
2 years ago
JustAnotherArchivist
191948cf9d
Print number of modified records on requeueing
2 years ago
JustAnotherArchivist
5121524f83
Log retrieval of showNumPages
2 years ago
JustAnotherArchivist
aba7a1b0b8
Replace resumeKey pagination with page number pagination
resumeKey pagination is horribly broken. It may return incomplete results or infinite loops.
2 years ago
JustAnotherArchivist
d57324a26c
Add --where for arbitrary conditions
2 years ago
JustAnotherArchivist
fed64387bd
Invert count/write logic
Previously, write was the actual default action, and in some forms of the command, the action value isn't actually checked against the possible values, so on a typo, it would write instead of count.
2 years ago
JustAnotherArchivist
f914b6afbe
Also reset the status_code on requeueing
2 years ago
JustAnotherArchivist
303bb69c37
Add ia-cdx-search
2 years ago
JustAnotherArchivist
0b45f7b2ba
Swap syntaxes
2 years ago
JustAnotherArchivist
b2c9ea2fa4
Refactor
2 years ago
JustAnotherArchivist
eaf53e1a44
Add alphabetseq
2 years ago
JustAnotherArchivist
c9c8b7e1f7
Add ia-wait-item-tasks
2 years ago
JustAnotherArchivist
b440b35c2f
Handle ancient /?v= URLs
2 years ago
JustAnotherArchivist
0044281b9d
Add YouTube channel listing script
2 years ago
JustAnotherArchivist
1686e04cbe
Add a timeout to prevent potentially indefinite blocking
2 years ago
JustAnotherArchivist
2fc9652ee9
Add support for other instances and full-instance listing
2 years ago
JustAnotherArchivist
b72da478b2
Fix org repo listing on new design/site structure
2 years ago
JustAnotherArchivist
ce7a069af5
Add --jsonl option
2 years ago
JustAnotherArchivist
9412f0c81c
Add azure-storage-list
2 years ago
JustAnotherArchivist
696e221fc1
Add support for password-protected folders
2 years ago
JustAnotherArchivist
158c1f1fe0
Fix usage error
2 years ago
JustAnotherArchivist
53bfe468bf
Basic error checks
2 years ago
JustAnotherArchivist
8c612082b6
Restore MD5 check as the API returns it again
Effectively partially reverts 06cf71f7
2 years ago
JustAnotherArchivist
8554c01a84
Fix gofile.io download to the new getFolder endpoint and download server structure
2 years ago
JustAnotherArchivist
a246bad957
Add support for Shorts
2 years ago
JustAnotherArchivist
6d019e63fc
Fix removenonyt performance by using simpler fixed-string patterns instead of a PCRE
2 years ago
JustAnotherArchivist
b27a428787
Fix usage notes from URLs to lines on stdin
2 years ago
JustAnotherArchivist
c4b62c2fea
Fix piping when reads return less data than expected
2 years ago
JustAnotherArchivist
dba6d1fb0e
Fix stderr printing
2 years ago
JustAnotherArchivist
6e5a019d9e
Always decode stdin with surrogateescape to avoid breaking on binary input
2 years ago
JustAnotherArchivist
e48fb9d1b6
Tighten patterns for user and custom channel URLs so they can handle HTML input more easily
2 years ago
JustAnotherArchivist
9cbc3f7968
Extract playlist and channel IDs from watch URLs
2 years ago
JustAnotherArchivist
80bf010433
Percent-decode each line only once
2 years ago
JustAnotherArchivist
f1fcfabafa
Add support for reading warc.zst from stdin
2 years ago
JustAnotherArchivist
d5f646f995
Add zstdwarccat
2 years ago
JustAnotherArchivist
4415c8d5dd
Add support for img.youtube.com (old thumbnails)
2 years ago
JustAnotherArchivist
50a0fcc7b0
Fix performance regression due to 479c2684
2 years ago
JustAnotherArchivist
479c268441
Fix whitespace handling
2 years ago
JustAnotherArchivist
56f21d1fc0
Add aggressive video ID v parameter extraction
2 years ago
JustAnotherArchivist
99c83eb331
Handle optional slash in generic watch matcher
2 years ago
JustAnotherArchivist
9f88f76e59
Handle a few more odd and rare URLs
2 years ago
JustAnotherArchivist
a0f3b16c9e
Handle youtu.be case variations and port numbers
2 years ago
JustAnotherArchivist
273d3ed45a
Handle gaming.youtube.com
2 years ago
JustAnotherArchivist
0cb61f4dae
Add b64grep
2 years ago
JustAnotherArchivist
8e6e47d623
Fix ytimg extraction
2 years ago
JustAnotherArchivist
0b13758659
Add Bugzilla URL list generator
2 years ago
JustAnotherArchivist
ce0ae88b21
Add ia-verify-file
2 years ago