JustAnotherArchivist
09cf078e34
Add metadata indexing
пре 10 месеци
JustAnotherArchivist
3dc1009e5b
Fix deserialisation of metadata producing an object with the wrong type
пре 10 месеци
JustAnotherArchivist
a839834050
Fix unrecognised repeated keys getting reported as unrepeatable
пре 10 месеци
JustAnotherArchivist
518541eb81
Fix metadata fields list caching for subclasses
Because the _allFieldsCache attribute gets inherited, when it gets set for a class, all subclasses will also see that list rather than their own, potentially different list. To fix this, use a global dict indexing on the metadata class instead.
пре 10 месеци
JustAnotherArchivist
47fe0a4e70
Handle URLs with queries and fragments
пре 1 година
JustAnotherArchivist
9de50bebdb
Fix metadata parsing on field values containing a colon
пре 1 година
JustAnotherArchivist
7eb175fb63
Document how inheritance on Metadata classes works
пре 1 година
JustAnotherArchivist
a361fe54e5
Add a metadata version field
пре 1 година
JustAnotherArchivist
fb8af13c15
Return all metadata validation errors at the same time
пре 1 година
JustAnotherArchivist
811e119835
Add retrieval start/end time metadata fields
пре 1 година
JustAnotherArchivist
fa4b60225c
Index → Metadata
'Index' was a misnomer from the start since it contains critical information for the operation that can't be reconstructed (e.g. existing refs).
пре 1 година
JustAnotherArchivist
4259d34ec8
Set default ID
пре 1 година
JustAnotherArchivist
d5891c795c
More metadata
пре 1 година
JustAnotherArchivist
25792d9006
Fix missing inheritance from abc.ABCMeta
пре 1 година
JustAnotherArchivist
a910d4851c
Add support for inheritance of index fields; change type of field list to a tuple to lessen the risk of modification
пре 1 година
JustAnotherArchivist
8e83c9b7b4
Support incremental Git bundles
Also fix a small discrepancy between the commit list and bundle due to --reflog vs --all
пре 1 година
JustAnotherArchivist
ed69ba16c9
logger → _logger
пре 1 година
JustAnotherArchivist
2257305872
Disallow underscores in module names
Using the preferred file naming scheme of {moduleName}_{someInputURLDerivative}_{date}*, this allows mapping files to modules without ambiguity.
пре 1 година
JustAnotherArchivist
4dcac08585
Fix import order
пре 1 година
JustAnotherArchivist
0f1f5abc64
Add indices for files
пре 1 година
JustAnotherArchivist
e3da8c7736
Use generic alias types
This requires at least Python 3.9.
пре 1 година
JustAnotherArchivist
2a2c9373d0
Documentation of the core
пре 3 година
JustAnotherArchivist
1b73693b37
Keep track of and handle errors in modules via metaclass
пре 3 година
JustAnotherArchivist
922900ac4e
Add support for selecting a module explicitly using `name+` URL prefix
E.g. `git+https://example.org/ `
пре 3 година
JustAnotherArchivist
22c707c04f
Add Module.name attribute
пре 3 година
JustAnotherArchivist
90e0af88b9
Fix return type of get_module_{class,instance}
No need to quote the class name since the methods are not inside the class (anymore)
пре 3 година
JustAnotherArchivist
5f9547d600
Get rid of inheritance-level-based module selection and instead raise an exception if there are no or multiple matching modules
пре 3 година
JustAnotherArchivist
7e8958b063
Allow overriding the archive ID
пре 4 година
JustAnotherArchivist
90f80e41a9
Add __repr__ methods
пре 4 година
JustAnotherArchivist
9f6e5a9f48
Move InputURL handling to base Module.__init__ and extract URL string for convenience
пре 4 година
JustAnotherArchivist
ca68893a59
Run submodules directly within the modules and return results from there instead of processing that externally
пре 4 година
JustAnotherArchivist
74a6fc7641
Use dataclass instead of namedtuple for module results
пре 4 година
JustAnotherArchivist
07dc1927cf
Initial commit
A significant part of this code (e.g. the module loading, HTTP retrieval, CLI) was mostly or entirely copied from snscrape.
пре 4 година