JustAnotherArchivist
c02859987f
Skip temporary metadata dependency resolution if there are no dependencies
пре 1 година
JustAnotherArchivist
c4e3fd29e0
Add debug messages for artefacts listing
пре 1 година
JustAnotherArchivist
c18c440871
Fix stripping of prefix when it contains escapes
пре 1 година
JustAnotherArchivist
cc802f7727
Make all `*_temporary_metadata` methods take/return unique names rather than filenames
пре 1 година
JustAnotherArchivist
ee581f5c5c
Fix wait_temporary_metadata checking for incorrect filenames
пре 1 година
JustAnotherArchivist
9474c44171
Ensure that FD 3 gets closed
It appears that Python doesn't reliably (or maybe just doesn't at all) flush and close non-standard FDs on exit. This randomly caused the artefacts list to get lost to /dev/null.
пре 1 година
JustAnotherArchivist
2fa88c4cce
Record metadata filename without full path
пре 1 година
JustAnotherArchivist
66666a1538
Workaround for incremental bundles with deltified objects
пре 1 година
JustAnotherArchivist
d3c701daa9
Support parallel runs against the same storage
Closes #15
пре 1 година
JustAnotherArchivist
9fa36654f5
Add undocumented --write-artefacts-fd-3 for codearchiver-bot
пре 1 година
JustAnotherArchivist
d42ee45bb2
Module puts to storage directly
пре 1 година
JustAnotherArchivist
543c6b0595
Use temporary directory for Git clone directory
The previous approach was flawed and broke on URLs ending with a slash.
пре 1 година
JustAnotherArchivist
47fe0a4e70
Handle URLs with queries and fragments
пре 1 година
JustAnotherArchivist
e08919d89f
Fix crash on incremental bundling with warnings
For example, if the HEAD is excluded:
warning: ref 'refs/heads/master' is excluded by the rev-list options
warning: ref 'HEAD' is excluded by the rev-list options
fatal: Refusing to create empty bundle.
The fatal message always appears last (though that's of course undocumented).
пре 1 година
JustAnotherArchivist
0da610744c
Keep a record of what HEAD points at
пре 1 година
JustAnotherArchivist
9de50bebdb
Fix metadata parsing on field values containing a colon
пре 1 година
JustAnotherArchivist
e39548c50b
Fix file extensions
пре 1 година
JustAnotherArchivist
7861036624
Add submodule check
#13
пре 1 година
JustAnotherArchivist
9a61800758
Refactor Git bundling to allow for verification of the bundle contents
This verifies that all objects from the current clone are in either the dependency bundles or the current bundle. This guarantees that the repo as it has been clone at the time of retrieval can be reconstructed exactly from the bundles.
As a side-effect, if a non-standard Git server were to include objects in a clone pack that are not discoverable from refs, this will fail any attempt to archive such a clone. This could in the future be resolved by adding custom refs for those extra objects.
This also fixes a bug where prior bundles could be included as a dependency even though they contain no relevant data due to their refs (as refs are always listed in the bundle metadata). Instead, dependency detection now operates directly on commit and tag objects, which can only be present in one bundle.
пре 1 година
JustAnotherArchivist
f5fe0496f5
Add support for supplying a file-like object as stdin
пре 1 година
JustAnotherArchivist
adafd6bd01
Fix race condition in subprocess runner
stdin, stdout, and stderr being closed does not necessarily imply that the process has exited, although it usually does. Still need to explicitly wait for it to terminate after the I/O loop. This matches what the stdlib `subprocess.Popen._communicate` does as well.
пре 1 година
JustAnotherArchivist
3ca99d8839
Require `Storage.search_metadata` to return files in lexicographical order to minimise dependencies between bundles
пре 1 година
JustAnotherArchivist
cc7bdbb3f4
Fix tag objects not getting deduplicated
пре 1 година
JustAnotherArchivist
f1edf4b752
Fix TypeError due to lack of `glob.glob`'s `root_dir` option on Python 3.9
пре 1 година
JustAnotherArchivist
4d6a423fb5
Replace hacky module importing (taken from snscrape commit aa7d7d3d)
пре 1 година
JustAnotherArchivist
7eb175fb63
Document how inheritance on Metadata classes works
пре 1 година
JustAnotherArchivist
a361fe54e5
Add a metadata version field
пре 1 година
JustAnotherArchivist
fb8af13c15
Return all metadata validation errors at the same time
пре 1 година
JustAnotherArchivist
811e119835
Add retrieval start/end time metadata fields
пре 1 година
JustAnotherArchivist
eab6db9f27
Better storage metadata search now that the module name is recorded there anyway
пре 1 година
JustAnotherArchivist
fa4b60225c
Index → Metadata
'Index' was a misnomer from the start since it contains critical information for the operation that can't be reconstructed (e.g. existing refs).
пре 1 година
JustAnotherArchivist
4259d34ec8
Set default ID
пре 1 година
JustAnotherArchivist
d5891c795c
More metadata
пре 1 година
JustAnotherArchivist
25792d9006
Fix missing inheritance from abc.ABCMeta
пре 1 година
JustAnotherArchivist
a910d4851c
Add support for inheritance of index fields; change type of field list to a tuple to lessen the risk of modification
пре 1 година
JustAnotherArchivist
80995bccde
Add comment about FETCH_HEAD
пре 1 година
JustAnotherArchivist
2a9ff2ee15
Support empty incremental bundles
пре 1 година
JustAnotherArchivist
0e7b17d3fd
Capture and return stderr
пре 1 година
JustAnotherArchivist
a6e256c58f
Fix invalid usage of codearchiver.subprocess
Introduced by 240dcceb
пре 1 година
JustAnotherArchivist
8e83c9b7b4
Support incremental Git bundles
Also fix a small discrepancy between the commit list and bundle due to --reflog vs --all
пре 1 година
JustAnotherArchivist
021b26973b
Fix handling empty input
пре 1 година
JustAnotherArchivist
ed69ba16c9
logger → _logger
пре 1 година
JustAnotherArchivist
6f7a95d289
Add --progress option to cloning for more details
пре 1 година
JustAnotherArchivist
42e420ad0d
Disable prompts on password-protected repos
пре 1 година
JustAnotherArchivist
a9e838adde
Raise exception if file already exists in DirectoryStorage target
пре 1 година
JustAnotherArchivist
6af07cb51c
Raise exceptions on fatal errors
пре 1 година
JustAnotherArchivist
2257305872
Disallow underscores in module names
Using the preferred file naming scheme of {moduleName}_{someInputURLDerivative}_{date}*, this allows mapping files to modules without ambiguity.
пре 1 година
JustAnotherArchivist
4dcac08585
Fix import order
пре 1 година
JustAnotherArchivist
0f1f5abc64
Add indices for files
пре 1 година
JustAnotherArchivist
e3da8c7736
Use generic alias types
This requires at least Python 3.9.
пре 1 година