09cf078
(HEAD -> master)
Add metadata indexing by
2023-07-23 19:26:09 +0000
3dc1009
Fix deserialisation of metadata producing an object with the wrong type by
2023-07-23 19:04:56 +0000
a839834
Fix unrecognised repeated keys getting reported as unrepeatable by
2023-07-23 18:42:02 +0000
518541e
Fix metadata fields list caching for subclasses by
2023-07-23 18:40:10 +0000
1355db6
(tag: v1.1)
Reduce memory usage by deleting potentially big objects when they're no longer needed by
2023-03-30 00:07:29 +0000
c028599
Skip temporary metadata dependency resolution if there are no dependencies by
2023-03-29 21:55:12 +0000
c4e3fd2
Add debug messages for artefacts listing by
2023-03-29 21:54:26 +0000
c18c440
Fix stripping of prefix when it contains escapes by
2023-03-29 21:47:36 +0000
cc802f7
Make all `*_temporary_metadata` methods take/return unique names rather than filenames by
2023-03-29 21:26:33 +0000
ee581f5
Fix wait_temporary_metadata checking for incorrect filenames by
2023-03-29 21:10:20 +0000
9474c44
Ensure that FD 3 gets closed by
2023-03-29 09:21:06 +0000
2fa88c4
Record metadata filename without full path by
2023-03-29 00:55:49 +0000
66666a1
Workaround for incremental bundles with deltified objects by
2023-03-28 04:36:46 +0000
d3c701d
Support parallel runs against the same storage by
2023-03-28 04:32:45 +0000
9fa3665
Add undocumented --write-artefacts-fd-3 for codearchiver-bot by
2023-03-27 20:40:57 +0000
d42ee45
Module puts to storage directly by
2023-03-27 19:46:48 +0000
543c6b0
(tag: v1.0)
Use temporary directory for Git clone directory by
2023-03-25 05:34:56 +0000
47fe0a4
Handle URLs with queries and fragments by
2023-03-25 05:27:43 +0000
e08919d
Fix crash on incremental bundling with warnings by
2023-03-24 23:03:51 +0000
0da6107
Keep a record of what HEAD points at by
2023-03-24 23:03:10 +0000
9de50be
Fix metadata parsing on field values containing a colon by
2023-03-24 23:02:23 +0000
e39548c
Fix file extensions by
2023-03-24 21:07:19 +0000
7861036
Add submodule check by
2023-03-21 19:44:20 +0000
9a61800
Refactor Git bundling to allow for verification of the bundle contents by
2023-03-20 01:17:17 +0000
f5fe049
Add support for supplying a file-like object as stdin by
2023-03-20 01:13:56 +0000
adafd6b
Fix race condition in subprocess runner by
2023-03-20 01:11:24 +0000
9a0c839
Document minimum Git version by
2023-03-14 23:41:21 +0000
3ca99d8
Require `Storage.search_metadata` to return files in lexicographical order to minimise dependencies between bundles by
2023-03-14 23:40:59 +0000
cc7bdbb
Fix tag objects not getting deduplicated by
2023-03-14 23:37:20 +0000
f1edf4b
Fix TypeError due to lack of `glob.glob`'s `root_dir` option on Python 3.9 by
2023-03-12 18:28:42 +0000
4d6a423
Replace hacky module importing (taken from snscrape commit aa7d7d3d) by
2023-03-10 20:15:02 +0000
7eb175f
Document how inheritance on Metadata classes works by
2023-03-10 13:20:47 +0000
a361fe5
Add a metadata version field by
2023-03-10 13:20:28 +0000
fb8af13
Return all metadata validation errors at the same time by
2023-03-10 11:41:09 +0000
811e119
Add retrieval start/end time metadata fields by
2023-03-10 11:24:22 +0000
b0505f9
Fix typo in package name by
2023-03-10 01:44:46 +0000
eab6db9
Better storage metadata search now that the module name is recorded there anyway by
2023-03-10 01:18:41 +0000
fa4b602
Index → Metadata by
2023-03-10 01:16:25 +0000
4259d34
Set default ID by
2023-03-10 01:03:59 +0000
d5891c7
More metadata by
2023-03-10 01:00:51 +0000
25792d9
Fix missing inheritance from abc.ABCMeta by
2023-03-10 00:46:34 +0000
a910d48
Add support for inheritance of index fields; change type of field list to a tuple to lessen the risk of modification by
2023-03-09 23:34:12 +0000
2779148
Add .gitignore by
2023-03-09 11:33:55 +0000
d5a7d39
setup.py → pyproject.toml by
2023-03-09 11:31:33 +0000
80995bc
Add comment about FETCH_HEAD by
2023-03-09 11:26:12 +0000
2a9ff2e
Support empty incremental bundles by
2023-03-09 11:19:59 +0000
0e7b17d
Capture and return stderr by
2023-03-09 11:18:04 +0000
a6e256c
Fix invalid usage of codearchiver.subprocess by
2023-03-09 11:16:14 +0000
8e83c9b
Support incremental Git bundles by
2023-03-09 10:53:02 +0000
021b269
Fix handling empty input by
2023-03-09 10:45:22 +0000
ed69ba1
logger → _logger by
2023-03-09 10:44:47 +0000
6f7a95d
Add --progress option to cloning for more details by
2023-03-09 08:13:35 +0000
42e420a
Disable prompts on password-protected repos by
2023-03-09 08:12:59 +0000
a9e838a
Raise exception if file already exists in DirectoryStorage target by
2023-03-09 08:11:50 +0000
6af07cb
Raise exceptions on fatal errors by
2023-03-09 08:05:39 +0000
2257305
Disallow underscores in module names by
2023-03-09 07:56:04 +0000
4dcac08
Fix import order by
2023-03-09 07:55:55 +0000
0f1f5ab
Add indices for files by
2023-03-09 07:55:40 +0000
e3da8c7
Use generic alias types by
2023-03-06 00:21:29 +0000
f2d2df9
Simplify storage design; there is no need for the queue by
2023-03-05 03:02:42 +0000
550afa8
Add storage abstraction by
2023-03-05 02:58:02 +0000
06daea1
Remove GitHub module as it is not ready for use yet by
2023-03-05 02:27:10 +0000
240dcce
Add subprocess wrapper for logging stderr by
2023-03-04 21:44:51 +0000
6fb0ac4
Initial GitHub module only retrieving the actual repository by
2020-06-27 13:16:06 +0000
2a2c937
Documentation of the core by
2020-06-27 01:19:23 +0000
715420e
Fix imports in CLI: core and modules aren't needed in the argument parser by
2020-06-27 01:15:55 +0000
1b73693
Keep track of and handle errors in modules via metaclass by
2020-06-26 22:12:58 +0000
922900a
Add support for selecting a module explicitly using `name+` URL prefix by
2020-06-26 17:57:02 +0000
22c707c
Add Module.name attribute by
2020-06-26 17:53:05 +0000
90e0af8
Fix return type of get_module_{class,instance} by
2020-06-26 16:38:08 +0000
5f9547d
Get rid of inheritance-level-based module selection and instead raise an exception if there are no or multiple matching modules by
2020-06-26 16:22:13 +0000
7e8958b
Allow overriding the archive ID by
2020-06-22 13:44:28 +0000
90f80e4
Add __repr__ methods by
2020-06-19 23:26:16 +0000
9f6e5a9
Move InputURL handling to base Module.__init__ and extract URL string for convenience by
2020-06-19 23:23:48 +0000
ca68893
Run submodules directly within the modules and return results from there instead of processing that externally by
2020-06-19 23:22:10 +0000
74a6fc7
Use dataclass instead of namedtuple for module results by
2020-06-19 23:20:21 +0000
07dc192
Initial commit by
2020-06-18 03:24:00 +0000