Commit Graph

  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • *
  • 09cf078 (HEAD -> master) Add metadata indexing by JustAnotherArchivist 2023-07-23 19:26:09 +0000
  • 3dc1009 Fix deserialisation of metadata producing an object with the wrong type by JustAnotherArchivist 2023-07-23 19:04:56 +0000
  • a839834 Fix unrecognised repeated keys getting reported as unrepeatable by JustAnotherArchivist 2023-07-23 18:42:02 +0000
  • 518541e Fix metadata fields list caching for subclasses by JustAnotherArchivist 2023-07-23 18:40:10 +0000
  • 1355db6 (tag: v1.1) Reduce memory usage by deleting potentially big objects when they're no longer needed by JustAnotherArchivist 2023-03-30 00:07:29 +0000
  • c028599 Skip temporary metadata dependency resolution if there are no dependencies by JustAnotherArchivist 2023-03-29 21:55:12 +0000
  • c4e3fd2 Add debug messages for artefacts listing by JustAnotherArchivist 2023-03-29 21:54:26 +0000
  • c18c440 Fix stripping of prefix when it contains escapes by JustAnotherArchivist 2023-03-29 21:47:36 +0000
  • cc802f7 Make all `*_temporary_metadata` methods take/return unique names rather than filenames by JustAnotherArchivist 2023-03-29 21:26:33 +0000
  • ee581f5 Fix wait_temporary_metadata checking for incorrect filenames by JustAnotherArchivist 2023-03-29 21:10:20 +0000
  • 9474c44 Ensure that FD 3 gets closed by JustAnotherArchivist 2023-03-29 09:21:06 +0000
  • 2fa88c4 Record metadata filename without full path by JustAnotherArchivist 2023-03-29 00:55:49 +0000
  • 66666a1 Workaround for incremental bundles with deltified objects by JustAnotherArchivist 2023-03-28 04:36:46 +0000
  • d3c701d Support parallel runs against the same storage by JustAnotherArchivist 2023-03-28 04:32:45 +0000
  • 9fa3665 Add undocumented --write-artefacts-fd-3 for codearchiver-bot by JustAnotherArchivist 2023-03-27 20:40:57 +0000
  • d42ee45 Module puts to storage directly by JustAnotherArchivist 2023-03-27 19:46:48 +0000
  • 543c6b0 (tag: v1.0) Use temporary directory for Git clone directory by JustAnotherArchivist 2023-03-25 05:34:56 +0000
  • 47fe0a4 Handle URLs with queries and fragments by JustAnotherArchivist 2023-03-25 05:27:43 +0000
  • e08919d Fix crash on incremental bundling with warnings by JustAnotherArchivist 2023-03-24 23:03:51 +0000
  • 0da6107 Keep a record of what HEAD points at by JustAnotherArchivist 2023-03-24 23:03:10 +0000
  • 9de50be Fix metadata parsing on field values containing a colon by JustAnotherArchivist 2023-03-24 23:02:23 +0000
  • e39548c Fix file extensions by JustAnotherArchivist 2023-03-24 21:07:19 +0000
  • 7861036 Add submodule check by JustAnotherArchivist 2023-03-21 19:44:20 +0000
  • 9a61800 Refactor Git bundling to allow for verification of the bundle contents by JustAnotherArchivist 2023-03-20 01:17:17 +0000
  • f5fe049 Add support for supplying a file-like object as stdin by JustAnotherArchivist 2023-03-20 01:13:56 +0000
  • adafd6b Fix race condition in subprocess runner by JustAnotherArchivist 2023-03-20 01:11:24 +0000
  • 9a0c839 Document minimum Git version by JustAnotherArchivist 2023-03-14 23:41:21 +0000
  • 3ca99d8 Require `Storage.search_metadata` to return files in lexicographical order to minimise dependencies between bundles by JustAnotherArchivist 2023-03-14 23:40:59 +0000
  • cc7bdbb Fix tag objects not getting deduplicated by JustAnotherArchivist 2023-03-14 23:37:20 +0000
  • f1edf4b Fix TypeError due to lack of `glob.glob`'s `root_dir` option on Python 3.9 by JustAnotherArchivist 2023-03-12 18:28:42 +0000
  • 4d6a423 Replace hacky module importing (taken from snscrape commit aa7d7d3d) by JustAnotherArchivist 2023-03-10 20:15:02 +0000
  • 7eb175f Document how inheritance on Metadata classes works by JustAnotherArchivist 2023-03-10 13:20:47 +0000
  • a361fe5 Add a metadata version field by JustAnotherArchivist 2023-03-10 13:20:28 +0000
  • fb8af13 Return all metadata validation errors at the same time by JustAnotherArchivist 2023-03-10 11:41:09 +0000
  • 811e119 Add retrieval start/end time metadata fields by JustAnotherArchivist 2023-03-10 11:24:22 +0000
  • b0505f9 Fix typo in package name by JustAnotherArchivist 2023-03-10 01:44:46 +0000
  • eab6db9 Better storage metadata search now that the module name is recorded there anyway by JustAnotherArchivist 2023-03-10 01:18:41 +0000
  • fa4b602 Index → Metadata by JustAnotherArchivist 2023-03-10 01:16:25 +0000
  • 4259d34 Set default ID by JustAnotherArchivist 2023-03-10 01:03:59 +0000
  • d5891c7 More metadata by JustAnotherArchivist 2023-03-10 01:00:51 +0000
  • 25792d9 Fix missing inheritance from abc.ABCMeta by JustAnotherArchivist 2023-03-10 00:46:34 +0000
  • a910d48 Add support for inheritance of index fields; change type of field list to a tuple to lessen the risk of modification by JustAnotherArchivist 2023-03-09 23:34:12 +0000
  • 2779148 Add .gitignore by JustAnotherArchivist 2023-03-09 11:33:55 +0000
  • d5a7d39 setup.py → pyproject.toml by JustAnotherArchivist 2023-03-09 11:31:33 +0000
  • 80995bc Add comment about FETCH_HEAD by JustAnotherArchivist 2023-03-09 11:26:12 +0000
  • 2a9ff2e Support empty incremental bundles by JustAnotherArchivist 2023-03-09 11:19:59 +0000
  • 0e7b17d Capture and return stderr by JustAnotherArchivist 2023-03-09 11:18:04 +0000
  • a6e256c Fix invalid usage of codearchiver.subprocess by JustAnotherArchivist 2023-03-09 11:16:14 +0000
  • 8e83c9b Support incremental Git bundles by JustAnotherArchivist 2023-03-09 10:53:02 +0000
  • 021b269 Fix handling empty input by JustAnotherArchivist 2023-03-09 10:45:22 +0000
  • ed69ba1 logger → _logger by JustAnotherArchivist 2023-03-09 10:44:47 +0000
  • 6f7a95d Add --progress option to cloning for more details by JustAnotherArchivist 2023-03-09 08:13:35 +0000
  • 42e420a Disable prompts on password-protected repos by JustAnotherArchivist 2023-03-09 08:12:59 +0000
  • a9e838a Raise exception if file already exists in DirectoryStorage target by JustAnotherArchivist 2023-03-09 08:11:50 +0000
  • 6af07cb Raise exceptions on fatal errors by JustAnotherArchivist 2023-03-09 08:05:39 +0000
  • 2257305 Disallow underscores in module names by JustAnotherArchivist 2023-03-09 07:56:04 +0000
  • 4dcac08 Fix import order by JustAnotherArchivist 2023-03-09 07:55:55 +0000
  • 0f1f5ab Add indices for files by JustAnotherArchivist 2023-03-09 07:55:40 +0000
  • e3da8c7 Use generic alias types by JustAnotherArchivist 2023-03-06 00:21:29 +0000
  • f2d2df9 Simplify storage design; there is no need for the queue by JustAnotherArchivist 2023-03-05 03:02:42 +0000
  • 550afa8 Add storage abstraction by JustAnotherArchivist 2023-03-05 02:58:02 +0000
  • 06daea1 Remove GitHub module as it is not ready for use yet by JustAnotherArchivist 2023-03-05 02:27:10 +0000
  • 240dcce Add subprocess wrapper for logging stderr by JustAnotherArchivist 2023-03-04 21:44:51 +0000
  • 6fb0ac4 Initial GitHub module only retrieving the actual repository by JustAnotherArchivist 2020-06-27 13:16:06 +0000
  • 2a2c937 Documentation of the core by JustAnotherArchivist 2020-06-27 01:19:23 +0000
  • 715420e Fix imports in CLI: core and modules aren't needed in the argument parser by JustAnotherArchivist 2020-06-27 01:15:55 +0000
  • 1b73693 Keep track of and handle errors in modules via metaclass by JustAnotherArchivist 2020-06-26 22:12:58 +0000
  • 922900a Add support for selecting a module explicitly using `name+` URL prefix by JustAnotherArchivist 2020-06-26 17:57:02 +0000
  • 22c707c Add Module.name attribute by JustAnotherArchivist 2020-06-26 17:53:05 +0000
  • 90e0af8 Fix return type of get_module_{class,instance} by JustAnotherArchivist 2020-06-26 16:38:08 +0000
  • 5f9547d Get rid of inheritance-level-based module selection and instead raise an exception if there are no or multiple matching modules by JustAnotherArchivist 2020-06-26 16:22:13 +0000
  • 7e8958b Allow overriding the archive ID by JustAnotherArchivist 2020-06-22 13:44:28 +0000
  • 90f80e4 Add __repr__ methods by JustAnotherArchivist 2020-06-19 23:26:16 +0000
  • 9f6e5a9 Move InputURL handling to base Module.__init__ and extract URL string for convenience by JustAnotherArchivist 2020-06-19 23:23:48 +0000
  • ca68893 Run submodules directly within the modules and return results from there instead of processing that externally by JustAnotherArchivist 2020-06-19 23:22:10 +0000
  • 74a6fc7 Use dataclass instead of namedtuple for module results by JustAnotherArchivist 2020-06-19 23:20:21 +0000
  • 07dc192 Initial commit by JustAnotherArchivist 2020-06-18 03:24:00 +0000