45 コミット (v0.2.1)

作成者 SHA1 メッセージ 日付
  JustAnotherArchivist 93df9cd18d Get rid of the temporary extra log file and read the plain file instead 4年前
  JustAnotherArchivist 08c3d55376 Add comment on block digest workaround (cf. f14a664b) 4年前
  JustAnotherArchivist 413435b7fb Work around warcio not writing the correct WARC-Profile header for revisit records on WARC/1.1 4年前
  JustAnotherArchivist 08d96b37c5 Support deep/multiple inheritance from Item 4年前
  JustAnotherArchivist 9d8de13775 Add Item.flush_subitems to flush the new subitems to the database while the item is still being processed 4年前
  JustAnotherArchivist 50b936b18c Refactor QWARC class to keep relevant variables in instance attributes instead of local variables 4年前
  JustAnotherArchivist c5d8d93166 Remove stray whitespace 4年前
  JustAnotherArchivist 8ee9b20718 Remove WARC-Target-URI header from warcinfo record 4年前
  JustAnotherArchivist f14a664b1c Work around warcio not writing a block digest for warcinfo records (https://github.com/webrecorder/warcio/issues/87) 4年前
  JustAnotherArchivist 7d53577522 Add parameter for disabling SSL/TLS certificate validation 4年前
  JustAnotherArchivist 7e049423a4 The memory leak has vanished as of CPython 3.7.3 4年前
  JustAnotherArchivist bd14ab3901 Fix crash due to closing the log handler on reaching the max WARC size 4年前
  JustAnotherArchivist 08117630b0 Remove warcinfo record in each data WARC and refer to the process's warcinfo record in the meta WARC instead 4年前
  JustAnotherArchivist 26aab15605 urn:X-qwarc instead of urn:qwarc 4年前
  JustAnotherArchivist 50d46ad51c Use log filename in the target URI of the log resource record 4年前
  JustAnotherArchivist e093211496 Set content type for resource records 4年前
  JustAnotherArchivist ae46b53401 Always write a WARC-Warcinfo-ID header 4年前
  JustAnotherArchivist 23fcdd4026 Write microsecond dates for request and response records 4年前
  JustAnotherArchivist 3030ad10ab Mark private API accordingly 4年前
  JustAnotherArchivist e0b4104d21 Remove log handler before writing log record since that requires closing the stream 4年前
  JustAnotherArchivist 6cfd352f68 Write WARC/1.1 files 4年前
  JustAnotherArchivist e1ad5c232e Write warcinfo and resource records in meta WARC on firing up qwarc rather than at the end 4年前
  JustAnotherArchivist f038cf91db Fix unfound distribution handling 4年前
  JustAnotherArchivist a5dfd5c805 Write spec file + its dependencies and command line to meta WARC 4年前
  JustAnotherArchivist e99e2304c9 Write meta WARC with log file 4年前
  JustAnotherArchivist d751844626 Fix starting another item before stopping on STOP file or memory limit exceedance 4年前
  JustAnotherArchivist 2b0778f9b5 Remove leftovers from initial code rewrite 4年前
  JustAnotherArchivist 85d78cee13 Add warcinfo record with version information on Python, system, and dependencies 4年前
  JustAnotherArchivist 9cff6bd5c1 Only open a WARC file when necessary to avoid producing empty WARCs at the end 4年前
  JustAnotherArchivist 21cf784102 Use setuptools_scm for versioning 4年前
  JustAnotherArchivist ab22966fef Add to log which item a message is coming from 4年前
  JustAnotherArchivist 6fafd32685 Error when the retries are exceeded 4年前
  JustAnotherArchivist 8647d6b396 Use f-strings instead of str.format 4年前
  JustAnotherArchivist 5008e6e8cd Deduplicate items 4年前
  JustAnotherArchivist 46c95e2157 Disable decoding the response content 4年前
  JustAnotherArchivist 85f6f7bd82 Make qwarc.utils.handle_response_limit_error_retries more useful by passing the deferring handler as an argument 5年前
  JustAnotherArchivist ad22a2327a Support adding headers to individual requests 5年前
  JustAnotherArchivist 67076f964c Add support for POST requests 5年前
  JustAnotherArchivist 2d52e78d85 Fix reference to aiohttp.CientError 5年前
  JustAnotherArchivist c1574a06c9 Fix sleep task type 5年前
  JustAnotherArchivist e0ca88c807 Fix reference to get_rss 5年前
  JustAnotherArchivist 984d28ede0 Fix type of --memorylimit, --disklimit, and --warcsplit values 5年前
  JustAnotherArchivist 8a8935810d Fix references to memory and disk space check methods 5年前
  JustAnotherArchivist be5673cfbf Add record deduplication within a process 5年前
  JustAnotherArchivist e892a6b6a7 Initial commit 5年前