71 コミット (v0.2.3)
 

作成者 SHA1 メッセージ 日付
  JustAnotherArchivist 1678075a89 Log traceback on exceptions raised from an item 4年前
  JustAnotherArchivist 4ff8b260a1 Don't close raw data tempfiles until the response gets GC'd 4年前
  JustAnotherArchivist 4d9e4d8fe8 Fix ClientResponse._read returning more than nbytes if the entire response fits into the first block fed into the parser 4年前
  JustAnotherArchivist 2895f4bfdf Catch TypeError in Content-Length parsing 4年前
  JustAnotherArchivist 8358ba9131 Add support for only reading part of the response into memory 4年前
  JustAnotherArchivist 939978beec Handle EOF from the HTTP payload parser correctly 4年前
  JustAnotherArchivist b1a1c03f7e Handle STOP file and high memory usage before full disk to allow stopping while the disk is above the limit 4年前
  JustAnotherArchivist dd44d9b174 Adjust logging levels: log individual request failures only at WARNING and cancelled tasks at ERROR level 4年前
  JustAnotherArchivist 820384fe1e Stop deduping small responses 4年前
  JustAnotherArchivist 91035d769c Catch exceptions in Item.process and mark the items as errors instead of crashing 4年前
  JustAnotherArchivist 69984765b3 Fix taskType typo silencing cancellation warnings 4年前
  JustAnotherArchivist 461cedbbde Avoid temporary files created by warcio due to not knowing the record payload length 4年前
  JustAnotherArchivist c263ad0b03 Return ClientResponse object from fetch only if the retrieval was successful 4年前
  JustAnotherArchivist cb0d11284e Write only successful retrievals (i.e. ones that don't cause an exception) to WARC 4年前
  JustAnotherArchivist 1214409a0b Flush big responses to a temporary file instead of trying to keep everything in-memory 4年前
  JustAnotherArchivist 37dbcfad21 Don't write responses to WARC that triggered an exception 4年前
  JustAnotherArchivist 93df9cd18d Get rid of the temporary extra log file and read the plain file instead 4年前
  JustAnotherArchivist 08c3d55376 Add comment on block digest workaround (cf. f14a664b) 4年前
  JustAnotherArchivist 413435b7fb Work around warcio not writing the correct WARC-Profile header for revisit records on WARC/1.1 4年前
  JustAnotherArchivist 08d96b37c5 Support deep/multiple inheritance from Item 4年前
  JustAnotherArchivist 9d8de13775 Add Item.flush_subitems to flush the new subitems to the database while the item is still being processed 4年前
  JustAnotherArchivist 50b936b18c Refactor QWARC class to keep relevant variables in instance attributes instead of local variables 4年前
  JustAnotherArchivist c5d8d93166 Remove stray whitespace 4年前
  JustAnotherArchivist 8ee9b20718 Remove WARC-Target-URI header from warcinfo record 4年前
  JustAnotherArchivist f14a664b1c Work around warcio not writing a block digest for warcinfo records (https://github.com/webrecorder/warcio/issues/87) 4年前
  JustAnotherArchivist 7d53577522 Add parameter for disabling SSL/TLS certificate validation 4年前
  JustAnotherArchivist 7e049423a4 The memory leak has vanished as of CPython 3.7.3 4年前
  JustAnotherArchivist bd14ab3901 Fix crash due to closing the log handler on reaching the max WARC size 4年前
  JustAnotherArchivist 08117630b0 Remove warcinfo record in each data WARC and refer to the process's warcinfo record in the meta WARC instead 4年前
  JustAnotherArchivist 26aab15605 urn:X-qwarc instead of urn:qwarc 4年前
  JustAnotherArchivist 50d46ad51c Use log filename in the target URI of the log resource record 4年前
  JustAnotherArchivist e093211496 Set content type for resource records 4年前
  JustAnotherArchivist ae46b53401 Always write a WARC-Warcinfo-ID header 4年前
  JustAnotherArchivist 23fcdd4026 Write microsecond dates for request and response records 4年前
  JustAnotherArchivist 3030ad10ab Mark private API accordingly 4年前
  JustAnotherArchivist e0b4104d21 Remove log handler before writing log record since that requires closing the stream 4年前
  JustAnotherArchivist 6cfd352f68 Write WARC/1.1 files 4年前
  JustAnotherArchivist e1ad5c232e Write warcinfo and resource records in meta WARC on firing up qwarc rather than at the end 4年前
  JustAnotherArchivist f038cf91db Fix unfound distribution handling 4年前
  JustAnotherArchivist a5dfd5c805 Write spec file + its dependencies and command line to meta WARC 4年前
  JustAnotherArchivist e99e2304c9 Write meta WARC with log file 4年前
  JustAnotherArchivist d751844626 Fix starting another item before stopping on STOP file or memory limit exceedance 4年前
  JustAnotherArchivist 2b0778f9b5 Remove leftovers from initial code rewrite 4年前
  JustAnotherArchivist 85d78cee13 Add warcinfo record with version information on Python, system, and dependencies 4年前
  JustAnotherArchivist 9eaa7be4c8 Python 3.7 compatibility 4年前
  JustAnotherArchivist 9cff6bd5c1 Only open a WARC file when necessary to avoid producing empty WARCs at the end 4年前
  JustAnotherArchivist 21cf784102 Use setuptools_scm for versioning 4年前
  JustAnotherArchivist ab22966fef Add to log which item a message is coming from 4年前
  JustAnotherArchivist 6fafd32685 Error when the retries are exceeded 4年前
  JustAnotherArchivist 8647d6b396 Use f-strings instead of str.format 4年前