105 Revīzijas (master)
 

Autors SHA1 Ziņojums Datums
  JustAnotherArchivist 2e1dc59e9d Fix log level of one message pirms 3 gadiem
  JustAnotherArchivist f025c4e9f3 Add extensive debug logging pirms 3 gadiem
  JustAnotherArchivist ce7f8fdc92 Make optional arguments to fetch kwarg-only pirms 3 gadiem
  JustAnotherArchivist b29db245fb Configurable verbosity for log file and stderr pirms 3 gadiem
  JustAnotherArchivist dbe1ed71ab "Freeze" log file object before writing to WARC to ensure that further log messages aren't picked up pirms 3 gadiem
  JustAnotherArchivist 8ca2a6bde5 Fix exceptions on journal errors pirms 3 gadiem
  JustAnotherArchivist 3c8b45b3a6 Refactor cleanup code pirms 3 gadiem
  JustAnotherArchivist dcd5455388 Fix crash on starting a run while the DB is locked pirms 3 gadiem
  JustAnotherArchivist 168fa78736 Avoid locking the DB when there are no subitems to insert pirms 3 gadiem
  JustAnotherArchivist 4484d6c588 Add Item representation pirms 3 gadiem
  JustAnotherArchivist 5675118877 Rename id to id_ to avoid clash with builtin pirms 3 gadiem
  JustAnotherArchivist a1e693739e Replace DB locking with an async context manager pirms 3 gadiem
  JustAnotherArchivist cbcef2f173 Add Linux classifier pirms 3 gadiem
  JustAnotherArchivist 733506aed7 Remove obsolete TODO pirms 3 gadiem
  JustAnotherArchivist c7fac0ec3f Add WARC journalling with rollback on errors pirms 3 gadiem
  JustAnotherArchivist a4cf1a4225 Fix str_get_all_between yielding half-overlapping matches pirms 3 gadiem
  JustAnotherArchivist 15203bd991 Handle redirect traps/loops pirms 3 gadiem
  JustAnotherArchivist f8f5258197 Track redirect depth pirms 3 gadiem
  JustAnotherArchivist a3d6fb35f8 Turn response handlers into kwarg-only functions for easier extendability without breaking existing code pirms 3 gadiem
  JustAnotherArchivist a91cc23d47 Simplify get_software_info's signature to just the extra dependency packages pirms 3 gadiem
  JustAnotherArchivist 6cc4adb901 Remove stray TODO pirms 3 gadiem
  JustAnotherArchivist c5604ef965 Simplify header merging pirms 3 gadiem
  JustAnotherArchivist 59ae1183d2 Add fromResponse parameter for URL completion and automatic Referer header pirms 3 gadiem
  JustAnotherArchivist 2324216016 Add baseUrl and evaluate incomplete URLs relative to it pirms 3 gadiem
  JustAnotherArchivist b30ccf8bf8 Move response/exception history to ClientResponse.qhistory pirms 3 gadiem
  JustAnotherArchivist e69527c715 Add defaultResponseHandler on the Item level pirms 3 gadiem
  JustAnotherArchivist 03336e4988 Add item to response handler arguments (e.g. for logging) pirms 3 gadiem
  JustAnotherArchivist 005999fcb9 Disable aiohttp's Content-Type checking on JSON parsing by default pirms 3 gadiem
  JustAnotherArchivist 6bdcfe71f0 Refactor database creation and item generation: call `Item.generate()` on every qwarc run and dedupe its output, allowing the addition of further items by modifying the spec file pirms 3 gadiem
  JustAnotherArchivist c878241f24 Switch from concurrent.futures.CancelledError to asyncio.CancelledError pirms 3 gadiem
  JustAnotherArchivist 749158b97a Use the Future's result directly rather than awaiting again pirms 3 gadiem
  JustAnotherArchivist 5c6169ee4d Bump Python version classifiers pirms 3 gadiem
  JustAnotherArchivist a85e80ffa2 Configurable request timeout pirms 3 gadiem
  JustAnotherArchivist 429ac94689 Make it possible to override and remove headers pirms 3 gadiem
  JustAnotherArchivist e40be54578 Document verify_ssl parameter pirms 3 gadiem
  JustAnotherArchivist d3437bde19 Move default headers to qwarc.const pirms 3 gadiem
  JustAnotherArchivist 17fc3499ff Fix infinite loop in workaround for aiohttp issue 4630 pirms 3 gadiem
  JustAnotherArchivist b6003af1e5 Work around aiohttp bug on parsing chunked transfer encoding responses when the buffer ends in an unfortunate spot pirms 4 gadiem
  JustAnotherArchivist 1678075a89 Log traceback on exceptions raised from an item pirms 4 gadiem
  JustAnotherArchivist 4ff8b260a1 Don't close raw data tempfiles until the response gets GC'd pirms 4 gadiem
  JustAnotherArchivist 4d9e4d8fe8 Fix ClientResponse._read returning more than nbytes if the entire response fits into the first block fed into the parser pirms 4 gadiem
  JustAnotherArchivist 2895f4bfdf Catch TypeError in Content-Length parsing pirms 4 gadiem
  JustAnotherArchivist 8358ba9131 Add support for only reading part of the response into memory pirms 4 gadiem
  JustAnotherArchivist 939978beec Handle EOF from the HTTP payload parser correctly pirms 4 gadiem
  JustAnotherArchivist b1a1c03f7e Handle STOP file and high memory usage before full disk to allow stopping while the disk is above the limit pirms 4 gadiem
  JustAnotherArchivist dd44d9b174 Adjust logging levels: log individual request failures only at WARNING and cancelled tasks at ERROR level pirms 4 gadiem
  JustAnotherArchivist 820384fe1e Stop deduping small responses pirms 4 gadiem
  JustAnotherArchivist 91035d769c Catch exceptions in Item.process and mark the items as errors instead of crashing pirms 4 gadiem
  JustAnotherArchivist 69984765b3 Fix taskType typo silencing cancellation warnings pirms 4 gadiem
  JustAnotherArchivist 461cedbbde Avoid temporary files created by warcio due to not knowing the record payload length pirms 4 gadiem