Browse Source

Exclude backslashes in channel patterns

master
JustAnotherArchivist 1 year ago
parent
commit
98adc6cfac
1 changed files with 2 additions and 2 deletions
  1. +2
    -2
      youtube-extract

+ 2
- 2
youtube-extract View File

@@ -53,10 +53,10 @@ noisePattern = '|'.join([
])

channelPattern = '|'.join([
r'''/www\.youtube\.com/c/[^/?&=."'>\s]+''',
r'''/www\.youtube\.com/c/[^/?&=."'>\\\s]+''',
r'/www\.youtube\.com/user/[A-Za-z0-9]{1,20}',
r'/www\.youtube\.com/channel/UC[0-9A-Za-z_-]{22}',
r'''/www\.youtube\.com/[^/?&=."'>\s]+(?=/?(\s|["'>]|$))''',
r'''/www\.youtube\.com/[^/?&=."'>\\\s]+(?=/?(\s|\\?["'>]|$))''',
])

# Make sure that the last 11 chars of the match are always the video ID (because Python's re doesn't support \K).


Loading…
Cancel
Save