Skip to content

Commit

Permalink
throw error on empty URL extractions
Browse files Browse the repository at this point in the history
  • Loading branch information
emcf committed Sep 9, 2024
1 parent 3d680df commit ccfe50a
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions thepipe/scraper.py
Original file line number Diff line number Diff line change
Expand Up @@ -541,6 +541,9 @@ def scrape_url(url: str, text_only: bool = False, ai_extraction: bool = False, v
else:
chunk = extract_page_content(url=url, text_only=text_only, verbose=verbose)
chunks = chunking_method([chunk])
# if no text or images were extracted, return error
if not any(chunk.texts for chunk in chunks) and not any(chunk.images for chunk in chunks):
raise ValueError("No content extracted from URL.")
return chunks

def format_timestamp(seconds, chunk_index, chunk_duration):
Expand Down

0 comments on commit ccfe50a

Please sign in to comment.