Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: adding new article on gemini citations #1186

Merged
merged 2 commits into from
Nov 15, 2024
Merged

Conversation

ivanleomk
Copy link
Collaborator

@ivanleomk ivanleomk commented Nov 15, 2024

Important

Adds a new blog post on generating PDF citations using Google's Gemini model with setup instructions and code examples.

  • New Blog Post:
    • Adds generating-pdf-citations.md to docs/blog/posts/.
    • Discusses using Google's Gemini model with Instructor for generating PDF citations.
    • Provides code examples for environment setup, data models, Gemini client initialization, PDF processing, and citation highlighting.
  • Structured Outputs:
    • Explains benefits of using Pydantic for structured outputs, including ease of definition, robust validation, and separation of concerns.

This description was created by Ellipsis for b2eb911. It will automatically update as commits are pushed.

@github-actions github-actions bot added documentation Improvements or additions to documentation enhancement New feature or request size:L This PR changes 100-499 lines, ignoring generated files. labels Nov 15, 2024
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Reviewed everything up to 4c57253 in 52 seconds

More details
  • Looked at 173 lines of code in 1 files
  • Skipped 1 files when reviewing.
  • Skipped posting 4 drafted comments based on config settings.
1. docs/blog/posts/generating-pdf-citations.md:46
  • Draft comment:
    The import statement for pymupdf should be import fitz as pymupdf is imported as fitz. This applies to all instances where pymupdf is used in the code.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    The comment addresses a potential issue with the import statement that could lead to runtime errors. If pymupdf is indeed supposed to be imported as fitz, then the code would not work as intended. This is a valid concern that needs to be addressed to ensure the code functions correctly.
    I might be assuming that the import statement is incorrect without verifying if pymupdf can be used directly. It's possible that the library has been updated to allow direct import as pymupdf.
    I should verify if pymupdf can be used directly or if it must be imported as fitz. This will confirm whether the comment is valid.
    Verify if pymupdf should be imported as fitz to determine if the comment is valid and should be kept.
2. docs/blog/posts/generating-pdf-citations.md:89
  • Draft comment:
    Consider adding error handling for the file upload process to manage potential issues such as timeouts or failures.
  • Reason this comment was not posted:
    Confidence changes required: 80%
    The code for uploading and processing the PDF file is missing error handling. This could lead to issues if the file upload fails or takes too long.
3. docs/blog/posts/generating-pdf-citations.md:137
  • Draft comment:
    Add a check to ensure citation.page_number - 1 is within the valid range of pages in the document to avoid potential errors.
  • Reason this comment was not posted:
    Confidence changes required: 80%
    The code uses doc.load_page(citation.page_number - 1) which assumes that the page number is always valid. There should be a check to ensure the page number is within the valid range of the document.
4. docs/blog/posts/generating-pdf-citations.md:1
  • Draft comment:
    Ensure this new markdown file is added to the mkdocs.yml to be included in the documentation.
  • Reason this comment was not posted:
    Confidence changes required: 80%
    The new markdown file should be added to the mkdocs.yml for documentation purposes.

Workflow ID: wflow_Ksl5XPSUmYgu7x1f


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.


# Wait for file to finish processing
while file.state != File.State.ACTIVE:
time.sleep(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add import time at the beginning of the script to use time.sleep(1).

Copy link

cloudflare-workers-and-pages bot commented Nov 15, 2024

Deploying instructor-py with  Cloudflare Pages  Cloudflare Pages

Latest commit: b2eb911
Status: ✅  Deploy successful!
Preview URL: https://66c60f52.instructor-py.pages.dev
Branch Preview URL: https://add-rag-citations.instructor-py.pages.dev

View logs

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Incremental review on b2eb911 in 18 seconds

More details
  • Looked at 15 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. docs/blog/posts/generating-pdf-citations.md:45
  • Draft comment:
    The removal of Field from the import statement is correct since it is not used in the code. Ensure to remove unused imports to keep the code clean.
  • Reason this comment was not posted:
    Confidence changes required: 10%
    The import statement for Field from Pydantic is removed but not used in the code. This is a good cleanup, but it should be noted for clarity.

Workflow ID: wflow_a9py1Wn9qY3lmFct


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

docs/blog/posts/generating-pdf-citations.md Show resolved Hide resolved
@ivanleomk ivanleomk merged commit 3f371ab into main Nov 15, 2024
14 of 15 checks passed
@ivanleomk ivanleomk deleted the add-rag-citations branch November 15, 2024 04:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant