Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FAQ rewriting pipeline #191

Open
wants to merge 20 commits into
base: main
Choose a base branch
from
Open

FAQ rewriting pipeline #191

wants to merge 20 commits into from

Conversation

cremertim
Copy link

@cremertim cremertim commented Jan 16, 2025

New Features

  • Added new Pipeline for rephrasing string content

Summary by CodeRabbit

Release Notes

  • New Features

    • Added a rewriting pipeline functionality to improve and proofread text.
    • Introduced a new API endpoint for triggering text rewriting.
    • Implemented a new system prompt for AI-driven FAQ text refinement.
    • Expanded the set of defined pipeline types with a new enumeration member.
    • Added new classes to manage rewriting execution and status updates.
  • Improvements

    • Enhanced text processing capabilities with intelligent proofreading.
    • Added support for maintaining original text meaning and markdown formatting.
  • Technical Updates

    • Implemented robust error handling for new pipeline execution.
    • Introduced new data transfer objects to support rewriting functionality.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (10)
app/pipeline/rephrasing_pipeline.py (2)

51-61: Avoid variable shadowing of 'prompt' parameter in __call__ method

The parameter prompt in the __call__ method is being shadowed by a local variable assignment. This can lead to confusion and potential bugs. Consider renaming the local variable to avoid shadowing.

Apply this diff to rename the local variables:

         if not dto.to_be_rephrased:
             raise ValueError("You need to provide a text to rephrase")

-        prompt = system_prompt_faq.format(
+        formatted_prompt = system_prompt_faq.format(
             rephrased_text=dto.to_be_rephrased,
         )
-        prompt = PyrisMessage(
+        formatted_message = PyrisMessage(
             sender=IrisMessageRole.SYSTEM,
-            contents=[TextMessageContentDTO(text_content=prompt)],
+            contents=[TextMessageContentDTO(text_content=formatted_prompt)],
         )

         response = self.request_handler.chat(
-            [prompt], CompletionArguments(temperature=0.4)
+            [formatted_message], CompletionArguments(temperature=0.4)
         )

67-67: Avoid logging potentially sensitive data

Logging the final rephrased text may expose sensitive information in the logs. Consider removing this log statement or ensuring that sensitive data is handled appropriately according to your application's privacy policies.

Apply this diff to modify the log statement:

-        logging.info(f"Final rephrased text: {final_result}")
+        logging.info("Rephrasing completed successfully.")
app/domain/status/rephrasing_status_update_dto.py (1)

1-1: Remove unused import Competency

The import Competency is not used in this file. Removing it will clean up the code.

Apply this diff to remove the unused import:

-from app.domain.data.competency_dto import Competency
🧰 Tools
🪛 Ruff (0.8.2)

1-1: app.domain.data.competency_dto.Competency imported but unused

Remove unused import: app.domain.data.competency_dto.Competency

(F401)

app/domain/rephrasing_pipeline_execution_dto.py (2)

1-1: Remove unused import List

The import List from the typing module is not used in this file. Removing it will clean up the code.

Apply this diff to remove the unused import:

-from typing import List
🧰 Tools
🪛 Ruff (0.8.2)

1-1: typing.List imported but unused

Remove unused import: typing.List

(F401)


6-6: Remove unused imports CompetencyTaxonomy and Competency

The imports CompetencyTaxonomy and Competency are not used in this file. Removing them will improve code maintainability.

Apply this diff to remove the unused imports:

-from .data.competency_dto import CompetencyTaxonomy, Competency
🧰 Tools
🪛 Ruff (0.8.2)

6-6: .data.competency_dto.CompetencyTaxonomy imported but unused

Remove unused import

(F401)


6-6: .data.competency_dto.Competency imported but unused

Remove unused import

(F401)

app/pipeline/prompts/faq_rephrasal.py (3)

1-2: Fix grammatical error in the prompt.

Change "a excellent" to "an excellent" to fix the grammar.

-You are a excellent tutor with expertise in computer science and practical applications teaching an university course.
+You are an excellent tutor with expertise in computer science and practical applications teaching a university course.

12-12: Review placement of the {rephrased_text} placeholder.

The {rephrased_text} placeholder appears in the middle of the instructions, which could lead to confusion. Consider moving it after all instructions to maintain a clear separation between instructions and input.


10-10: Enhance markdown formatting instructions.

The current instruction "keep the markdown formatting intact" is vague. Consider specifying which markdown elements should be preserved (e.g., headers, lists, code blocks, etc.) and how to handle them.

app/web/status/status_update.py (1)

278-296: Consider adjusting the stage weight.

The stage weight of 10 seems low compared to other pipelines (e.g., CourseChatStatusCallback uses 40). Consider increasing the weight to better reflect the computational effort of the rephrasing task.

             StageDTO(
-                weight=10,
+                weight=30,
                 state=StageStateEnum.NOT_STARTED,
                 name="Generating Rephrasing",
             )
app/web/routers/pipelines.py (1)

227-234: Remove TODO-style comments.

The comments indicating "Replace with actual..." should be removed as they are no longer needed - the actual implementations are already in place.

-        # Replace with actual Callback class
         callback = RephrasingCallback(
             run_id=dto.execution.settings.authentication_token,
             base_url=dto.execution.settings.artemis_base_url,
             initial_stages=dto.execution.initial_stages,
         )
-        #Replace with actual pipeline RefrasingPipeline
         pipeline = RephrasingPipeline(callback=callback)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ddd77de and 3a7dbd9.

⛔ Files ignored due to path filters (1)
  • docker/pyris-dev.yml is excluded by !**/*.yml
📒 Files selected for processing (7)
  • app/common/PipelineEnum.py (1 hunks)
  • app/domain/rephrasing_pipeline_execution_dto.py (1 hunks)
  • app/domain/status/rephrasing_status_update_dto.py (1 hunks)
  • app/pipeline/prompts/faq_rephrasal.py (1 hunks)
  • app/pipeline/rephrasing_pipeline.py (1 hunks)
  • app/web/routers/pipelines.py (4 hunks)
  • app/web/status/status_update.py (2 hunks)
🧰 Additional context used
🪛 Ruff (0.8.2)
app/domain/rephrasing_pipeline_execution_dto.py

1-1: typing.List imported but unused

Remove unused import: typing.List

(F401)


6-6: .data.competency_dto.CompetencyTaxonomy imported but unused

Remove unused import

(F401)


6-6: .data.competency_dto.Competency imported but unused

Remove unused import

(F401)

app/domain/status/rephrasing_status_update_dto.py

1-1: app.domain.data.competency_dto.Competency imported but unused

Remove unused import: app.domain.data.competency_dto.Competency

(F401)

🔇 Additional comments (1)
app/common/PipelineEnum.py (1)

17-17: LGTM!

The new enum member follows the established naming convention and is correctly placed before the NOT_SET enum.

app/web/routers/pipelines.py Outdated Show resolved Hide resolved
app/web/routers/pipelines.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (8)
app/pipeline/rewriting_pipeline.py (1)

41-46: Remove unused parameters in __call__ method

The parameters prompt and **kwargs in the __call__ method are not used. Removing them can simplify the method signature.

Apply this diff to remove the unused parameters:

 def __call__(
     self,
     dto: RewritingPipelineExecutionDTO,
-    prompt: Optional[ChatPromptTemplate] = None,
-    **kwargs,
 ):
🧰 Tools
🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black

app/domain/status/rewriting_status_update_dto.py (1)

1-1: Remove unused import Competency

The import Competency from app.domain.data.competency_dto is not used and can be safely removed.

Apply this diff to remove the unused import:

-from app.domain.data.competency_dto import Competency
🧰 Tools
🪛 Ruff (0.8.2)

1-1: app.domain.data.competency_dto.Competency imported but unused

Remove unused import: app.domain.data.competency_dto.Competency

(F401)

app/domain/rewriting_pipeline_execution_dto.py (2)

1-1: Remove unused import List

The import List from typing is not used and can be removed.

Apply this diff to remove the unused import:

-from typing import List
🧰 Tools
🪛 Ruff (0.8.2)

1-1: typing.List imported but unused

Remove unused import: typing.List

(F401)

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black


6-6: Remove unused imports CompetencyTaxonomy and Competency

The imports CompetencyTaxonomy and Competency from .data.competency_dto are not used and can be removed.

Apply this diff to remove the unused imports:

-from .data.competency_dto import CompetencyTaxonomy, Competency
🧰 Tools
🪛 Ruff (0.8.2)

6-6: .data.competency_dto.CompetencyTaxonomy imported but unused

Remove unused import

(F401)


6-6: .data.competency_dto.Competency imported but unused

Remove unused import

(F401)

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black

app/pipeline/prompts/faq_rewriting.py (2)

2-2: Fix grammar in role description.

Change "a excellent" to "an excellent" for correct grammar.

-You are a excellent tutor with expertise in computer science and practical applications teaching an university course.
+You are an excellent tutor with expertise in computer science and practical applications teaching a university course.
🧰 Tools
🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black


4-10: Ensure consistent punctuation in numbered list.

The numbered list has inconsistent punctuation at the end of each item. Some end with periods while others don't.

-1. Correct all spelling and grammatical errors.
-2. Ensure the text is written in simple and clear language, making it easy to understand for students.
-3. Preserve the original meaning and intent of the text.
-4. Ensure that the response is always written in complete sentences. If you are given a list of bullet points, convert them into complete sentences.
-5. Make sure to use the original language of the input text
-6. Avoid repeating any information that is already present in the text.
-7. Make sure to keep the markdown formatting intact and add formatting for the most important information
+1. Correct all spelling and grammatical errors.
+2. Ensure the text is written in simple and clear language, making it easy to understand for students.
+3. Preserve the original meaning and intent of the text.
+4. Ensure that the response is always written in complete sentences. If you are given a list of bullet points, convert them into complete sentences.
+5. Make sure to use the original language of the input text.
+6. Avoid repeating any information that is already present in the text.
+7. Make sure to keep the markdown formatting intact and add formatting for the most important information.
🧰 Tools
🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black

app/web/routers/pipelines.py (2)

233-234: Fix typo and comment style.

The comment has a typo ("RefrasingPipeline" instead of "RewritingPipeline") and uses inconsistent comment style.

-        #Replace with actual pipeline RefrasingPipeline
+        # Replace with actual pipeline RewritingPipeline
🧰 Tools
🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black


262-262: Remove unnecessary empty lines.

Multiple consecutive empty lines can be reduced to a single empty line.

@router.post(
    "/rewriting/{variant}/run",
    status_code=status.HTTP_202_ACCEPTED,
    dependencies=[Depends(TokenValidator())],
)
-

def run_rewriting_pipeline(
-

@router.get("/{feature}/variants")

Also applies to: 277-277

🧰 Tools
🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3a7dbd9 and 8b7a7c9.

📒 Files selected for processing (7)
  • app/common/PipelineEnum.py (1 hunks)
  • app/domain/rewriting_pipeline_execution_dto.py (1 hunks)
  • app/domain/status/rewriting_status_update_dto.py (1 hunks)
  • app/pipeline/prompts/faq_rewriting.py (1 hunks)
  • app/pipeline/rewriting_pipeline.py (1 hunks)
  • app/web/routers/pipelines.py (4 hunks)
  • app/web/status/status_update.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • app/common/PipelineEnum.py
  • app/web/status/status_update.py
🧰 Additional context used
🪛 Ruff (0.8.2)
app/domain/rewriting_pipeline_execution_dto.py

1-1: typing.List imported but unused

Remove unused import: typing.List

(F401)


6-6: .data.competency_dto.CompetencyTaxonomy imported but unused

Remove unused import

(F401)


6-6: .data.competency_dto.Competency imported but unused

Remove unused import

(F401)

app/domain/status/rewriting_status_update_dto.py

1-1: app.domain.data.competency_dto.Competency imported but unused

Remove unused import: app.domain.data.competency_dto.Competency

(F401)

🪛 GitHub Actions: Run linters
app/domain/rewriting_pipeline_execution_dto.py

[warning] File needs to be reformatted using Black

app/pipeline/prompts/faq_rewriting.py

[warning] File needs to be reformatted using Black

app/pipeline/rewriting_pipeline.py

[warning] File needs to be reformatted using Black

app/web/routers/pipelines.py

[warning] File needs to be reformatted using Black

⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: Build and Push Docker Image
  • GitHub Check: Mend Security Check
🔇 Additional comments (3)
app/web/routers/pipelines.py (3)

244-244: Fix incorrect error message.

The error message refers to "extraction" which is incorrect for a rewriting pipeline.

-        logger.error(f"Error running rewriting extraction pipeline: {e}")
+        logger.error(f"Error running rewriting pipeline: {e}")
🧰 Tools
🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black


271-271: Avoid logging entire DTO object.

Logging the entire DTO object could expose sensitive information.

-    logger.info(f"Rewriting pipeline started with variant: {variant} and dto: {dto}")
+    logger.info(f"Rewriting pipeline started with variant: {variant}")
🧰 Tools
🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black


Line range hint 1-349: Format file using Black.

The pipeline indicates that this file needs to be reformatted using Black.

Run Black formatter on this file:

🧰 Tools
🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black

app/pipeline/rewriting_pipeline.py Outdated Show resolved Hide resolved
app/pipeline/rewriting_pipeline.py Outdated Show resolved Hide resolved
app/pipeline/rewriting_pipeline.py Show resolved Hide resolved
app/domain/rewriting_pipeline_execution_dto.py Outdated Show resolved Hide resolved
app/pipeline/prompts/faq_rewriting.py Outdated Show resolved Hide resolved
app/web/routers/pipelines.py Outdated Show resolved Hide resolved
app/web/routers/pipelines.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
app/web/routers/pipelines.py (1)

Tests missing for pipeline implementations

The rewriting pipeline implementation follows the established patterns and integrates correctly with the existing infrastructure. However, there appears to be no test coverage for any of the pipeline implementations, which is a significant concern. Consider adding tests for:

  • Pipeline initialization and execution flows
  • Error handling and callback behaviors
  • FastAPI endpoint integration
  • DTO validation and processing
🔗 Analysis chain

Line range hint 223-347: Verify integration with existing pipeline infrastructure.

The rewriting pipeline implementation follows the established patterns and integrates well with the existing infrastructure. However, consider adding tests to verify:

  1. Pipeline initialization and execution
  2. Error handling and callback behavior
  3. Integration with the FastAPI router

Run this script to check for existing tests:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for test coverage of the rewriting pipeline implementation

# Look for test files related to the rewriting pipeline
rg -l "test.*rewriting.*pipeline" -g "test_*.py"

# Check for test cases using the new endpoint
rg -A 5 "test.*rewriting.*run" -g "test_*.py"

Length of output: 396


Script:

#!/bin/bash
# First, find all test files and directories
echo "=== Finding test files ==="
fd -t f "test_.*\.py$"

echo -e "\n=== Searching for rewriting pipeline tests ==="
# Search for rewriting-related tests with broader context
fd -t f "test_.*\.py$" --exec rg -l "rewriting|RewritingPipeline"

echo -e "\n=== Checking test directories structure ==="
# List directories containing test files
fd -t f "test_.*\.py$" -d 3 --exec dirname {} | sort -u

Length of output: 427


Script:

#!/bin/bash
# Look for any test-related Python files
echo "=== Finding any test-related Python files ==="
fd ".*test.*\.py$"

echo -e "\n=== Looking for FastAPI test client usage ==="
rg "TestClient|pytest" -l -g "*.py"

echo -e "\n=== Finding pipeline-related files ==="
fd "pipeline" -e py

Length of output: 1690

🧰 Tools
🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black

🧹 Nitpick comments (2)
app/web/routers/pipelines.py (2)

275-275: Fix formatting: Remove extra blank line.

The Black formatter indicates that this file needs reformatting. Remove the extra blank line to comply with the formatting rules.

🧰 Tools
🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black


338-338: Fix formatting: Remove extra blank line.

The Black formatter indicates that this file needs reformatting. Remove the extra blank line to comply with the formatting rules.

🧰 Tools
🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8b7a7c9 and 2ff6d82.

📒 Files selected for processing (1)
  • app/web/routers/pipelines.py (4 hunks)
🧰 Additional context used
🪛 GitHub Actions: Run linters
app/web/routers/pipelines.py

[warning] File needs to be reformatted using Black

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build and Push Docker Image
🔇 Additional comments (4)
app/web/routers/pipelines.py (4)

14-14: LGTM: Import statements are properly organized.

The new imports for the rewriting pipeline components are correctly placed and grouped with related imports.

Also applies to: 20-20, 25-25

🧰 Tools
🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black


223-245: Fix incorrect error message in logging statement.

The error message incorrectly mentions "extraction" which seems to be a copy-paste error.

-        logger.error(f"Error running rewriting extraction pipeline: {e}")
+        logger.error(f"Error running rewriting pipeline: {e}")
🧰 Tools
🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black


269-269: Avoid logging entire DTO object.

Logging the entire DTO object could potentially expose sensitive information.

-    logger.info(f"Rewriting pipeline started with variant: {variant} and dto: {dto}")
+    logger.info(f"Rewriting pipeline started with variant: {variant}")
🧰 Tools
🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black


339-346: LGTM: Pipeline variant implementation is consistent.

The implementation of the rewriting pipeline variant follows the established pattern and provides appropriate metadata.

🧰 Tools
🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black

@FelixTJDietrich FelixTJDietrich changed the title Basic rephrasing FAQ rewriting pipeline Jan 17, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (3)
app/pipeline/rewriting_pipeline.py (2)

61-62: ⚠️ Potential issue

Add error handling for empty response content.

Accessing response.contents[0] without checking if response.contents is not empty may raise an IndexError.

Apply this diff to handle empty response content:

-        response = response.contents[0].text_content
+        if response.contents:
+            response = response.contents[0].text_content
+        else:
+            raise ValueError("Empty response from LLM")

63-63: ⚠️ Potential issue

Change logging level for sensitive information.

Logging the rewritten text at INFO level may expose sensitive information.

Apply this diff to adjust the logging level:

-        logging.info(f"Final rewritten text: {final_result}")
+        logging.debug(f"Final rewritten text: {final_result}")
app/web/routers/pipelines.py (1)

266-266: ⚠️ Potential issue

Avoid logging entire DTO object.

Logging the entire DTO object could expose sensitive information.

Apply this diff to log only necessary information:

-    logger.info(f"Rewriting pipeline started with variant: {variant} and dto: {dto}")
+    logger.info(f"Rewriting pipeline started with variant: {variant}")
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2ff6d82 and 79e2dc2.

📒 Files selected for processing (8)
  • app/domain/rewriting_pipeline_execution_dto.py (1 hunks)
  • app/pipeline/chat/course_chat_pipeline.py (1 hunks)
  • app/pipeline/chat/exercise_chat_agent_pipeline.py (1 hunks)
  • app/pipeline/prompts/faq_rewriting.py (1 hunks)
  • app/pipeline/rewriting_pipeline.py (1 hunks)
  • app/pipeline/shared/citation_pipeline.py (1 hunks)
  • app/web/routers/pipelines.py (4 hunks)
  • app/web/status/status_update.py (2 hunks)
✅ Files skipped from review due to trivial changes (3)
  • app/pipeline/shared/citation_pipeline.py
  • app/pipeline/chat/course_chat_pipeline.py
  • app/pipeline/chat/exercise_chat_agent_pipeline.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • app/web/status/status_update.py
🧰 Additional context used
🪛 GitHub Actions: Run linters
app/pipeline/prompts/faq_rewriting.py

[warning] 2-2: Line too long (207 characters, limit is 120)


[warning] 7-7: Line too long (148 characters, limit is 120)


[warning] 14-14: Line too long (203 characters, limit is 120)

app/domain/rewriting_pipeline_execution_dto.py

[warning] 1-1: Unused import: 'typing.List' is imported but never used


[warning] 6-6: Unused import: '.data.competency_dto.CompetencyTaxonomy' is imported but never used


[warning] 6-6: Unused import: '.data.competency_dto.Competency' is imported but never used

🪛 Ruff (0.8.2)
app/domain/rewriting_pipeline_execution_dto.py

1-1: typing.List imported but unused

Remove unused import: typing.List

(F401)


6-6: .data.competency_dto.CompetencyTaxonomy imported but unused

Remove unused import

(F401)


6-6: .data.competency_dto.Competency imported but unused

Remove unused import

(F401)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build and Push Docker Image
🔇 Additional comments (5)
app/domain/rewriting_pipeline_execution_dto.py (1)

9-11: LGTM! Clean and well-structured DTO.

The DTO class is well-designed with:

  • Clear inheritance from BaseModel
  • Proper use of Field for JSON alias mapping
  • Appropriate type hints
app/pipeline/prompts/faq_rewriting.py (1)

1-15: LGTM! Well-structured prompt with clear instructions.

The prompt provides clear and comprehensive instructions for the AI model with:

  • Specific tasks enumerated
  • Emphasis on maintaining original meaning
  • Clear formatting requirements
🧰 Tools
🪛 GitHub Actions: Run linters

[warning] 2-2: Line too long (207 characters, limit is 120)


[warning] 7-7: Line too long (148 characters, limit is 120)


[warning] 14-14: Line too long (203 characters, limit is 120)

app/pipeline/rewriting_pipeline.py (1)

27-38: LGTM! Well-structured constructor with appropriate initialization.

The constructor properly:

  • Calls the parent constructor
  • Initializes the callback
  • Sets up the request handler with specific requirements
app/web/routers/pipelines.py (2)

260-269: LGTM! Well-structured endpoint with proper authentication.

The endpoint implementation:

  • Uses proper HTTP status codes
  • Includes token validation
  • Follows the pattern of other pipeline endpoints

241-241: ⚠️ Potential issue

Fix incorrect error message.

The error message refers to "extraction" which is incorrect for the rewriting pipeline.

Apply this diff to fix the error message:

-        logger.error(f"Error running rewriting extraction pipeline: {e}")
+        logger.error(f"Error running rewriting pipeline: {e}")

Likely invalid or redundant comment.

app/domain/rewriting_pipeline_execution_dto.py Outdated Show resolved Hide resolved
app/pipeline/prompts/faq_rewriting.py Outdated Show resolved Hide resolved
app/pipeline/rewriting_pipeline.py Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
app/pipeline/rewriting_pipeline.py (1)

38-43: Add return type hint to call method.

The method is missing a return type annotation.

     def __call__(
         self,
         dto: RewritingPipelineExecutionDTO,
         prompt: Optional[ChatPromptTemplate] = None,
         **kwargs,
-    ):
+    ) -> str:
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 79e2dc2 and 0715367.

📒 Files selected for processing (3)
  • app/domain/rewriting_pipeline_execution_dto.py (1 hunks)
  • app/pipeline/prompts/faq_rewriting.py (1 hunks)
  • app/pipeline/rewriting_pipeline.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • app/domain/rewriting_pipeline_execution_dto.py
  • app/pipeline/prompts/faq_rewriting.py
🧰 Additional context used
🪛 Ruff (0.8.2)
app/pipeline/rewriting_pipeline.py

12-12: app.domain.data.competency_dto.Competency imported but unused

Remove unused import: app.domain.data.competency_dto.Competency

(F401)

🪛 GitHub Actions: Run linters
app/pipeline/rewriting_pipeline.py

[error] 12-12: Unused import: 'app.domain.data.competency_dto.Competency' imported but unused

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build and Push Docker Image
🔇 Additional comments (4)
app/pipeline/rewriting_pipeline.py (4)

32-33: Verify GPT model requirements.

The GPT version (4.5) and context length (16385) specifications seem unusual:

  • GPT-4.5 is not a standard version
  • Context length is typically a power of 2 (e.g., 16384)

Please verify these requirements match the intended model specifications.


60-60: Add validation for response contents.

Direct access to response.contents[0] could raise an IndexError if the response is empty.


62-62: Adjust logging level for sensitive content.

Logging the rewritten text at INFO level could expose sensitive information.


25-25: ⚠️ Potential issue

Remove unused output_parser declaration.

The output_parser is declared but never used in the code.

-    output_parser: PydanticOutputParser

Likely invalid or redundant comment.

app/pipeline/rewriting_pipeline.py Show resolved Hide resolved
app/pipeline/rewriting_pipeline.py Outdated Show resolved Hide resolved
coderabbitai[bot]
coderabbitai bot previously approved these changes Jan 21, 2025
@ls1intum ls1intum deleted a comment from coderabbitai bot Jan 21, 2025
coderabbitai[bot]
coderabbitai bot previously approved these changes Jan 21, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
app/pipeline/prompts/faq_rewriting.py (5)

1-4: Add type annotation for better type safety.

Consider adding a type annotation to the variable declaration:

-system_prompt_faq = """:
+system_prompt_faq: str = """:
🧰 Tools
🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black code formatter


9-9: Capitalize proper noun "german".

Proper nouns should be capitalized:

-5. Original Language: Use the same language as the input text. The input text will be either german or english.
+5. Original Language: Use the same language as the input text. The input text will be either German or English.
🧰 Tools
🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black code formatter


5-12: Consider adding examples for each guideline.

To make the guidelines more actionable, consider adding brief examples for each point. This would help ensure consistent output across different model responses.

🧰 Tools
🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black code formatter


13-14: Specify the criteria for "short inputs".

The current instruction for handling short inputs is ambiguous. Consider defining specific criteria (e.g., minimum character count or sentence structure) to ensure consistent handling.

🧰 Tools
🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black code formatter


26-28: Consider using unique markers and specify string formatting method.

Two suggestions for improvement:

  1. Use more unique markers to avoid potential conflicts with markdown content (e.g., ###PYRIS_START###).
  2. Document the expected string formatting method:
# Example usage:
prompt = system_prompt_faq.format(rewritten_text="input text")
🧰 Tools
🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black code formatter

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 76e5c9b and 402c65c.

📒 Files selected for processing (1)
  • app/pipeline/prompts/faq_rewriting.py (1 hunks)
🧰 Additional context used
🪛 GitHub Actions: Run linters
app/pipeline/prompts/faq_rewriting.py

[warning] File needs to be reformatted using Black code formatter

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: Build and Push Docker Image
🔇 Additional comments (2)
app/pipeline/prompts/faq_rewriting.py (2)

16-16: Remove trailing whitespace.

Remove the trailing whitespace at the end of line 16.

🧰 Tools
🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black code formatter


1-30: Black formatting is needed.

The file requires formatting with Black to maintain consistent code style.

🧰 Tools
🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black code formatter

coderabbitai[bot]
coderabbitai bot previously approved these changes Jan 22, 2025
coderabbitai[bot]
coderabbitai bot previously approved these changes Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants