FAQ rewriting pipeline #191

cremertim · 2025-01-16T16:10:36Z

New Features

Added new Pipeline for rephrasing string content

Summary by CodeRabbit

Release Notes

New Features
- Added a rewriting pipeline functionality to improve and proofread text.
- Introduced a new API endpoint for triggering text rewriting.
- Implemented a new system prompt for AI-driven FAQ text refinement.
- Expanded the set of defined pipeline types with a new enumeration member.
- Added new classes to manage rewriting execution and status updates.
Improvements
- Enhanced text processing capabilities with intelligent proofreading.
- Added support for maintaining original text meaning and markdown formatting.
Technical Updates
- Implemented robust error handling for new pipeline execution.
- Introduced new data transfer objects to support rewriting functionality.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (10)

app/pipeline/rephrasing_pipeline.py (2)
51-61: Avoid variable shadowing of 'prompt' parameter in __call__ method

The parameter prompt in the __call__ method is being shadowed by a local variable assignment. This can lead to confusion and potential bugs. Consider renaming the local variable to avoid shadowing.

Apply this diff to rename the local variables:
         if not dto.to_be_rephrased:
             raise ValueError("You need to provide a text to rephrase")

-        prompt = system_prompt_faq.format(
+        formatted_prompt = system_prompt_faq.format(
             rephrased_text=dto.to_be_rephrased,
         )
-        prompt = PyrisMessage(
+        formatted_message = PyrisMessage(
             sender=IrisMessageRole.SYSTEM,
-            contents=[TextMessageContentDTO(text_content=prompt)],
+            contents=[TextMessageContentDTO(text_content=formatted_prompt)],
         )

         response = self.request_handler.chat(
-            [prompt], CompletionArguments(temperature=0.4)
+            [formatted_message], CompletionArguments(temperature=0.4)
         )
67-67: Avoid logging potentially sensitive data

Logging the final rephrased text may expose sensitive information in the logs. Consider removing this log statement or ensuring that sensitive data is handled appropriately according to your application's privacy policies.

Apply this diff to modify the log statement:
-        logging.info(f"Final rephrased text: {final_result}")
+        logging.info("Rephrasing completed successfully.")
app/domain/status/rephrasing_status_update_dto.py (1)
1-1: Remove unused import Competency

The import Competency is not used in this file. Removing it will clean up the code.

Apply this diff to remove the unused import:
-from app.domain.data.competency_dto import Competency
🧰 Tools

🪛 Ruff (0.8.2)

1-1: app.domain.data.competency_dto.Competency imported but unused

Remove unused import: app.domain.data.competency_dto.Competency

(F401)
app/domain/rephrasing_pipeline_execution_dto.py (2)
1-1: Remove unused import List

The import List from the typing module is not used in this file. Removing it will clean up the code.

Apply this diff to remove the unused import:
-from typing import List
🧰 Tools

🪛 Ruff (0.8.2)

1-1: typing.List imported but unused

Remove unused import: typing.List

(F401)

6-6: Remove unused imports CompetencyTaxonomy and Competency

The imports CompetencyTaxonomy and Competency are not used in this file. Removing them will improve code maintainability.

Apply this diff to remove the unused imports:
-from .data.competency_dto import CompetencyTaxonomy, Competency
🧰 Tools

🪛 Ruff (0.8.2)

6-6: .data.competency_dto.CompetencyTaxonomy imported but unused

Remove unused import

(F401)

6-6: .data.competency_dto.Competency imported but unused

Remove unused import

(F401)
app/pipeline/prompts/faq_rephrasal.py (3)
1-2: Fix grammatical error in the prompt.

Change "a excellent" to "an excellent" to fix the grammar.
-You are a excellent tutor with expertise in computer science and practical applications teaching an university course.
+You are an excellent tutor with expertise in computer science and practical applications teaching a university course.
12-12: Review placement of the {rephrased_text} placeholder.

The {rephrased_text} placeholder appears in the middle of the instructions, which could lead to confusion. Consider moving it after all instructions to maintain a clear separation between instructions and input.

10-10: Enhance markdown formatting instructions.

The current instruction "keep the markdown formatting intact" is vague. Consider specifying which markdown elements should be preserved (e.g., headers, lists, code blocks, etc.) and how to handle them.
app/web/status/status_update.py (1)
278-296: Consider adjusting the stage weight.

The stage weight of 10 seems low compared to other pipelines (e.g., CourseChatStatusCallback uses 40). Consider increasing the weight to better reflect the computational effort of the rephrasing task.
             StageDTO(
-                weight=10,
+                weight=30,
                 state=StageStateEnum.NOT_STARTED,
                 name="Generating Rephrasing",
             )
app/web/routers/pipelines.py (1)
227-234: Remove TODO-style comments.

The comments indicating "Replace with actual..." should be removed as they are no longer needed - the actual implementations are already in place.
-        # Replace with actual Callback class
         callback = RephrasingCallback(
             run_id=dto.execution.settings.authentication_token,
             base_url=dto.execution.settings.artemis_base_url,
             initial_stages=dto.execution.initial_stages,
         )
-        #Replace with actual pipeline RefrasingPipeline
         pipeline = RephrasingPipeline(callback=callback)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ddd77de and 3a7dbd9.

⛔ Files ignored due to path filters (1)

docker/pyris-dev.yml is excluded by !**/*.yml

📒 Files selected for processing (7)

app/common/PipelineEnum.py (1 hunks)
app/domain/rephrasing_pipeline_execution_dto.py (1 hunks)
app/domain/status/rephrasing_status_update_dto.py (1 hunks)
app/pipeline/prompts/faq_rephrasal.py (1 hunks)
app/pipeline/rephrasing_pipeline.py (1 hunks)
app/web/routers/pipelines.py (4 hunks)
app/web/status/status_update.py (2 hunks)

🧰 Additional context used

🪛 Ruff (0.8.2)

app/domain/rephrasing_pipeline_execution_dto.py

1-1: typing.List imported but unused

Remove unused import: typing.List

(F401)

6-6: .data.competency_dto.CompetencyTaxonomy imported but unused

Remove unused import

(F401)

6-6: .data.competency_dto.Competency imported but unused

Remove unused import

(F401)

app/domain/status/rephrasing_status_update_dto.py

1-1: app.domain.data.competency_dto.Competency imported but unused

Remove unused import: app.domain.data.competency_dto.Competency

(F401)

🔇 Additional comments (1)

app/common/PipelineEnum.py (1)

17-17: LGTM!

The new enum member follows the established naming convention and is correctly placed before the NOT_SET enum.

app/web/routers/pipelines.py

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (8)

app/pipeline/rewriting_pipeline.py (1)
41-46: Remove unused parameters in __call__ method

The parameters prompt and **kwargs in the __call__ method are not used. Removing them can simplify the method signature.

Apply this diff to remove the unused parameters:
 def __call__(
     self,
     dto: RewritingPipelineExecutionDTO,
-    prompt: Optional[ChatPromptTemplate] = None,
-    **kwargs,
 ):
🧰 Tools

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black
app/domain/status/rewriting_status_update_dto.py (1)
1-1: Remove unused import Competency

The import Competency from app.domain.data.competency_dto is not used and can be safely removed.

Apply this diff to remove the unused import:
-from app.domain.data.competency_dto import Competency
🧰 Tools

🪛 Ruff (0.8.2)

1-1: app.domain.data.competency_dto.Competency imported but unused

Remove unused import: app.domain.data.competency_dto.Competency

(F401)
app/domain/rewriting_pipeline_execution_dto.py (2)
1-1: Remove unused import List

The import List from typing is not used and can be removed.

Apply this diff to remove the unused import:
-from typing import List
🧰 Tools

🪛 Ruff (0.8.2)

1-1: typing.List imported but unused

Remove unused import: typing.List

(F401)

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black

6-6: Remove unused imports CompetencyTaxonomy and Competency

The imports CompetencyTaxonomy and Competency from .data.competency_dto are not used and can be removed.

Apply this diff to remove the unused imports:
-from .data.competency_dto import CompetencyTaxonomy, Competency
🧰 Tools

🪛 Ruff (0.8.2)

6-6: .data.competency_dto.CompetencyTaxonomy imported but unused

Remove unused import

(F401)

6-6: .data.competency_dto.Competency imported but unused

Remove unused import

(F401)

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black
app/pipeline/prompts/faq_rewriting.py (2)
2-2: Fix grammar in role description.

Change "a excellent" to "an excellent" for correct grammar.
-You are a excellent tutor with expertise in computer science and practical applications teaching an university course.
+You are an excellent tutor with expertise in computer science and practical applications teaching a university course.
🧰 Tools

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black

4-10: Ensure consistent punctuation in numbered list.

The numbered list has inconsistent punctuation at the end of each item. Some end with periods while others don't.
-1. Correct all spelling and grammatical errors.
-2. Ensure the text is written in simple and clear language, making it easy to understand for students.
-3. Preserve the original meaning and intent of the text.
-4. Ensure that the response is always written in complete sentences. If you are given a list of bullet points, convert them into complete sentences.
-5. Make sure to use the original language of the input text
-6. Avoid repeating any information that is already present in the text.
-7. Make sure to keep the markdown formatting intact and add formatting for the most important information
+1. Correct all spelling and grammatical errors.
+2. Ensure the text is written in simple and clear language, making it easy to understand for students.
+3. Preserve the original meaning and intent of the text.
+4. Ensure that the response is always written in complete sentences. If you are given a list of bullet points, convert them into complete sentences.
+5. Make sure to use the original language of the input text.
+6. Avoid repeating any information that is already present in the text.
+7. Make sure to keep the markdown formatting intact and add formatting for the most important information.
🧰 Tools

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black
app/web/routers/pipelines.py (2)
233-234: Fix typo and comment style.

The comment has a typo ("RefrasingPipeline" instead of "RewritingPipeline") and uses inconsistent comment style.
-        #Replace with actual pipeline RefrasingPipeline
+        # Replace with actual pipeline RewritingPipeline
🧰 Tools

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black

262-262: Remove unnecessary empty lines.

Multiple consecutive empty lines can be reduced to a single empty line.
@router.post(
    "/rewriting/{variant}/run",
    status_code=status.HTTP_202_ACCEPTED,
    dependencies=[Depends(TokenValidator())],
)
-

def run_rewriting_pipeline(
-

@router.get("/{feature}/variants")
Also applies to: 277-277

🧰 Tools

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3a7dbd9 and 8b7a7c9.

📒 Files selected for processing (7)

app/common/PipelineEnum.py (1 hunks)
app/domain/rewriting_pipeline_execution_dto.py (1 hunks)
app/domain/status/rewriting_status_update_dto.py (1 hunks)
app/pipeline/prompts/faq_rewriting.py (1 hunks)
app/pipeline/rewriting_pipeline.py (1 hunks)
app/web/routers/pipelines.py (4 hunks)
app/web/status/status_update.py (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

app/common/PipelineEnum.py
app/web/status/status_update.py

🧰 Additional context used

🪛 Ruff (0.8.2)

app/domain/rewriting_pipeline_execution_dto.py

1-1: typing.List imported but unused

Remove unused import: typing.List

(F401)

6-6: .data.competency_dto.CompetencyTaxonomy imported but unused

Remove unused import

(F401)

6-6: .data.competency_dto.Competency imported but unused

Remove unused import

(F401)

app/domain/status/rewriting_status_update_dto.py

1-1: app.domain.data.competency_dto.Competency imported but unused

Remove unused import: app.domain.data.competency_dto.Competency

(F401)

🪛 GitHub Actions: Run linters

app/domain/rewriting_pipeline_execution_dto.py

[warning] File needs to be reformatted using Black

app/pipeline/prompts/faq_rewriting.py

[warning] File needs to be reformatted using Black

app/pipeline/rewriting_pipeline.py

[warning] File needs to be reformatted using Black

app/web/routers/pipelines.py

[warning] File needs to be reformatted using Black

⏰ Context from checks skipped due to timeout of 90000ms (2)

GitHub Check: Build and Push Docker Image
GitHub Check: Mend Security Check

🔇 Additional comments (3)

app/web/routers/pipelines.py (3)

244-244: Fix incorrect error message.

The error message refers to "extraction" which is incorrect for a rewriting pipeline.
-        logger.error(f"Error running rewriting extraction pipeline: {e}")
+        logger.error(f"Error running rewriting pipeline: {e}")
🧰 Tools

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black

271-271: Avoid logging entire DTO object.

Logging the entire DTO object could expose sensitive information.
-    logger.info(f"Rewriting pipeline started with variant: {variant} and dto: {dto}")
+    logger.info(f"Rewriting pipeline started with variant: {variant}")
🧰 Tools

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black

Line range hint 1-349: Format file using Black.

The pipeline indicates that this file needs to be reformatted using Black.

Run Black formatter on this file:

🧰 Tools

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black

app/pipeline/rewriting_pipeline.py

app/domain/rewriting_pipeline_execution_dto.py

app/pipeline/prompts/faq_rewriting.py

app/web/status/status_update.py

app/pipeline/prompts/faq_rewriting.py

app/web/routers/pipelines.py

coderabbitai

Actionable comments posted: 0

🔭 Outside diff range comments (1)

app/web/routers/pipelines.py (1)
Tests missing for pipeline implementations

The rewriting pipeline implementation follows the established patterns and integrates correctly with the existing infrastructure. However, there appears to be no test coverage for any of the pipeline implementations, which is a significant concern. Consider adding tests for:

Pipeline initialization and execution flows

Error handling and callback behaviors

FastAPI endpoint integration

DTO validation and processing

🔗 Analysis chain

Line range hint 223-347: Verify integration with existing pipeline infrastructure.

The rewriting pipeline implementation follows the established patterns and integrates well with the existing infrastructure. However, consider adding tests to verify:

Pipeline initialization and execution

Error handling and callback behavior

Integration with the FastAPI router

Run this script to check for existing tests:
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Check for test coverage of the rewriting pipeline implementation

# Look for test files related to the rewriting pipeline
rg -l "test.*rewriting.*pipeline" -g "test_*.py"

# Check for test cases using the new endpoint
rg -A 5 "test.*rewriting.*run" -g "test_*.py"
Length of output: 396

Script:
#!/bin/bash
# First, find all test files and directories
echo "=== Finding test files ==="
fd -t f "test_.*\.py$"

echo -e "\n=== Searching for rewriting pipeline tests ==="
# Search for rewriting-related tests with broader context
fd -t f "test_.*\.py$" --exec rg -l "rewriting|RewritingPipeline"

echo -e "\n=== Checking test directories structure ==="
# List directories containing test files
fd -t f "test_.*\.py$" -d 3 --exec dirname {} | sort -u
Length of output: 427

Script:
#!/bin/bash
# Look for any test-related Python files
echo "=== Finding any test-related Python files ==="
fd ".*test.*\.py$"

echo -e "\n=== Looking for FastAPI test client usage ==="
rg "TestClient|pytest" -l -g "*.py"

echo -e "\n=== Finding pipeline-related files ==="
fd "pipeline" -e py
Length of output: 1690
🧰 Tools

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black

🧹 Nitpick comments (2)

app/web/routers/pipelines.py (2)

275-275: Fix formatting: Remove extra blank line.

The Black formatter indicates that this file needs reformatting. Remove the extra blank line to comply with the formatting rules.

🧰 Tools

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black

338-338: Fix formatting: Remove extra blank line.

The Black formatter indicates that this file needs reformatting. Remove the extra blank line to comply with the formatting rules.

🧰 Tools

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8b7a7c9 and 2ff6d82.

📒 Files selected for processing (1)

app/web/routers/pipelines.py (4 hunks)

🧰 Additional context used

🪛 GitHub Actions: Run linters

app/web/routers/pipelines.py

[warning] File needs to be reformatted using Black

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: Build and Push Docker Image

🔇 Additional comments (4)

app/web/routers/pipelines.py (4)

14-14: LGTM: Import statements are properly organized.

The new imports for the rewriting pipeline components are correctly placed and grouped with related imports.

Also applies to: 20-20, 25-25

🧰 Tools

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black

223-245: Fix incorrect error message in logging statement.

The error message incorrectly mentions "extraction" which seems to be a copy-paste error.
-        logger.error(f"Error running rewriting extraction pipeline: {e}")
+        logger.error(f"Error running rewriting pipeline: {e}")
🧰 Tools

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black

269-269: Avoid logging entire DTO object.

Logging the entire DTO object could potentially expose sensitive information.
-    logger.info(f"Rewriting pipeline started with variant: {variant} and dto: {dto}")
+    logger.info(f"Rewriting pipeline started with variant: {variant}")
🧰 Tools

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black

339-346: LGTM: Pipeline variant implementation is consistent.

The implementation of the rewriting pipeline variant follows the established pattern and provides appropriate metadata.

🧰 Tools

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black

coderabbitai

Actionable comments posted: 4

♻️ Duplicate comments (3)

app/pipeline/rewriting_pipeline.py (2)
61-62: ⚠️ Potential issue

Add error handling for empty response content.

Accessing response.contents[0] without checking if response.contents is not empty may raise an IndexError.

Apply this diff to handle empty response content:
-        response = response.contents[0].text_content
+        if response.contents:
+            response = response.contents[0].text_content
+        else:
+            raise ValueError("Empty response from LLM")
63-63: ⚠️ Potential issue

Change logging level for sensitive information.

Logging the rewritten text at INFO level may expose sensitive information.

Apply this diff to adjust the logging level:
-        logging.info(f"Final rewritten text: {final_result}")
+        logging.debug(f"Final rewritten text: {final_result}")
app/web/routers/pipelines.py (1)
266-266: ⚠️ Potential issue

Avoid logging entire DTO object.

Logging the entire DTO object could expose sensitive information.

Apply this diff to log only necessary information:
-    logger.info(f"Rewriting pipeline started with variant: {variant} and dto: {dto}")
+    logger.info(f"Rewriting pipeline started with variant: {variant}")

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2ff6d82 and 79e2dc2.

📒 Files selected for processing (8)

app/domain/rewriting_pipeline_execution_dto.py (1 hunks)
app/pipeline/chat/course_chat_pipeline.py (1 hunks)
app/pipeline/chat/exercise_chat_agent_pipeline.py (1 hunks)
app/pipeline/prompts/faq_rewriting.py (1 hunks)
app/pipeline/rewriting_pipeline.py (1 hunks)
app/pipeline/shared/citation_pipeline.py (1 hunks)
app/web/routers/pipelines.py (4 hunks)
app/web/status/status_update.py (2 hunks)

✅ Files skipped from review due to trivial changes (3)

app/pipeline/shared/citation_pipeline.py
app/pipeline/chat/course_chat_pipeline.py
app/pipeline/chat/exercise_chat_agent_pipeline.py

🚧 Files skipped from review as they are similar to previous changes (1)

app/web/status/status_update.py

🧰 Additional context used

🪛 GitHub Actions: Run linters

app/pipeline/prompts/faq_rewriting.py

[warning] 2-2: Line too long (207 characters, limit is 120)

[warning] 7-7: Line too long (148 characters, limit is 120)

[warning] 14-14: Line too long (203 characters, limit is 120)

app/domain/rewriting_pipeline_execution_dto.py

[warning] 1-1: Unused import: 'typing.List' is imported but never used

[warning] 6-6: Unused import: '.data.competency_dto.CompetencyTaxonomy' is imported but never used

[warning] 6-6: Unused import: '.data.competency_dto.Competency' is imported but never used

🪛 Ruff (0.8.2)

app/domain/rewriting_pipeline_execution_dto.py

1-1: typing.List imported but unused

Remove unused import: typing.List

(F401)

6-6: .data.competency_dto.CompetencyTaxonomy imported but unused

Remove unused import

(F401)

6-6: .data.competency_dto.Competency imported but unused

Remove unused import

(F401)

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: Build and Push Docker Image

🔇 Additional comments (5)

app/domain/rewriting_pipeline_execution_dto.py (1)

9-11: LGTM! Clean and well-structured DTO.

The DTO class is well-designed with:

Clear inheritance from BaseModel

Proper use of Field for JSON alias mapping

Appropriate type hints

app/pipeline/prompts/faq_rewriting.py (1)

1-15: LGTM! Well-structured prompt with clear instructions.

The prompt provides clear and comprehensive instructions for the AI model with:

Specific tasks enumerated

Emphasis on maintaining original meaning

Clear formatting requirements

🧰 Tools

🪛 GitHub Actions: Run linters

[warning] 2-2: Line too long (207 characters, limit is 120)

[warning] 7-7: Line too long (148 characters, limit is 120)

[warning] 14-14: Line too long (203 characters, limit is 120)

app/pipeline/rewriting_pipeline.py (1)

27-38: LGTM! Well-structured constructor with appropriate initialization.

The constructor properly:

Calls the parent constructor

Initializes the callback

Sets up the request handler with specific requirements
app/web/routers/pipelines.py (2)

260-269: LGTM! Well-structured endpoint with proper authentication.

The endpoint implementation:

Uses proper HTTP status codes

Includes token validation

Follows the pattern of other pipeline endpoints

241-241: ⚠️ Potential issue

Fix incorrect error message.

The error message refers to "extraction" which is incorrect for the rewriting pipeline.

Apply this diff to fix the error message:
-        logger.error(f"Error running rewriting extraction pipeline: {e}")
+        logger.error(f"Error running rewriting pipeline: {e}")
Likely invalid or redundant comment.

app/domain/rewriting_pipeline_execution_dto.py

app/pipeline/prompts/faq_rewriting.py

app/pipeline/rewriting_pipeline.py

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

app/pipeline/rewriting_pipeline.py (1)
38-43: Add return type hint to call method.

The method is missing a return type annotation.
     def __call__(
         self,
         dto: RewritingPipelineExecutionDTO,
         prompt: Optional[ChatPromptTemplate] = None,
         **kwargs,
-    ):
+    ) -> str:

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 79e2dc2 and 0715367.

📒 Files selected for processing (3)

app/domain/rewriting_pipeline_execution_dto.py (1 hunks)
app/pipeline/prompts/faq_rewriting.py (1 hunks)
app/pipeline/rewriting_pipeline.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

app/domain/rewriting_pipeline_execution_dto.py
app/pipeline/prompts/faq_rewriting.py

🧰 Additional context used

🪛 Ruff (0.8.2)

app/pipeline/rewriting_pipeline.py

12-12: app.domain.data.competency_dto.Competency imported but unused

Remove unused import: app.domain.data.competency_dto.Competency

(F401)

🪛 GitHub Actions: Run linters

app/pipeline/rewriting_pipeline.py

[error] 12-12: Unused import: 'app.domain.data.competency_dto.Competency' imported but unused

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: Build and Push Docker Image

🔇 Additional comments (4)

app/pipeline/rewriting_pipeline.py (4)

32-33: Verify GPT model requirements.

The GPT version (4.5) and context length (16385) specifications seem unusual:

GPT-4.5 is not a standard version

Context length is typically a power of 2 (e.g., 16384)

Please verify these requirements match the intended model specifications.

60-60: Add validation for response contents.

Direct access to response.contents[0] could raise an IndexError if the response is empty.

62-62: Adjust logging level for sensitive content.

Logging the rewritten text at INFO level could expose sensitive information.

25-25: ⚠️ Potential issue

Remove unused output_parser declaration.

The output_parser is declared but never used in the code.
-    output_parser: PydanticOutputParser
Likely invalid or redundant comment.

app/pipeline/rewriting_pipeline.py

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (5)

app/pipeline/prompts/faq_rewriting.py (5)
1-4: Add type annotation for better type safety.

Consider adding a type annotation to the variable declaration:
-system_prompt_faq = """:
+system_prompt_faq: str = """:
🧰 Tools

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black code formatter

9-9: Capitalize proper noun "german".

Proper nouns should be capitalized:
-5. Original Language: Use the same language as the input text. The input text will be either german or english.
+5. Original Language: Use the same language as the input text. The input text will be either German or English.
🧰 Tools

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black code formatter

5-12: Consider adding examples for each guideline.

To make the guidelines more actionable, consider adding brief examples for each point. This would help ensure consistent output across different model responses.

🧰 Tools

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black code formatter

13-14: Specify the criteria for "short inputs".

The current instruction for handling short inputs is ambiguous. Consider defining specific criteria (e.g., minimum character count or sentence structure) to ensure consistent handling.

🧰 Tools

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black code formatter

26-28: Consider using unique markers and specify string formatting method.

Two suggestions for improvement:

Use more unique markers to avoid potential conflicts with markdown content (e.g., ###PYRIS_START###).

Document the expected string formatting method:
# Example usage:
prompt = system_prompt_faq.format(rewritten_text="input text")
🧰 Tools

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black code formatter

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 76e5c9b and 402c65c.

📒 Files selected for processing (1)

app/pipeline/prompts/faq_rewriting.py (1 hunks)

🧰 Additional context used

🪛 GitHub Actions: Run linters

app/pipeline/prompts/faq_rewriting.py

[warning] File needs to be reformatted using Black code formatter

⏰ Context from checks skipped due to timeout of 90000ms (1)

GitHub Check: Build and Push Docker Image

🔇 Additional comments (2)

app/pipeline/prompts/faq_rewriting.py (2)

16-16: Remove trailing whitespace.

Remove the trailing whitespace at the end of line 16.

🧰 Tools

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black code formatter

1-30: Black formatting is needed.

The file requires formatting with Black to maintain consistent code style.

🧰 Tools

🪛 GitHub Actions: Run linters

[warning] File needs to be reformatted using Black code formatter

cremertim and others added 3 commits January 16, 2025 11:06

Setup rephrasing on new branch without FAQ

9c899e8

add missing DTO

cf272ef

Minor rephrasement of Prompt, fixed docker setup for windows

3a7dbd9

github-actions bot added component:Pipeline component:FastAPI component:Domain component:Docker labels Jan 16, 2025

cremertim mentioned this pull request Jan 16, 2025

Iris: Add Artemis Intelligence rewrite action for FAQs ls1intum/Artemis#10157

Open

18 tasks

coderabbitai bot requested changes Jan 16, 2025

View reviewed changes

app/web/routers/pipelines.py Outdated Show resolved Hide resolved

app/web/routers/pipelines.py Outdated Show resolved Hide resolved

Renamed everything to Rewrite

8b7a7c9

coderabbitai bot requested changes Jan 16, 2025

View reviewed changes

wasnertobias reviewed Jan 17, 2025

View reviewed changes

app/web/status/status_update.py Show resolved Hide resolved

wasnertobias reviewed Jan 17, 2025

View reviewed changes

app/pipeline/prompts/faq_rewriting.py Outdated Show resolved Hide resolved

FelixTJDietrich added the deploy:pyris-test label Jan 17, 2025

github-actions bot added lock:pyris-test and removed deploy:pyris-test labels Jan 17, 2025

FelixTJDietrich temporarily deployed to Iris Test January 17, 2025 10:12 — with GitHub Actions Inactive

FelixTJDietrich reviewed Jan 17, 2025

View reviewed changes

app/web/routers/pipelines.py Outdated Show resolved Hide resolved

app/web/routers/pipelines.py Outdated Show resolved Hide resolved

Removed unnecessary comments

2ff6d82

coderabbitai bot reviewed Jan 17, 2025

View reviewed changes

FelixTJDietrich mentioned this pull request Jan 17, 2025

Add problem statement rewriting to the rewriting pipeline #192

Open

FelixTJDietrich changed the title ~~Basic rephrasing~~ FAQ rewriting pipeline Jan 17, 2025

Reformat using

79e2dc2

coderabbitai bot requested changes Jan 18, 2025

View reviewed changes

app/domain/rewriting_pipeline_execution_dto.py Outdated Show resolved Hide resolved

app/pipeline/prompts/faq_rewriting.py Outdated Show resolved Hide resolved

app/pipeline/rewriting_pipeline.py Outdated Show resolved Hide resolved

Reformat using

0715367

coderabbitai bot requested changes Jan 18, 2025

View reviewed changes

app/pipeline/rewriting_pipeline.py Show resolved Hide resolved

app/pipeline/rewriting_pipeline.py Outdated Show resolved Hide resolved

FelixTJDietrich requested review from bassner, sebastianloose and isabellagessl January 20, 2025 14:10

coderabbitai bot previously approved these changes Jan 21, 2025

View reviewed changes

cremertim temporarily deployed to Iris Test January 21, 2025 13:32 — with GitHub Actions Inactive

ls1intum deleted a comment from coderabbitai bot Jan 21, 2025

cremertim added 2 commits January 21, 2025 17:49

Test

f35f2b5

Test

76e5c9b

coderabbitai bot previously approved these changes Jan 21, 2025

View reviewed changes

cremertim added deploy:pyris-test and removed lock:pyris-test labels Jan 21, 2025

github-actions bot added lock:pyris-test and removed deploy:pyris-test labels Jan 21, 2025

cremertim temporarily deployed to Iris Test January 21, 2025 16:59 — with GitHub Actions Inactive

cremertim added deploy:pyris-test and removed lock:pyris-test labels Jan 21, 2025

github-actions bot added lock:pyris-test and removed deploy:pyris-test labels Jan 21, 2025

cremertim temporarily deployed to Iris Test January 21, 2025 21:21 — with GitHub Actions Inactive

Test

402c65c

cremertim dismissed coderabbitai[bot]’s stale review via 402c65c January 22, 2025 06:02

coderabbitai bot reviewed Jan 22, 2025

View reviewed changes

coderabbitai bot previously approved these changes Jan 22, 2025

View reviewed changes

cremertim added deploy:pyris-test and removed lock:pyris-test labels Jan 22, 2025

github-actions bot added lock:pyris-test and removed deploy:pyris-test labels Jan 22, 2025

cremertim deployed to Iris Test January 22, 2025 06:13 — with GitHub Actions View deployment

Pushed another prompt improvement

8776ea2

cremertim dismissed coderabbitai[bot]’s stale review via 8776ea2 January 22, 2025 06:14

coderabbitai bot previously approved these changes Jan 22, 2025

View reviewed changes

Redo task part of the prompt

9ddf634

cremertim dismissed coderabbitai[bot]’s stale review via 9ddf634 January 22, 2025 06:17

cremertim removed the lock:pyris-test label Jan 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FAQ rewriting pipeline #191

FAQ rewriting pipeline #191

cremertim commented Jan 16, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

FAQ rewriting pipeline #191

Are you sure you want to change the base?

FAQ rewriting pipeline #191

Conversation

cremertim commented Jan 16, 2025 • edited by coderabbitai bot Loading

Summary by CodeRabbit

Release Notes

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

cremertim commented Jan 16, 2025 •

edited by coderabbitai bot

Loading