Skip to content

Commit

Permalink
Remove instructor-hub (#1197)
Browse files Browse the repository at this point in the history
Co-authored-by: devin-ai-integration[bot] <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: Ivan Leo <[email protected]>
  • Loading branch information
devin-ai-integration[bot] and ivanleomk authored Nov 20, 2024
1 parent e277818 commit f0c9e4c
Show file tree
Hide file tree
Showing 40 changed files with 95 additions and 2,273 deletions.
4 changes: 2 additions & 2 deletions docs/blog/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,8 +50,8 @@ If you want to get updates on new features and tips on how to use Instructor, yo
- [Ollama Integration](../integrations/ollama.md)
- [llama-cpp-python Integration](../integrations/llama-cpp-python.md)
- [Together Compute Integration](../integrations/together.md)
- [Extracting Data into Pandas DataFrame using GPT-3.5 Turbo](../hub/pandas_df.md)
- [Implementing Streaming Partial Responses with Field-Level Streaming](../hub/partial_streaming.md)
- [Pandas DataFrame Examples](../examples/bulk_classification.md#working-with-dataframes)
- [Streaming Response Examples](../examples/bulk_classification.md#streaming-responses)

## Media and Resources

Expand Down
2 changes: 1 addition & 1 deletion docs/blog/posts/best_framework.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ Other features on instructor, in and out of the llibrary are:
3. [Parallel Tool Calling](../../concepts/parallel.md) with correct types
4. Streaming [Partial](../../concepts/partial.md) and [Iterable](../../concepts/iterable.md) data.
5. Returning [Primitive](../../concepts/types.md) Types and [Unions](../../concepts/unions.md) as well!
6. Lots, and Lots of [Cookbooks](../../examples/index.md), [Tutorials](../../tutorials/1-introduction.ipynb), Documentation and even [instructor hub](../../integrations/index.md)
6. Lots of [Cookbooks](../../examples/index.md), [Tutorials](../../tutorials/1-introduction.ipynb), and comprehensive Documentation in our [Integration Guides](../../integrations/index.md)

## Instructor's Broad Applicability

Expand Down
7 changes: 4 additions & 3 deletions docs/blog/posts/langsmith.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,10 +37,11 @@ pip install -U langsmith
pip install -U instructor
```

If you want to pull this example down from [instructor-hub](../../hub/index.md) you can use the following command:
You can find this example in our [examples directory](../../examples/bulk_classification.md):

```bash
instructor hub pull --slug batch_classification_langsmith --py > batch_classification_langsmith.py
# The example code is available in the examples directory
# See: https://python.useinstructor.com/examples/bulk_classification
```

In this example we'll use the `wrap_openai` function to wrap the OpenAI client with LangSmith. This will allow us to use LangSmith's observability and monitoring features with the OpenAI client. Then we'll use `instructor` to patch the client with the `TOOLS` mode. This will allow us to use `instructor` to add additional functionality to the client. We'll use [asyncio](./learn-async.md) to classify a list of questions.
Expand Down Expand Up @@ -169,4 +170,4 @@ If you follow what we've done is wrapped the client and proceeded to quickly use

To take a look at trace of this run check out this shareable [link](https://smith.langchain.com/public/eaae9f95-3779-4bbb-824d-97aa8a57a4e0/r).

![](./img/langsmith.png)
![](./img/langsmith.png)
17 changes: 0 additions & 17 deletions docs/hub/action_items.md → docs/examples/action_items.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,6 @@ description: Learn to extract actionable items from meeting transcripts using Op

In this guide, we'll walk through how to extract action items from meeting transcripts using OpenAI's API and Pydantic. This use case is essential for automating project management tasks, such as task assignment and priority setting.

If you want to try outs via `instructor hub`, you can pull it by running

```bash
instructor hub pull --slug action_items --py > action_items.py
```

For multi-label classification, we introduce a new enum class and a different Pydantic model to handle multiple labels.

!!! tips "Motivation"
Expand Down Expand Up @@ -89,27 +83,16 @@ def generate(data: str) -> Iterable[Ticket]:
prediction = generate(
"""
Alice: Hey team, we have several critical tasks we need to tackle for the upcoming release. First, we need to work on improving the authentication system. It's a top priority.
Bob: Got it, Alice. I can take the lead on the authentication improvements. Are there any specific areas you want me to focus on?
Alice: Good question, Bob. We need both a front-end revamp and back-end optimization. So basically, two sub-tasks.
Carol: I can help with the front-end part of the authentication system.
Bob: Great, Carol. I'll handle the back-end optimization then.
Alice: Perfect. Now, after the authentication system is improved, we have to integrate it with our new billing system. That's a medium priority task.
Carol: Is the new billing system already in place?
Alice: No, it's actually another task. So it's a dependency for the integration task. Bob, can you also handle the billing system?
Bob: Sure, but I'll need to complete the back-end optimization of the authentication system first, so it's dependent on that.
Alice: Understood. Lastly, we also need to update our user documentation to reflect all these changes. It's a low-priority task but still important.
Carol: I can take that on once the front-end changes for the authentication system are done. So, it would be dependent on that.
Alice: Sounds like a plan. Let's get these tasks modeled out and get started."""
)
```
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,10 @@ description: Discover how to integrate LangSmith with the OpenAI client for impr

Its a common misconception that LangChain's [LangSmith](https://www.langchain.com/langsmith) is only compatible with LangChain's models. In reality, LangSmith is a unified DevOps platform for developing, collaborating, testing, deploying, and monitoring LLM applications. In this blog we will explore how LangSmith can be used to enhance the OpenAI client alongside `instructor`.

If you want to try this example using `instructor hub`, you can pull it by running
First, install the necessary packages:

```bash
pip install -U langsmith
instructor hub pull --slug batch_classification_langsmith --py > langsmith_example.py
```

## LangSmith
Expand Down Expand Up @@ -101,7 +100,6 @@ async def classify(data: str) -> QuestionClassification:
"""
Perform multi-label classification on the input text.
Change the prompt to fit your use case.
Args:
data (str): The input text to classify.
"""
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,6 @@ description: Learn to construct knowledge graphs from textual data using OpenAI'

In this tutorial, we will explore the process of constructing knowledge graphs from textual data using OpenAI's API and Pydantic. This approach is crucial for efficiently automating the extraction of structured information from unstructured text.

To experiment with this yourself through `instructor hub`, you can obtain the necessary code by executing:

```bash
instructor hub pull --slug knowledge_graph --py > knowledge_graph.py
```

```python
from typing import List
from pydantic import BaseModel, Field
Expand Down Expand Up @@ -95,4 +89,4 @@ if __name__ == "__main__":
]
}
"""
```
```
2 changes: 1 addition & 1 deletion docs/examples/classification.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ description: Learn to implement single-label and multi-label text classification

# Text Classification using OpenAI and Pydantic

This tutorial showcases how to implement text classification tasks—specifically, single-label and multi-label classifications—using the OpenAI API and Pydantic models. If you want to see full examples check out the hub examples for [single classification](../hub/single_classification.md) and [multi classification](../hub/multiple_classification.md)
This tutorial showcases how to implement text classification tasks—specifically, single-label and multi-label classifications—using the OpenAI API and Pydantic models. For complete examples, check out our [single classification](bulk_classification.md#single-label-classification) and [multi-label classification](bulk_classification.md#multi-label-classification) examples in the cookbook.

!!! tips "Motivation"

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,6 @@ description: Learn to extract customer lead details using OpenAI's API and Pydan

In this guide, we'll walk through how to extract customer lead information using OpenAI's API and Pydantic. This use case is essential for seamlessly automating the process of extracting specific information from a context.

If you want to try this out via `instructor hub`, you can pull it by running:

```bash
instructor hub pull --slug extract_contact_info --py > extract_contact_info.py
```

## Motivation

You could potentially integrate this into a chatbot to extract relevant user information from user messages. With the use of machine learning driven validation it would reduce the need for a human to verify the information.
Expand Down Expand Up @@ -99,4 +93,4 @@ if __name__ == "__main__":
"""
```

In this example, the `parse_lead_from_message` function successfully extracts lead information from a user message, demonstrating how automation can enhance the efficiency of collecting accurate customer details. It also shows how the function successfully catches that the phone number is invalid so functionality can be implemented for the user to get prompted again to give a correct phone number.
In this example, the `parse_lead_from_message` function successfully extracts lead information from a user message, demonstrating how automation can enhance the efficiency of collecting accurate customer details. It also shows how the function successfully catches that the phone number is invalid so functionality can be implemented for the user to get prompted again to give a correct phone number.
13 changes: 11 additions & 2 deletions docs/examples/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Welcome to our collection of cookbooks showcasing the power of structured output
7. [Complex Query Decomposition](planning-tasks.md): Break down intricate queries into manageable subtasks for thorough analysis.
8. [Entity Extraction and Resolution](entity_resolution.md): Identify and disambiguate named entities in text.
9. [PII Sanitization](pii.md): Detect and redact sensitive personal information from text data.
10. [Action Item and Dependency Extraction](../hub/action_items.md): Generate structured task lists and relationships from meeting transcripts.
10. [Action Item Extraction](planning-tasks.md): Generate structured task lists and relationships from meeting transcripts.
11. [OpenAI Content Moderation Integration](moderation.md): Implement content filtering using OpenAI's moderation API.
12. [Table Extraction with GPT-Vision](extracting_tables.md): Convert image-based tables into structured data using AI vision capabilities.
13. [AI-Powered Ad Copy Generation from Images](image_to_ad_copy.md): Create compelling advertising text based on visual content.
Expand All @@ -34,7 +34,16 @@ Welcome to our collection of cookbooks showcasing the power of structured output
23. [Slide Content Extraction with GPT-4 Vision](extract_slides.md): Convert presentation slide images into structured, analyzable text data.
24. [Few-Shot Learning with Examples](examples.md): Improve AI model performance by providing contextual examples in prompts.
25. [Local Classification without API](local_classification.md): Perform text classification tasks locally without relying on external API calls.

26. [Action Items Extraction](action_items.md): Extract structured action items and tasks from text content.
27. [Batch Classification with LangSmith](batch_classification_langsmith.md): Efficiently classify content in batches using LangSmith integration.
28. [Contact Information Extraction](extract_contact_info.md): Extract structured contact details from unstructured text.
29. [Knowledge Graph Building](building_knowledge_graph.md): Create and manipulate knowledge graphs from textual data.
30. [Multiple Classification Tasks](multiple_classification.md): Handle multiple classification categories simultaneously.
31. [Pandas DataFrame Integration](pandas_df.md): Work with structured data using Pandas DataFrames.
32. [Partial Response Streaming](partial_streaming.md): Stream partial results for real-time processing.
33. [Single Classification Tasks](single_classification.md): Implement focused single-category classification.
34. [Table Extraction from Images](tables_from_vision.md): Convert visual tables into structured data formats.
35. [YouTube Clip Analysis](youtube_clips.md): Extract and analyze information from YouTube video clips.

## Subscribe to our Newsletter for Updates and Tips

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,6 @@ title: Multi-Label Classification with OpenAI and Pydantic
description: Learn how to implement multi-label classification using OpenAI's API and Pydantic for effective support ticket classification.
---

If you want to try outs via `instructor hub`, you can pull it by running

```bash
instructor hub pull --slug multiple_classification --py > multiple_classification.py
```

For multi-label classification, we introduce a new enum class and a different Pydantic model to handle multiple labels.

```python
Expand All @@ -28,7 +22,6 @@ LABELS = Literal["ACCOUNT", "BILLING", "GENERAL_QUERY"]
class MultiClassPrediction(BaseModel):
"""
A few-shot example of multi-label classification:
Examples:
- "My account is locked and I can't access my billing info.": ACCOUNT, BILLING
- "I need help with my subscription.": ACCOUNT
Expand Down
6 changes: 3 additions & 3 deletions docs/examples/ollama.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ Instructor offers several key benefits:

- :material-code-braces: **Powered by Type Hints**: Leverage Pydantic for schema validation, prompting control, less code, and IDE integration. [:octicons-arrow-right-16: Learn more](https://docs.pydantic.dev/)

- :material-lightning-bolt: **Simplified LLM Interactions**: Support for various LLM providers including OpenAI, Anthropic, Google, Vertex AI, Mistral/Mixtral, Anyscale, Ollama, llama-cpp-python, Cohere, and LiteLLM. [:octicons-arrow-right-16: See Hub](../hub/index.md)
- :material-lightning-bolt: **Simplified LLM Interactions**: Support for various LLM providers including OpenAI, Anthropic, Google, Vertex AI, Mistral/Mixtral, Anyscale, Ollama, llama-cpp-python, Cohere, and LiteLLM. [:octicons-arrow-right-16: See Examples](../examples/index.md)

For more details on these features, check out the [Concepts](../concepts/models.md) section of the documentation.

Expand Down Expand Up @@ -107,10 +107,10 @@ To explore more about Instructor and its various applications, consider checking

2. [Concepts](../concepts/models.md) - Dive deeper into the core concepts of Instructor, including models, retrying, and validation.

3. [Hub](../hub/index.md) - Explore the Instructor Hub for more examples and integrations with various LLM providers.
3. [Examples](../examples/index.md) - Explore our comprehensive collection of examples and integrations with various LLM providers.

4. [Tutorials](../tutorials/1-introduction.ipynb) - Step-by-step tutorials to help you get started with Instructor.

5. [Learn Prompting](../prompting/index.md) - Techniques and strategies for effective prompt engineering with Instructor.

By exploring these resources, you'll gain a comprehensive understanding of Instructor's capabilities and how to leverage them in your projects.
By exploring these resources, you'll gain a comprehensive understanding of Instructor's capabilities and how to leverage them in your projects.
6 changes: 0 additions & 6 deletions docs/hub/pandas_df.md → docs/examples/pandas_df.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,6 @@ description: Learn how to extract and convert Markdown tables directly into Pand

# Extracting directly to a DataFrame

You can pull this example into your IDE by running the following command:

```bash
instructor hub pull --slug pandas_df --py > pandas_df.py
```

In this example we'll show you how to extract directly to a `pandas.DataFrame`

```python
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,6 @@ Field level streaming provides incremental snapshots of the current state of the

Instructor supports this pattern by making use of `Partial[T]`. This lets us dynamically create a new class that treats all of the original model's fields as `Optional`.

If you want to try outs via `instructor hub`, you can pull it by running

```bash
instructor hub pull --slug partial_streaming --py > partial_streaming.py
```

```python
import instructor
from openai import OpenAI
Expand All @@ -25,15 +19,11 @@ client = instructor.from_openai(OpenAI())

text_block = """
In our recent online meeting, participants from various backgrounds joined to discuss the upcoming tech conference. The names and contact details of the participants were as follows:
- Name: John Doe, Email: [email protected], Twitter: @TechGuru44
- Name: Jane Smith, Email: [email protected], Twitter: @DigitalDiva88
- Name: Alex Johnson, Email: [email protected], Twitter: @CodeMaster2023
During the meeting, we agreed on several key points. The conference will be held on March 15th, 2024, at the Grand Tech Arena located at 4521 Innovation Drive. Dr. Emily Johnson, a renowned AI researcher, will be our keynote speaker.
The budget for the event is set at $50,000, covering venue costs, speaker fees, and promotional activities. Each participant is expected to contribute an article to the conference blog by February 20th.
A follow-up meetingis scheduled for January 25th at 3 PM GMT to finalize the agenda and confirm the list of speakers.
"""

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,6 @@ description: Learn to implement single-label classification using the OpenAI API

# Single-Label Classification

IF you want to try this code with `instructor hub` you can pull it by running

```bash
instructor hub pull --slug single_classification --py > single_classification.py
```

This example demonstrates how to perform single-label classification using the OpenAI API. The example uses the `gpt-3.5-turbo` model to classify text as either `SPAM` or `NOT_SPAM`.

```python
Expand All @@ -27,7 +21,7 @@ client = instructor.from_openai(OpenAI())
class ClassificationResponse(BaseModel):
"""
A few-shot example of text classification:
Examples:
- "Buy cheap watches now!": SPAM
- "Meeting at 3 PM in the conference room": NOT_SPAM
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ MarkdownDataFrame = Annotated[
{
"type": "string",
"description": """
The markdown representation of the table,
The markdown representation of the table,
each one should be tidy, do not try to join tables
that should be seperate""",
}
Expand Down Expand Up @@ -108,11 +108,9 @@ def extract(url: str) -> MultipleTables:
"type": "text",
"text": """
First, analyze the image to determine the most appropriate headers for the tables.
Generate a descriptive h1 for the overall image, followed by a brief summary of the data it contains.
Generate a descriptive h1 for the overall image, followed by a brief summary of the data it contains.
For each identified table, create an informative h2 title and a concise description of its contents.
Finally, output the markdown representation of each table.
Make sure to escape the markdown table properly, and make sure to include the caption and the dataframe.
including escaping all the newlines and quotes. Only return a markdown table in dataframe, nothing else.
""",
Expand Down
5 changes: 2 additions & 3 deletions docs/hub/youtube_clips.md → docs/examples/youtube_clips.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,10 @@ description: Learn to create concise YouTube clips from video transcripts with `

This guide demonstrates how to generate concise, informative clips from YouTube video transcripts using the `instructor` library. By leveraging the power of OpenAI's models, we can extract meaningful segments from a video's transcript, which can then be recut into smaller, standalone videos. This process involves identifying key moments within a transcript and summarizing them into clips with specific titles and descriptions.

If you're interested in trying this example using `instructor hub`, you can pull it by running:

First, install the necessary packages:

```bash
pip install youtube_transcript_api instructor rich
instructor hub pull --slug youtube-clips --py > youtube_clips.py
```

![youtube clip streaming](../img/youtube.gif)
Expand Down Expand Up @@ -127,3 +125,4 @@ if __name__ == "__main__":
str(youtube_clip.end),
)
console.print(table)
```
Loading

0 comments on commit f0c9e4c

Please sign in to comment.