diff --git a/DataSets/AdVec/advec_demo.ipynb b/DataSets/AdVec/advec_demo.ipynb index 7aed8e9..96770cb 100644 --- a/DataSets/AdVec/advec_demo.ipynb +++ b/DataSets/AdVec/advec_demo.ipynb @@ -1,5267 +1,626 @@ { - "cells": [ - { - "cell_type": "markdown", - "source": [ - "![63f78014766fd30436c18a79_Hyperspace - navbar logo.png]()\n", - "\n", - "# Application Semantic Search Using Hyperspace\n", - "\n", - "This notebook demonstrates the use of Hyperspace to perform hybrid search over an App database.\n", - "In addition to hybrid search, the notebook includes examples for classic search and vector search, based on embedding of a user provided query.\n", - "\n", - "The relevent score functions can be downloaded from [Hyperspace git](https://github.com/hyper-space-io/QuickStart/blob/main/DataSets/AdVec/classic_score.py).\n", - "For more info, see the [Hyperspace documentation](https://docs.hyper-space.io/hyperspace-docs/getting-started/overview).\n", - "\n", - "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/hyper-space-io/QuickStart/blob/master/DataSets/AdVec/advec_demo.ipynb)\n", - "# The Dataset\n", - "![AdVec_logo.PNG]()\n", - "\n", - "The dataset includes 89330 documents with the following fields:\n", - "1. **id** [float] - unique identifier per application\n", - "2. **title** [Keyword] - Application name\n", - "3. **bundle_id** [keyword] - identifier of the App bundle, if such exists\n", - "4. **ios** [boolean] - Is the App an IOS App (True) or Android (False)\n", - "5. **categories** [list[keyword]] - list of categories to which the App belongs\n", - "6. **content** [Keyword] - app description as text\n", - "7. **embedded_app** [list[float]] - text embedding of the app description. Text was embedded using the Hugging face [bge-small-en model](https://huggingface.co/BAAI/bge-small-en)\n", - "\n", - "The data was taken from [AdVec ML](https://demo.advecml.com/) and the search engine was built in collabortation with [Argmax.io](https://www.linkedin.com/company/argmax/?originalSubdomain=il).\n", - "The data can be downloaded from the following links: [vectors](http://hyperspace-datasets.s3.amazonaws.com/vectors.npy)\n", - ", [metadata](http://hyperspace-datasets.s3.amazonaws.com/context.jsonl)\n", - "\n", - "# Hybrid search with Hyperspace\n", - "This notebook combines brute-force KNN (accurate) with metadata filtering. In this scheme, Hyperspace uses the pre-filtering approach, by which the metadata is first filtered, and KNN is applied only to vectors that pass the initial filtering. With KNN, this approach optimizes the query latency without reducing its recall.\n", - "\n", - "![image.png]()\n" - ], - "metadata": { - "id": "_trhSpIUhamm" - }, - "id": "_trhSpIUhamm" - }, - { - "cell_type": "markdown", - "source": [ - "# Setting up the Hyperspace environment\n", - "Working with Hyperspace requires the followin steps\n", - "\n", - "1. Install the client API\n", - "2. Create data config file\n", - "3. Connect to a server\n", - "4. Create collection\n", - "5. Ingest data\n", - "6. Run query" - ], - "metadata": { - "id": "K41CEp06-JmN" - }, - "id": "K41CEp06-JmN" - }, - { - "cell_type": "markdown", - "source": [ - "## 1. Install the client API\n", - "You can install the Hyperspace API directly from Git by executing the following command:" - ], - "metadata": { - "id": "7UVt24r6-Mft" - }, - "id": "7UVt24r6-Mft" - }, - { - "cell_type": "code", - "source": [ - "pip install git+https://github.com/hyper-space-io/hyperspace-py" - ], - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/" - }, - "id": "edxBW-er-Lvi", - "outputId": "0a9be8ca-7bb7-4943-9a7d-5b8e8ed7e89e" - }, - "id": "edxBW-er-Lvi", - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "markdown", - "source": [ - "### Download dataset" - ], - "metadata": { - "collapsed": false - }, - "id": "cd585f6aa21b383a" - }, - { - "cell_type": "code", - "execution_count": null, - "outputs": [], - "source": [ - "from urllib.request import urlretrieve\n", - "import os\n", - "\n", - "def download_data(url, file_name):\n", - " \"\"\"\n", - " url (str): URL of the file to download.\n", - " file_name (str): Local path where the file will be saved.\n", - " \"\"\"\n", - " # Check if the file already exists and is not empty\n", - " if os.path.exists(file_name) and os.path.getsize(file_name) > 0:\n", - " print(f\"The file {file_name} already exists and is not empty.\")\n", - " else:\n", - " try:\n", - " # Attempt to download the file from `url` and save it locally under `file_name`\n", - " urlretrieve(url, file_name)\n", - " # Check if the file was downloaded and is not empty\n", - " if os.path.exists(file_name) and os.path.getsize(file_name) > 0:\n", - " print(f\"Successfully downloaded {file_name}\")\n", - " else:\n", - " print(\"Download failed or file is empty.\")\n", - " \n", - " except Exception as e:\n", - " print(f\"An error occurred: {e}\")\n" - ], - "metadata": { - "collapsed": false - }, - "id": "9384a45a031b2a23" - }, - { - "cell_type": "code", - "execution_count": null, - "outputs": [], - "source": [ - "metadata_url = \"http://hyperspace-datasets.s3.amazonaws.com/context.jsonl\"\n", - "vectors_url = \"http://hyperspace-datasets.s3.amazonaws.com/vectors.npy\"\n", - "download_data(metadata_url, \"./context.jsonl\")\n", - "download_data(vectors_url, \"./vectors.npy\")" - ], - "metadata": { - "collapsed": false - }, - "id": "57cc9837ffc666d9" - }, - { - "cell_type": "markdown", - "metadata": { - "id": "TCZSwM6DVeDm" - }, - "source": [ - "## 2. Connect to a server\n", - "\n", - "Once the Hyperspace API is installed, you can access database by creating a local instance of the Hyperspace client. This step requires host address, username and password, provided by Hyperspace" - ], - "id": "TCZSwM6DVeDm" - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "17a978a2" - }, - "outputs": [], - "source": [ - "import hyperspace\n", - "from getpass import getpass\n", - "\n", - "username = \"USERNAME\"\n", - "host = \"HOST_URL\"\n", - "\n", - "hyperspace_client = hyperspace.HyperspaceClientApi(host=host, username=username, password=getpass())\n" - ], - "id": "17a978a2" - }, - { - "cell_type": "markdown", - "metadata": { - "id": "HXmdh3YGVfQV" - }, - "source": [ - "## 3. Create a Data Schema File\n", - "\n", - "As other search databases, Hyper-Space database requires a configuration file that outlines the data schema. Attached below is a config file that corresponds to the fields of the given dataset.\n", - "\n", - "For vector fields, we also provide the index type to be used, and the metric. . Current options for index include \"**brute_force**\", \"**hnsw**\", \"**ivf**\", and \"**bin_ivf**\" for binary vectors, and \"**IP**\" ([inner product](https://en.wikipedia.org/wiki/Inner_product_space)) as a metric for floating point vectors and \"**Hamming**\" ([hamming distance](https://en.wikipedia.org/wiki/Hamming_distance)) for binary vectors.\n", - "Here, we use \"brute_force\" (exact KNN) with inner product." - ], - "id": "HXmdh3YGVfQV" - }, - { - "cell_type": "code", - "execution_count": null, - "id": "63ea0ce9-ce9a-45b2-9747-2f0e504c3514", - "metadata": { - "id": "63ea0ce9-ce9a-45b2-9747-2f0e504c3514" - }, - "outputs": [], - "source": [ - "import json\n", - "\n", - "config = {\n", - " \"configuration\": {\n", - " \"id\": {\n", - " \"type\": \"keyword\",\n", - " \"id\": True\n", - " },\n", - " \"title\":{\n", - " \"type\":\"keyword\"\n", - " },\n", - " \"bundle_id\": {\n", - " \"type\":\"keyword\"\n", - " },\n", - " \"ios\":{\n", - " \"type\":\"boolean\"\n", - " },\n", - " \"categories\": {\n", - " \"type\":\"keyword\",\n", - " \"struct_type\":\"list\"\n", - " },\n", - " \"content\": {\n", - " \"type\":\"keyword\"\n", - " },\n", - " \"embedded_app\": {\n", - " \"type\": \"dense_vector\",\n", - " \"dim\": 384,\n", - " \"index_type\": \"brute_force\",\n", - " \"metric\": \"IP\"\n", - " }\n", - " }\n", - "}\n", - "\n", - "with open('advec_config.json', 'w') as f:\n", - " f.write(json.dumps(config, indent=2))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "7RTDkUsr3ead" - }, - "source": [ - "## 4. Create Collection\n", - "The Hyerspace engine stroes data in Collections, where each collecction commonly hosts data of similar context, etc. Each search is then perfomed within a collection. We create a collection using the command \"**create_collection**(schema_filename, collection_name)\"." - ], - "id": "7RTDkUsr3ead" - }, - { - "cell_type": "code", - "execution_count": null, - "id": "092053ea-a4e2-4dbb-90c7-4adbfc953384", - "metadata": { - "id": "092053ea-a4e2-4dbb-90c7-4adbfc953384", - "outputId": "e21d42f9-206b-4300-a248-a784b4275d98", - "colab": { - "base_uri": "https://localhost:8080/" - } - }, - "outputs": [], - "source": [ - "collection_name = 'advec'\n", - "if collection_name not in hyperspace_client.collections_info()[\"collections\"]:\n", - " hyperspace_client.create_collection('advec_config.json', collection_name)\n", - "\n", - "hyperspace_client.collections_info()" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "lUpgHD2VWFXd" - }, - "source": [ - "# 5. Ingest Data\n", - "\n", - "In the next step, we ingest the dataset in batches. The number documents in each batch can be controlled by the user, and specifically, it can be increased to reduce ingestion time.\n", - "Batches of data are added using the add_batch(batch, collection_name) command" - ], - "id": "lUpgHD2VWFXd" - }, - { - "cell_type": "code", - "source": [ - "import numpy as np\n", - "vectors_path = \"vectors.npy\"\n", - "data_file_path = \"context.jsonl\"\n", - "vecs = np.load(vectors_path)\n", - "with open(data_file_path, encoding='cp437') as metadata_file:\n", - " metadata= [json.loads(row) for row in metadata_file]\n" - ], - "metadata": { - "id": "TQsSPSqeTMXq" - }, - "id": "TQsSPSqeTMXq", - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0904d325-495f-410d-8bc6-f84a30bac7af", - "metadata": { - "scrolled": true, - "id": "0904d325-495f-410d-8bc6-f84a30bac7af", - "outputId": "90378193-b817-404a-dfca-e4a6e6240039", - "colab": { - "base_uri": "https://localhost:8080/" - } - }, - "outputs": [], - "source": [ - "\n", - "BATCH_SIZE = 500\n", - "\n", - "batch = []\n", - "for i, (metadata_row, vec) in enumerate(zip(metadata, vecs)):\n", - " row = {key: value for key, value in metadata_row.items() if key in config[\"configuration\"].keys()}\n", - " row['embedded_app'] = np.ndarray.tolist(vec)\n", - " row[\"id\"] = str(i)\n", - " batch.append(row)\n", - "\n", - " if i % BATCH_SIZE == 0:\n", - " response = hyperspace_client.add_batch(batch, collection_name)\n", - " batch.clear()\n", - " print(i, response)\n", - "response = hyperspace_client.add_batch(batch, collection_name)\n", - "batch.clear()\n", - "print(i, response)\n", - "hyperspace_client.commit(collection_name)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "8bac72a3-45c7-445e-8f00-e206608969f5", - "metadata": { - "id": "8bac72a3-45c7-445e-8f00-e206608969f5", - "outputId": "06d1ba8f-5437-40ce-ff4a-70065f0b420a", - "colab": { - "base_uri": "https://localhost:8080/" - } - }, - "outputs": [], - "source": [ - "hyperspace_client.collections_info()" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "8lRgTyKwWJ84" - }, - "source": [ - "# 6. Define Logic and Run a Query\n", - "In the last step we build a Hyperspace hybrid search query. We randomly select an App from the database and search for similar applications. The overall score is defined by the weights, provided under the \"boost\" fields. These weights allow to contorl the relative weights of the classic search and vector search scores." - ], - "id": "8lRgTyKwWJ84" - }, - { - "cell_type": "code", - "execution_count": null, - "id": "5ff8391b-b679-43f7-bf79-00a14aef2981", - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/" - }, - "id": "5ff8391b-b679-43f7-bf79-00a14aef2981", - "outputId": "f121c109-81d6-40cc-f9c5-3decab90c9fb" - }, - "outputs": [], - "source": [ - "input_document = hyperspace_client.get_document(collection_name, 42)\n", - "print(input_document)" - ] - }, - { - "cell_type": "markdown", - "source": [ - "## Vector Search" - ], - "metadata": { - "id": "iawRCHKM_8Sj" - }, - "id": "iawRCHKM_8Sj" - }, - { - "cell_type": "markdown", - "source": [ - "Let us first perform a vector search over the embedded description. This step does not require a score function" - ], - "metadata": { - "id": "B_IsUeKYAB5W" - }, - "id": "B_IsUeKYAB5W" - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1ca689dd-b0f9-4ff4-aaea-c041b7135bb9", - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/" - }, - "id": "1ca689dd-b0f9-4ff4-aaea-c041b7135bb9", - "outputId": "1873c114-b89a-4697-fd66-a4253433d475" - }, - "outputs": [], - "source": [ - "results = hyperspace_client.search({'params': input_document,\n", - " 'knn' : [{'field': 'embedded_app',\"boost\": 0},\n", - " {'field': 'query',\"boost\": 1}]},\n", - " size=10,\n", - " collection_name=collection_name)\n", - "\n", - "for i,result in enumerate(results['similarity']):\n", - " vector_api_response = hyperspace_client.get_document(document_id=result['document_id'], collection_name=collection_name)\n", - " response = f\"{i+1} - {result['document_id']} : {result['score']} --- \"\n", - " keys_str = \" - \".join([str(vector_api_response[k]) for k in [\"title\",\"bundle_id\",\"categories\"]])\n", - " print(response+keys_str)" - ] - }, - { - "cell_type": "markdown", - "id": "5f29e4ba-6e8b-43f7-97ff-3037b705dff0", - "metadata": { - "jp-MarkdownHeadingCollapsed": true, - "id": "5f29e4ba-6e8b-43f7-97ff-3037b705dff0" - }, - "source": [ - "## Classic Search\n", - "We repeat the process with classic search, using pre-defined score function, that can be downloaded from [Hyperspace git](https://github.com/hyper-space-io/QuickStart/blob/main/DataSets/AdVec/classic_score.py)" - ] - }, - { - "cell_type": "code", - "source": [ - "import inspect\n", - "\n", - "def set_score_function(func, collection_name, score_function_name='func'):\n", - " source = inspect.getsource(func)\n", - " with open('sf.py', 'w') as f:\n", - " f.write(source)\n", - " return hyperspace_client.set_function('sf.py', collection_name, score_function_name)\n", - " " - ], - "metadata": { - "id": "G6acmbv1vnNq" - }, - "id": "G6acmbv1vnNq", - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "code", - "source": [ - "def ios_score(params, doc):\n", - " if match(\"bundle_id\"):\n", - " return 0.0\n", - "\n", - " score = 0.0\n", - " if match(\"categories\"):\n", - " score += rarity_sum(\"categories\")\n", - "\n", - " if doc[\"ios\"]:\n", - " score *= 2\n", - "\n", - " return score\n", - "\n", - "print(set_score_function(ios_score, collection_name, score_function_name=\"ios_score\"))" - ], - "metadata": { - "id": "g3PLGAF_t2eH" - }, - "id": "g3PLGAF_t2eH", - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "f5fa75ff-6559-462e-b5fb-60f8cfb268ce", - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/" - }, - "id": "f5fa75ff-6559-462e-b5fb-60f8cfb268ce", - "outputId": "ee7a15b5-cd56-4839-fa77-13b0f02d7855" - }, - "outputs": [], - "source": [ - "query_with_knn = {\n", - " 'params': input_document,\n", - " 'knn' : [{'field': 'embedded_app',\"boost\": 0},\n", - " {'field': 'query',\"boost\": 1}]\n", - "}\n", - "\n", - "results = hyperspace_client.search(query_with_knn,\n", - " size=10,\n", - " function_name=\"ios_score\",\n", - " collection_name=collection_name)\n", - "\n", - "for i,result in enumerate(results['similarity']):\n", - " vector_api_response = hyperspace_client.get_document(document_id=result['document_id'], collection_name=collection_name)\n", - " response = f\"{i+1} - {result['document_id']} : {result['score']} --- \"\n", - " keys_str = \" - \".join([str(vector_api_response[k]) for k in [\"title\",\"bundle_id\",\"categories\"]])\n", - " print(response+keys_str)" - ] - }, - { - "cell_type": "markdown", - "id": "31b4bd2a-3fb0-432d-9f29-e556fd0cd870", - "metadata": { - "id": "31b4bd2a-3fb0-432d-9f29-e556fd0cd870" - }, - "source": [ - "## Embed User Query and Search\n", - "Let us now create a free text query, embed it using the [bge-small-en model](https://huggingface.co/BAAI/bge-small-en) and retrieve relevant apps" - ] - }, - { - "cell_type": "code", - "source": [ - "try:\n", - " import sentence_transformers\n", - "except ImportError:\n", - " !pip install sentence_transformers" - ], - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/" - }, - "id": "Xxq39DC5CQ1p", - "outputId": "7c90d148-3a00-4ddb-be12-02d926002f06" - }, - "id": "Xxq39DC5CQ1p", - "execution_count": null, - "outputs": [] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "dfd1e3b2-3be5-42d2-ab8f-6aa68e708e75", - "metadata": { - "id": "dfd1e3b2-3be5-42d2-ab8f-6aa68e708e75", - "outputId": "af26df99-647e-4b12-e319-fa0f61e30824", - "colab": { - "base_uri": "https://localhost:8080/", - "height": 433, - "referenced_widgets": [ - "edab0fcca6734d49bf178ecd28e84aa2", - "dde169eb76a14027a51f72beec02b7a8", - "a0459f9566a041c78d1d18a67eeefd46", - "ba9de2869272485b89537cc762b4d398", - "e26f297fcfd74e81b06241aac3b49023", - "7970346e5d7547bd8e5e02695ccc5b49", - "68c216897f5d403bb00dd52dc32c6019", - "e9c047d2947e496c93974f5f49a5c172", - "f355a9ec295f4c078163992c73662d28", - "7b6dd89030ff43d08880d3ad3613a7c0", - "ed06c1df20ee40fb98f1cc3b5fcc827d", - "841dfdd7fcef43dab3055b13300c3652", - "37794a9b345c4456bc8faf36942af2dd", - "bfc09d93ac2040b7ae5e46830c00d58c", - "94fc6060c95544549b7b59d1df345819", - "cf26414873dc439e9916b5ff228c4bf4", - "ded50ec8861d4d3998547189e1beeab2", - "5361ce35bc6343039fde123a4af9b58d", - "f96975b7c17243778855154d885152f9", - "824ddea4eaea482db26b873e582c8a03", - "aefab71bec87427f82e4dd0285f6dda4", - "8ae88cb01a2c4bdcb45f0a92acb90442", - "ef834f09ab1e4046843ed1733bcbeba9", - "2ebc9188a308434bb256e91e80dc3e2e", - "c696d208a7554ba1a5971e047687721a", - "c157baf6a71c4ca093b638f6e3448d63", - "d5f2c1fc46cb402596d57420efa8ca10", - "175e7704c0b74f9cb122484a73f38401", - "b56f7bf3439e415792086133ab517a26", - "4740498b4bb44f399e67163e4b2728d2", - "93b82ea6ee024af794d3948503f2ac15", - "c72d2cb4b307474db663fc4bdaa9ff9c", - "cc6f8809c55545c39dbf95f94a8c5ac7", - "0fe94447ec1742cbbe755e10e96659fc", - "268ce921f6994617a0320e3afcd35c77", - "5bd260d90fcd449590f659554344d002", - "b205d9cfe8334bb791d16477cf9005d8", - "867f79f4e5ff4067a487456eff17e795", - "02cbe0caf1c94c0eac7130cfd523da03", - "30f9a3ea9c03491091c7eec441219ef6", - "56ab87077873497b9730cd2306757d84", - "fca4944712734053b1f12479895b8a5f", - "4a44360888194d8e95eb2c0a74c97dbf", - "bf692c66cdb3420b99e9699ca70af6a5", - "4bcb60b1268d4e86b00d12c979591ddb", - "1443b51e5f034419a6c645e47e661449", - "bdb2ed9d6a2941bc804a20d1b9246cc8", - "fa44afce87da4a94800d014e4651e19c", - "896cfd5cf3c64be6862c12101b6868d4", - "103082c03c2a4078ad7f18cc75bf8838", - "225b97737ed5415aafb7de51b70c650c", - "c302387a15b44ca195e17d3a814660e5", - "f2fec96988c044d5936726bfad7a8de6", - "bf3aafe4b3194d97bf7529743de0c388", - "de6e42cf341b467d8473f5ca47605a06", - "f62091b77fee4c139a9e74df7416d17b", - "6f12628603954490be891844c2ad525a", - "8bd3f746083244cf860e7e339998a158", - "2c46df6c36ec47ac8b4b68367ff9a842", - "96571e4398b64ae4a440838e3b699711", - "0190d9216db747d2b6c4cf6f7e3deed8", - "be90bb1519884ef6a42ba9ff66706292", - "f2c7b0c095a148918c3d7941e943a584", - "06593db36a3f4d7f82c682001f3b7421", - "7ec550f71fbb4fa98a019c6a19656ad4", - "3a8d7229525e4ba7bc376ac82df5d2d1", - "8c8cd5238b9d49a3b8396fe6ada21a40", - "fb4093faaa6349fda9bddb7647d3d846", - "63f117175e5d4342b9f7d1f443874070", - "371bc87227b94fe4a9cf7726e20593c6", - "ecf3daf027734c99a94e91f8990c6955", - "a84dd759931842a1ade720f7358caf3a", - "1adff9ab83bd43c69909e2fe3d31ef9a", - "c4daa6cde32f473d8245827d7dfc2c7c", - "f48249b28c0846b5902d3e5805454df0", - "0c4cf500201c4105b656d5890324fa61", - "40ced2dae396443f86faf07bbd923a90", - "7cd8d8954a0140e2ac192b762669ef5c", - "f5a823df899a4a2e9e68396a0229f259", - "c13d059e56a5455ea6057dfcc4003524", - "e219e9aa16014593abbf1843125130e0", - "8222bd8a6e684f2abc9a4a033ef4c4d4", - "8c4d6a396c00443db3d17afb3b55ff35", - "42eef07ae276418fa02d6885401d5d5a", - "659df989d3c6491bacef23aca65a8e08", - "5740a110796e42ae8ad3230e7876889a", - "5fc767a015fa468ea30d2eb7b24b5a75", - "ea319de9c8254b59ae0d0e9209218e36", - "9bff5ae41d714e31b8472dfb163ff0d4", - "e1df1c91733c4108adee77846d11278e", - "ebe50f5013774c9bbecc704dc9e6b3a1", - "3951e3f34008406a86d69473e8beb32f", - "11ef22c9716d439a96f1896ef05a4470", - "129ab1a2f2d946ca9c70723ce46d0f5b", - "49ea66c4778343608641e04a0cb19637", - "cb56178d93ab43bd9143eb01da0eef1f", - "30d5690e1b014dccba847f2f68abba4f", - "70cfc3b2ddc146dd8bfb595c199ffe8a", - "6ac778f22cdd4e49b78d455e37491b6c", - "95d4316e8e5a4f92aae48d22824df3b9", - "ba49dc57a28c4f319b49211b0669b95a", - "5af787f313364d54a95121512f030510", - "e81b9e3dfd4b49a49a5c42feb5b65f04", - "41ce1cfa2e054d4daf95177682bbcf34", - "dc89470ee36847a78880c997a0b9f63e", - "467c02d8f5f147908729d80de1f51f70", - "514eed6d562344f7866867ff4550dbcb", - "a4a139ce4cb94f91a78839a406e67f31", - "d9b549be0e594b6a8105d77221ca40f0", - "d9306c2d52724a31b8d29a90c123f508", - "a2c457a03fb04ee9b155e65a082e58e5", - "a624d6afa6a2418592026f037c48e253", - "1ee5fb825f114993b575795584f98301", - "b0a2a43e44e047349248b9b94a0f58d0", - "113da0edbfb14ef0b1844020dc814c33", - "34b6b2cf879e4a87a66af87888ccd0b9", - "3c40e4d223b145bcb9ef881d8d5fb268", - "cf9182ed0fbd4e5bac5b5b0badfdc938", - "a6b694a9ea964133bf085967984f1bf7", - "99971a53f7194457bbc06e6458b340d6", - "22ddbb4eccf04e50b77e6f64df8f6cee", - "d661e8d0f34241f98be969a3d0853f6e", - "7dbd603ea2e741cc808ba5f0798f70fb", - "42ef47c7184e45649a3f2e0434e9ecc9", - "3d1b70ed00d74d288198320225ef8bd4", - "7ae4b92b9e144702977befd7c0b45f91", - "2bf03c2eeccc4572b8903c92cc8f30d1", - "80b3dec35c2f484fb05ae76518a44428", - "043077e63bed45329811efc1e07aa35c", - "af04eee5a56840b384a1c5dbaa679cab", - "8d25496b978f4df38ca2e5781b5a5472", - "ae2d2c1526fc476b8ddca3b71f520cd9", - "e16139aa130d4b90af89e7b67c5a01ae", - "e86bf22fa4e943f8bc448a284385c531", - "e9166ca449f04fb783e06baa9f8413b1", - "d08e4837a5314677808b8dcb4d67162d", - "c0bd65caef214882a25faa0ae5615c08", - "215d0f33bd8d4692a375e185d0fb4ece", - "7dc550ffd0fe4448aff53c73ed5bdb73", - "bac733d0ec2344f7ab2c1a68dc6210ba", - "f2dda96c9ec84ef4a0880db78ae26206", - "bc13d9867f384020a5ffae6ee51a6c44", - "162b9b0e624445d286fff0e805a7c573" - ] - } - }, - "outputs": [], - "source": [ - "from sentence_transformers import SentenceTransformer\n", - "embbeding_model = SentenceTransformer('BAAI/bge-small-en')" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "outputs": [], - "source": [ - "sim_sentence = \"\"\"a great app for gaming with my friends \"\"\"\n", - "sim_embedding = embbeding_model.encode([sim_sentence], normalize_embeddings=True)[0]\n", - "\n", - "results = hyperspace_client.search({'params': {'embedded_app': sim_embedding.tolist()},\n", - " 'knn' : [{'field': 'embedded_app',\"boost\": 0},\n", - " {'field': 'query',\"boost\": 1}]},\n", - " size=10,\n", - " collection_name=collection_name)\n", - "\n", - "for i,result in enumerate(results['similarity']):\n", - " vector_api_response = hyperspace_client.get_document(document_id=result['document_id'], collection_name=collection_name)\n", - " response = f\"{i+1} - {result['document_id']} : {result['score']} --- \"\n", - " keys_str = \" - \".join([str(vector_api_response[k]) for k in [\"title\",\"bundle_id\",\"categories\"]])\n", - " print(response+keys_str)" - ], - "metadata": { - "collapsed": false - }, - "id": "da81d11b390d1231" - }, - { - "cell_type": "markdown", - "id": "1d96433e-80fe-47b2-befb-9197e5ec75cf", - "metadata": { - "id": "1d96433e-80fe-47b2-befb-9197e5ec75cf" - }, - "source": [ - "## Hybrid Search\n", - "In the last step, we perform a hybrid search that combines KNN using the embedded app description with a classic score function." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b07eaf8c-2b68-4d9a-8d15-d5e1c24b2aa7", - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/", - "height": 36 - }, - "id": "b07eaf8c-2b68-4d9a-8d15-d5e1c24b2aa7", - "outputId": "95f6925c-bccc-48e9-ee27-cba9ddf84aa6" - }, - "outputs": [], - "source": [ - "input_document = hyperspace_client.get_document(collection_name, 7960)\n", - "input_document['title'] + \"\\n\" + str(input_document['categories'])" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "5fff6916-3310-4653-98e9-6ec9dc68c1d1", - "metadata": { - "colab": { - "base_uri": "https://localhost:8080/" - }, - "id": "5fff6916-3310-4653-98e9-6ec9dc68c1d1", - "outputId": "58ffce97-acc2-4320-ca23-fa9c36f2b97a" - }, - "outputs": [], - "source": [ - "results = hyperspace_client.search({'params': input_document},\n", - " size=10,\n", - " function_name=\"ios_score\",\n", - " collection_name=collection_name)\n", - "\n", - "for i,result in enumerate(results['similarity']):\n", - " vector_api_response = hyperspace_client.get_document(document_id=result['document_id'], collection_name=collection_name)\n", - " response = f\"{i+1} - {result['document_id']} : {result['score']} --- \"\n", - " keys_str = \" - \".join([str(vector_api_response[k]) for k in [\"title\",\"bundle_id\",\"categories\"]])\n", - " print(response+keys_str)" - ] - }, - { - "cell_type": "markdown", - "source": [ - "This notebook demonstrated the use of Hyper search for classic, vector and hybrid search. For more info, visit us at [Hyper-space.io](https://www.hyper-space.io/)" - ], - "metadata": { - "id": "Sj6caGDZEHBz" - }, - "id": "Sj6caGDZEHBz" - } - ], - "metadata": { - "kernelspec": { - "name": "python3", - "language": "python", - "display_name": "Python 3 (ipykernel)" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.11" - }, - "colab": { - "provenance": [] - }, - "widgets": { - "application/vnd.jupyter.widget-state+json": { - "edab0fcca6734d49bf178ecd28e84aa2": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_dde169eb76a14027a51f72beec02b7a8", - "IPY_MODEL_a0459f9566a041c78d1d18a67eeefd46", - "IPY_MODEL_ba9de2869272485b89537cc762b4d398" + "cells": [ + { + "cell_type": "markdown", + "source": [ + "![63f78014766fd30436c18a79_Hyperspace - navbar logo.png]()\n", + "\n", + "# Application Semantic Search Using Hyperspace\n", + "\n", + "This notebook demonstrates the use of Hyperspace to perform hybrid search over an App database.\n", + "In addition to hybrid search, the notebook includes examples for classic search and vector search, based on embedding of a user provided query.\n", + "\n", + "The relevent score functions can be downloaded from [Hyperspace git](https://github.com/hyper-space-io/QuickStart/blob/main/DataSets/AdVec/classic_score.py).\n", + "For more info, see the [Hyperspace documentation](https://docs.hyper-space.io/hyperspace-docs/getting-started/overview).\n", + "\n", + "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/hyper-space-io/QuickStart/blob/master/DataSets/AdVec/advec_demo.ipynb)\n", + "# The Dataset\n", + "![AdVec_logo.PNG]()\n", + "\n", + "The dataset includes 89330 documents with the following fields:\n", + "1. **id** [float] - unique identifier per application\n", + "2. **title** [Keyword] - Application name\n", + "3. **bundle_id** [keyword] - identifier of the App bundle, if such exists\n", + "4. **ios** [boolean] - Is the App an IOS App (True) or Android (False)\n", + "5. **categories** [list[keyword]] - list of categories to which the App belongs\n", + "6. **content** [Keyword] - app description as text\n", + "7. **embedded_app** [list[float]] - text embedding of the app description. Text was embedded using the Hugging face [bge-small-en model](https://huggingface.co/BAAI/bge-small-en)\n", + "\n", + "The data was taken from [AdVec ML](https://demo.advecml.com/) and the search engine was built in collabortation with [Argmax.io](https://www.linkedin.com/company/argmax/?originalSubdomain=il).\n", + "The data can be downloaded from the following links: [vectors](http://hyperspace-datasets.s3.amazonaws.com/vectors.npy)\n", + ", [metadata](http://hyperspace-datasets.s3.amazonaws.com/context.jsonl)\n", + "\n", + "# Hybrid search with Hyperspace\n", + "This notebook combines brute-force KNN (accurate) with metadata filtering. In this scheme, Hyperspace uses the pre-filtering approach, by which the metadata is first filtered, and KNN is applied only to vectors that pass the initial filtering. With KNN, this approach optimizes the query latency without reducing its recall.\n", + "\n", + "![image.png]()\n" ], - "layout": "IPY_MODEL_e26f297fcfd74e81b06241aac3b49023" - } - }, - "dde169eb76a14027a51f72beec02b7a8": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_7970346e5d7547bd8e5e02695ccc5b49", - "placeholder": "​", - "style": "IPY_MODEL_68c216897f5d403bb00dd52dc32c6019", - "value": ".gitattributes: 100%" - } - }, - "a0459f9566a041c78d1d18a67eeefd46": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_e9c047d2947e496c93974f5f49a5c172", - "max": 1519, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_f355a9ec295f4c078163992c73662d28", - "value": 1519 - } - }, - "ba9de2869272485b89537cc762b4d398": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_7b6dd89030ff43d08880d3ad3613a7c0", - "placeholder": "​", - "style": "IPY_MODEL_ed06c1df20ee40fb98f1cc3b5fcc827d", - "value": " 1.52k/1.52k [00:00<00:00, 11.8kB/s]" - } - }, - "e26f297fcfd74e81b06241aac3b49023": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "7970346e5d7547bd8e5e02695ccc5b49": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "68c216897f5d403bb00dd52dc32c6019": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "e9c047d2947e496c93974f5f49a5c172": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "f355a9ec295f4c078163992c73662d28": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "7b6dd89030ff43d08880d3ad3613a7c0": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "ed06c1df20ee40fb98f1cc3b5fcc827d": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "841dfdd7fcef43dab3055b13300c3652": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_37794a9b345c4456bc8faf36942af2dd", - "IPY_MODEL_bfc09d93ac2040b7ae5e46830c00d58c", - "IPY_MODEL_94fc6060c95544549b7b59d1df345819" + "metadata": { + "id": "_trhSpIUhamm" + }, + "id": "_trhSpIUhamm" + }, + { + "cell_type": "markdown", + "source": [ + "# Setting up the Hyperspace environment\n", + "Working with Hyperspace requires the followin steps\n", + "\n", + "1. Install the client API\n", + "2. Create data config file\n", + "3. Connect to a server\n", + "4. Create collection\n", + "5. Ingest data\n", + "6. Run query" ], - "layout": "IPY_MODEL_cf26414873dc439e9916b5ff228c4bf4" - } - }, - "37794a9b345c4456bc8faf36942af2dd": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_ded50ec8861d4d3998547189e1beeab2", - "placeholder": "​", - "style": "IPY_MODEL_5361ce35bc6343039fde123a4af9b58d", - "value": "1_Pooling/config.json: 100%" - } - }, - "bfc09d93ac2040b7ae5e46830c00d58c": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_f96975b7c17243778855154d885152f9", - "max": 190, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_824ddea4eaea482db26b873e582c8a03", - "value": 190 - } - }, - "94fc6060c95544549b7b59d1df345819": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_aefab71bec87427f82e4dd0285f6dda4", - "placeholder": "​", - "style": "IPY_MODEL_8ae88cb01a2c4bdcb45f0a92acb90442", - "value": " 190/190 [00:00<00:00, 1.81kB/s]" - } - }, - "cf26414873dc439e9916b5ff228c4bf4": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "ded50ec8861d4d3998547189e1beeab2": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "5361ce35bc6343039fde123a4af9b58d": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "f96975b7c17243778855154d885152f9": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "824ddea4eaea482db26b873e582c8a03": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "aefab71bec87427f82e4dd0285f6dda4": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "8ae88cb01a2c4bdcb45f0a92acb90442": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "ef834f09ab1e4046843ed1733bcbeba9": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_2ebc9188a308434bb256e91e80dc3e2e", - "IPY_MODEL_c696d208a7554ba1a5971e047687721a", - "IPY_MODEL_c157baf6a71c4ca093b638f6e3448d63" + "metadata": { + "id": "K41CEp06-JmN" + }, + "id": "K41CEp06-JmN" + }, + { + "cell_type": "markdown", + "source": [ + "## 1. Install the client API\n", + "You can install the Hyperspace API directly from Git by executing the following command:" ], - "layout": "IPY_MODEL_d5f2c1fc46cb402596d57420efa8ca10" - } - }, - "2ebc9188a308434bb256e91e80dc3e2e": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_175e7704c0b74f9cb122484a73f38401", - "placeholder": "​", - "style": "IPY_MODEL_b56f7bf3439e415792086133ab517a26", - "value": "README.md: 100%" - } - }, - "c696d208a7554ba1a5971e047687721a": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_4740498b4bb44f399e67163e4b2728d2", - "max": 90189, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_93b82ea6ee024af794d3948503f2ac15", - "value": 90189 - } - }, - "c157baf6a71c4ca093b638f6e3448d63": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_c72d2cb4b307474db663fc4bdaa9ff9c", - "placeholder": "​", - "style": "IPY_MODEL_cc6f8809c55545c39dbf95f94a8c5ac7", - "value": " 90.2k/90.2k [00:00<00:00, 1.04MB/s]" - } - }, - "d5f2c1fc46cb402596d57420efa8ca10": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "175e7704c0b74f9cb122484a73f38401": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "b56f7bf3439e415792086133ab517a26": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "4740498b4bb44f399e67163e4b2728d2": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "93b82ea6ee024af794d3948503f2ac15": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "c72d2cb4b307474db663fc4bdaa9ff9c": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "cc6f8809c55545c39dbf95f94a8c5ac7": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "0fe94447ec1742cbbe755e10e96659fc": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_268ce921f6994617a0320e3afcd35c77", - "IPY_MODEL_5bd260d90fcd449590f659554344d002", - "IPY_MODEL_b205d9cfe8334bb791d16477cf9005d8" + "metadata": { + "id": "7UVt24r6-Mft" + }, + "id": "7UVt24r6-Mft" + }, + { + "cell_type": "code", + "source": [ + "pip install git+https://github.com/hyper-space-io/hyperspace-py" ], - "layout": "IPY_MODEL_867f79f4e5ff4067a487456eff17e795" - } - }, - "268ce921f6994617a0320e3afcd35c77": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_02cbe0caf1c94c0eac7130cfd523da03", - "placeholder": "​", - "style": "IPY_MODEL_30f9a3ea9c03491091c7eec441219ef6", - "value": "config.json: 100%" - } - }, - "5bd260d90fcd449590f659554344d002": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_56ab87077873497b9730cd2306757d84", - "max": 684, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_fca4944712734053b1f12479895b8a5f", - "value": 684 - } - }, - "b205d9cfe8334bb791d16477cf9005d8": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_4a44360888194d8e95eb2c0a74c97dbf", - "placeholder": "​", - "style": "IPY_MODEL_bf692c66cdb3420b99e9699ca70af6a5", - "value": " 684/684 [00:00<00:00, 7.28kB/s]" - } - }, - "867f79f4e5ff4067a487456eff17e795": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "02cbe0caf1c94c0eac7130cfd523da03": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "30f9a3ea9c03491091c7eec441219ef6": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "56ab87077873497b9730cd2306757d84": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "fca4944712734053b1f12479895b8a5f": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "4a44360888194d8e95eb2c0a74c97dbf": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "bf692c66cdb3420b99e9699ca70af6a5": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "4bcb60b1268d4e86b00d12c979591ddb": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_1443b51e5f034419a6c645e47e661449", - "IPY_MODEL_bdb2ed9d6a2941bc804a20d1b9246cc8", - "IPY_MODEL_fa44afce87da4a94800d014e4651e19c" + "metadata": { + "id": "edxBW-er-Lvi" + }, + "id": "edxBW-er-Lvi", + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "markdown", + "source": [ + "### Download dataset" ], - "layout": "IPY_MODEL_896cfd5cf3c64be6862c12101b6868d4" - } - }, - "1443b51e5f034419a6c645e47e661449": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_103082c03c2a4078ad7f18cc75bf8838", - "placeholder": "​", - "style": "IPY_MODEL_225b97737ed5415aafb7de51b70c650c", - "value": "config_sentence_transformers.json: 100%" - } - }, - "bdb2ed9d6a2941bc804a20d1b9246cc8": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_c302387a15b44ca195e17d3a814660e5", - "max": 124, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_f2fec96988c044d5936726bfad7a8de6", - "value": 124 - } - }, - "fa44afce87da4a94800d014e4651e19c": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_bf3aafe4b3194d97bf7529743de0c388", - "placeholder": "​", - "style": "IPY_MODEL_de6e42cf341b467d8473f5ca47605a06", - "value": " 124/124 [00:00<00:00, 1.02kB/s]" - } - }, - "896cfd5cf3c64be6862c12101b6868d4": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "103082c03c2a4078ad7f18cc75bf8838": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "225b97737ed5415aafb7de51b70c650c": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "c302387a15b44ca195e17d3a814660e5": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "f2fec96988c044d5936726bfad7a8de6": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "bf3aafe4b3194d97bf7529743de0c388": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "de6e42cf341b467d8473f5ca47605a06": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "f62091b77fee4c139a9e74df7416d17b": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_6f12628603954490be891844c2ad525a", - "IPY_MODEL_8bd3f746083244cf860e7e339998a158", - "IPY_MODEL_2c46df6c36ec47ac8b4b68367ff9a842" + "metadata": { + "collapsed": false, + "id": "cd585f6aa21b383a" + }, + "id": "cd585f6aa21b383a" + }, + { + "cell_type": "code", + "execution_count": null, + "outputs": [], + "source": [ + "from urllib.request import urlretrieve\n", + "import os\n", + "\n", + "def download_data(url, file_name):\n", + " \"\"\"\n", + " url (str): URL of the file to download.\n", + " file_name (str): Local path where the file will be saved.\n", + " \"\"\"\n", + " # Check if the file already exists and is not empty\n", + " if os.path.exists(file_name) and os.path.getsize(file_name) > 0:\n", + " print(f\"The file {file_name} already exists and is not empty.\")\n", + " else:\n", + " try:\n", + " # Attempt to download the file from `url` and save it locally under `file_name`\n", + " urlretrieve(url, file_name)\n", + " # Check if the file was downloaded and is not empty\n", + " if os.path.exists(file_name) and os.path.getsize(file_name) > 0:\n", + " print(f\"Successfully downloaded {file_name}\")\n", + " else:\n", + " print(\"Download failed or file is empty.\")\n", + "\n", + " except Exception as e:\n", + " print(f\"An error occurred: {e}\")\n" ], - "layout": "IPY_MODEL_96571e4398b64ae4a440838e3b699711" - } - }, - "6f12628603954490be891844c2ad525a": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_0190d9216db747d2b6c4cf6f7e3deed8", - "placeholder": "​", - "style": "IPY_MODEL_be90bb1519884ef6a42ba9ff66706292", - "value": "model.safetensors: 100%" - } - }, - "8bd3f746083244cf860e7e339998a158": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_f2c7b0c095a148918c3d7941e943a584", - "max": 133466304, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_06593db36a3f4d7f82c682001f3b7421", - "value": 133466304 - } - }, - "2c46df6c36ec47ac8b4b68367ff9a842": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_7ec550f71fbb4fa98a019c6a19656ad4", - "placeholder": "​", - "style": "IPY_MODEL_3a8d7229525e4ba7bc376ac82df5d2d1", - "value": " 133M/133M [00:02<00:00, 91.8MB/s]" - } - }, - "96571e4398b64ae4a440838e3b699711": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "0190d9216db747d2b6c4cf6f7e3deed8": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "be90bb1519884ef6a42ba9ff66706292": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "f2c7b0c095a148918c3d7941e943a584": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "06593db36a3f4d7f82c682001f3b7421": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "7ec550f71fbb4fa98a019c6a19656ad4": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "3a8d7229525e4ba7bc376ac82df5d2d1": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "8c8cd5238b9d49a3b8396fe6ada21a40": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_fb4093faaa6349fda9bddb7647d3d846", - "IPY_MODEL_63f117175e5d4342b9f7d1f443874070", - "IPY_MODEL_371bc87227b94fe4a9cf7726e20593c6" + "metadata": { + "id": "9384a45a031b2a23" + }, + "id": "9384a45a031b2a23" + }, + { + "cell_type": "code", + "execution_count": null, + "outputs": [], + "source": [ + "metadata_url = \"http://hyperspace-datasets.s3.amazonaws.com/context.jsonl\"\n", + "vectors_url = \"http://hyperspace-datasets.s3.amazonaws.com/vectors.npy\"\n", + "download_data(metadata_url, \"./context.jsonl\")\n", + "download_data(vectors_url, \"./vectors.npy\")" ], - "layout": "IPY_MODEL_ecf3daf027734c99a94e91f8990c6955" - } - }, - "fb4093faaa6349fda9bddb7647d3d846": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_a84dd759931842a1ade720f7358caf3a", - "placeholder": "​", - "style": "IPY_MODEL_1adff9ab83bd43c69909e2fe3d31ef9a", - "value": "pytorch_model.bin: 100%" - } - }, - "63f117175e5d4342b9f7d1f443874070": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_c4daa6cde32f473d8245827d7dfc2c7c", - "max": 133508397, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_f48249b28c0846b5902d3e5805454df0", - "value": 133508397 - } - }, - "371bc87227b94fe4a9cf7726e20593c6": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_0c4cf500201c4105b656d5890324fa61", - "placeholder": "​", - "style": "IPY_MODEL_40ced2dae396443f86faf07bbd923a90", - "value": " 134M/134M [00:04<00:00, 27.5MB/s]" - } - }, - "ecf3daf027734c99a94e91f8990c6955": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "a84dd759931842a1ade720f7358caf3a": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "1adff9ab83bd43c69909e2fe3d31ef9a": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "c4daa6cde32f473d8245827d7dfc2c7c": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "f48249b28c0846b5902d3e5805454df0": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "0c4cf500201c4105b656d5890324fa61": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "40ced2dae396443f86faf07bbd923a90": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "7cd8d8954a0140e2ac192b762669ef5c": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_f5a823df899a4a2e9e68396a0229f259", - "IPY_MODEL_c13d059e56a5455ea6057dfcc4003524", - "IPY_MODEL_e219e9aa16014593abbf1843125130e0" + "metadata": { + "id": "57cc9837ffc666d9" + }, + "id": "57cc9837ffc666d9" + }, + { + "cell_type": "markdown", + "metadata": { + "id": "TCZSwM6DVeDm" + }, + "source": [ + "## 2. Connect to a server\n", + "\n", + "Once the Hyperspace API is installed, you can access database by creating a local instance of the Hyperspace client. This step requires host address, username and password, provided by Hyperspace" ], - "layout": "IPY_MODEL_8222bd8a6e684f2abc9a4a033ef4c4d4" - } - }, - "f5a823df899a4a2e9e68396a0229f259": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_8c4d6a396c00443db3d17afb3b55ff35", - "placeholder": "​", - "style": "IPY_MODEL_42eef07ae276418fa02d6885401d5d5a", - "value": "sentence_bert_config.json: 100%" - } - }, - "c13d059e56a5455ea6057dfcc4003524": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_659df989d3c6491bacef23aca65a8e08", - "max": 52, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_5740a110796e42ae8ad3230e7876889a", - "value": 52 - } - }, - "e219e9aa16014593abbf1843125130e0": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_5fc767a015fa468ea30d2eb7b24b5a75", - "placeholder": "​", - "style": "IPY_MODEL_ea319de9c8254b59ae0d0e9209218e36", - "value": " 52.0/52.0 [00:00<00:00, 1.90kB/s]" - } - }, - "8222bd8a6e684f2abc9a4a033ef4c4d4": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "8c4d6a396c00443db3d17afb3b55ff35": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "42eef07ae276418fa02d6885401d5d5a": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "659df989d3c6491bacef23aca65a8e08": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "5740a110796e42ae8ad3230e7876889a": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "5fc767a015fa468ea30d2eb7b24b5a75": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "ea319de9c8254b59ae0d0e9209218e36": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "9bff5ae41d714e31b8472dfb163ff0d4": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_e1df1c91733c4108adee77846d11278e", - "IPY_MODEL_ebe50f5013774c9bbecc704dc9e6b3a1", - "IPY_MODEL_3951e3f34008406a86d69473e8beb32f" + "id": "TCZSwM6DVeDm" + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "17a978a2" + }, + "outputs": [], + "source": [ + "import hyperspace\n", + "from getpass import getpass\n", + "\n", + "username = \"USERNAME\"\n", + "host = \"HOST_URL\"\n", + "\n", + "hyperspace_client = hyperspace.HyperspaceClientApi(host=host, username=username, password=getpass())\n" ], - "layout": "IPY_MODEL_11ef22c9716d439a96f1896ef05a4470" - } - }, - "e1df1c91733c4108adee77846d11278e": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_129ab1a2f2d946ca9c70723ce46d0f5b", - "placeholder": "​", - "style": "IPY_MODEL_49ea66c4778343608641e04a0cb19637", - "value": "special_tokens_map.json: 100%" - } - }, - "ebe50f5013774c9bbecc704dc9e6b3a1": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_cb56178d93ab43bd9143eb01da0eef1f", - "max": 125, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_30d5690e1b014dccba847f2f68abba4f", - "value": 125 - } - }, - "3951e3f34008406a86d69473e8beb32f": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_70cfc3b2ddc146dd8bfb595c199ffe8a", - "placeholder": "​", - "style": "IPY_MODEL_6ac778f22cdd4e49b78d455e37491b6c", - "value": " 125/125 [00:00<00:00, 3.19kB/s]" - } - }, - "11ef22c9716d439a96f1896ef05a4470": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "129ab1a2f2d946ca9c70723ce46d0f5b": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "49ea66c4778343608641e04a0cb19637": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "cb56178d93ab43bd9143eb01da0eef1f": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "30d5690e1b014dccba847f2f68abba4f": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "70cfc3b2ddc146dd8bfb595c199ffe8a": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "6ac778f22cdd4e49b78d455e37491b6c": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "95d4316e8e5a4f92aae48d22824df3b9": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_ba49dc57a28c4f319b49211b0669b95a", - "IPY_MODEL_5af787f313364d54a95121512f030510", - "IPY_MODEL_e81b9e3dfd4b49a49a5c42feb5b65f04" + "id": "17a978a2" + }, + { + "cell_type": "markdown", + "metadata": { + "id": "HXmdh3YGVfQV" + }, + "source": [ + "## 3. Create a Data Schema File\n", + "\n", + "As other search databases, Hyper-Space database requires a configuration file that outlines the data schema. Attached below is a config file that corresponds to the fields of the given dataset.\n", + "\n", + "For vector fields, we also provide the index type to be used, and the metric. . Current options for index include \"**brute_force**\", \"**hnsw**\", \"**ivf**\", and \"**bin_ivf**\" for binary vectors, and \"**IP**\" ([inner product](https://en.wikipedia.org/wiki/Inner_product_space)) as a metric for floating point vectors and \"**Hamming**\" ([hamming distance](https://en.wikipedia.org/wiki/Hamming_distance)) for binary vectors.\n", + "Here, we use \"brute_force\" (exact KNN) with inner product." ], - "layout": "IPY_MODEL_41ce1cfa2e054d4daf95177682bbcf34" - } - }, - "ba49dc57a28c4f319b49211b0669b95a": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_dc89470ee36847a78880c997a0b9f63e", - "placeholder": "​", - "style": "IPY_MODEL_467c02d8f5f147908729d80de1f51f70", - "value": "tokenizer.json: 100%" - } - }, - "5af787f313364d54a95121512f030510": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_514eed6d562344f7866867ff4550dbcb", - "max": 711396, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_a4a139ce4cb94f91a78839a406e67f31", - "value": 711396 - } - }, - "e81b9e3dfd4b49a49a5c42feb5b65f04": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_d9b549be0e594b6a8105d77221ca40f0", - "placeholder": "​", - "style": "IPY_MODEL_d9306c2d52724a31b8d29a90c123f508", - "value": " 711k/711k [00:00<00:00, 11.8MB/s]" - } - }, - "41ce1cfa2e054d4daf95177682bbcf34": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "dc89470ee36847a78880c997a0b9f63e": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "467c02d8f5f147908729d80de1f51f70": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "514eed6d562344f7866867ff4550dbcb": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "a4a139ce4cb94f91a78839a406e67f31": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "d9b549be0e594b6a8105d77221ca40f0": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "d9306c2d52724a31b8d29a90c123f508": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "a2c457a03fb04ee9b155e65a082e58e5": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_a624d6afa6a2418592026f037c48e253", - "IPY_MODEL_1ee5fb825f114993b575795584f98301", - "IPY_MODEL_b0a2a43e44e047349248b9b94a0f58d0" + "id": "HXmdh3YGVfQV" + }, + { + "cell_type": "code", + "execution_count": null, + "id": "63ea0ce9-ce9a-45b2-9747-2f0e504c3514", + "metadata": { + "id": "63ea0ce9-ce9a-45b2-9747-2f0e504c3514" + }, + "outputs": [], + "source": [ + "import json\n", + "\n", + "config = {\n", + " \"configuration\": {\n", + " \"id\": {\n", + " \"type\": \"keyword\",\n", + " \"id\": True\n", + " },\n", + " \"title\":{\n", + " \"type\":\"keyword\"\n", + " },\n", + " \"bundle_id\": {\n", + " \"type\":\"keyword\"\n", + " },\n", + " \"ios\":{\n", + " \"type\":\"boolean\"\n", + " },\n", + " \"categories\": {\n", + " \"type\":\"keyword\",\n", + " \"struct_type\":\"list\"\n", + " },\n", + " \"content\": {\n", + " \"type\":\"keyword\"\n", + " },\n", + " \"embedded_app\": {\n", + " \"type\": \"dense_vector\",\n", + " \"dim\": 384,\n", + " \"index_type\": \"brute_force\",\n", + " \"metric\": \"IP\"\n", + " }\n", + " }\n", + "}\n", + "\n", + "with open('advec_config.json', 'w') as f:\n", + " f.write(json.dumps(config, indent=2))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "7RTDkUsr3ead" + }, + "source": [ + "## 4. Create Collection\n", + "The Hyerspace engine stroes data in Collections, where each collecction commonly hosts data of similar context, etc. Each search is then perfomed within a collection. We create a collection using the command \"**create_collection**(schema_filename, collection_name)\"." ], - "layout": "IPY_MODEL_113da0edbfb14ef0b1844020dc814c33" - } - }, - "a624d6afa6a2418592026f037c48e253": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_34b6b2cf879e4a87a66af87888ccd0b9", - "placeholder": "​", - "style": "IPY_MODEL_3c40e4d223b145bcb9ef881d8d5fb268", - "value": "tokenizer_config.json: 100%" - } - }, - "1ee5fb825f114993b575795584f98301": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_cf9182ed0fbd4e5bac5b5b0badfdc938", - "max": 366, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_a6b694a9ea964133bf085967984f1bf7", - "value": 366 - } - }, - "b0a2a43e44e047349248b9b94a0f58d0": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_99971a53f7194457bbc06e6458b340d6", - "placeholder": "​", - "style": "IPY_MODEL_22ddbb4eccf04e50b77e6f64df8f6cee", - "value": " 366/366 [00:00<00:00, 11.0kB/s]" - } - }, - "113da0edbfb14ef0b1844020dc814c33": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "34b6b2cf879e4a87a66af87888ccd0b9": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "3c40e4d223b145bcb9ef881d8d5fb268": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "cf9182ed0fbd4e5bac5b5b0badfdc938": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "a6b694a9ea964133bf085967984f1bf7": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "99971a53f7194457bbc06e6458b340d6": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "22ddbb4eccf04e50b77e6f64df8f6cee": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "d661e8d0f34241f98be969a3d0853f6e": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_7dbd603ea2e741cc808ba5f0798f70fb", - "IPY_MODEL_42ef47c7184e45649a3f2e0434e9ecc9", - "IPY_MODEL_3d1b70ed00d74d288198320225ef8bd4" + "id": "7RTDkUsr3ead" + }, + { + "cell_type": "code", + "execution_count": null, + "id": "092053ea-a4e2-4dbb-90c7-4adbfc953384", + "metadata": { + "id": "092053ea-a4e2-4dbb-90c7-4adbfc953384" + }, + "outputs": [], + "source": [ + "collection_name = 'advec'\n", + "if collection_name not in hyperspace_client.collections_info()[\"collections\"]:\n", + " hyperspace_client.create_collection('advec_config.json', collection_name)\n", + "\n", + "hyperspace_client.collections_info()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "lUpgHD2VWFXd" + }, + "source": [ + "# 5. Ingest Data\n", + "\n", + "In the next step, we ingest the dataset in batches. The number documents in each batch can be controlled by the user, and specifically, it can be increased to reduce ingestion time.\n", + "Batches of data are added using the add_batch(batch, collection_name) command" ], - "layout": "IPY_MODEL_7ae4b92b9e144702977befd7c0b45f91" - } - }, - "7dbd603ea2e741cc808ba5f0798f70fb": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_2bf03c2eeccc4572b8903c92cc8f30d1", - "placeholder": "​", - "style": "IPY_MODEL_80b3dec35c2f484fb05ae76518a44428", - "value": "vocab.txt: 100%" - } - }, - "42ef47c7184e45649a3f2e0434e9ecc9": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_043077e63bed45329811efc1e07aa35c", - "max": 231508, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_af04eee5a56840b384a1c5dbaa679cab", - "value": 231508 - } - }, - "3d1b70ed00d74d288198320225ef8bd4": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_8d25496b978f4df38ca2e5781b5a5472", - "placeholder": "​", - "style": "IPY_MODEL_ae2d2c1526fc476b8ddca3b71f520cd9", - "value": " 232k/232k [00:00<00:00, 4.53MB/s]" - } - }, - "7ae4b92b9e144702977befd7c0b45f91": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "2bf03c2eeccc4572b8903c92cc8f30d1": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "80b3dec35c2f484fb05ae76518a44428": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "043077e63bed45329811efc1e07aa35c": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "af04eee5a56840b384a1c5dbaa679cab": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "8d25496b978f4df38ca2e5781b5a5472": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "ae2d2c1526fc476b8ddca3b71f520cd9": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "e16139aa130d4b90af89e7b67c5a01ae": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HBoxModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HBoxModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HBoxView", - "box_style": "", - "children": [ - "IPY_MODEL_e86bf22fa4e943f8bc448a284385c531", - "IPY_MODEL_e9166ca449f04fb783e06baa9f8413b1", - "IPY_MODEL_d08e4837a5314677808b8dcb4d67162d" + "id": "lUpgHD2VWFXd" + }, + { + "cell_type": "code", + "source": [ + "import numpy as np\n", + "vectors_path = \"vectors.npy\"\n", + "data_file_path = \"context.jsonl\"\n", + "vecs = np.load(vectors_path)\n", + "with open(data_file_path, encoding='cp437') as metadata_file:\n", + " metadata= [json.loads(row) for row in metadata_file]\n" ], - "layout": "IPY_MODEL_c0bd65caef214882a25faa0ae5615c08" - } - }, - "e86bf22fa4e943f8bc448a284385c531": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_215d0f33bd8d4692a375e185d0fb4ece", - "placeholder": "​", - "style": "IPY_MODEL_7dc550ffd0fe4448aff53c73ed5bdb73", - "value": "modules.json: 100%" - } - }, - "e9166ca449f04fb783e06baa9f8413b1": { - "model_module": "@jupyter-widgets/controls", - "model_name": "FloatProgressModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "FloatProgressModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "ProgressView", - "bar_style": "success", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_bac733d0ec2344f7ab2c1a68dc6210ba", - "max": 349, - "min": 0, - "orientation": "horizontal", - "style": "IPY_MODEL_f2dda96c9ec84ef4a0880db78ae26206", - "value": 349 - } - }, - "d08e4837a5314677808b8dcb4d67162d": { - "model_module": "@jupyter-widgets/controls", - "model_name": "HTMLModel", - "model_module_version": "1.5.0", - "state": { - "_dom_classes": [], - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "HTMLModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/controls", - "_view_module_version": "1.5.0", - "_view_name": "HTMLView", - "description": "", - "description_tooltip": null, - "layout": "IPY_MODEL_bc13d9867f384020a5ffae6ee51a6c44", - "placeholder": "​", - "style": "IPY_MODEL_162b9b0e624445d286fff0e805a7c573", - "value": " 349/349 [00:00<00:00, 11.3kB/s]" - } - }, - "c0bd65caef214882a25faa0ae5615c08": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "215d0f33bd8d4692a375e185d0fb4ece": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "7dc550ffd0fe4448aff53c73ed5bdb73": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } - }, - "bac733d0ec2344f7ab2c1a68dc6210ba": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } - }, - "f2dda96c9ec84ef4a0880db78ae26206": { - "model_module": "@jupyter-widgets/controls", - "model_name": "ProgressStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "ProgressStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "bar_color": null, - "description_width": "" - } - }, - "bc13d9867f384020a5ffae6ee51a6c44": { - "model_module": "@jupyter-widgets/base", - "model_name": "LayoutModel", - "model_module_version": "1.2.0", - "state": { - "_model_module": "@jupyter-widgets/base", - "_model_module_version": "1.2.0", - "_model_name": "LayoutModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "LayoutView", - "align_content": null, - "align_items": null, - "align_self": null, - "border": null, - "bottom": null, - "display": null, - "flex": null, - "flex_flow": null, - "grid_area": null, - "grid_auto_columns": null, - "grid_auto_flow": null, - "grid_auto_rows": null, - "grid_column": null, - "grid_gap": null, - "grid_row": null, - "grid_template_areas": null, - "grid_template_columns": null, - "grid_template_rows": null, - "height": null, - "justify_content": null, - "justify_items": null, - "left": null, - "margin": null, - "max_height": null, - "max_width": null, - "min_height": null, - "min_width": null, - "object_fit": null, - "object_position": null, - "order": null, - "overflow": null, - "overflow_x": null, - "overflow_y": null, - "padding": null, - "right": null, - "top": null, - "visibility": null, - "width": null - } + "metadata": { + "id": "TQsSPSqeTMXq" + }, + "id": "TQsSPSqeTMXq", + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0904d325-495f-410d-8bc6-f84a30bac7af", + "metadata": { + "scrolled": true, + "id": "0904d325-495f-410d-8bc6-f84a30bac7af" + }, + "outputs": [], + "source": [ + "\n", + "BATCH_SIZE = 500\n", + "\n", + "batch = []\n", + "for i, (metadata_row, vec) in enumerate(zip(metadata, vecs)):\n", + " row = {key: value for key, value in metadata_row.items() if key in config[\"configuration\"].keys()}\n", + " row['embedded_app'] = np.ndarray.tolist(vec)\n", + " row[\"id\"] = str(i)\n", + " batch.append(row)\n", + "\n", + " if i % BATCH_SIZE == 0:\n", + " response = hyperspace_client.add_batch(batch, collection_name)\n", + " batch.clear()\n", + " print(i, response)\n", + "response = hyperspace_client.add_batch(batch, collection_name)\n", + "batch.clear()\n", + "print(i, response)\n", + "hyperspace_client.commit(collection_name)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8bac72a3-45c7-445e-8f00-e206608969f5", + "metadata": { + "id": "8bac72a3-45c7-445e-8f00-e206608969f5" + }, + "outputs": [], + "source": [ + "hyperspace_client.collections_info()" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "8lRgTyKwWJ84" + }, + "source": [ + "# 6. Define Logic and Run a Query\n", + "In the last step we build a Hyperspace hybrid search query. We randomly select an App from the database and search for similar applications. The overall score is defined by the weights, provided under the \"boost\" fields. These weights allow to contorl the relative weights of the classic search and vector search scores." + ], + "id": "8lRgTyKwWJ84" + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5ff8391b-b679-43f7-bf79-00a14aef2981", + "metadata": { + "id": "5ff8391b-b679-43f7-bf79-00a14aef2981" + }, + "outputs": [], + "source": [ + "input_document = hyperspace_client.get_document(collection_name, 42)\n", + "print(input_document)" + ] + }, + { + "cell_type": "markdown", + "source": [ + "## Vector Search" + ], + "metadata": { + "id": "iawRCHKM_8Sj" + }, + "id": "iawRCHKM_8Sj" + }, + { + "cell_type": "markdown", + "source": [ + "Let us first perform a vector search over the embedded description. This step does not require a score function" + ], + "metadata": { + "id": "B_IsUeKYAB5W" + }, + "id": "B_IsUeKYAB5W" + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1ca689dd-b0f9-4ff4-aaea-c041b7135bb9", + "metadata": { + "id": "1ca689dd-b0f9-4ff4-aaea-c041b7135bb9" + }, + "outputs": [], + "source": [ + "results = hyperspace_client.search({'params': input_document,\n", + " 'knn' : [{'field': 'embedded_app',\"boost\": 0},\n", + " {'field': 'query',\"boost\": 1}]},\n", + " size=10,\n", + " collection_name=collection_name)\n", + "\n", + "for i,result in enumerate(results['similarity']):\n", + " vector_api_response = hyperspace_client.get_document(document_id=result['document_id'], collection_name=collection_name)\n", + " response = f\"{i+1} - {result['document_id']} : {result['score']} --- \"\n", + " keys_str = \" - \".join([str(vector_api_response[k]) for k in [\"title\",\"bundle_id\",\"categories\"]])\n", + " print(response+keys_str)" + ] + }, + { + "cell_type": "markdown", + "id": "5f29e4ba-6e8b-43f7-97ff-3037b705dff0", + "metadata": { + "jp-MarkdownHeadingCollapsed": true, + "id": "5f29e4ba-6e8b-43f7-97ff-3037b705dff0" + }, + "source": [ + "## Classic Search\n", + "We repeat the process with classic search, using pre-defined score function, that can be downloaded from [Hyperspace git](https://github.com/hyper-space-io/QuickStart/blob/main/DataSets/AdVec/classic_score.py)" + ] + }, + { + "cell_type": "code", + "source": [ + "import inspect\n", + "\n", + "def set_score_function(func, collection_name, score_function_name='func'):\n", + " source = inspect.getsource(func)\n", + " with open('sf.py', 'w') as f:\n", + " f.write(source)\n", + " return hyperspace_client.set_function('sf.py', collection_name, score_function_name)\n", + "" + ], + "metadata": { + "id": "G6acmbv1vnNq" + }, + "id": "G6acmbv1vnNq", + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "source": [ + "def ios_score(params, doc):\n", + " if match(\"bundle_id\"):\n", + " return 0.0\n", + "\n", + " score = 0.0\n", + " if match(\"categories\"):\n", + " score += rarity_sum(\"categories\")\n", + "\n", + " if doc[\"ios\"]:\n", + " score *= 2\n", + "\n", + " return score\n", + "\n", + "hyperspace_client.set_function(ios_score, collection_name, \"ios_score\")" + ], + "metadata": { + "id": "g3PLGAF_t2eH" + }, + "id": "g3PLGAF_t2eH", + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f5fa75ff-6559-462e-b5fb-60f8cfb268ce", + "metadata": { + "id": "f5fa75ff-6559-462e-b5fb-60f8cfb268ce" + }, + "outputs": [], + "source": [ + "query_with_knn = {\n", + " 'params': input_document,\n", + " 'knn' : [{'field': 'embedded_app',\"boost\": 0},\n", + " {'field': 'query',\"boost\": 1}]\n", + "}\n", + "\n", + "results = hyperspace_client.search(query_with_knn,\n", + " size=10,\n", + " function_name=\"ios_score\",\n", + " collection_name=collection_name)\n", + "\n", + "for i,result in enumerate(results['similarity']):\n", + " vector_api_response = hyperspace_client.get_document(document_id=result['document_id'], collection_name=collection_name)\n", + " response = f\"{i+1} - {result['document_id']} : {result['score']} --- \"\n", + " keys_str = \" - \".join([str(vector_api_response[k]) for k in [\"title\",\"bundle_id\",\"categories\"]])\n", + " print(response+keys_str)" + ] + }, + { + "cell_type": "markdown", + "id": "31b4bd2a-3fb0-432d-9f29-e556fd0cd870", + "metadata": { + "id": "31b4bd2a-3fb0-432d-9f29-e556fd0cd870" + }, + "source": [ + "## Embed User Query and Search\n", + "Let us now create a free text query, embed it using the [bge-small-en model](https://huggingface.co/BAAI/bge-small-en) and retrieve relevant apps" + ] + }, + { + "cell_type": "code", + "source": [ + "!pip install sentence_transformers" + ], + "metadata": { + "id": "gupjaQ57dt0j" + }, + "id": "gupjaQ57dt0j", + "execution_count": null, + "outputs": [] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "dfd1e3b2-3be5-42d2-ab8f-6aa68e708e75", + "metadata": { + "id": "dfd1e3b2-3be5-42d2-ab8f-6aa68e708e75" + }, + "outputs": [], + "source": [ + "import sentence_transformers\n", + "\n", + "from sentence_transformers import SentenceTransformer\n", + "embbeding_model = SentenceTransformer('BAAI/bge-small-en')" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "outputs": [], + "source": [ + "sim_sentence = \"\"\"a great app for gaming with my friends \"\"\"\n", + "sim_embedding = embbeding_model.encode([sim_sentence], normalize_embeddings=True)[0]\n", + "\n", + "results = hyperspace_client.search({'params': {'embedded_app': sim_embedding.tolist()},\n", + " 'knn' : [{'field': 'embedded_app',\"boost\": 0},\n", + " {'field': 'query',\"boost\": 1}]},\n", + " size=10,\n", + " collection_name=collection_name)\n", + "\n", + "for i,result in enumerate(results['similarity']):\n", + " vector_api_response = hyperspace_client.get_document(document_id=result['document_id'], collection_name=collection_name)\n", + " response = f\"{i+1} - {result['document_id']} : {result['score']} --- \"\n", + " keys_str = \" - \".join([str(vector_api_response[k]) for k in [\"title\",\"bundle_id\",\"categories\"]])\n", + " print(response+keys_str)" + ], + "metadata": { + "id": "da81d11b390d1231" + }, + "id": "da81d11b390d1231" + }, + { + "cell_type": "markdown", + "id": "1d96433e-80fe-47b2-befb-9197e5ec75cf", + "metadata": { + "id": "1d96433e-80fe-47b2-befb-9197e5ec75cf" + }, + "source": [ + "## Hybrid Search\n", + "In the last step, we perform a hybrid search that combines KNN using the embedded app description with a classic score function." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b07eaf8c-2b68-4d9a-8d15-d5e1c24b2aa7", + "metadata": { + "id": "b07eaf8c-2b68-4d9a-8d15-d5e1c24b2aa7" + }, + "outputs": [], + "source": [ + "input_document = hyperspace_client.get_document(collection_name, 7960)\n", + "input_document['title'] + \"\\n\" + str(input_document['categories'])" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5fff6916-3310-4653-98e9-6ec9dc68c1d1", + "metadata": { + "id": "5fff6916-3310-4653-98e9-6ec9dc68c1d1" + }, + "outputs": [], + "source": [ + "results = hyperspace_client.search({'params': input_document},\n", + " size=10,\n", + " function_name=\"ios_score\",\n", + " collection_name=collection_name)\n", + "\n", + "for i,result in enumerate(results['similarity']):\n", + " vector_api_response = hyperspace_client.get_document(document_id=result['document_id'], collection_name=collection_name)\n", + " response = f\"{i+1} - {result['document_id']} : {result['score']} --- \"\n", + " keys_str = \" - \".join([str(vector_api_response[k]) for k in [\"title\",\"bundle_id\",\"categories\"]])\n", + " print(response+keys_str)" + ] + }, + { + "cell_type": "markdown", + "source": [ + "This notebook demonstrated the use of Hyper search for classic, vector and hybrid search. For more info, visit us at [Hyper-space.io](https://www.hyper-space.io/)" + ], + "metadata": { + "id": "Sj6caGDZEHBz" + }, + "id": "Sj6caGDZEHBz" + } + ], + "metadata": { + "kernelspec": { + "name": "python3", + "language": "python", + "display_name": "Python 3 (ipykernel)" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.11" }, - "162b9b0e624445d286fff0e805a7c573": { - "model_module": "@jupyter-widgets/controls", - "model_name": "DescriptionStyleModel", - "model_module_version": "1.5.0", - "state": { - "_model_module": "@jupyter-widgets/controls", - "_model_module_version": "1.5.0", - "_model_name": "DescriptionStyleModel", - "_view_count": null, - "_view_module": "@jupyter-widgets/base", - "_view_module_version": "1.2.0", - "_view_name": "StyleView", - "description_width": "" - } + "colab": { + "provenance": [] } - } - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} + }, + "nbformat": 4, + "nbformat_minor": 5 +} \ No newline at end of file