Skip to content

Latest commit

 

History

History
283 lines (221 loc) · 7.14 KB

EXTERNAL-SOURCES.md

File metadata and controls

283 lines (221 loc) · 7.14 KB

External data sources

External data sources are an alternative to regular data sources when you want to manage the data yourself.

Deep Intelligence will fetch the data from an API you must provide.

Advantages:

  • No storage cost. The external sources will appear as 0 size to your workspace.
  • Self hosted and self stored data.
  • Fully customizable. You can use any internal storage system and you can restrict the data access of your API.

Disadvantages:

  • You have to provide your own infrastructure.
  • Data updating features are disabled.

How to host an external source

In order to host an external source you need to provide an API with at least these endpoints:

Handshake endpoint

Property Value
Method GET
Path /
Header: x-public-key Source public key (an arbitrary string)
Header: x-secret-key Source secret key (an arbitrary string)
Expected Response The expected response if the authentication details are correct is 200 OK with a JSON body including a property public_key with the same value as the header x-public-key

This endpoint is called when you create the external source. To verify there is an active external source for the provided host.

Example response:

{
  "public_key": "string"
}

Metadata endpoint

Property Value
Method POST
Path /source/metadata
Header: x-public-key Source public key (an arbitrary string)
Header: x-secret-key Source secret key (an arbitrary string)
Expected Response The expected response if the authentication details are correct is 200 OK with a JSON body including the source metadata.

This endpoint is used to get the source metadata.

The metadata consists of the following fields:

  • id - Identifier of the source. Usually the public key.
  • features - Array of features, For each including the index and the type. Available types are: NOMINAL, TEXT, NUMERIC, LOGIC and DATE.
  • count - Amount of rows in the source.

Example response:

{
  "id": "iris",
  "features": [
    {
      "index": 0,
      "type": "NUMERIC"
    },
    {
      "index": 1,
      "type": "NUMERIC"
    },
    {
      "index": 2,
      "type": "NUMERIC"
    },
    {
      "index": 3,
      "type": "NUMERIC"
    },
    {
      "index": 4,
      "type": "NOMINAL"
    }
  ],
  "count": 150
}

Query endpoint

Property Value
Method POST
Path /source/query
Header: x-public-key Source public key (an arbitrary string)
Header: x-secret-key Source secret key (an arbitrary string)
Expected Response The expected response if the authentication details are correct is 200 OK with a JSON body including the source instances.

This endpoint is used to fetch data from the source.

The request body consists of the following fields:

  • filter - Data filter. See Sources filtering.
  • projection - List of features (indexes) to return, split by commas. Example: 0,1,2
  • order - Index of the feature to order by. Set to -1 to use default order.
  • dir - Can be asc (Ascendant) or desc (descendant), for the order direction.
  • skip - Number of rows to skip. Set to -1 to ignore.
  • limit - Max number of rows to return. Set to -1 for no limit.

Example request body:

{
  "filter": {
    "type": "single",
    "operation": "eq",
    "left": 4,
    "right": "setosa"
  },
  "projection": "1,2,4",
  "order": 1,
  "dir": "asc",
  "skip": 0,
  "limit": 150
}

The response has the following fields:

  • features - List of returned features, including the index and the type.
  • instances - Instances in bidimensional matrix format (array of arrays).

Example response:

{
  "features": [
    {
      "index": 0,
      "type": "NUMERIC"
    },
    {
      "index": 1,
      "type": "NUMERIC"
    },
    {
      "index": 2,
      "type": "NUMERIC"
    },
    {
      "index": 3,
      "type": "NUMERIC"
    },
    {
      "index": 4,
      "type": "NOMINAL"
    }
  ],
  "instances": [
    [
      5.1,
      3.5,
      1.4,
      0.2,
      "setosa"
    ],
    [
      4.9,
      3,
      1.4,
      0.2,
      "setosa"
    ]
  ]
}

Count endpoint

Property Value
Method POST
Path /source/count
Header: x-public-key Source public key (an arbitrary string)
Header: x-secret-key Source secret key (an arbitrary string)
Expected Response The expected response if the authentication details are correct is 200 OK with a JSON body including the number of expected instances.

This endpoint is used to count instances.

The request body consists of the following fields:

Example request body:

{
  "filter": {
    "type": "single",
    "operation": "eq",
    "left": 4,
    "right": "setosa"
  }
}

The response is a JSON with the count property.

{
  "count": 50
}

Nominal values enpoint

Property Value
Method POST
Path /source/nominal/values
Header: x-public-key Source public key (an arbitrary string)
Header: x-secret-key Source secret key (an arbitrary string)
Expected Response The expected response if the authentication details are correct is 200 OK with a JSON body including the list of nominal values.

This endpoint is used to fetch a list of available nominal values for nominal features.

The request body consists of the following fields:

  • feature - Nominal feature index.
  • filter - Data filter. See Sources filtering.
  • query - String that the nominal value must start with.

Example request body:

{
  "feature": 4,
  "filter": null,
  "query": ""
}

The response must be an array with the list of available values:

[
  "setosa",
  "versicolor",
  "virginica"
]

Note: You should limit the amount of values returned, for example, to 100, in order to speed up the requests.

Link an external source to Deep Intelligence

In order to link the external source you have to provide an URL containing the host, port, public key and secret key. Example:

https://example.sources.deepint.net/?pub=public-key&secret=secret-key

Update the source

You can change the data internally when using external sources. However, there are 2 concerns:

  • The cache
  • The alerts

In order to invalidate the cache and trigger alerts, you have to use the external source updating endpoint:

Calling this endpoint will invalidate the cache and trigger alerts using the instances you provided in the request.

Note: the endpoint only accepts 100 instances for each request, so if you have more you need to split the instances in multiple requests.

Example implementations