Release TickerTick stock news dataset 2023-11-23 · hczhu/TickerTick-API

Use the following link to download the dataset: Dataset Download Link
The dataset has close to 8 million news stories. The dataset file has each stock news story as a line in JSON format in reverse chronological order. An example news story in prettified multi-line JSON format is shown below:

{
  "title": "Europe gives Meta, TikTok six days to share information on response to Israel-Hamas conflict",
  "url": "https://www.cnbc.com/2023/10/19/israel-hamas-eu-gives-meta-tiktok-six-days-to-provide-information.html",
  "unix_timestamp": 1697727889,
  "id": "3341850707742811898",
  "tickers_direct": [
    "meta",
    "fb"
  ],
  "tickers_indirect": [
    ".bytedance"
  ],
  "description": "The EU said it would like Meta and TikTok to hand over information on how they're tackling misinformation about the Israel-Hamas war."
}

The fields of the JSON blob are explained below. Most of the fields have the same semantics as the ones in the response of TickerTick API.

Field name	Meaning	Optional field? (If yes, this field can be missing)
title	The title of this news story	No
url	The original URL for the full news story	No
unix_timestamp	The UNIX timestamp when the news was reported	No
id	A unique string ID of this news story	No
description	A short description of this news story	Yes
tickers_direct	The tickers that the news story is directly about, e.g., the name of the company for the ticker is mentioned	Yes
tickers_indirect	The tickers that the news story is indirectly about, e.g., the CEO or a product of the company for this ticker is mentioned	Yes

Note that many well-known pre-IPO startups (e.g., Bytedance, the parent company of TikTok) have made-up tickers like .bytedance and .databricks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TickerTick stock news dataset 2023-11-23