The FLEDGE Key/Value server is used to send real-time signals to the buyers and the sellers during a FLEDGE auction. The server reads files from a cloud file storage service. This doc explains the expected file format, and processes to perform the common data loading operations:
- For simple testing, you can use the sample data generator provided.
- To actually integrate with your own data source, you will need to write C++ code. Use the sample data generator as an example reference.
- The data generation part is a general process that applies to all cloud providers, but the uploading instructions are for AWS only.
Data is consumed as delta. Newer data read will overwrite the key-value pair, if one already exists.
Delta file name must conform to the regular expression “DELTA_\d{16}”. See public/constants.h for the most up-to-date format.
A tool is available to generate sample data in Riegeli format. From the repo base directory, run:
> ./tools/serving_data_generator/generate_test_riegeli_data
Confirm that the sample data file riegeli_data
has been generated.
The server watches an S3 bucket for new files. The bucket name is provided by you in the Terraform config and is globally unique.
You can use the AWS CLI to upload the sample data to S3, or you can also use the UI.
> S3_BUCKET="[[YOUR_BUCKET]]"
> aws s3 cp riegeli_data s3://${S3_BUCKET}/DELTA_001
Cauition: The filename must start with
DELTA_
prefix, followed by a 16-digit number.
Confirm that the file is present in the S3 bucket:
Today only C++ is supported to generate real data. The file format is Riegeli. The actual data record is in Flatbuffers format.
To generate the data:
- Use
public/data/records_utils.h
library to create the Flatbuffers record. - Use Riegeli libraries to write the records to a file. The file must contain metadata in the
format of
public/data_loading/riegeli_metadata.proto
. The code fromtools/serving_data_generator/test_serving_data_generator.cc
can be used as a good reference on how to write the file.
AWS provides libraries to communicate with S3, such as the C++ SDK. As soon as a file is uploaded to a watched bucket it will be read into the service, assuming that it has a higher logical commit timestamp.