diff --git a/README.md b/README.md index 3d24434..360d487 100644 --- a/README.md +++ b/README.md @@ -140,13 +140,16 @@ logging.info(ci.configuration.parameters[SOME_PARAMETER]) ## Processing input tables - Manifest vs I/O mapping -Input and output tables specified by user are listed in the [configuration file](/extend/common-interface/config-file/). -Apart from that, all input tables provided by user also include manifest file with additional metadata. +All input tables provided by user include manifest file with additional metadata. Tables and their manifest files are represented by the `keboola.component.dao.TableDefinition` object and may be loaded using the convenience method `get_input_tables_definitions()`. The result object contains all metadata about the table, such as manifest file representations, system path and name. +Apart from that, input and output tables specified by user are listed in +the [configuration file](/extend/common-interface/config-file/). +In most cases it is not recommended to use this option, as it is not compatible with Processors and component chaining. + ### Manifest & input folder content ```python @@ -167,7 +170,10 @@ logging.info(f'The first table has following columns defined in the manifest {fi ``` -### Using I/O mapping +### (Alternative) Using I/O mapping + +**NOTE** This is a legacy option, it is recommended to use the manifest files instead. This is useful only in cases when +you need to access the User configuration of the Input or Output mapping. E.g. in transformation components. ```python import csv @@ -193,7 +199,7 @@ for table in tables: outDestination = ci.configuration.tables_output_mapping[j]['destination'] ``` -## I/O table manifests and processing results +## Output Table Manifests and storing tabular results The component may define output [manifest files](https://developers.keboola.com/extend/common-interface/manifest-files/#dataouttables-manifests) @@ -222,7 +228,8 @@ from keboola.component import dao ci = CommonInterface() # create container for the result -result_table = ci.create_out_table_definition('my_new_result_table', primary_key=['id'], incremental=True) +result_table = ci.create_out_table_definition('my_new_result_table', primary_key=['id'], incremental=True, + write_always=False) # write some content with open(result_table.full_path, 'w') as result: @@ -240,51 +247,72 @@ result_table.table_metadata.add_column_data_type('id', dao.SupportedDataTypes.ST ci.write_manifest(result_table) ``` -### Get input table by name +### Retrieve raw manifest file definition (CommonInterface compatible) + +To retrieve the manifest file representation that is compliant with Keboola Connection Common Interface use +the `table_def.get_manifest_dictionary()` method. ```python -from keboola.component import CommonInterface +from keboola.component import dao, CommonInterface # init the interface ci = CommonInterface() -table_def = ci.get_input_table_definition_by_name('input.csv') +table_def = ci.create_out_table_definition('test.csv') + +# get the manifest file representation +manifest_dict = table_def.get_manifest_dictionary() ``` -### Initializing TableDefinition object from the manifest file +## Input Table Manifests and working with input tables -```python -from keboola.component import dao +All input tables can be retrieved via the `get_input_tables_definitions()` method. The result is a list of +`keboola.component.dao.TableDefinition` objects that contain all metadata about the table, such as manifest file +representations, system path and name. In some cases the input tables may not have a manifest file, in such cases the +`TableDefinition` object is initialized with default values. -table_def = dao.TableDefinition.build_from_manifest('data/in/tables/table.csv.manifest') +**NOTE:** -# print table.csv full-path if present: +The input table manifests are different from the output table manifests. The input table manifests are +automatically generated by the Keboola Connection. The output table manifests are created by the component or a user. -print(table_def.full_path) +The `CommonInterface.write_manifest` method can be used to write the input table manifest into the out stage, +however the `stage` attribute needs to be changed explicitly to `out`. +Some of the Input Manifest specific attributes are not supported and will be ignored when storing on out stage. -# rows count +## Get all input tables -print(table_def.rows_count) -``` +All input tables can be retrieved via the `get_input_tables_definitions()` method. The result is a list of +`keboola.component.dao.TableDefinition` objects that contain all metadata about the table, such as manifest file +representations, system path and name. In some cases the input tables may not have a manifest file, in such cases the +`TableDefinition` object is initialized with default values. -### Retrieve raw manifest file definition (CommonInterface compatible) +```python +from keboola.component import CommonInterface -To retrieve the manifest file representation that is compliant with Keboola Connection Common Interface use -the `table_def.get_manifest_dictionary()` method. +# init the interface +ci = CommonInterface() +input_tables = ci.get_input_tables_definitions() -```python -from keboola.component import dao +for table in input_tables: + print(table.full_path) -table_def = dao.TableDefinition.build_from_manifest('data/in/tables/table.csv.manifest') +``` -# get the manifest file representation -manifest_dict = table_def.get_manifest_dictionary() +### Get input table by name + +```python +from keboola.component import CommonInterface + +# init the interface +ci = CommonInterface() +table_def = ci.get_input_table_definition_by_name('input.csv') ``` -## Processing input files +## Input File Manifests and working with input files -Similarly as tables, files and their manifest files are represented by the `keboola.component.dao.FileDefinition` object +Similarly to tables, files and their manifest files are represented by the `keboola.component.dao.FileDefinition` object and may be loaded using the convenience method `get_input_files_definitions()`. The result object contains all metadata about the file, such as manifest file representations, system path and name. @@ -333,7 +361,33 @@ logging.info(input_files_by_name['image.jpg']) ``` -## Processing state files +## Output File Manifests and storing files + +The component may store results also into the [File Storage](https://help.keboola.com/storage/files/). +The library provides methods that simplifies the manifest file creation and allows defining the export options +and metadata of the result file using helper object `FileDefinition`. + +## Storing files + +```python +from keboola.component import CommonInterface + +# init the interface +ci = CommonInterface() + +# create metadata container for the result. This file will be stored temporarily with tags 'my_tag' and 'my_tag2' +result_file = ci.create_out_file_definition('my_new_result_file.dat', tags=['my_tag', 'my_tag2'], is_public=False, + is_permanent=False) + +with open(result_file.full_path, 'w+') as result: + result.write('something') + +ci.write_manifest(result_file) + + +``` + +## Component state and State Files [State files](https://developers.keboola.com/extend/common-interface/config-file/#state-file) can be easily written and loaded using the `get_state_file()` and `write_state_file()` methods: