|
| 1 | +# @(#)PORTING.NOTES 2.1.8.1 |
| 2 | + |
| 3 | +Table of Contents |
| 4 | +================== |
| 5 | +1. General Program Structure |
| 6 | +2. Naming Conventions and Variable Usage |
| 7 | +3. Porting Procedures |
| 8 | +4. Compilation Options |
| 9 | +5. Customizing QGEN |
| 10 | +6. Further Enhancements |
| 11 | +7. Known Porting Problems |
| 12 | +8. Reporting Problems |
| 13 | + |
| 14 | +1. General Program Structure |
| 15 | + |
| 16 | +The code provided with TPC-H and TPC-R benchmarks includes a database |
| 17 | +population generator (DBGEN) and a query template translator(QGEN). It |
| 18 | +is written in ANSI-C, and is meant to be easily portable to a broad variety |
| 19 | +of platforms. The program is composed of five source files and some |
| 20 | +support and header files. The main modules are: |
| 21 | + |
| 22 | + build.c: each table in the database schema is represented by a |
| 23 | + routine mk_XXXX, which populates a structure |
| 24 | + representing one row in table XXXX. |
| 25 | + See Also: dss_types.h, bm_utils.c, rnd.* |
| 26 | + print.c: each table in the database schema is represented by a |
| 27 | + routine pr_XXXX, which prints the contents of a |
| 28 | + structure representing one row in table XXX. |
| 29 | + See Also: dss_types.h, dss.h |
| 30 | + driver.c: this module contains the main control functions for |
| 31 | + DBGEN, including command line parsing, distribution |
| 32 | + management, database scaling and the calls to mk_XXXX |
| 33 | + and pr_XXXX for each table generated. |
| 34 | + qgen.c: this module contains the main control functions for |
| 35 | + QGEN, including query template parsing. |
| 36 | + varsub.c: each query template includes one or more parameter |
| 37 | + substitution points; this routine handles the |
| 38 | + parameter generation for the TPC-H/TPC-R benchmark. |
| 39 | + |
| 40 | +The support utilities provide a generalized set of functions for data |
| 41 | +generation and include: |
| 42 | + |
| 43 | + bm_utils.c: data type generators, string management and |
| 44 | + portability routines. |
| 45 | + |
| 46 | + rnd.*: a general purpose random number generator used |
| 47 | + throughout the code. |
| 48 | + |
| 49 | + dss.h: |
| 50 | + shared.h: a set of '#defines' for limits, formats and fixed |
| 51 | + values |
| 52 | + dsstypes.h: structure definitions for each table definition |
| 53 | + |
| 54 | +2. Naming Conventions and Variable Usage |
| 55 | + |
| 56 | +Since DBGEN will be maintained by a large number of people, it is |
| 57 | +particularly important to observe the coding, variable naming and usage |
| 58 | +conventions detailed here. |
| 59 | + |
| 60 | + #define |
| 61 | + -------- |
| 62 | + All #define directives are found in header files (*.h). In general, |
| 63 | + the header files segregate variables and macros as follows: |
| 64 | + rnd.h -- anything exclusively referenced by rnd.c |
| 65 | + dss.h -- general defines for the benchmark, including *all* |
| 66 | + extern declarations (see below). |
| 67 | + shared.h -- defines related to the tuple definitions in |
| 68 | + dsstypes.h. Isolated to ease automatic processing needed by many |
| 69 | + direct load routines (see below). |
| 70 | + dsstypes.h -- structure definitons and typedef directives to |
| 71 | + detail the contents of each table's tuples. |
| 72 | + config.h -- any porting and configuration related defines should |
| 73 | + go here, to localize the changes necessary to move the suite |
| 74 | + from one machine to another. |
| 75 | + tpcd.h -- defines related to QGEN, rather than DBGEN |
| 76 | + |
| 77 | + extern |
| 78 | + ------ |
| 79 | + DBGEN and QGEN make extensive use of extern declarations. This could |
| 80 | + probably stand to be changed at some point, but has made the rapid |
| 81 | + turnaround of prototypes easier. In order to be sure that each |
| 82 | + declaration was matched by exactly one definition per executatble, |
| 83 | + they are all declared as EXTERN, a macro dependent on DECLARER. In |
| 84 | + any module that defines DECLARER, all variables declared EXTERN will |
| 85 | + be defined as globals. DECLARER should be declared only in modules |
| 86 | + containing a main() routine. |
| 87 | + |
| 88 | + Naming Conventions |
| 89 | + ------------------ |
| 90 | + defines |
| 91 | + o All defines use upper case |
| 92 | + o All defines use a table prefix, if appropriate: |
| 93 | + O_* relates to orders table |
| 94 | + L_* realtes to lineitem table |
| 95 | + P_* realtes to part table |
| 96 | + PS_* relates to partsupplier table |
| 97 | + C_* realtes to customer table |
| 98 | + S_* relates to supplier table |
| 99 | + N_* relates to nation table |
| 100 | + R_* realtes to region table |
| 101 | + T_* relates to time table |
| 102 | + o All defines have a usage prefix, if appropriate: |
| 103 | + *_TAG environment variable name |
| 104 | + *_DFLT environment variable default |
| 105 | + *_MAX upper bound |
| 106 | + *_MIN lower bound |
| 107 | + *_LEN average length |
| 108 | + *_SD random number seed (see rnd.*) |
| 109 | + *_FMT printf format string |
| 110 | + *_SCL divisor (for scaled arithmetic) |
| 111 | + *_SIZE tuple length |
| 112 | + |
| 113 | +3. Porting Procedures |
| 114 | + |
| 115 | +The code provided should be easily portable to any machine providing an |
| 116 | +ANSI C compiler. |
| 117 | + -- Copy makefile.suite to makefile |
| 118 | + -- Edit the makefile to match the name of your C compiler |
| 119 | + and to include appropriate compilation options in the CFLAGS |
| 120 | + definition |
| 121 | + -- make. |
| 122 | + |
| 123 | +Special care should be taken in modifying any of the monetary calcu- |
| 124 | +lations in DBGEN. These have proven to be particularly sensitive to |
| 125 | +portability problems. If you decide to create the routines for inline |
| 126 | +data load (see below), be sure to compare the resulting data to that |
| 127 | +generated by a flat file data generation to be sure that all numeric |
| 128 | +conversions have been correct. |
| 129 | + |
| 130 | +If the compile generates errors, refer to "Compilation Options", below. |
| 131 | +The problem you are encountering may already have been addressed in the |
| 132 | +code. |
| 133 | + |
| 134 | +If the compile is successful, but QGEN is not generating the appropriate |
| 135 | +query syntax for your environment, refer to "Customizing QGEN", below. |
| 136 | + |
| 137 | +For other problems, refer to "Reporting Problems" at the end of this |
| 138 | +document. |
| 139 | + |
| 140 | +4. Compilation Options |
| 141 | + |
| 142 | +config.h and makefile.suite contain a number of compile time options intended |
| 143 | +to make the process of porting the code provided with TPC-H/TPC-R as easy as |
| 144 | +possible on a broad range of platforms. Most ports should consist of reviewing |
| 145 | +the possible settings described in config.h and modifying the makefile |
| 146 | +to employ them appropriately. |
| 147 | + |
| 148 | +5. Customizing QGEN |
| 149 | + |
| 150 | +QGEN relies on a number of vendor-specific conventions to generate |
| 151 | +appropriate query syntax. These are controlled by #defines in tpcd.h, |
| 152 | +and enabled by a #define in config.h. If you find that the syntax |
| 153 | +generated by QGEN is not sufficient for your environment you will need |
| 154 | +to modify these to files. It is strongly recomended that you not change |
| 155 | +the general organization of the files. |
| 156 | + |
| 157 | +Currently defined options are: |
| 158 | + |
| 159 | +VTAG -- marks a variable substitution point [:] |
| 160 | +QDIR_TAG -- environent variable which points to query templates |
| 161 | + [DSS_QUERY] |
| 162 | +GEN_QUERY_PLAN -- syntax to generate a query plan ["Set Explain On;"] |
| 163 | +START_TRAN -- syntax to begin a transaction ["Begin Work;"] |
| 164 | +END_TRAN -- syntax to end a transaction ["Commit Work;"] |
| 165 | +SET_OUTPUT -- syntax to redirect query output ["Output to"] |
| 166 | +SET_ROWCOUNT -- syntax to set the number of rows returned |
| 167 | + ["{return %d rows}"] |
| 168 | +SET_DBASE -- syntax to connect to a database |
| 169 | + |
| 170 | +6. Further Enhancements |
| 171 | + |
| 172 | +load_stub.c provides entry points for two likely enhancements. |
| 173 | + |
| 174 | +The ld_XXXX routines make it possible to load the |
| 175 | +database directly from DBGEN without first writing the database |
| 176 | +population out to the filesystem. This may prove particularly useful |
| 177 | +when loading larger database populations. Be particularly careful about |
| 178 | +monetary amounts. To assure portability, all monetary calcualtion are |
| 179 | +done using long integers (which hold money amounts as a number of |
| 180 | +pennies). These will need to be scaled to dollars and cents (by dividing |
| 181 | +by 100), before the values are presented to the DBMS. |
| 182 | + |
| 183 | +The hd_XXXX routines allow header information to be written before the |
| 184 | +creation of the flat files. This should allow system which require |
| 185 | +formatting information in database load files to use DBGEN with only |
| 186 | +a small amount of custom code. |
| 187 | + |
| 188 | +qgen.c defines the translation table for query templates in the |
| 189 | +routine qsub(). |
| 190 | + |
| 191 | +varsub.c defines the parameter substitutions in the routine varsub(). |
| 192 | + |
| 193 | +If you are porting DBGEN to a machine that is not supports a native word |
| 194 | +size larger that 32 bits, you may wish to modify the default values for |
| 195 | +BITS_PER_LONG and MAX_LONG. These values are used in the generation of |
| 196 | +the sparse primary keys in the order and lineitem tables. The code has |
| 197 | +been structured to run on any machine supporting a 32 bit long, but |
| 198 | +may be slightly more efficient on machines that are able to make use of |
| 199 | +a larger native type. |
| 200 | + |
| 201 | +7. Known Porting Problems |
| 202 | + |
| 203 | +The current codeline will not compile under SunOS 4.1. Solaris 2.4 and later |
| 204 | +are supported, and anyone wishing to use DBGEN on a Sun platform is |
| 205 | +encouraged to use one of these OS releases. |
| 206 | + |
| 207 | + |
| 208 | +8. Reporting Problems |
| 209 | + |
| 210 | +The code provided with TPC-H/TPC-R has been written to be easily portable, |
| 211 | +and has been tested on a wide variety of platforms, If you have any |
| 212 | +trouble porting the code to your platform, please help us to correct |
| 213 | +the problem in a later release by sending the following information |
| 214 | +to the TPC D subcommittee: |
| 215 | + |
| 216 | + Computer Make and Model |
| 217 | + Compiler Type and Revision Number |
| 218 | + Brief Description of the problem |
| 219 | + Suggested modification to correct the problem |
| 220 | + |
0 commit comments