Skip to content

Commit 5eab24a

Browse files
committed
Import 2.14.0
0 parents  commit 5eab24a

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

128 files changed

+33751
-0
lines changed

BUGS

+993
Large diffs are not rendered by default.

HISTORY

+535
Large diffs are not rendered by default.

PORTING.NOTES

+220
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,220 @@
1+
# @(#)PORTING.NOTES 2.1.8.1
2+
3+
Table of Contents
4+
==================
5+
1. General Program Structure
6+
2. Naming Conventions and Variable Usage
7+
3. Porting Procedures
8+
4. Compilation Options
9+
5. Customizing QGEN
10+
6. Further Enhancements
11+
7. Known Porting Problems
12+
8. Reporting Problems
13+
14+
1. General Program Structure
15+
16+
The code provided with TPC-H and TPC-R benchmarks includes a database
17+
population generator (DBGEN) and a query template translator(QGEN). It
18+
is written in ANSI-C, and is meant to be easily portable to a broad variety
19+
of platforms. The program is composed of five source files and some
20+
support and header files. The main modules are:
21+
22+
build.c: each table in the database schema is represented by a
23+
routine mk_XXXX, which populates a structure
24+
representing one row in table XXXX.
25+
See Also: dss_types.h, bm_utils.c, rnd.*
26+
print.c: each table in the database schema is represented by a
27+
routine pr_XXXX, which prints the contents of a
28+
structure representing one row in table XXX.
29+
See Also: dss_types.h, dss.h
30+
driver.c: this module contains the main control functions for
31+
DBGEN, including command line parsing, distribution
32+
management, database scaling and the calls to mk_XXXX
33+
and pr_XXXX for each table generated.
34+
qgen.c: this module contains the main control functions for
35+
QGEN, including query template parsing.
36+
varsub.c: each query template includes one or more parameter
37+
substitution points; this routine handles the
38+
parameter generation for the TPC-H/TPC-R benchmark.
39+
40+
The support utilities provide a generalized set of functions for data
41+
generation and include:
42+
43+
bm_utils.c: data type generators, string management and
44+
portability routines.
45+
46+
rnd.*: a general purpose random number generator used
47+
throughout the code.
48+
49+
dss.h:
50+
shared.h: a set of '#defines' for limits, formats and fixed
51+
values
52+
dsstypes.h: structure definitions for each table definition
53+
54+
2. Naming Conventions and Variable Usage
55+
56+
Since DBGEN will be maintained by a large number of people, it is
57+
particularly important to observe the coding, variable naming and usage
58+
conventions detailed here.
59+
60+
#define
61+
--------
62+
All #define directives are found in header files (*.h). In general,
63+
the header files segregate variables and macros as follows:
64+
rnd.h -- anything exclusively referenced by rnd.c
65+
dss.h -- general defines for the benchmark, including *all*
66+
extern declarations (see below).
67+
shared.h -- defines related to the tuple definitions in
68+
dsstypes.h. Isolated to ease automatic processing needed by many
69+
direct load routines (see below).
70+
dsstypes.h -- structure definitons and typedef directives to
71+
detail the contents of each table's tuples.
72+
config.h -- any porting and configuration related defines should
73+
go here, to localize the changes necessary to move the suite
74+
from one machine to another.
75+
tpcd.h -- defines related to QGEN, rather than DBGEN
76+
77+
extern
78+
------
79+
DBGEN and QGEN make extensive use of extern declarations. This could
80+
probably stand to be changed at some point, but has made the rapid
81+
turnaround of prototypes easier. In order to be sure that each
82+
declaration was matched by exactly one definition per executatble,
83+
they are all declared as EXTERN, a macro dependent on DECLARER. In
84+
any module that defines DECLARER, all variables declared EXTERN will
85+
be defined as globals. DECLARER should be declared only in modules
86+
containing a main() routine.
87+
88+
Naming Conventions
89+
------------------
90+
defines
91+
o All defines use upper case
92+
o All defines use a table prefix, if appropriate:
93+
O_* relates to orders table
94+
L_* realtes to lineitem table
95+
P_* realtes to part table
96+
PS_* relates to partsupplier table
97+
C_* realtes to customer table
98+
S_* relates to supplier table
99+
N_* relates to nation table
100+
R_* realtes to region table
101+
T_* relates to time table
102+
o All defines have a usage prefix, if appropriate:
103+
*_TAG environment variable name
104+
*_DFLT environment variable default
105+
*_MAX upper bound
106+
*_MIN lower bound
107+
*_LEN average length
108+
*_SD random number seed (see rnd.*)
109+
*_FMT printf format string
110+
*_SCL divisor (for scaled arithmetic)
111+
*_SIZE tuple length
112+
113+
3. Porting Procedures
114+
115+
The code provided should be easily portable to any machine providing an
116+
ANSI C compiler.
117+
-- Copy makefile.suite to makefile
118+
-- Edit the makefile to match the name of your C compiler
119+
and to include appropriate compilation options in the CFLAGS
120+
definition
121+
-- make.
122+
123+
Special care should be taken in modifying any of the monetary calcu-
124+
lations in DBGEN. These have proven to be particularly sensitive to
125+
portability problems. If you decide to create the routines for inline
126+
data load (see below), be sure to compare the resulting data to that
127+
generated by a flat file data generation to be sure that all numeric
128+
conversions have been correct.
129+
130+
If the compile generates errors, refer to "Compilation Options", below.
131+
The problem you are encountering may already have been addressed in the
132+
code.
133+
134+
If the compile is successful, but QGEN is not generating the appropriate
135+
query syntax for your environment, refer to "Customizing QGEN", below.
136+
137+
For other problems, refer to "Reporting Problems" at the end of this
138+
document.
139+
140+
4. Compilation Options
141+
142+
config.h and makefile.suite contain a number of compile time options intended
143+
to make the process of porting the code provided with TPC-H/TPC-R as easy as
144+
possible on a broad range of platforms. Most ports should consist of reviewing
145+
the possible settings described in config.h and modifying the makefile
146+
to employ them appropriately.
147+
148+
5. Customizing QGEN
149+
150+
QGEN relies on a number of vendor-specific conventions to generate
151+
appropriate query syntax. These are controlled by #defines in tpcd.h,
152+
and enabled by a #define in config.h. If you find that the syntax
153+
generated by QGEN is not sufficient for your environment you will need
154+
to modify these to files. It is strongly recomended that you not change
155+
the general organization of the files.
156+
157+
Currently defined options are:
158+
159+
VTAG -- marks a variable substitution point [:]
160+
QDIR_TAG -- environent variable which points to query templates
161+
[DSS_QUERY]
162+
GEN_QUERY_PLAN -- syntax to generate a query plan ["Set Explain On;"]
163+
START_TRAN -- syntax to begin a transaction ["Begin Work;"]
164+
END_TRAN -- syntax to end a transaction ["Commit Work;"]
165+
SET_OUTPUT -- syntax to redirect query output ["Output to"]
166+
SET_ROWCOUNT -- syntax to set the number of rows returned
167+
["{return %d rows}"]
168+
SET_DBASE -- syntax to connect to a database
169+
170+
6. Further Enhancements
171+
172+
load_stub.c provides entry points for two likely enhancements.
173+
174+
The ld_XXXX routines make it possible to load the
175+
database directly from DBGEN without first writing the database
176+
population out to the filesystem. This may prove particularly useful
177+
when loading larger database populations. Be particularly careful about
178+
monetary amounts. To assure portability, all monetary calcualtion are
179+
done using long integers (which hold money amounts as a number of
180+
pennies). These will need to be scaled to dollars and cents (by dividing
181+
by 100), before the values are presented to the DBMS.
182+
183+
The hd_XXXX routines allow header information to be written before the
184+
creation of the flat files. This should allow system which require
185+
formatting information in database load files to use DBGEN with only
186+
a small amount of custom code.
187+
188+
qgen.c defines the translation table for query templates in the
189+
routine qsub().
190+
191+
varsub.c defines the parameter substitutions in the routine varsub().
192+
193+
If you are porting DBGEN to a machine that is not supports a native word
194+
size larger that 32 bits, you may wish to modify the default values for
195+
BITS_PER_LONG and MAX_LONG. These values are used in the generation of
196+
the sparse primary keys in the order and lineitem tables. The code has
197+
been structured to run on any machine supporting a 32 bit long, but
198+
may be slightly more efficient on machines that are able to make use of
199+
a larger native type.
200+
201+
7. Known Porting Problems
202+
203+
The current codeline will not compile under SunOS 4.1. Solaris 2.4 and later
204+
are supported, and anyone wishing to use DBGEN on a Sun platform is
205+
encouraged to use one of these OS releases.
206+
207+
208+
8. Reporting Problems
209+
210+
The code provided with TPC-H/TPC-R has been written to be easily portable,
211+
and has been tested on a wide variety of platforms, If you have any
212+
trouble porting the code to your platform, please help us to correct
213+
the problem in a later release by sending the following information
214+
to the TPC D subcommittee:
215+
216+
Computer Make and Model
217+
Compiler Type and Revision Number
218+
Brief Description of the problem
219+
Suggested modification to correct the problem
220+

0 commit comments

Comments
 (0)