forked from ggobi/qtbase
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathNOTES
652 lines (512 loc) · 29.5 KB
/
NOTES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
====================================================
R Design
====================================================
General organization: modules contain classes, which contain methods
Modules: A module (i.e. all of Qt) roughly corresponds to an R
namespace. A package is required to define a namespace, which would
make sense here. While this would be quite natural for user-defined
classes, the smoke-based modules are most easily provided dynamically.
We could try populating the namespace with the classes at build-time,
but it would be difficult. We would need to link to a shared object to
get the class names and then use exportPattern() in NAMESPACE. Then
there is the documentation, which is not yet build-time dynamic. Even
if it were, we would need to find the entire Qt documentation (for
conversion).
Of course, at some point we would want documentation in R (and it
would be nice to avoid the use of Qt$ for retrieving every
class). Perhaps as a separate package?
To keep everything dynamic, we provide a special smoke module object
(e.g. "Qt") that supports '$' access. How is a such an object
represented in R? Extptr would be the obvious first choice, but the
symbol would need to be replaced or populated somehow at load
time. This is tricky, because all objects are duplicated upon
export. Could change the pointer address, or make it an environment,
instead. But what goes into the environment? A single external
pointer, or populate it with the class objects (directly or lazily)?
The most efficient would be to lazily accumulate the class
objects. Simplest would be to fully populate the environment and, if
done natively, this would be fast. Of course, the environment should
be locked after it is populated. The advantage over using an external
pointer is that everything is stored in a well-supported R object.
One interesting feature of an environment-based Smoke module is the
ability to attach() it, though this would not work in the package
context. Not that it would help much.
Class objects: extracted using $ from a module object. These should
extend 'function' so that they can be invoked as constructors.
Methods: These are the messages that can be received by a particular
class of object. Some are static (sent to the class), while others are
sent to specific instances. Methods must be keyed by signature, not
only by name, like classes. But the user will not want to specify the
signature; this should be automatically resolved from the method name
and argument types. The latter are not known until invocation, so the
method wrapper, retrieved by name, needs to perform the selection.
Given the functional nature of R, it may be natural to represent a
method as a function, where say the first argument represents the
class object or instance. This gives rise to familiar behavior, like:
> lapply(widgets, Qt$QWidget$show)
This leads quickly to questioning why the first argument is special,
and then one arrives at multiple dispatch systems like S4. Multiple
dispatch interferes with bundling, and thus conflicts with the design
of C++ and thus Qt. With single dispatch, it seems redundant to
specify the initial 'self' parameter, as the message has already been
delivered to a specific instance. The parameters then consitute the
message contents. Here is a compromise:
> lapply(widgets, qinvoke, "show")
This uses a function, qinvoke, to send the message of type "show" to
each widget in the list. This might be preferable to the former
approach, since calling '$' is more explicitly leveraging named state,
which is not a purely functional concept. Explicit 'self' can come in
handy, however, to implement a method with an existing ordinary
function. In this case, though, delegation is probably sufficient. In
short, we take Ruby over Python.
How then is the named state stored? The natural way is an
environment. We would prefer not to create a static environment for
every class, though. Rather, the environment for a class (or rather
two environments, to separate static and instance methods) should be
populated upon first retrieval of the class from the Smoke
module. This suggests that the module environment needs to be
dynamically bound, using makeActiveBinding(), to avoid any overhead.
Instance methods: Use the $ syntax on the instance.
Static methods: Use $ syntax on a class object.
Global functions: Use $ syntax on a module object. Unlikely to be
useful, but could be stored with the class objects.
(In-)Out parameters: RGtk2 handles this through basic lists. If a
function has an out parameter, a list is returned with the actual
return value and the out parameters. The first challenge is
identifying an 'out' parameter. This is difficult. Non-const reference
parameters are probably in-out. Pointers are difficult to distinguish
from arrays. The second challenge is method overload resolution:
arguments cannot be omitted, as they would be for an 'out' parameter,
if we are to select a method. We could have the user pass a special
type of parameter, that indicates the type of an out parameter or, for
in-out parameters, the input value. The function 'qout()' marks an
out parameter and 'qinout()' an in-out parameter.
str <- "broken"
v <- QValidator()
fixed <- v$fixup(qinout(str))
str <- fixed$str
For in-out parameters, we do not need any help selecting a method, but
this still offers benefits. It makes it obvious that we are obtaining
a value by reference parameter. C# does the same thing with its 'ref'
and 'out' keywords. It also makes it easy to ignore uninteresting
return parameters, i.e.:
text <- clipboard$text(subtype) # ignore change in subtype
# worry about subtype
ret <- clipboard$text(qinout(subtype))
subtype <- ret$subtype
text <- ret$text
If the return value of the method is void, do we just assume that the
marked out parameter is the desired return value?
str <- v$fixup(qinout(str))
This seems reasonable.
What about pure out parameters?
ns <- QXmlNamespaceSupport()
names <- ns$splitName("ns:element", qout(""), qout(""))
prefix <- names$prefix
This is similar to the .C() contract.
We'll hold off on out parameters until we need them.
Signals:
Emitting a signal: just like an instance method
Connecting a signal handler: qconnect() or some fancy syntax?
Accessing properties: Should be be consistent with accessor patterns
in R. One idea is to use the element extraction operator:
> val <- obj$prop
> obj$prop <- val
However, vector element extraction/replacement is somewhat different
from setting fields on an object. The closest thing to a 'field' in R
is the S4 slot. Slots are internally accessed via '@' but externally
via accessors of the form:
> val <- prop(obj)
> prop(obj) <- val
Unfortunately, it is technically too difficult to support this syntax
dynamically. Both of the above rely on the replacement function
'<-'. The underlying logic is that the left hand side is being
replaced by the modified copy of the original object. With mutable
objects, however, this becomes unecessary, and we can use the
C++/Java-like set syntax:
> obj$prop()
> obj$setProp(val)
This means a bit more typing, but might make it clear that we are
dealing with mutable objects. Then again, since our objects are
environments, the '$<-' syntax is perfectly consistent with R.
Instances: Need to store the pointer somewhere. Could just make this
an external pointer, but might be more useful as an environment, with
the pointer stored as an attribute. The environment would dynamically
bind closures for each method (and property if a QObject). It would
also contain the symbol 'self' which refers to the environment.
User Classes: We need a simple mechanism for defining new classes and
add methods to classes. Usually, these will be overrides of native
virtual methods. Example syntax:
> MyWindow <- qclass("MyWindow", Qt$QWindow)
> MyWindow$actionEvent <- function(event) { }
> MyWindow$mySlot <- qslot(function(x) { })
> MyWindow$myStaticMethod <- qstatic(function(x) { })
The 'MyWindow' object is a class object, like Qt$QWidget.
User Methods: Methods stored in a class-level environment, which is
referenced from each instance. The methods will be enclosed within the
instance environment.
Do we really want to bind every method (wrapper) to the class
environments, or do we want them dynamic, e.g. through ObjectTable?
Static binding would mean generating wrappers for each method. We
already have this working for static methods. For object methods, the
wrappers would be similar, just using qinvoke() rather than
qinvokeStatic(). We would want the static symbols in a separate
environment from the object symbols. For R classes, the wrapper could
directly invoke the class, after enclosing it, or call qinvoke(),
relying on RMethod to do the enclosing. I would favor the direct
invocation mechanism, at least at first. Of course, the instance could
remain as a dynamic environment, feeding off the field and class
environments. This design makes sense, because:
- The symbols for R classes are already stored in an environment.
- There are not that many classes, so storing them (lazily) into
environments will not incur much overhead, though it is unfortunate
that we cannot use chaining due to multiple inheritance.
- While there is some redundancy between the Class objects and these
environments, they are really different. The former is a database of
actual methods, while the latter is just a database of wrappers. The
Class should hold a reference to the environment.
The main concerns is that this does not gracefully handle evaluation
of user static methods. Need to change the parent of the static
environment to the original function enclosure. However, we are not
sure yet if we are going to support user static symbols.
As a compromise, we could keep the static symbols in "static" R
environments and place the object symbols (for R classes) in a
separate environment.
One question is disambiguation of overloaded virtuals. In such cases,
it may be necessary to do something like:
> MyWindow$overloadedMethod <- qmethod(function(x) { }, c("integer", "integer"))
In order to keep the method selection centralized in native code, a
wrapper also needs to be added, just like for native methods. The
method implementation is then called from native code. This might be
going too far though. R is a dynamically typed language, and a
function will accept any invocation, as long as the argument count is
not exceeded. If multiple dispatch is required, one can always
delegate to an S4 generic.
Another issue is chaining up to the parent class. Here is one syntax:
> Parent$foo()
Where 'Parent' is the name of a class from which this instance is
derived. This means the classes need to behave specially inside the
instance method. Or we can use 'super' like in Java:
> super$foo()
But what is 'super'? It needs to be some object, most conveniently
another reference to the instance, like 'this'. This does not make
sense though. There is no such thing as a "super instance" -- only a
"super class". The user could do weird things, like pass 'super' as an
argument to other methods. Instead, we could take the S3/S4 route,
with a special function that calls a super method, as in:
> callSuper(...)
But unlike S3/S4, it makes sense when bundling to allow an alternative
method name.
> callSuper("foo", ...)
A little extra typing, but it is more explicit.
User objects: Should support storing fields in an environment. Should
not need to declare these, unless they are properties. The instance
environment should inherit from this field environment.
----------------------------------------------------------
C++ Design
----------------------------------------------------------
Thoughts:
An R user requests a method invocation, providing the method name,
the target and the arguments. We need to:
1) Find the method given the name, target class and argument types
2) Marshal the arguments
3) Invoke the method
4) Marshal the return value
5) Return the value to R
We start with a MethodCallRequest, constructed by the R
wrapper. Currently, we pass the MethodCallRequest to a
MethodSelector to obtain the MethodCall, which is then evaluated to
obtain the result for returning to R. The MethodCall consists of an
Executor and the arguments. When evaluated, the MethodCall marshals
the arguments by delegating to TypeHandlers and then passes itself
to an Executor for method execution. The Executor is an adaptor
that obtains the necessary information from the MethodCall to
invoke the Method. Finally, it marshals the return value with a
TypeHandler and returns the result to the wrappers. This could be
simplified by having the Method play the role of the Executor and
perform the low-level invocation based on the MethodCall.
We could reduce the MethodCallRequest to a MethodRequest, with only
type information, no data. This would then be passed to the
MethodSelectors to obtain a Method. The Method would be passed the
arguments, perhaps through overloading invoke() or passing a
MarshalContext. The MarshalContext might be preferred, since it
would avoid the need for each Method to have its own set of
overloads. Either way, the Method obtains a MarshalContext to
marshal the arguments, call itself, and marshal the return value,
which the Method returns to the wrappers. This is strange, because
the Method is presiding over the marshalling, as well as the
low-level invocation. This could be avoided by creating a
MethodCall from the Method and the arguments, and evaluating
it. The Method would be the most natural factory of the MethodCall,
but then it is playing too many roles. The question is whether the
ability to select a method separate from invocation is worth the
complexity. Given there is no use-case, probably not.
Here is a variation on the first idea:
1) MethodSelector transforms the MethodCallRequest to a MethodCall.
2) When MethodCall is evaluated, it passes itself to its Method
3) The Method requests the required stack from the Methodcall,
either R or C++/Smoke.
4) If the request is across the marshalling boundary, the arguments
are marshalled, and the method is invoked again, this time with the
arguments available. The original request comes back empty, and is
ignored by the Method.
5) The result is retrieved and returned to R by the wrappers.
Wow, #4 is a bit complicated. Given that we only have two stacks (R
and Smoke/C++), we can just ask the Method which it requires. The
combination of the Method type and the stack provided to the
MethodCall determines the marshal mode (Identity, RToSmoke, SmokeToR).
This should easily handle R->R, R->Smoke, and R->Moc. Handling
callbacks, e.g. Smoke->R and Moc->R, is more complicated. Idea:
reverse the process. Start with a SmokeMethod or MocMethod that
delegates to an RMethod. The Smoke/MocMethod creates a MethodCall
to the RMethod and evaluates it. For SmokeMethod, it would probably
be more direct to just create the MethodCall and eval it. MocMethod
handles the bridging of the Moc and Smoke stacks.
Taking that idea to the extreme, we have all invocations going
through proxy methods. For example, the R invocation will call an
RProxyMethod, which will create the MethodCallRequest, pass it to
the MethodSelector to obtain a MethodCall, evaluate the call, and
return the result in the appropriate manner. This seems elegant.
Another question: should the MethodCallRequest and MethodCall be
part of the same hierarchy, as they share many of the same
attributes? As in, BoundMethodCall and UnboundMethodCall, both
inheriting from MethodCall? The BoundMethodCall would gain a
Method. It is also possible to make the Method optional on the
MethodCall and add an isBound() method. This might work, but the
Binder/Selector usually only understands one type of request
(foreign or native). Is that true? In theory, the MethodSelector only
needs to know:
1) Identity of instance, if object method (SmokeObject <-> SEXP)
2) Class of method, if static method (string)
3) Name of method (string)
4) Types of arguments (vector of SmokeTypes)
Note that the actual argument values are not on this list. They are
only used now to build the MethodCall, but if we already *have* a
MethodCall, this is not an issue.
To get SmokeTypes for R objects, we will need an extensible
mechanism that could probably replace the current scoring
mechanism, i.e. all scoring is based on SmokeTypes, which is
straight-forward.
So should there be a separate (Un)BoundMethodCall type? The bound
variant would add the eval() method. When binding, are we really
making a new object, or just changing the state of the existing
one? In some ways, this is identical to opening and closing a
connection. We don't use different classes for that, so....
How do properties integrate with this? Properties seem general
enough to be included in the base Class interface, even though
Smoke does not implement them. The question is the abstraction:
what type system? QVariant is convenient for Moc, but it is limited
by compile-time support. The MocProperty could be implemented via a
Moc call to setProperty(), with a QVariant type handler.
========================================================
User Classes Design
========================================================
One of the main reasons we are using Qt from R, rather than from C++,
is the dynamic nature of R. Thus, we should retain flexibility when
defining new Qt-based classes. This follows the route of 'proto'.
To define a class, we need to specify the name of the parent and the
constructor. Only single inheritance is allowed, mostly due to
technical limitations. This also makes things easier to implement with
environments. Note that there is only one constructor (and only one
method of a given name), because overloading is a difficult problem in
dynamic languages. S4 already does a good job of dispatch, so users
are encouraged to leverage it to simulate overloading.
Example:
MyClass <- qclass(Qt$QWidget, function() { this$x <- 0 })
The class, like the other classes, is simply an environment, although
it is unlocked by default. Methods, static fields and enums may be
defined/replaced at any time. It would even be possible to extract a
method and insert it into another class. Sometimes, simple casts are
required. Examples:
qmethods(MyClass)$method <- function(...) { }
qmethods(MyClass)$staticMethod <- qstatic(function(...) { })
qenums(MyClass)$enum <- c("small", "medium", "large")
MyClass$staticField <- "foobar"
There is some debate as to whether we really require static method
support. The examples seem to use static methods for:
- Private utility functions. In R, we can often define these inline,
inside the client functions. Obviously, this is not possible in C++.
- Easy namespacing of functions
- Singletons
The last two can be handled by simple closure tricks like local(), but
it might be nice to have the name match the class name, which may not
always be possible...
Anyway, do we like the qmethods()$<- syntax, or would qsetMethod() be
better? The latter is more like qsetClass() and the setGeneric/Method
from the methods package. Compare:
> qmethods(MyClass)$method <- qprotected(function(...) { })
> qsetMethod(MyClass, "method", function(...) { }, "protected")
The main difference is that the former requires three method calls,
while the latter consists of a single call. It is much easier to
document the process of defining a method as a single call. For
example, how many casting functions are available, and which can be
combined? This would be made immediately obvious by qsetMethod().
While it is nice to be consistent with familiar syntax from S4, the
requirements of this design are completely different: we are bundling
a method, identified by a name, into a class. However, let us consider
a more complex case, that of slot definition. We need to specify the
signature of the slot. Compare:
> qmethods(MyClass)$slotName <- qslot(function(...) { }, "int")
> qsetSlot(MyClass, "slotName(int)", function(...) { })
Now what happens when defining a new signature on the same name?
> qmethods(MyClass)$slotName <- qslot(function(...) { }, "double") # oops!
> qsetSlot(MyClass, "slotName(double)", function(...) { })
The former syntax is misleading, because it appears as if we are
replacing the previous slot. There is this alternative:
> qslots(MyClass)$"slotName(double)" <- function(...) { }
A bit ugly, but it works. Now let us consider signals, where there is
no function to set.
> qsignals(MyClass)$"signalName(int, int)" <- what goes here? access?
> qsetSignal(MyClass, "signalName(int, int)", "protected")
Thus, we will take the simpler qset* route for now.
Do we really want a separate function for each slot? As a
dynamically-typed language, we could implement multiple slots (of the
same name) with the same function. If we need more advanced dispatch,
there is always S4. Going this route:
> qsetMethod(MyClass, "methodName", function(...) { })
> qsetSlot(MyClass, "methodName(int)")
In other words, we are simply advertising our function to the
statically-typed world. The above could certainly be the default
behavior, even if we support functions per slot, which would be a lot
more complicated. We would need to perform method selection within
RClass. It is not common within Qt to have multiple signatures for a
slot, but when this does occur it is limited to leaving off a single
argument (remember that 95% of slots have one argument or less). This
is why QtRuby based Moc method selection merely on argument count.
For now, we go the single function route. To make this easier, what
about adding a 'slot' argument to qsetMethod?
> qsetMethod(MyClass, "methodName", function() { }, slot = "methodName()")
Saves a bit of typing, but still suffers from the redundancy.
> qsetMethod(MyClass, "methodName", function() { }, slot = "")
No more typing, but a bit odd looking.
> qsetMethod(MyClass, "methodName", function() { }, slot = TRUE)
Looks nice, but what about when we have arguments?
> qsetMethod(MyClass, "methodName", function(x) { }, slot = "int")
Ok, but what if 'x' is optional?
> qsetMethod(MyClass, "methodName", function(x = 0) { }, slot = c("", "int"))
Could be worse.
Since we are following the bundling idiom, methods do not take a
'this' or 'self' argument. Rather, the methods are enclosed,
transiently during evaluation, in an instance environment. This
environment has the special symbol 'this' that refers to the instance.
The MyClass object needs to be serializable, so it cannot consist of
any external pointers. The link to the base Smoke class should be
encoded by the module name and class name.
Upon construction, we need to create an instance of the parent class
(eventually chaining up to an actual Smoke class). This has to happen
first, though we do not know how to invoke the base constructor. C++
uses initializer lists to solve this. R does not have them, so we can
take the Java route, where the base constructor must come first,
defaulting to the no-arg constructor. As in:
MyClass <- qclass(Qt$QWidget, function(parent = NULL) { super(parent); ... })
The next challenge is specifying then class name. We want to enforce,
if possible, the symbol in the environment to have the same name as
the class. One way is the assign()/setClass()-like syntax:
qclass("MyClass", Qt$QWidget, constructor, where = topenv())
Which has the side-effect of assigning "MyClass" into the 'where'
env. That seems reasonable; it's the same contract of setGeneric().
How is the reference to the parent class stored? At first thought,
simply storing the base class object with the new class seems
reasonable. The base environment could become the parent of the
environment for the new class, which facilitates propagating changes
on the base class. However, note that lazy loading will cause
trouble. Consider:
- To define subclass, access a class, thus caching it in the library
- Load package, library is regenerated, class envs are different
Anyway, users need to disable lazy loading to make that work.
> qmethods(MyClass)$method <- function(...) { }
This has a number of advantages. First, qmethods(MyClass)$method
retrieves the unmodified function (i.e. we might want to have
qmethods() anyway). It also makes it obvious that we are setting a
method. What about:
- Fields: We probably do not need to define fields; just set them at run-time.
- Properties: These require special treatment, as they have a type,
permissions, etc. Use qproperties() accessor.
> qproperties(MyClass)$prop <- qproperty("integer", writable = FALSE)
> qproperties(MyClass)$prop <- 0L # read/write, integer, defaults to 0
- Enums:
> qenums(MyClass)$enum <- c("small", "medium", "large")
Are these necessary? R already has enumerations in the form of
factors. Enumerations are mostly useful for type safety. The
possible values are fixed, and consistency can be verified by the
compiler. The same is true in many respects for factors, even though
R is dynamically typed. For example, we can protect against setting
an illegal value:
> f <- factor("small", levels = c("small", "big"))
> f[] <- "huge" # error
When passing an enumeration to a function, there is the match.arg()
idiom:
> fun <- function(x = levels(f)) { x <- match.arg(x); ... }
Thus, we will hold off on formal enums.
========================================================
Documentation
========================================================
We need documentation for Qt and other libraries in a form that works
for the R user. It is not clear if integrating with the R help system
(like we did with RGtk2) is the best approach:
- The bindings do not exist in any R namespace, so there is no need to
give help aliases to them.
- Translating C++ HTML to R HTML is much easier than HTML to Rd.
- Overhead: storage, installation, managing, etc of R documentation
incurs a lot of overhead. Definitely a pain with RGtk2.
- Although R help has many output targets, it is not clear if anything
beyond HTML is necessary.
- The Qt Assistant utility is a useful interface that is well designed
for our use case. People are working on new interfaces to R help,
but Qt Assistant already works well. It would be nice if the new R
help front-ends supported interfacing with other systems that served
up HTML.
How to provide our documentation? Could perform translation at
installation. Or it could just be semi-statically generated by running
a script periodically. Lazy run-time generation would not work well,
because there would be no index. Any dynamic generation requires us to
find the installed Qt docs, if any, on the user machine. While that
may not be tough, it's an extra complication. An alternative is to
distribute it with a separate package, maybe called 'qthelp'. But
again, that's an extra complication. How much overhead is involved in
generating the documentation? There's quite a lot of it.
Let us just embed the compiled documentation in qtbase. That is what
Qt does. Most people get binary packages anyway, which means they
will need to download all the help, whether it is generated at build
time or not. What about version differences? Most people (binary
users) are tied to a particular version anyway. Just make sure that
the version of the documentation corresponds to the binary builds.
API: the qhelp() function will lookup an identifier and return an
object representing the search hits, essentially URLs pointing to
qthelp resources. Then, the print method for the object will display
the help in Qt Assistant. Calling as.character on a single hit would
convert to HTML string, supporting integration.
========================================================
Support for Reference Classes
========================================================
We want to have Qt classes, including R derivatives, registered in the
S4/R5 class system. How to achieve this?
Obviously, we need a setRefClass() wrapper that will see the classes
in the Smoke typelib. Call it setSmokeRefClass(). At least for now,
this should sit on top of the existing R interface to Smoke, i.e.,
RQtClass, RQtObject, etc. Calling setSmokeRefClass on one class will
automatically register the base classes. Anyway, seems pretty trivial.
The question is whether to implicitly declare the class when loading
the Smoke module. This would involve a fair bit of overhead. But more
importantly, would we necessarily expect a large, external library,
like Qt or Java, to be entirely represented in the R5 hierarchy, upon
load? The namespace system could manage this in the case of Qt,
maybe. Still, do we consider the symbols in Qt to be on the same level
as the R symbols declared by the qtbase package? Not currently, at
least. We always need to qualify symbols with the module object. This
is helpful for separating the low-level from the high-level
interface. The qtbase package is a bridge; it does not inject all of
Qt into R. In Rcpp, which does use implicit registration, the
programmer has already explicitly imported a C++ library into R using
their macro system. In our case, we import all of Qt, wholesale, as
the module object. Explicitly registering a reference class would thus
be analogous to declaring an interface with Rcpp macros.
This would be easier if we had some use cases, which are currently
unclear. It may be that an R package will define a S4/R5 API for a
GUI, where the Qt classes (and R subclasses) are only used
internally. We have seen, from gWidgets for example, that this is not
always the case. The gWidgets package calls setOldClass to bring the
S3 RGtk2 objects into S4. That is a chore that should be avoided. Then
again, calling setSmokeRefClass("Foo") is not much worse than calling
importClass("Foo") in the NAMESPACE. Sure, it's better to have things
in the NAMESPACE, but it's not clear how to get there.