This section how to manage processes and external programs. Besides replacements for cmake's execute_process
function this section defines a control system for parallel processes controlled from any cmake.
- async
- await
- command_line
- command_line_args_combine
- command_line_args_escape
- command_line_parse
- command_line_parse_string
- command_line_to_string
- execute
- execute_script
- process_execute
- process_handle
- process_handle_change_state
- process_handle_get
- process_handle_new
- process_handle_register
- process_handles
- process_info
- process_isrunning
- process_kill
- process_list
- process_refresh_handle
- process_return_code
- process_start
- process_start_info_new
- process_start_script
- process_stderr
- process_stdout
- process_timeout
- process_wait
- process_wait_all
- process_wait_any
- process_wait_n
- string_take_commandline_arg
- wrap_executable
- wrap_executable_bare
The following definitions are common to the following subsections.
<command>
a path or filename to an executable programm.<process start info> ::= { <command>, <<args:<any ...>>, <cwd:<directory>> }
Using external applications is more complex than necessary in cmake in my opinion. I tried to make it as easy as possible. Using the convenience of the filesystem functions and maps wrapping an external programm and using it as well as pure execution is now very simple as you can see in the following example for git:
This is all the code you need to create a function which wraps the git executable. It uses the initializer function pattern.
function(git)
# initializer block (will only be executed once)
find_package(Git)
if(NOT GIT_FOUND)
message(FATAL_ERROR "missing git")
endif()
# function redefinition inside wrap_executable
wrap_executable(git "${GIT_EXECUTABLE}")
# delegate to redefinition and return value
git(${ARGN})
return_ans()
endfunction()
Another example showing usage of the execute()
function:
find_package(Hg)
set(cmdline --version)
execute({path:$HG_EXECUTABLE, args: $cmdline} --process-handle)
ans(res)
map_get(${res} result)
ans(error)
map_get(${res} output)
ans(stdout)
assert(NOT error) # error code is 0
assert("${stdout}" MATCHES "mercurial") # output contains mercurial
json_print(${res}) # outputs input and output of process
execute(<process start info ?!> [--process-handle|--exit-code]) -> <stdout>|<process info>|<int>
executes the process described by<process start ish>
and by default fails fatally if return code is not 0. if--exit-code
flag is specified<process info>
is returned and if<return-code>
is specified the command's return code is returned. (the second two options will not cause a fatal error)- example:
execute("{path:'<command>', args:['a','b']}")
- example:
wrap_executable(<name:string> <command>)
takes the executable/command and wraps it inside a function called<name>
it has the same signature asexecute(...)
<process start info?!>
a string which can be converted to a<process start>
object using theprocess_start_info()
function.<process start info>
a map/object containing the following fieldscommand
command name / path of executable requiredargs
command line arguments to pass along to command, usestring_encode_semicolon
if you want to have an argument with semicolons optionaltimeout:<int>
optional number of seconds to wait before failingcwd:<unqualified path>
optional by default it is whateverpwd()
currently returns
<process info>
contains all the fields of<process start>
and additionalyoutput:<stdout>
the output of the command execution. (merged error and stdout streams)result:<int>
the return code of the execution. 0 normally indicates success.
process_start_info(<process start info?!>):<process start info>
creates a valid<process start info>
object from the input argument. If the input is invalid the function fails fatally.
When working with large applications in cmake it can become necessary to work in parallel processes. Since all cmake target systems support multitasking from the command line it is possible to implement cmake functions to control it. I implemented a 'platform independent' (platform on which either powershell or bash is available) control mechanism for starting, killing, querying and waiting for processes. The lowlevel functions are platform specific the others are based on the abstraction layer that the provide.
This example starts a script into three separate cmake processes. The program ends when all scripts are done executing.
# define a script which counts to 10 and then
# note that a fresh process means that cmake has not loaded cmakepp
set(script "
foreach(i RANGE 0 10)
message(\${i})
execute_process(COMMAND \${CMAKE_COMMAND} -E sleep 1)
endforeach()
message(end)
")
# start each script - fork_script returns without waiting for the process to finish.
# a handle to the created process is returned.
process_start_script("${script}")
ans(handle1)
process_start_script("${script}")
ans(handle2)
process_start_script("${script}")
ans(handle3)
# wait for every process to finish. returns the handles in order in which the process finishes
process_wait_all(${handle1} ${handle2} ${handle3})
ans(res)
## print the process handles in order of completion
json_print(${res})
This example shows a more usefull case: Downloading multiple 'large' files parallely to save time
## define a function which downloads
## all urls specified to the current dir
## returns the path for every downloaded files
function(download_files_parallel)
## get current working dir
pwd()
ans(target_dir)
## process start loop
## starts a new process for every url to download
set(handles)
foreach(url ${ARGN})
## start download by creating a cmake script
process_start_script("
include(${cmakepp_base_dir}/cmakepp.cmake) # include cmakepp
download(\"${url}\" \"${target_dir}\")
ans(result_path)
message(STATUS ${target_dir}/\${result_path})
")
ans(handle)
## store process handle
list(APPEND handles ${handle})
endforeach()
## wait for all downloads to finish
process_wait_all(${handles})
set(result_paths)
foreach(handle ${handles})
## get process stdout
process_stdout(${handle})
ans(output)
## remove '-- ' from beginning of output which is
## automatically prependend by message(STATUS)
string(SUBSTRING "${output}" 3 -1 output)
## store returned file path
list(APPEND result_paths ${output})
endforeach()
## return file paths of downloaded files
return_ref(result_paths)
endfunction()
## create and goto ./download_dir
cd("download_dir" --create)
## start downloading files in parallel by calling previously defined function
download_files_parallel(
http://www.cmake.org/files/v3.0/cmake-3.0.2.tar.gz
http://www.cmake.org/files/v2.8/cmake-2.8.12.2.tar.gz
)
ans(paths)
## assert that every the files were downloaded
foreach(path ${paths})
assert(EXISTS "${path}")
endforeach()
- datatypes
<process handle> ::= { state:<process state> , pid:<process id> }
process handle is a runtime unique map which is used to address a process. The process handle may contain more properties than specified - only the specified ones are available on all systems - these properties contain values which are implementation specific.<process info> ::= { }
a map containing verbose information on a proccess. only the specified fields are available on all platforms. More are available depending on the OS you use. You should not try to use these without examining their origin / validity.<process state> ::= "running"|"terminated"|"unknown"
<process id> ::= <string>
a unspecified systemwide unique string which identifies a process (normally a integer)
- platform specific low level functions
process_start(<process start info?!>):<process handle>
platfrom specific function which starts a process and returns a process handleprocess_kill(<process handle?!>)
platform specific function which stops a process.process_list():<process handle ...>
platform specific function which returns a process handle for all running processes on OS.process_info(<process handle?!>):<process info>
platform specific function which returns a verbose process infoprocess_isrunning(<process handle?!>):<bool>
returns true iff process is running.
process_timeout(<n:<seconds>>):<process handle>
starts a process which times out in<n>
seconds.process_wait(<process handle~> [--timeout <n:seconds>]):<process handle>
waits for the specified process to finish or the specified timeout to run out. returns null if timeout runs out before process finishes.process_wait_all(<process handle?!...> <[--timeout <n:seconds>] [--quietly]):<process handle ...>
waits for all specified process handles and returns them in the order that they completed. If the--timeout <n>
value is specified the function returns as soon as the timeout is reached returning only the process finished up to that point. The function normally prints status messages which can be supressed via the--quietly
flag.process_wait_any(<process handle?!...> <?"--timeout" <n:<seconds>>> <?"--quietly">):<?process handle>
waits for any of the specified processes to finish, returning the handle of the first one to finished. If--timeout <n>
is specified the function will returnnull
aftern
seconds if no process completed up to that point in time. You can specify--quietly
if you want to suppress the status messages.process_stdout(<process handle~>):<string>
returns all data written to standard output stream of the process specified up to the current point in timeprocess_stderr(<process handle~>):<string>
return all data written to standard error stream of the process specified up to the current point in timeprocess_return_code(<process handle~>):<int?>
returns nothing or the process return code if the process is finishedprocess_start_script(<cmake code>):<process handle>
starts a separate cmake process which runs the specified cmake code.
To communicate with you processes you can use any of the following well known mechanisms
- Environment Variables
- the started processes have access to you current Environment. So when you call
set(ENV{VAR} value)
before starting a process that process will have read access to the variable$ENV{VAR}
- the started processes have access to you current Environment. So when you call
- Command Line Arguments
- all variables passed to
start_process
will be passed allong - Command Line Variables are sometimes problematic as they must be escaped correctly and this does not always happen as expected. So you might want to choose another mechanism to transmit complex data to your process
- Command Line Variables are limited by their string length depending on you host os.
- all variables passed to
- Files
- Files are the easiest and safest way to communicate large amounts of data to another process. If you can try to use file communication
- stderr, stdout
- The output of a process started with
start_process
becomes available to you when the process ends at latest, You can choose to poll stdout and take data as soon as it is written to the output streams
- The output of a process started with
- return code
- the returns code tells you how you process finished and is often enough result information for a process you start
- process starting is slow - it can take seconds (it takes 900ms on my machine). The task needs to be a very large one for it to compensate the overhead.
- parallel processes use platform specific functions - It might cause problems on less well tested OSs and some may not be supported. (currently only platforms with bash or powershell are supported ie Windows and Linux)
async((<args...)) ->
executes a callable asynchroniously
todo:
capture necessary scope vars
include further files for custom functions
environment vars
combines the list of command line args into a string which separates and escapes them correctly
escapes a command line quoting arguments as needed
command_line_parse parses the sepcified cmake style command list which starts with COMMAND or parses a single command line call returns a command line object: { command:, args: ... }
(<process start info> [--process-handle] [--exit-code] [--async] [--silent-fail] [--success-callback <callable>] [--error-callback <callable>] [--state-changed-callback <callable>])-><process handle>|<exit code>|<stdout>|<null>
options
--process-handle
--exit-code
--async
--silent-fail
--success-callback <callable>[exit_code](<process handle>)
--error-callback <callable>[exit_code](<process handle>)
--state-changed-callback <callable>[old_state;new_state](<process handle>)
--lean
example
execute(cmake -E echo_append hello) -> 'hello'
(<cmake code> [--pure] <args...>)-><execute result>
equivalent to execute(...)->...
runs the specified code using cmake -P
.
prepends the current cmakepp.cmake
to the script (this default behaviour can be stopped by adding --pure
)
all not specified args
are forwarded to execute
(<process handle>)-><process handle>
executes the specified command with the specified arguments in the
current working directory
creates and registers a process handle which is then returned
this function accepts arguments as encoded lists. this allows you to include
arguments which contain semicolon and other special string chars.
the process id of processes start with process_execute
is always -1
because CMake
's execute_process
does not return it. This is not too much of a problem
because the process will always be terminated as soon as the function returns
parameters
<command>
the executable (may contain spaces)<arg...>
the arguments - may be an encoded list scopepwd()
used as the working-directory eventson_process_handle_created
global event is emitted when the process_handle is readyprocess_handle.on_state_changed
returns
<process handle> ::= {
pid: "-1"|"0"
}
returns the runtime unique process handle information may differ depending on os but the following are the same for any os
- pid
- status
(<process start info>)-><process handle>
returns a new process handle which has the following layout:
<process handle> ::= {
pid: <pid>
start_info: <process start info>
state: "unknown"|"running"|"terminated"
stdout: <text>
stderr: <text>
exit_code: <integer>|<error string>
command: <executable>
command_args: <encoded list>
on_state_change: <event>[old_state, new_state](${process_handle})
}
transforms a list of <process handle?!> into a list of
process_info(<process handle?!>): returns information on the specified process handle
returns true iff the process identified by is running
returns a list of containing all processes currently running on os process_list():...
refreshes the fields of the process handle returns true if the process is still running false otherwise this is the only function which is allowed to change the state of a process handle
returns the <return_code> for the specified process handle if process is not finished the result is empty
starts a process and returns a handle which can be used to controll it.
(<command string>|<object> [TIMEOUT <n:int>] [WORKING_DIRECTORY ~<path>] [--passthru])-><process start info>
<command string> ::= "COMMAND"? <command> <arg...>
creates a new process_start_info with the following fields
<process start info> ::= {
command: <executable>
command_arguments: <encoded list>
working_directory: <directory>
timeout: <n>
passthru: true|false
}
the syntax of the process_start_info_new is equivalent to cmake's built in execute_process
command.
the command needs to point to an executable, preferrably a fully qualified path the command_arguments contain an encoded list. you can specify any argument in the execute function - they will be passed along as you write them (this deserves extra appreciation because doing so in cmake is hard) If the flags you want to pass along to the process conflict with the flags of the process function you can always encode them yourself and pass them along as an encoded list
example
process_start_info_new(COMMAND cmake -E echo "asd bsd" csd) -> { "command":"cmake", "command_arguments":[ "-E", "echo", "asd;bsd" ], "working_directory":"/home/anotherfoxguy/Documents/Github/cmakepp/cmake/process", "timeout":"-1", "passthru":false }
shorthand to fork a cmake script
returns the current error output This can change until the process is finished
returns the current stdout of a this changes until the process is ove
returns a to a process that runs for n seconds
blocks until given process has terminated returns nothing if the process does not exist - is deleted etc updates and returns the process_handle if a timeout greater 0 the function will return nothing if the timeout is reached process_wait( <?--timeout:>)
(<handles: <process handle...>> [--timeout <seconds>] [--idle-callback <callable>] [--task-complete-callback <callable>] )
waits for all specified to finish returns them in the order in which they completed
--timeout <n>
if value is specified the function will return all
finished processes after n seconds
--idle-callback <callable>
if value is specified it will be called at least once
and between every query if a task is still running
--task-complete-callback <callable>
if value is specified it will be called whenever a
task completes.
Example
process_wait_all(${handle1} ${handle1} --task-complete-callback "[](handle)message(FORMAT '{handle.pid}')")
prints the process id to the console whenver a process finishes
waits until any of the specified handles stops running returns the handle of that process if --timeout is specified function will return nothing after n seconds
(<n:<int>|"*"> <process handle>... [--idle-callback:<callable>])-><process handle>...
waits for at least processes to complete
returns:
- at least
n
terminated processes
arguments:
n
an integer the number of processes to return (lower bound) ifn
is clamped to the number of processes. ifn
is * it is replaced with the number of processes--idle-callback
is called after every time a processes state was polled. It is guaranteed to be called once per process handle. it has access to the following scope variablesterminated_count
number of terminated processesrunning_count
number of running processeswait_time
time that was waitedwait_counter
number of times the waiting loop iteratedrunning_processes
list of running processescurrent_process
the current process being polledis_running
the running state of the current processterminated_processes
the list of terminated processes
a fast wrapper for the specified executable this should be used for executables that are called often and do not need to run async