Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request implements memory mapped access to (larger) reference files in
bwa mem
. Themmap
s are created as private (copy-on-write), read-only mode and are locked in memory. This approach presents two main advantages over allocating space in memory andfread
ing the data into it.The first advantage is that the reference data in memory is automatically shared in memory by all bwa processes using it. In fact, by memory mapping the file the reference is automatically loaded into a shared memory space.
The second advantage is that the OS will not evict the reference data from memory as soon as the process exits. Instead, it will be kept around until the memory is needed by the system. This means that for multiple invocations of BWA with the same reference only the first will have to wait for the reference to be loaded; the subsequent ones will re-use the same data so they will proceed directly to aligning reads. This of course is only true if the system isn't subject to intense memory use by other processes in the meantime.
To enable memory-mapped reference access call
bwa mem
with the new "-z" option.Because we're locking the reference in memory, this approach will need more memory on the system than the standard
malloc
&fread
strategy; if the system can't comfortably fit the entire reference in memory the OS will refuse to map the file to memory. In that casebwa mem
will exit with an error message.Comparison to SHM
I originally wrote this code against version 0.7.8, then got sidetracked before I managed to put together the pull request. I see that in the meantime I see that there's been some work towards using POSIX shared memory with the a similar objective in mind (sharing references in memory across multiple processes).
While both strategies manage to avoid loading multiple copies of the same reference in memory, I would argue that the one presented in this pull request should be preferred -- or at least co-exist -- especially for its simplicity. A user need not do anything special to take advantage of the feature other than passing the "-z" option, while using SHM objects requires that the user pre-load the references with
bwa shm
before runningbwa mem
, and that he remember to delete them afterwards to free the resources on the system.Explicitly having to create a delete the shared reference is also problematic and less effective for anyone launching parallel alignment jobs, perhaps through a batch queueing system, where the nodes where the
bwa
instances will end up running are not easily predicted. In such a scenario, an alignment task should properly clean up after itself after finishing, but in doing so it may be trying to delete the shared memory object that is being used by other concurrent bwa jobs (not so bad; it should stick around until the last process releases it). Moreover, by deleting the shared memory object when it finishes, subsequent alignment runs using the same reference will not find it in memory and thus will not be able to benefit from caching effects; they will instead have to spend time reloading it.Testing
I'm not sure whether there's an official test suite for contributions. I tested by mapping 200k pairs to the human reference with both SHM and
-z
memory mapping. I compared output by md5 checksum, after removing the@PG
line and got identical results.