Given an array of integers, find the first missing positive integer in linear time and constant space. In other words, find the lowest positive integer that does not exist in the array. The array can contain duplicates and negative numbers as well.
For example, the input [3, 4, -1, 1]
should give 2
. The input [1, 2, 0]
should give 3
.
You can modify the input array in-place.
Array of integers arr
Lowest positive integer not in the array num
I like this one! Really made me think about some outside-the-box ways to approach this problem.
I thought I could sort the numbers, then just start at the lowest positive integer and keep a concurrently incrementing value next to it, and return that incrementing value if they're not equal. But sorting is nlogn time, so that fails specs.
I then looked more into the math of the problem, and found out a couple powerful simplifications:
- We can ignore all non-positive integers
- We can also ignore all integers greater than the length of
arr
because the maximum possible output is given by a permutation of[1, 2, ..., n]
, which givesn + 1
, and any modification to a value ofarr
would change the output to the pre-modification value.
Example:
let arr = [1, 2, 3, 4]
firstMissingInteger(arr) // 5
let arr2 = [1, 2, 7, 4]
firstMissingInteger(arr2) // 3
So I thought of keeping a parallel boolean array markers
of size n
that marks the values from 1
to n
that we encounter, then iterate through markers
. That would be n time, but also n space which, once again, fails specs.
I then revisited the point at the end of the problem: you can modify the input array in-place. Say no more fam.
So I figured I could go through the array in 3 passes. It could have been less, but I kept it in 3 for the sake of clarity:
- Set all irrelevant values in
arr
tofalse
- For each value in
arr
, if it is a numbernum
, set the value at that corresponding indexnum-1
totrue
. If there is currently a numbernum2
there, set the value at that corresponding indexnum2-1
totrue
. Do this recursively.
The reason this is n
time is because we are modifying the value at the desired index to true
before recursing. We only recursively look at the next value if there is a number. We'll never have to revisit that index in the same recursive fashion, but we may have to set an already true
value to true
again. That's ok, because that doubles the usual work in the absolute worst case.
Example:
arr = [3, 4, -1, 1]
=> [3, 4, false, 1]
=> [3, 4, true, 1] // saw 3, changed arr[2] to true, didn't recurse
=> [3, 4, true, true] // saw 4, changed arr[3] to true, recursed with 1
=> [true, 4, true, true] // saw 1, changed arr[0] to true, recursed with 3
=> [true, 4, true, true] // saw 3, changed arr[3] to true, didn't recurse
- Iterate through
arr
and return the firstfalse
orNumber
encountered. If none are found, returnn + 1
.
Although this completely mutilates our original array, we do 3 subequent n time actions, resulting in overall n time, and none of the space allocated is dependent on n, so it's also constant space.
Just to kinda delve further, the most efficient way I can think of to maintain the input array and achieve the same result would be to lose the constant space requirement, and use a boolean filter array like I described above. That would maintain n time, and assuming the programming language has a data structure to pack booleans into all bits of a byte, it would use at most floor(n/8)
bytes of space.