I use the following Python solution to discuss the correctness of the algorithm:
class Solution:
"""
@param: nums: a list of integers
@return: The majority number that occurs more than 1/3
"""
def majorityNumber(self, nums):
if nums is None:
return None
if len(nums) == 0:
return None
num1 = None
num2 = None
count1 = 0
count2 = 0
# Loop 1
for i, val in enumerate(nums):
if count1 == 0:
num1 = val
count1 = 1
elif val == num1:
count1 += 1
elif count2 == 0:
num2 = val
count2 = 1
elif val == num2:
count2 += 1
else:
count1 -= 1
count2 -= 1
count1 = 0
count2 = 0
for val in nums:
if val == num1:
count1 += 1
elif val == num2:
count2 += 1
if count1 > count2:
return num1
return num2
First, we need to prove claim A:
Claim A: Consider a list C which contains a majority number m which occurs more floor(n/3) times. After 3 different numbers are removed from C, we have C'. m is the majority number of C'.
Proof: Use R to denote m's occurrence count in C. We have R > floor(n/3). R > floor(n/3) => R - 1 > floor(n/3) - 1 => R - 1 > floor((n-3)/3). Use R' to denote m's occurrence count in C'. And use n' to denote the length of C'. Since 3 different numbers are removed, we have R' >= R - 1. And n'=n-3 is obvious. We can have R' > floor(n'/3) from R - 1 > floor((n-3)/3). So m is the majority number of C'.
Now let's prove the correctness of the loop 1. Define L as count1 * [num1] + count2 * [num2] + nums[i:]. Use m to denote the majority number.
Invariant
The majority number m is in L.
Initialization
At the start of the first itearation, L is nums[0:]. So the invariant is trivially true.
Maintenance
if count1 == 0 branch: Before the iteration, L is count2 * [num2] + nums[i:]. After the iteration, L is 1 * [nums[i]] + count2 * [num2] + nums[i+1:]. In other words, L is not changed. So the invariant is maintained.
if val == num1 branch: Before the iteration, L is count1 * [nums[i]] + count2 * [num2] + nums[i:]. After the iteration, L is (count1+1) * [num[i]] + count2 * [num2] + nums[i+1:]. In other words, L is not changed. So the invariant is maintained.
f count2 == 0 branch: Similar to condition 1.
elif val == num2 branch: Similar to condition 2.
else branch: nums[i], num1 and num2 are different to each other in this case. After the iteration, L is (count1-1) * [num1] + (count2-1) * [num2] + nums[i+1:]. In other words, three different numbers are moved from count1 * [num1] + count2 * [num2] + nums[i:]. From claim A, we know m is the majority number of L.So the invariant is maintained.
Termination
When the loop terminates, nums[n:] is empty. L is count1 * [num1] + count2 * [num2].
So when the loop terminates, the majority number is either num1 or num2.