Greedy Algorithm¶
When we are facing some multi-phase choice problems(like dp),
greedy algorithm tells us to do "current-optimal" choices, and hoping that we can get "global optimal" solution.
Design and proof of a greedy algorithm¶
- Choose a subproblem
- Prove that choice is safe
- Prove that after every choice, there always exists a optimal solution
Matroids¶
Ordered Pair \(M = (S,\mathcal I)\)
- \(S\): finite, nonempty set
- \(\mathcal I \subseteq 2^S\): hereditary(遗传的), if \(A \in \mathcal I\), then for any \(B \subseteq A\), \(B \in \mathcal I\)
Exchange property:
for \(A, B \in \mathcal I\) and \(|A| < |B|\), then there is some element \(x \in B - A\) such that \(A \cup \{x\} \in \mathcal I\).
Intuition: No "seam"
Extension:
\(x \not\in A\) is a extension of \(A\) \(\iff\) \(A \cup \{ x \} \in \mathcal I\)
Maximum property:
If \(A\) has no extension, then \(A\) is maximal.
Theorem
All maximal independent subsets in a matroid has the same size.
Weight Matroid¶
Every elemnt in \(S\) has a weight, we use the function \(w\) to represent it.
Definition on set: \(w(A) = \max_{x\in A} w(x)\)
Graphic Matroid¶
\(G = (V,E)\)
- \(S_G = E\)
- \(A \subseteq 2^E\), \(A \in \mathcal I_G\) if and only if \(A\) has no cycle(acyclic, forest)
Transformation to greedy algorithm¶
Find the maximum-weight independent subset in a weighted matroid.
The "greedy" approach is right.
Intuitional proof of correctness:
- If an element is not picked in the first place, then it won't be choosed afterward.
一方面:选择了 \(x\) 之后,根据子集遗传的性质,\(A - \{x\}\) 仍然是合法的独立集合。
另一方面:刨除掉 \(x\) 之后产生的 \(A'\) 一定可以加上一个 \(x\) 之后仍然是独立子集。
Problems¶
Activity Selection Problem¶
DP solution¶
...
Greedy Solution¶
Hint
Consider \(S_k = \{a_i \in S \mid s_i \geq f_k \}\) (start after \(k\) finish)
The one with earliest finish time will be in some maximum-size subset of mutually compatible activities of \(S_k\)
Intuitional Proof:
Proof by contradiction(?).
Change the first of activity
The result:
- We reduce the size of subproblmes.
- We only need to consider one choice.(choose the first-finished task)
Huffman Tree/Codes¶
- Varibale-length 0/1 code;
- Prefix-free codes
- The codes in which no codeword is also a prefix of some other codeword
Code method can be represented in a tree, like:
The tree had a cost, which is the total length of encoded text.
We want to minify that cost.
Correctness:
Theorem
\(C\) : alphabet, with frequency \(c.freq\) for \(c \in C\)
\(T\) : tree, with optimal cost
From \(C\) to \(C'\):
- choose two least frequent character \(x,y\)
- \(C' = C - \{x, y\} \cup \{ z \}\)
- \(z.freq = x.freq + y.freq\)
Then we can construct \(T\) from \(T'\), by adding \(x,y\) as two sons of \(z\) in \(T '\).
Intuitional proof:
- 在最优的树中,最小权值的两个结点一定是兄弟。
- 用反证法
- 这样的话,两个点和缩成一个点就是等价的;
- 用反证法
Task-scheduling problem(solved by using matroid)¶
Input:
- \(n\) unitary tasks \(a_1, \cdots, a_n\)
- each with a deadline: \(d_1, \cdots, d_n\)
- each with a penalty: \(w_1, \cdots w_n\)
Output:
- minify the total penalty
Solution:
If there are two tasks,
- 超时的在前面,没超时的在后面
- 那么我们可以交换两者,前者仍然没超时,后者仍然超时
- 所以我们一定可以找到最优方案,让某点前面都没超时,后面都超时了。
Construct Independent set \(\mathcal I\). We have a no penalty schedule over \(a \in \mathcal I\).
Easy to see this formed a matroid.
We want to find the maximum-weight independent set in \(\mathcal I\)