3 \begin{enumerate}[label=(\alph*)]
5 Yes. Some words (e.g. "I", "a", "are",..) will occur very often, the
6 reducers handling that key will take very long to process all values.
7 Other reducers handling less frequently used words will be finished in
10 With less reducers the total time will go up, but the skew will be
11 less, as the long lists are more likely to be moved to a reducer which
12 had short lists before (as that one will be finished sooner)
13 With 10000 reducers the skew will go up, as a few of those reducers
14 will have to handle very long lists, while the others will finish early
15 and idle. This is still assuming that we don't have a combiner.
17 It will be much less skew than if we don't use a combiner. A lot of
18 words appear lots of times on a page, so pre-processing them with a
19 combiner will reduce the number of values for the reducers
20 significantly, and this also will reduce communication cost.
22 Communication cost will increase, as the reducer size can be set to
23 lower values. This will also reduce the skew in the times taken by the
25 Replication rate will stay the same, as this is determined by mapped
26 values divided by inputs (which is about 2500 for an average webpage)
27 Reducer size can go down, in order to reduce reducer runtime skew.