3 \begin{enumerate}[label=(\alph*)]
4 \item Hash Join in Merge Join.
6 EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
7 SELECT a,b,c FROM r NATURAL JOIN s NATURAL JOIN t;
12 Merge Join (cost=126688.84..189746.85 rows=4812309 width=12)
13 (actual time=1223.252..3564.720 rows=4790517 loops=1)
15 Merge Cond: ((t.a = r.a) AND (t.c = s.c))
16 Buffers: shared hit=138, temp read=1320 written=3593
17 -> Sort (cost=809.39..834.39 rows=10000 width=8)
18 (actual time=6.542..9.442 rows=10000 loops=1)
21 Sort Method: quicksort Memory: 853kB
22 Buffers: shared hit=45
23 -> Seq Scan on public.t (cost=0.00..145.00 rows=10000 width=8)
24 (actual time=0.013..1.698 rows=10000 loops=1)
26 Buffers: shared hit=45
27 -> Materialize (cost=125878.37..130770.59 rows=978445 width=12)
28 (actual time=1216.698..1848.300 rows=4790518 loops=1)
30 Buffers: shared hit=93, temp read=1320 written=3593
31 -> Sort (cost=125878.37..128324.48 rows=978445 width=12)
32 (actual time=1216.694..1265.485 rows=105629 loops=1)
35 Sort Method: external merge Disk: 21312kB
36 Buffers: shared hit=93, temp read=1320 written=3593
37 -> Hash Join (cost=270.00..11799.45 rows=978445 width=12)
38 (actual time=5.707..326.759 rows=993774 loops=1)
40 Hash Cond: (r.b = s.b)
41 Buffers: shared hit=93
42 -> Seq Scan on public.r (cost=0.00..145.00 rows=10000 width=8)
43 (actual time=0.013..2.200 rows=10000 loops=1)
45 Buffers: shared hit=45
46 -> Hash (cost=145.00..145.00 rows=10000 width=8)
47 (actual time=5.636..5.636 rows=10000 loops=1)
49 Buckets: 16384 Batches: 1 Memory Usage: 519kB
50 Buffers: shared hit=45
51 -> Seq Scan on public.s (cost=0.00..145.00 rows=10000 width=8)
52 (actual time=0.012..2.656 rows=10000 loops=1)
54 Buffers: shared hit=45
55 Planning time: 0.448 ms
56 Execution time: 4093.957 ms
63 set enable\_hashjoin=1; set enable\_mergejoin=0; set enable\_nestloop=0;
64 EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
65 SELECT a,b,c FROM r NATURAL JOIN s NATURAL JOIN t;
70 Hash Join (cost=565.00..382720.42 rows=4812309 width=12)
71 (actual time=10.065..2336.284 rows=4790517 loops=1)
72 Execution time: 2839.835 ms
77 set enable\_hashjoin=0; set enable\_mergejoin=1; set enable\_nestloop=0;
78 EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
79 SELECT a,b,c FROM r NATURAL JOIN s NATURAL JOIN t;
84 Merge Join (cost=127309.80..190367.82 rows=4812309 width=12)
85 (actual time=1400.115..3961.542 rows=4790517 loops=1)
86 Planning time: 0.374 ms
87 Execution time: 4215.626 ms
92 set enable\_hashjoin=0; set enable\_mergejoin=0; set enable\_nestloop=1;
93 EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
94 SELECT a,b,c FROM r NATURAL JOIN s NATURAL JOIN t;
99 Nested Loop (cost=0.00..172728360.00 rows=4812309 width=12)
100 Execution time: query-timeout
104 TODO: Explain perfomance diffrences.
106 TODO: change create numbers to change performance. -> (non)duplicate stuff?
110 \item create indices.
112 create index i1 on r(a);
113 create index i2 on r(b);
114 create index i3 on s(b);
115 create index i4 on s(c);
116 create index i5 on t(a);
117 create index i6 on t(c);
118 create index ii1 on r(a,b);
119 create index ii2 on s(b,c);
120 create index ii3 on t(a,c);
121 REINDEX DATABASE u00726236; VACUUM ANALYZE;
124 EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
125 SELECT a,b,c FROM r NATURAL JOIN s NATURAL JOIN t;
128 New query plan for merge join.
130 Merge Join (cost=126115.03..189598.35 rows=4812309 width=12)
131 (actual time=1471.775..3798.577 rows=4790517 loops=1)
132 Output: r.a, r.b, s.c
133 Merge Cond: ((t.a = r.a) AND (t.c = s.c))
134 Buffers: shared hit=7289 read=58, temp read=1325 written=3598
135 -> Index Only Scan using ii3 on public.t (cost=0.29..449.50 rows=10000 width=8)
136 (actual time=0.034..8.250 rows=10000 loops=1)
139 Buffers: shared hit=6299 read=29
140 -> Materialize (cost=126114.75..131006.97 rows=978445 width=12)
141 (actual time=1471.732..2093.037 rows=4790518 loops=1)
142 Output: r.a, r.b, s.c
143 Buffers: shared hit=990 read=29, temp read=1325 written=3598
144 -> Sort (cost=126114.75..128560.86 rows=978445 width=12)
145 (actual time=1471.727..1519.766 rows=105629 loops=1)
146 Output: r.a, r.b, s.c
148 Sort Method: external merge Disk: 21312kB
149 Buffers: shared hit=990 read=29, temp read=1325 written=3598
150 -> Merge Join (cost=809.67..12035.83 rows=978445 width=12)
151 (actual time=5.865..568.309 rows=993774 loops=1)
152 Output: r.a, r.b, s.c
153 Merge Cond: (r.b = s.b)
154 Buffers: shared hit=990 read=29
155 -> Index Scan using i2 on public.r (cost=0.29..449.80 rows=10000 width=8)
156 (actual time=0.026..7.013 rows=10000 loops=1)
158 Buffers: shared hit=945 read=29
159 -> Sort (cost=809.39..834.39 rows=10000 width=8)
160 (actual time=5.836..189.024 rows=993775 loops=1)
163 Sort Method: quicksort Memory: 853kB
164 Buffers: shared hit=45
165 -> Seq Scan on public.s (cost=0.00..145.00 rows=10000 width=8)
166 (actual time=0.013..2.654 rows=10000 loops=1)
168 Buffers: shared hit=45
169 Planning time: 1.060 ms
170 Execution time: 4323.022 ms
173 It seems like no performance improvement is made: Query planning and execution time is bigger with indices.
176 Suddenly nested loop joins become feasible.
178 Nested Loop (cost=0.57..498778.26 rows=4812309 width=12)
179 (actual time=0.096..8351.002 rows=4790517 loops=1)
180 Output: r.a, r.b, s.c
181 Buffers: shared hit=7476815 read=11
182 -> Nested Loop (cost=0.29..30520.00 rows=1007007 width=12)
183 (actual time=0.076..1017.339 rows=987869 loops=1)
184 Output: r.a, r.b, t.c
185 Buffers: shared hit=950972 read=4
186 -> Seq Scan on public.t (cost=0.00..145.00 rows=10000 width=8)
187 (actual time=0.015..2.720 rows=10000 loops=1)
189 Buffers: shared hit=45
190 -> Index Only Scan using ii1 on public.r (cost=0.29..2.05 rows=99 width=8)
191 (actual time=0.007..0.076 rows=99 loops=10000)
193 Index Cond: (r.a = t.a)
195 Buffers: shared hit=950927 read=4
196 -> Index Only Scan using ii2 on public.s (cost=0.29..0.41 rows=5 width=8)
197 (actual time=0.003..0.006 rows=5 loops=987869)
199 Index Cond: ((s.b = r.b) AND (s.c = t.c))
200 Heap Fetches: 4790517
201 Buffers: shared hit=6525843 read=7
202 Planning time: 0.604 ms
203 Execution time: 8884.429 ms
211 RESET ALL; EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
212 SELECT a,b,c FROM r NATURAL JOIN t WHERE b IN (SELECT s.b FROM s);
217 Hash Join (cost=578.59..12368.66 rows=1007007 width=12)
218 (actual time=25.336..340.303 rows=987869 loops=1)
219 Output: r.a, r.b, t.c
220 Hash Cond: (t.a = r.a)
221 Buffers: shared hit=135
222 -> Seq Scan on public.t (cost=0.00..145.00 rows=10000 width=8)
223 (actual time=0.035..1.820 rows=10000 loops=1)
225 Buffers: shared hit=45
226 -> Hash (cost=453.59..453.59 rows=10000 width=8)
227 (actual time=25.235..25.235 rows=10000 loops=1)
229 Buckets: 16384 Batches: 1 Memory Usage: 519kB
230 Buffers: shared hit=90
231 -> Hash Join (cost=172.27..453.59 rows=10000 width=8)
232 (actual time=14.607..22.222 rows=10000 loops=1)
234 Hash Cond: (r.b = s.b)
235 Buffers: shared hit=90
236 -> Seq Scan on public.r (cost=0.00..145.00 rows=10000 width=8)
237 (actual time=0.026..1.943 rows=10000 loops=1)
239 Buffers: shared hit=45
240 -> Hash (cost=171.01..171.01 rows=101 width=4)
241 (actual time=14.558..14.559 rows=101 loops=1)
243 Buckets: 1024 Batches: 1 Memory Usage: 12kB
244 Buffers: shared hit=45
245 -> HashAggregate (cost=170.00..171.01 rows=101 width=4)
246 (actual time=14.423..14.486 rows=101 loops=1)
249 Buffers: shared hit=45
250 -> Seq Scan on public.s (cost=0.00..145.00 rows=10000 width=4)
251 (actual time=0.028..6.220 rows=10000 loops=1)
253 Buffers: shared hit=45
254 Planning time: 0.626 ms
255 Execution time: 439.179 ms
258 It now takes only 439.7 ms to execute the query, instead of the previous 4323 ms.
261 \item This was the case with (d) already.\\
262 But if we create indexes we get a merge join + hash semi join combination, which is a little bit slower.
264 Merge Join (cost=1118.26..12199.68 rows=1007007 width=12)
265 (actual time=23.151..517.241 rows=987869 loops=1)
266 Output: r.a, r.b, t.c
267 Merge Cond: (t.a = r.a)
268 Buffers: shared hit=585 read=29
269 -> Index Scan using i5 on public.t (cost=0.29..448.89 rows=10000 width=8)
270 (actual time=0.038..4.266 rows=10000 loops=1)
272 Buffers: shared hit=495 read=29
273 -> Sort (cost=1117.98..1142.98 rows=10000 width=8)
274 (actual time=23.107..186.813 rows=987870 loops=1)
277 Sort Method: quicksort Memory: 853kB
278 Buffers: shared hit=90
279 -> Hash Join (cost=172.27..453.59 rows=10000 width=8)
280 (actual time=11.223..19.180 rows=10000 loops=1)
282 Hash Cond: (r.b = s.b)
283 Buffers: shared hit=90
284 -> Seq Scan on public.r (cost=0.00..145.00 rows=10000 width=8)
285 (actual time=0.019..2.069 rows=10000 loops=1)
287 Buffers: shared hit=45
288 -> Hash (cost=171.01..171.01 rows=101 width=4)
289 (actual time=11.183..11.183 rows=101 loops=1)
291 Buckets: 1024 Batches: 1 Memory Usage: 12kB
292 Buffers: shared hit=45
293 -> HashAggregate (cost=170.00..171.01 rows=101 width=4)
294 (actual time=11.079..11.126 rows=101 loops=1)
297 Buffers: shared hit=45
298 -> Seq Scan on public.s (cost=0.00..145.00 rows=10000 width=4)
299 (actual time=0.019..4.724 rows=10000 loops=1)
301 Buffers: shared hit=45
302 Planning time: 1.320 ms
303 Execution time: 624.356 ms
310 RESET ALL; EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
311 SELECT a,b,c FROM r NATURAL JOIN s WHERE a IN (SELECT t.a FROM t);
314 Without index: hash join + hash semi join
316 Hash Join (cost=440.25..1943.04 rows=112490 width=12)
317 (actual time=20.751..57.164 rows=105628 loops=1)
318 Planning time: 0.836 ms
319 Execution time: 67.660 ms
322 With index: hash join + merge join.
324 Hash Join (cost=440.59..1817.14 rows=112490 width=12)
325 (actual time=19.917..53.721 rows=105628 loops=1)
326 Planning time: 1.938 ms
327 Execution time: 65.131 ms
333 RESET ALL; EXPLAIN (ANALYZE, COSTS, VERBOSE, BUFFERS)
334 SELECT a,b,c FROM s NATURAL JOIN t WHERE a IN (SELECT r.a FROM r);
337 Without index: hash join + hash semi join
339 Hash Join (cost=442.27..56192.50 rows=4884102 width=12)
340 (actual time=17.134..1482.398 rows=4884102 loops=1)
341 Planning time: 0.547 ms
342 Execution time: 1980.768 ms
345 With index: hash join + hash join.
347 Hash Join (cost=442.27..56192.50 rows=4884102 width=12)
348 (actual time=17.631..1541.867 rows=4884102 loops=1)
349 Planning time: 0.978 ms
350 Execution time: 2040.407 ms
355 -> Indizes quasi irrelevant.\\
356 -> performance unterschiede weil unterschiedliche set mächtigkeiten?