Left semi join

Left semi join gives only rows from the left side table, if, and only if, they exist in the right side table. Use this to get rows from left table, if, and only if, the rows are found in the right table. This is the opposite of the left anti join seen in the previous section. It does not include right side values. It provides very good performance, as only one table is fully considered, and the other is only checked for the join condition:

This is similar to left outer join, except that we will only output left table records from cities.csv.

Get Big Data Analytics with Hadoop 3 now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.