HDFS block placement will use rack awareness for fault tolerance by placing one block replica on a different rack, as shown in the following diagram:
Let's understand the figure in detail:
- The first replica is placed on the same rack as the initiating request DataNode, for example, Rack 1 and DataNode 1
- The second replica is placed on any DataNode of another rack, for example, Rack 2, DataNode 2
- The third replica is placed on any DataNode of the same rack, for example, Rack 2, DataNode 3
A custom rack topology script, which contains an algorithm to select appropriate DataNodes, can be developed using a Unix shell, Java, or ...