1.) As I said in the piece, source is Harappa Ancestry scores. These are publicly available online at forums like Anthrogenica.
2.) Dalits do not form the majority of the Gangetic Plain. Scheduled Castes and Tribes together are 25% of India's population, and per the British census the true Dalit and Adivasi figures are even less then that.
3.) PCA was made via BioVinci, dimensions were based on Harappa components. Again, this is all publicly available data, if you are willing to put the work in, you can recreate this same visualization yourself.