Extract domain frequency from web archive using Data Frame and Spark SQL.
Extract domain frequency from web archive using Data Frame and Spark SQL.
Data frame obtained from RecordLoader
Dataset[Row], where the schema is (Domain, count)
Extract domain frequency from web archive using MapReduce.
Extract domain frequency from web archive using MapReduce.
RDD[ArchiveRecord] obtained from RecordLoader
RDD[(String,Int))], which holds (DomainName, DomainFrequency)