Package

io.archivesunleashed

app

Permalink

package app

Visibility
  1. Public
  2. All

Type Members

  1. class NERCombinedJson extends Serializable

    Permalink

    Classifies records using NER and stores results as JSON.

Value Members

  1. object ExtractEntities

    Permalink

    Performs Named Entity Recognition (NER) on a WARC or ARC file.

    Performs Named Entity Recognition (NER) on a WARC or ARC file.

    Named Entity Recognition applies rules formed in a Named Entity Classifier to identify locations, people or other objects from data.

  2. object ExtractGraph

    Permalink

    Extracts a network graph using Spark's GraphX utility.

  3. object ExtractPopularImages

    Permalink

    Extract most popular images from an RDD.

  4. object WriteGEXF

    Permalink

    UDF for exporting an RDD representing a collection of links to a GEXF file.

  5. object WriteGraphML

    Permalink

    UDF for exporting an RDD representing a collection of links to a GraphML file.

Ungrouped