Package

io.archivesunleashed

app

Permalink

package app

Visibility
  1. Public
  2. All

Type Members

  1. class CmdAppConf extends ScallopConf

    Permalink

    Construct a Scallop option reader from command line argument string list.

  2. class CommandLineApp extends AnyRef

    Permalink

    Main application that parse command line arguments and invoke appropriate extractor.

  3. class NERCombinedJson extends Serializable

    Permalink

    Classifies records using NER and stores results as JSON.

Value Members

  1. object AudioInformationExtractor

    Permalink
  2. object CommandLineAppRunner

    Permalink
  3. object DomainFrequencyExtractor

    Permalink
  4. object DomainGraphExtractor

    Permalink
  5. object ExtractEntities

    Permalink

    Performs Named Entity Recognition (NER) on a WARC or ARC file.

    Performs Named Entity Recognition (NER) on a WARC or ARC file.

    Named Entity Recognition applies rules formed in a Named Entity Classifier to identify locations, people or other objects from data.

  6. object ExtractImageDetailsDF

    Permalink

    Extracts image details given raw bytes.

  7. object ExtractPopularImagesDF

    Permalink

    Extract most popular images from a Data Frame.

  8. object ExtractPopularImagesRDD

    Permalink

    Extract most popular images from an RDD.

  9. object ImageGraphExtractor

    Permalink
  10. object ImageInformationExtractor

    Permalink
  11. object PDFInformationExtractor

    Permalink
  12. object PlainTextExtractor

    Permalink
  13. object PresentationProgramInformationExtractor

    Permalink
  14. object SpreadsheetInformationExtractor

    Permalink
  15. object TextFilesInformationExtractor

    Permalink
  16. object VideoInformationExtractor

    Permalink
  17. object WebGraphExtractor

    Permalink
  18. object WebPagesExtractor

    Permalink
  19. object WordProcessorInformationExtractor

    Permalink
  20. object WriteGEXF

    Permalink
  21. object WriteGraphML

    Permalink

Ungrouped