Information about an image.
Information about an image. e.g. width, height.
Image sizing utilities.
Compute MD5 checksum.
Detects language using Apache Tika.
Detect MIME type using Apache Tika.
Extract raw text content from an HTML page, minus "boilerplate" content (using boilerpipe).
Gets different parts of a dateString.
Extracts the host domain name from a full url string.
Extracts image details given raw bytes.
Extracts image links from a webpage given the HTML content (using Jsoup).
Extracts links from a webpage given the HTML content (using Jsoup).
Exacts texts from PDFs using Apache Tika.
Extracts Urls found in a string of text.
Extracts Urls found in a string of text.
a list of urls found in the string.
Get file extension using MIME type, then URL extension.
Reads in a text string, and returns entities identified by the configured Stanford NER classifier.
Removes HTML markup with JSoup.
Remove HTTP headers.
Tuple formatter utility.
Package object which supplies implicits providing common UDF-related functionalities.