-
Notifications
You must be signed in to change notification settings - Fork 25
/
Copy pathhtml-to-favicon-url
executable file
·51 lines (50 loc) · 1.49 KB
/
html-to-favicon-url
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
#!/bin/sh
#
# Parse HTML to extract the favicon URL.
#
# Syntax:
#
# ... | html-to-favicon-href
#
# Example:
#
# wget http://example.com | html-to-favicon-href
# => http://example.com/favicon.ico
#
# ## Implementation
#
# This implementation requires Ruby and the Nokogiri gem.
#
# We have tried other implementations that haven't worked so far,
# such as using hxselect, or xmlstarlet, or pup, but these failed
# on various real-world web pages e.g. www.kickstarter.com:
#
# * The `pup` implementation finds the link, but the
# selector doesn't seem to be able to do `contains`:
#
# ... | pup '//link[contains(concat(" ", @rel, " "), " icon ")]'
#
# * The `tidy` approach works to clean up the HTML, but nothing
# later in the pipe is able to find the link and its href:
#
# ... | tidy -q --show-warnings no -numeric -asxhtml
#
# * The `hxselect` approach works to find the link,
# but the CSS selector is unable to get the href:
#
# ... | hxclean 2>/dev/null | hxselect 'link[rel~="icon"]'
#
# * An implmentation using ruby succeeds, but needs nokogiri:
#
# ... | ruby -rnokogiri -p -e \
# 'd=Nokogiri::HTML(STDIN.read); $_=""; puts d.xpath("//link[contains(concat(\" \", @rel, \" \"), \" icon \")]").first["href"]'
#
# Command: html-to-favicon-url
# Version: 1.3.0
# Contact: Joel Parker Henderson ([email protected])
# License: GPL
# Created: 2015-05-31
# Updated: 2016-02-09
##
set -euf
pup 'link[rel~="icon"] attr{href}' | head -1