Skip to content

Use SQL to instantly query HTML resources. Open source CLI. No DB required.

License

Notifications You must be signed in to change notification settings

turbot/steampipe-plugin-html

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

html Plugin for Steampipe

HTML plugin for Steampipe

Web pages often contain data in HTML tables. This plugin's html_table table downloads one or more tables from a web page into one or more CSV files that you can query using the CSV.

Web pages also contain links. This plugin's html_link table queries for them.

The file ./config/html.spc is used to define a local path to which downloaded HTML files will be saved.

Get started

Install go, then:

$ git clone https://github.com/turbot/steampipe-plugin-html

$ cp ./config/html.spc ~/.steampipe/config

$ make

$ steampipe query

> select
  base_name,
  name,
  path,
  columns
from
  html_table
where
  url = 'https://simple.wikipedia.org/wiki/List_of_U.S._states_by_population'
  and base_name = 'wiki'
+-----------+--------+---------------+------------------------------------------------------------------------
| base_name | name   | path          | columns
+-----------+--------+---------------+------------------------------------------------------------------------
| wiki      | wiki_0 | /home/jon/csv | "Rankinstates&territories,2019","Rankinstates&territories,2010","State"
+-----------+--------+---------------+------------------------------------------------------------------------

In this example the plugin found one table on the page, and downloaded it as /home/jon/csv/wiki_0.csv (the /home/jon/csv path is specified in ./config/html.spc).

Here is a query of that table.

with data as (
  select
    "State" as state,
    "Rankinstates&territories,2010"::int as rank2010,
    "Rankinstates&territories,2019"::int as rank2019,
    replace("Percentchange,20102019[note1]",'%','')::numeric as pct_change,
    replace("PercentofthetotalU.S.population,2018[note3]",'%','')::numeric as pct_of_us_pop
  from
    wiki_0
)
select
	state,
  rank2010,
  rank2019,
  - (rank2019 - rank2010) as rank_change,
  pct_change,
  pct_of_us_pop
from
  data
order by
  pct_change desc
+---------------+----------+----------+-------------+------------+---------------+
| state         | rank2010 | rank2019 | rank_change | pct_change | pct_of_us_pop |
+---------------+----------+----------+-------------+------------+---------------+
| Utah          | 35       | 30       | 5           | 16.0       | 0.96          |
| Texas         | 2        | 2        | 0           | 15.3       | 8.68          |
| Colorado      | 22       | 21       | 1           | 14.5       | 1.72          |
| NewYork       | 3        | 3        | 0           | 14.2       | 6.44          |
| Nevada        | 36       | 33       | 3           | 14.1       | 0.92          |
| Idaho         | 40       | 39       | 1           | 14.0       | 0.53          |
| Arizona       | 16       | 14       | 2           | 13.9       | 2.17          |
| NorthDakota   | 49       | 48       | 1           | 13.3       | 0.23          |