Fitting in nicely with the discussion on pulling financial data into Google Spreadsheets, the OUseful blog recently demonstrated another Spreadsheet data import function, importHTML(), which allows you to easily link an external HTML table to your workbook.
The Google spreadsheet function =importHTML(“”,”table”,N) will scrape a table from an HMTL web page into a Google spreadsheet. The URL of the target web page, and the target table element both need to be in double quotes. The number N identifies the N’th table in the page (counting starts at 0) as the target table for data scraping.
The author goes on to show you how to pull a country population table from a Wikipedia entry into a spreadsheet, create a graph from it, publish the spreadsheet as a CSV, consume the CSV in Yahoo Pipes, export the Pipe output to KML, and import the KML into a Google Map. Whew!
The importHTML function will accept either “list” or “table” as the second parameter, which allows you to retrieve records from either UL/OL/DL lists or TABLE contents, respectively. If you want to retrieve something that’s not table or list based, the importXML may also come in handy. With importXML, you can pull data from any XML or HTML file using an XPath query to target a specific tag or attribute. For more information on these import functions, consult the official documentation below.
Data Scraping Wikipedia with Google Spreadsheets
Google Docs Documentation: Functions For External Data
Previously:
HOWTO – track stocks in Google Spreadsheets
ADVERTISEMENT