Import XML in Google Sheets Not Working? Here’s How to Fix
Import XML in Google Sheets Not Working? Here’s How to Fix
Zac Cannon
July 3, 2024
How to Import a Large XML Sitemap into Google Sheets and Fix ErrorResource at URL Contents Exceeded Maximum Size
If you’re working with a large website with potentially thousands of URLs, then you may get hit with the error message “Resource at url contents exceeded maximum size”.
Sitemaps can contain up to a maximum of 50,000 URLs and can’t be more than 50MB in size. However IMPORTXML does seem to far break below this limit, and I’ve seen it breaking with XML sitemaps as low as 10,000. Fortunately I’ve managed to overcome this limitation. One way I’ve found is to parse the XML yourself using Google AppScript with XMLService. This will work providing there is nothing blocking you crawling the site
See below code snippet to parse an XML sitemap using AppScript and import the URLs into Google Sheets.
The script works by effectively taking the whole sitemap, fetching the text of the <loc> element which is always at position 0 in the <url> array. It could also be amended to fetch the <lastmod> element or any other elements you wanted as well. It then calls another function to push the urls to a Google Sheet of your choosing after overwriting any old data.
function requestXMLSitemap() {
const url = 'INSERT_SITEMAP_URL' // place your sitemap URL here
const options = {
header: {
'Content-Type': 'application/xml'
},
method: 'GET'
}
const response = UrlFetchApp.fetch(url, options).getContentText()
const xml = XmlService.parse(response)
const root = xml.getRootElement();
const children = root.getChildren();
const data = [];
for (let item in children){
let url = children[item].getChildren()[0].getText();
data.push([url]);
}
pushToSheet(data);
}
function pushToSheet(data) {
// settings
const settings = {
url: 'INSERT_SPREADSHEET_URL_HERE', // place your spreadsheet URL here
sheet: 'INSERT SHEET NAME HERE', // place the sheet name here
}
const ss = SpreadsheetApp.openByUrl(settings.url);
const sheet = ss.getSheetByName(settings.sheet);
sheet.getRange("A2:A").clearContent(); // place the range to push urls to here
sheet.getRange(2,1,data.length,data[0].length).setValues(data) // export data to sheet
}
Import XML in Google Sheets Not Working? Here’s How to Fix
Import XML in Google Sheets Not Working? Here’s How to Fix
Zac Cannon
July 3, 2024
How to Import a Large XML Sitemap into Google Sheets and Fix ErrorResource at URL Contents Exceeded Maximum Size
If you’re working with a large website with potentially thousands of URLs, then you may get hit with the error message “Resource at url contents exceeded maximum size”.
Sitemaps can contain up to a maximum of 50,000 URLs and can’t be more than 50MB in size. However IMPORTXML does seem to far break below this limit, and I’ve seen it breaking with XML sitemaps as low as 10,000. Fortunately I’ve managed to overcome this limitation. One way I’ve found is to parse the XML yourself using Google AppScript with XMLService. This will work providing there is nothing blocking you crawling the site
See below code snippet to parse an XML sitemap using AppScript and import the URLs into Google Sheets.
The script works by effectively taking the whole sitemap, fetching the text of the <loc> element which is always at position 0 in the <url> array. It could also be amended to fetch the <lastmod> element or any other elements you wanted as well. It then calls another function to push the urls to a Google Sheet of your choosing after overwriting any old data.
function requestXMLSitemap() {
const url = 'INSERT_SITEMAP_URL' // place your sitemap URL here
const options = {
header: {
'Content-Type': 'application/xml'
},
method: 'GET'
}
const response = UrlFetchApp.fetch(url, options).getContentText()
const xml = XmlService.parse(response)
const root = xml.getRootElement();
const children = root.getChildren();
const data = [];
for (let item in children){
let url = children[item].getChildren()[0].getText();
data.push([url]);
}
pushToSheet(data);
}
function pushToSheet(data) {
// settings
const settings = {
url: 'INSERT_SPREADSHEET_URL_HERE', // place your spreadsheet URL here
sheet: 'INSERT SHEET NAME HERE', // place the sheet name here
}
const ss = SpreadsheetApp.openByUrl(settings.url);
const sheet = ss.getSheetByName(settings.sheet);
sheet.getRange("A2:A").clearContent(); // place the range to push urls to here
sheet.getRange(2,1,data.length,data[0].length).setValues(data) // export data to sheet
}
Import XML in Google Sheets Not Working? Here’s How to Fix
Import XML in Google Sheets Not Working? Here’s How to Fix
Zac Cannon
July 3, 2024
How to Import a Large XML Sitemap into Google Sheets and Fix ErrorResource at URL Contents Exceeded Maximum Size
If you’re working with a large website with potentially thousands of URLs, then you may get hit with the error message “Resource at url contents exceeded maximum size”.
Sitemaps can contain up to a maximum of 50,000 URLs and can’t be more than 50MB in size. However IMPORTXML does seem to far break below this limit, and I’ve seen it breaking with XML sitemaps as low as 10,000. Fortunately I’ve managed to overcome this limitation. One way I’ve found is to parse the XML yourself using Google AppScript with XMLService. This will work providing there is nothing blocking you crawling the site
See below code snippet to parse an XML sitemap using AppScript and import the URLs into Google Sheets.
The script works by effectively taking the whole sitemap, fetching the text of the <loc> element which is always at position 0 in the <url> array. It could also be amended to fetch the <lastmod> element or any other elements you wanted as well. It then calls another function to push the urls to a Google Sheet of your choosing after overwriting any old data.
function requestXMLSitemap() {
const url = 'INSERT_SITEMAP_URL' // place your sitemap URL here
const options = {
header: {
'Content-Type': 'application/xml'
},
method: 'GET'
}
const response = UrlFetchApp.fetch(url, options).getContentText()
const xml = XmlService.parse(response)
const root = xml.getRootElement();
const children = root.getChildren();
const data = [];
for (let item in children){
let url = children[item].getChildren()[0].getText();
data.push([url]);
}
pushToSheet(data);
}
function pushToSheet(data) {
// settings
const settings = {
url: 'INSERT_SPREADSHEET_URL_HERE', // place your spreadsheet URL here
sheet: 'INSERT SHEET NAME HERE', // place the sheet name here
}
const ss = SpreadsheetApp.openByUrl(settings.url);
const sheet = ss.getSheetByName(settings.sheet);
sheet.getRange("A2:A").clearContent(); // place the range to push urls to here
sheet.getRange(2,1,data.length,data[0].length).setValues(data) // export data to sheet
}
Import XML in Google Sheets Not Working? Here’s How to Fix
Import XML in Google Sheets Not Working? Here’s How to Fix
Zac Cannon
July 3, 2024
How to Import a Large XML Sitemap into Google Sheets and Fix ErrorResource at URL Contents Exceeded Maximum Size
If you’re working with a large website with potentially thousands of URLs, then you may get hit with the error message “Resource at url contents exceeded maximum size”.
Sitemaps can contain up to a maximum of 50,000 URLs and can’t be more than 50MB in size. However IMPORTXML does seem to far break below this limit, and I’ve seen it breaking with XML sitemaps as low as 10,000. Fortunately I’ve managed to overcome this limitation. One way I’ve found is to parse the XML yourself using Google AppScript with XMLService. This will work providing there is nothing blocking you crawling the site
See below code snippet to parse an XML sitemap using AppScript and import the URLs into Google Sheets.
The script works by effectively taking the whole sitemap, fetching the text of the <loc> element which is always at position 0 in the <url> array. It could also be amended to fetch the <lastmod> element or any other elements you wanted as well. It then calls another function to push the urls to a Google Sheet of your choosing after overwriting any old data.
function requestXMLSitemap() {
const url = 'INSERT_SITEMAP_URL' // place your sitemap URL here
const options = {
header: {
'Content-Type': 'application/xml'
},
method: 'GET'
}
const response = UrlFetchApp.fetch(url, options).getContentText()
const xml = XmlService.parse(response)
const root = xml.getRootElement();
const children = root.getChildren();
const data = [];
for (let item in children){
let url = children[item].getChildren()[0].getText();
data.push([url]);
}
pushToSheet(data);
}
function pushToSheet(data) {
// settings
const settings = {
url: 'INSERT_SPREADSHEET_URL_HERE', // place your spreadsheet URL here
sheet: 'INSERT SHEET NAME HERE', // place the sheet name here
}
const ss = SpreadsheetApp.openByUrl(settings.url);
const sheet = ss.getSheetByName(settings.sheet);
sheet.getRange("A2:A").clearContent(); // place the range to push urls to here
sheet.getRange(2,1,data.length,data[0].length).setValues(data) // export data to sheet
}
We grow e-commerce brands.
Get in touch: sales@vida-digital.co.uk
Vida Digital Marketing Limited
Registered in England and Wales
Company number: 14162188
© Copyright 2024
Vida Digital Marketing Limited
Crafted by kreated
We grow e-commerce brands.
Get in touch: sales@vida-digital.co.uk
Vida Digital Marketing Limited
Registered in England and Wales
Company number: 14162188
© Copyright 2024
Vida Digital Marketing Limited
Crafted by kreated
We grow e-commerce brands.
Get in touch: sales@vida-digital.co.uk
Vida Digital Marketing Limited
Registered in England and Wales
Company number: 14162188
© Copyright 2024
Vida Digital Marketing Limited
Crafted by kreated
We grow e-commerce brands.
Get in touch: sales@vida-digital.co.uk
Vida Digital Marketing Limited
Registered in England and Wales
Company number: 14162188
© Copyright 2024
Vida Digital Marketing Limited
Crafted by kreated