Disclaimer: This entry is not a complete tutorial. I share all of my code, but you’ll have to put two and two together yourself :) Maybe seek help from an AI chatbot of choice.
A few months ago, I got hooked on playing Magic the Gathering. It’s a game I’ve always been curious about playing, but I never fully tried to before. As a kid, it seemed difficult and expensive and out of reach, but now with a stable income, it’s perfect. It’s a nice distraction from all of the digital noise and a fun way to hang out with friends and family.
One thing I didn’t expect, was to also getting hooked on the whole “collector” aspect of the cards. I’ve never gambled much, as betting never felt gratifying to me. But with Magic, I’ve found myself drawn into it anyway. I know it’s probably a losing game, but compared to betting on numbers or sports, Magic feels like betting on stocks. New cards are constantly coming out, which makes it interesting for coding prediction algorithms.
There are many tools out there that allow to track your own cards, but from what I’ve found, few, if any, let you model and play with the data and economics.
And so, my journey has begun. Whaddayaknow… Making a digital exercise out of it anyways.
In this entry, I’ll share my initial progress and learnings within this bit of data analysis. This won’t be a tutorial. Just the necessary code and structure to maybe get you started.
Also, I’m not a finance expert. Don’t take my advice. My goal for myself is to use this as an excuse to learn about datascience and analysis.
Let’s get started.
Here’s a screenshot of what I’ve built today in Google Sheets with the code in this post. It’s basically a database of individual cards I own. I don’t own many yet, but I have a few I want to start monitoring. A simple start. Down the road I imagine I will try to apply different prediction techniques.
It is mostly automated. I wanted to minimise my workload. I’ve configured a system where I only need to press “Add Entry,” paste in a URL, and type in my buying price. From that point, the spreadsheet will collect daily trend data for the card and create a graph for each row.
The graph is quite basic at this point: a column graph where each column is red if below my buying price and green if above. As I collect more data, I look forward to coding other models. As I said, this is my entry point into learning about data science and potentially some machine learning.
There are 2 scripts and a bunch of formulas.
Script 1: Daily harvest
This is the main script of the project. You simply add it by going to Extensions -> App Scripts
and pasting it in. Then in the same interface you can configure it to trigger once every day.
function fetchPricesDaily() {
// Get the active spreadsheet and sheet
var sheet =
SpreadsheetApp.getActiveSpreadsheet().getSheetByName("MTG Singles");
// We get the XPath which is configured at C2 in the dataset.
var xpath = sheet.getRange("C2").getValue();
// make sure xpath can be passed to IMPORTXML
xpath = xpath.replace(/"/g, "'");
// Get today's date in a specific format to use as column header
var today = new Date();
var formattedTodayDate = Utilities.formatDate(
today,
Session.getScriptTimeZone(),
"yyyy-MM-dd"
);
// Find the last column with date in the header
var dateRow = 3; // In our setup this is the row with dates on.
var dateColumn = 1; // We just start looking from the first column
var includeRows = 1; // We only want to look at this single row
var lastColumnOnDateRow = sheet
.getRange(dateRow, dateColumn, includeRows, sheet.getMaxColumns())
.getValues()[0]
.filter(String).length;
var dateRange = sheet.getRange(3, 1, 1, lastColumnOnDateRow); // Get the first row, spanning all columns
var dates = dateRange.getValues()[0]; // Get the values of the first row as an array
console.log(dates);
// Get last array item
var lastDate = dates[dates.length - 1];
var formattedLastDate = Utilities.formatDate(
new Date(lastDate),
Session.getScriptTimeZone(),
"yyyy-MM-dd"
);
// If the value of the last date isn't the same as formattedTodayDate, add a new column with formattedTodayDate date
if (formattedLastDate != formattedTodayDate) {
dateColumnIndex = lastColumnOnDateRow + 1;
sheet.getRange(dateRow, dateColumnIndex).setValue(formattedTodayDate);
// Pull prices for all rows of items
getPrices(sheet, xpath, dateColumnIndex);
}
}
function getPrices(sheet, xpath, columnIndex) {
var dataRange = sheet.getRange(4, 1, sheet.getLastRow() - 3, 8);
var data = dataRange.getValues();
// Iterate over each row of data
data.forEach(function (row, index) {
var url = row[0];
if (url == "") {
return;
}
// Create the IMPORTXML formula to fetch the raw price
var rawPriceFormula = 'IMPORTXML("' + url + '"; "' + xpath + '")';
// Set the raw price formula on todays date
sheet.getRange(index + 4, columnIndex).setFormula(rawPriceFormula);
});
// Wait for 5 seconds to allow IMPORTXML to complete
Utilities.sleep(5000);
// Then iterate over all of the cells again and replace their formulas with their values
data.forEach(function (row, index) {
// Replace formulas with values in the range containing IMPORTXML formulas
var range = sheet.getRange(index + 4, columnIndex);
var values = range.getValues();
range.setValues(values);
});
}
The most important part of this script is the `IMPORTXML()` formula. This formula is entered into each cell with a corresponding URL and the XPath that leads to the right dom element on the page.
The XPath took a bit of trial and error for me, but I think I’ve figured out all of the gotcha’s
//*[@id="tabContent-info"]/div/div[contains(@class, 'mx-auto')]/div/div[2]/dl/dt[contains(text(), 'Price Trend')]/following-sibling::dd[1]
It’s basically pointing towards a Cardmarket product’s price section and navigating to the “trend” value. It takes care of the edge cases where some cards only exist as foil, which means the “foil” div toggle doesn’t exist in the dom and it would fail. And it also takes care of the situation where a card doesn’t have any reprints at all which meant that the final element couldn’t just be selected with an index, but I found a way using the sibling trick.
Then, after having added the `IMPORTXML()` formula’s to each cell, you’ll notice that I wait for 5 seconds and then run through the cells again. I do this to overwrite the formula with the scraped value. This turned out to be important because the IMPORTXML formula could (and would) get triggered in cells in previous days and that will obviously corrupt the data. So this way, I only leave a regular number behind in each cell.
Script 2: Adding an entry
This is a small quality of life improvement. I made it easy to add an entry with a button. This ensures that my dataset doesn’t have a lot of empty rows with formulas in them. The reason why that’s important, is that every time I sorted by smallest/highest values, these empty formula rows would mix in as well. So to fix this, I’m not pre-adding formulas to the empty rows, but in stead adding them with a button
function addEntry() {
var sheet = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
var lastRow = sheet.getLastRow();
// Insert the new row and set values
sheet.insertRowAfter(lastRow);
// Set the predefined values in the new row (except for "Is Foil" and SPARKLINE)
sheet.getRange(lastRow + 1, 1).setValue("Fill");
sheet.getRange(lastRow + 1, 5).setValue("Fill");
var productNameFormula =
"=SUBSTITUTE(SUBSTITUTE(INDEX(SPLIT(INDEX(SPLIT(A" +
(lastRow + 1) +
' ; "/") ; COUNTA(SPLIT(A' +
(lastRow + 1) +
' ; "/")) - 0) ; "?") ; 1) ; "-" ; " ") ; "Fill" ; "")';
sheet.getRange(lastRow + 1, 2).setFormula(productNameFormula);
var setNameFormula =
"=SUBSTITUTE(SUBSTITUTE(INDEX(SPLIT(A" +
(lastRow + 1) +
' ; "/"); COUNTA(SPLIT(A' +
(lastRow + 1) +
' ; "/")) - 1) ; "-" ; " ") ; "Fill" ; "")';
sheet.getRange(lastRow + 1, 3).setFormula(setNameFormula);
var isFoilFormula = "=REGEXMATCH(A" + (lastRow + 1) + ';"isFoil=Y")';
sheet.getRange(lastRow + 1, 4).setFormula(isFoilFormula);
var sparklineFormula =
"=SPARKLINE(ARRAYFORMULA(IF(G" +
(lastRow + 1) +
":ZY" +
(lastRow + 1) +
' = ""; G' +
(lastRow + 1) +
":ZY" +
(lastRow + 1) +
"; G" +
(lastRow + 1) +
":ZY" +
(lastRow + 1) +
" - E" +
(lastRow + 1) +
')); {"charttype"\\"column"; "negcolor"\\"red"; "color"\\"green"; "nan"\\"ignore"; "empty"\\"ignore"; "rtl"\\false})';
sheet.getRange(lastRow + 1, 6).setFormula(sparklineFormula);
}
Spreadsheet Formulas
As soon as the URL is pasted into the first column, I extract the cards title, set name and if it’s a foil from the URL and create a bar chart. In the examples below the number `9` represents the current row.
Title
=IFERROR(SUBSTITUTE(INDEX(SPLIT(INDEX(SPLIT(A9 ; "/") ; COUNTA(SPLIT(A9 ; "/")) - 0) ; "?") ; 1) ; "-" ; " ") ; "")
Set
=IFERROR(SUBSTITUTE(INDEX(SPLIT(A9 ; "/"); COUNTA(SPLIT(A9 ; "/")) - 1) ; "-" ; " ") ; "")
Foil
=REGEXMATCH(A9;"isFoil=Y")
Column Chart
Finally I have a simple formula for creating a column chart for each row. Eventually, as scraping continue each row will have a red/green graph that reveals if it has dropped or gained price (compared to my buying price).
=SPARKLINE(ARRAYFORMULA(IF(G9:ZY9 = ""; G9:ZY9; G9:ZY9 - E9)); {"charttype"\"column"; "negcolor"\"red"; "color"\"green"; "nan"\"ignore"; "empty"\"ignore"; "rtl"\false})
Note: All of these formulas are as mentioned not directly typed into the fields by me. I created the “Add Entry” button script to serve that purpose. They look a little different there inside the script, but in reality, this is what they look like.
Alright. That’s it! That’s pretty much all there is to it. Now It will just scrape card data every day and the next step would be to figure out what to do with it.
Now, back to playing with friends and family!