Reading .xlsx file and exporting data to Google Spreadsheet with Node: A real case exploration

Calebe Machado
4 min readDec 1, 2020
Photo by Mika Baumeister on Unsplash

Last week I got a task to implement some logic that retrieves data from a certain URL, the first request brought me a filename and with that, I could make a second request to another URL passing that filename to retrieve the file itself.

In this article, we will explore two main requirements of the task: read .xlsx file and export the data inside the file, not the file itself.

The application structure is very simple, we only need to use the npm command to generate the project and start coding, we will look at the libraries used more ahead.

$ mkdir reading-and-writing-project
$ cd reading-and-writing-project
$ npm init -y
$ mkdir src (folder to hold our source code)
$ cd src && touch index.js (our starting code will we here)

Whit the above commands we created our project folder, we initialize the node project with npm init -y (this generates the package.json file). Inside the folder project, we created an src folder to hold the main code at the file index.js (this pattern is only because I don't like to put source code directly on the project folder, mixed with configuration, env, docker, readme, and other files…)

Starting the code itself

The first thing to analyze was the process structure behind our code, I like to read the documentation about it (in this particular case I had a simple .txt file with two URLs of example and a token). To expand the idea, I thought about the steps:

1 — Request to search the file name
2 — Request to search the file itself
3 — Read the file from the buffer and apply some logic
4 — Export the content to Google SpreadSheet

To accomplish those, and to make make it easier I searched some libraries, the first and second steps I could make with Axios. For the third step, I used the Node Filesystem API to create a write stream and store the content inside a file. To read the file and apply some logic validation I had to use the ExcelJS Library. To accomplish the last step we had two possibilities, use the Google Sheets API direct or a library that “translates” the API to Javascript world, I opted to use the second one with the Google Spreadsheet npm package.

$ npm install --save axios exceljs google-spreadsheet dotenv

The Dotenv library was used to store environment variables inside a file on the root of the project with the name “.env”. To access the variables you just need to require the library at the top of the index.js file and when needed to call the process.env.VARIABLE_NAME and it will bring a string representation of the content on the variable of the .env file.

1 require('dotenv').config()

I won't expand on Axios requests here, but it was pretty simple:

With the responde.data we can use the fs.createWriteStream to store the buffer content on a file.

To communicate with Google Spreadsheet first, create a project inside the Google console and add Drive API and Spreadsheet API. More information about authentication here.

In my case, I had to look inside both files (the one that I got from the second URL and the existing spreadsheet on google drive) and compare sheet by sheet, looking for their names and changes. If the structure (nº of columns) was different I had to create a new sheet with the content that I received and put on Google Spreadsheet and rename the old one on GDrive adding something to differentiate a “history” sheet from the new one.

The simple solution was to add “##” at the start of the sheet name and the current date at the end. With that, every time I had to look for the newer version of the sheet on Google Spreadsheet I had to filter sheet names without “##” — as I did on the last snippet.

The last step was to create the google sheet in fact with the function “createGoogleSheet” passing a title, the values as a sparse array, and the Google Document saw at third snippet.

In this step, you need to take care of some things, first is that google API expects that the headers row is sent on creating of sheet or with “setHeaderRow” function. If you sent headers on the same sparse array of the data it may cause an exception. The second is that google API expects unique headers name, you can see that I created a function to verify it and change on necessity. The last thing you have to take care, on inserting data sometimes the rows or columns on google spreadsheet is blocked (as if they wast initialized). To resolve you have to resize the sheet as I made on lines 8,9 and 10.

That's it!

In the next article, I’ll discuss how I managed to make a simple pooling request functionality.

--

--