I published 3000 books on Amazon

Creating multi-version children's book using Python, LaTeX, JavaScript and Puppeteer

TL;DR: For the impatient, these are my books: HappySophieBooks.com or directly on Amazon. Read on to see how I made this happen.

Amazon KDP (kdp.amazon.com) is an awesome service to publish digital and print books and sell them on Amazon. Input metadata, upload a PDF for the cover, manuscript and specify pricing, and voila - you are a published author. Books are printed on demand, so there are no expenses. This blew my mind when I first tried it.

During the pandemic, I was a new mom, and had an idea for a children’s book that’s customized with the child’s name, hair color, skin color, etc, and available instantly. I could pre-publish all those names. After all, how many names can there be?

About baby names

Social Security Administration in the US publishes annually the baby names CSV file. The file for the year 2021 has 35K names covering 3.36M of babies, with lowest count 5. There were 3.66M births in 2021, so there must be 300K unique-ish names.

Below is a histogram of baby names. To reach coverage of 70%, I needed the top 1500 names.

Histogram of baby names in 2021
Histogram of baby names in 2021

And a bit of trivia: the most popular baby name in 2021 was Liam - 20K Liams were born, or, in other words, over 1 in 100 boys was Liam. For girls, the most popular name was Olivia, but if we consider different spellings, Sophia/Sofia took the top spot, with a count similar to Liam’s.

#Boy’s nameCount%
1Liam202720.55%
2Noah187390.51%
3Oliver146160.40%
4Elijah127080.35%
5James123670.34%
6William120880.33%
7Benjamin117910.32%
8Lucas115010.31%
9Henry113070.31%
10Theodore95350.26%
#Girl’s nameCount%
1Olivia177280.48%
2Emma154330.42%
3Charlotte132850.36%
4Amelia129520.35%
5Ava127590.35%
6Sophia124960.34%
7Isabella112010.31%
8Mia110960.30%
9Evelyn94340.26%
10Harper83880.23%

Illustrations

I hired a professional illustrator on Upwork and spent a few months in review cycles. The illustrator was amazing; she produced 26 illustrations, each with 5 versions: 2 hair colors for a boy hero, and 3 hair colors for a girl hero.

The images were in the PSD format (“Photoshop Document”), which contains layers. I used psd-tools, a Python package to pick layers from each image. It was organized so that by selecting certain layers, and hiding certain other layers, I can put the desired hero in the picture.

Now I could generate all versions of the illustrations on the fly: gender and hair color.

Manuscript

I created my manuscript using LaTeX.

I only needed one manuscript file main.tex (and another cover.tex) written using commands like \babyname, \babygender, \babyhaircolor, \isbn, etc. For example:

\mbox{\centering Who will \textbf{\babyname{}} be?}

Later I passed these values to generate all the PDFs that I needed:

xelatex -jobname=liam "\def\babyname{Liam}...\input{main}"

At this point, for simplicity, I decided to ignore the girl with red hair and only have 2 hair colors for each gender. That’s about 3000 books.

Manual uploads

I uploaded a few books manually. Clicking through the website quickly turned out to be mind-numbing. Uploading a single book takes about 5min, mainly spent waiting the RPCs to complete; in particular, content upload took about 3.5min in my case - exactly too short to do something useful in the meantime and long enough to forget to click the next button. Even if I was uploading non-stop, 5 min per book meant uploading 3000 books would take over 10 days without sleeping. Repeat upon a change. Clearly, I needed a robot to click for me.

Auto-kdp

I implemented the “auto-kdp” robot in Javascript using NodeJS and Puppeteer - a headless version of Chromium (headless means the browser runs without the UI).

I published the code here: https://github.com/elutek/auto-kdp.

Puppeteer

The way Puppeteer works, you say something like:

await page.goto('https://kdp.amazon.com/en_US/title-setup/paperback/new/details');
await page.waitForSelector('#data-print-book-title');
await page.type('My title')
..
await page.click('#save');

and Puppeteer types and clicks in the Chromium browser on your behalf.

See a real example: the “publish” action in auto-kdp.

You may observe Puppeteer at work, but you don’t have to; you don’t even have to run the UI in the first place. That’s in theory - in practice, the content review step in Amazon KDP does not seems work in the headless mode (I don’t know why), so that step has to be run with the UI on.

Resolution engine

Auto-kdp reads books from a CSV file - one book per row.

Each book is defined by about 30 values, such as: title, author, categories, keywords, prices in different markets, etc. There are also a bunch of status values returned by KDP, e.g. pubStatus, pubDate, ISBN, ASIN, etc.

In my case, many values were the same, for example author, illustrator, book category. To avoid such repetition values are read from the CSV file and defaults from a key-val config file

My books.csv:

title = Who will ${babyName} be?
authorFirstName = Ela
authorLastName = Krepska
isGirl = $vareq ${babyGender} == she
category1 = $varif ${isGirl} ?? JUV006000 :: JUV005000
..

My books.conf:

action                , babyName, babyGender, babyHairColor, pubStatus, ..
updateMetadata:publish, Sophia  , she       , dark         , LIVE, ..
updateMetadata:publish, Sophia  , she       , fair         , LIVE, ..
updateMetadata:publish, Liam    , he        , dark         , LIVE, ..

You can see that some values use ${key} - I implemented a resolution engine so you can define custom keys like babyName, babyGender and reuse them in other values title = Who will ${babyName} be?

I also added a very hacky conditional support to my resolution engine: $varif and $vareq.

That way the majority of values are actually stored in the defaults config file, and the CSV file just stores the basic values and KDP status values.

Actions

Each row in the CSV file has a special action column which tells the script what to do, e.g. update metadata, or click publish, etc. There is a special action produceManuscript which invokes the defined command to generate the manuscript in a child process.

Bringing it all together

auto-kdp is a command-line that:

auto-kdp system diagram
auto-kdp system diagram

Summary

Over time I uploaded over 3000 books. You can browse them on HappySophieBooks.com or directly on Amazon.

Did I make lots of money? Not at all, but days are early :)

Who will Baby be - montage of example bookcovers

Still here? Thank you for reading! If you want to support me, please consider getting the “Who will < Baby > be?” book for a child that you care about :) Search here and it will take you to the right Amazon product. If the name is not covered, send me a message to happysophiebooks@gmail.com and I’ll produce it in a few days.