I published 3000 books on Amazon
Creating multi-version children's book using Python, LaTeX, JavaScript and Puppeteer
TL;DR: For the impatient, these are my books: HappySophieBooks.com or directly on Amazon. Read on to see how I made this happen.
Amazon KDP (kdp.amazon.com) is an awesome service to publish digital and print books and sell them on Amazon. Input metadata, upload a PDF for the cover, manuscript and specify pricing, and voila - you are a published author. Books are printed on demand, so there are no expenses. This blew my mind when I first tried it.
During the pandemic, I was a new mom, and had an idea for a children’s book that’s customized with the child’s name, hair color, skin color, etc, and available instantly. I could pre-publish all those names. After all, how many names can there be?
About baby names
Social Security Administration in the US publishes annually the baby names CSV file. The file for the year 2021 has 35K names covering 3.36M of babies, with lowest count 5. There were 3.66M births in 2021, so there must be 300K unique-ish names.
Below is a histogram of baby names. To reach coverage of 70%, I needed the top 1500 names.
And a bit of trivia: the most popular baby name in 2021 was Liam - 20K Liams were born, or, in other words, over 1 in 100 boys was Liam. For girls, the most popular name was Olivia, but if we consider different spellings, Sophia/Sofia took the top spot, with a count similar to Liam’s.
# | Boy’s name | Count | % |
---|---|---|---|
1 | Liam | 20272 | 0.55% |
2 | Noah | 18739 | 0.51% |
3 | Oliver | 14616 | 0.40% |
4 | Elijah | 12708 | 0.35% |
5 | James | 12367 | 0.34% |
6 | William | 12088 | 0.33% |
7 | Benjamin | 11791 | 0.32% |
8 | Lucas | 11501 | 0.31% |
9 | Henry | 11307 | 0.31% |
10 | Theodore | 9535 | 0.26% |
# | Girl’s name | Count | % |
---|---|---|---|
1 | Olivia | 17728 | 0.48% |
2 | Emma | 15433 | 0.42% |
3 | Charlotte | 13285 | 0.36% |
4 | Amelia | 12952 | 0.35% |
5 | Ava | 12759 | 0.35% |
6 | Sophia | 12496 | 0.34% |
7 | Isabella | 11201 | 0.31% |
8 | Mia | 11096 | 0.30% |
9 | Evelyn | 9434 | 0.26% |
10 | Harper | 8388 | 0.23% |
Illustrations
I hired a professional illustrator on Upwork and spent a few months in review cycles. The illustrator was amazing; she produced 26 illustrations, each with 5 versions: 2 hair colors for a boy hero, and 3 hair colors for a girl hero.
The images were in the PSD format (“Photoshop Document”), which contains layers. I used psd-tools, a Python package to pick layers from each image. It was organized so that by selecting certain layers, and hiding certain other layers, I can put the desired hero in the picture.
Now I could generate all versions of the illustrations on the fly: gender and hair color.
Manuscript
I created my manuscript using LaTeX.
I only needed one manuscript file main.tex
(and another cover.tex
)
written using commands like
\babyname
, \babygender
, \babyhaircolor
, \isbn
, etc.
For example:
\mbox{\centering Who will \textbf{\babyname{}} be?}
Later I passed these values to generate all the PDFs that I needed:
xelatex -jobname=liam "\def\babyname{Liam}...\input{main}"
At this point, for simplicity, I decided to ignore the girl with red hair and only have 2 hair colors for each gender. That’s about 3000 books.
Manual uploads
I uploaded a few books manually. Clicking through the website quickly turned out to be mind-numbing. Uploading a single book takes about 5min, mainly spent waiting the RPCs to complete; in particular, content upload took about 3.5min in my case - exactly too short to do something useful in the meantime and long enough to forget to click the next button. Even if I was uploading non-stop, 5 min per book meant uploading 3000 books would take over 10 days without sleeping. Repeat upon a change. Clearly, I needed a robot to click for me.
Auto-kdp
I implemented the “auto-kdp” robot in Javascript using NodeJS and Puppeteer - a headless version of Chromium (headless means the browser runs without the UI).
I published the code here: https://github.com/elutek/auto-kdp.
Puppeteer
The way Puppeteer works, you say something like:
await page.goto('https://kdp.amazon.com/en_US/title-setup/paperback/new/details');
await page.waitForSelector('#data-print-book-title');
await page.type('My title')
..
await page.click('#save');
and Puppeteer types and clicks in the Chromium browser on your behalf.
See a real example: the “publish” action in auto-kdp.
You may observe Puppeteer at work, but you don’t have to; you don’t even have to run the UI in the first place. That’s in theory - in practice, the content review step in Amazon KDP does not seems work in the headless mode (I don’t know why), so that step has to be run with the UI on.
Resolution engine
Auto-kdp reads books from a CSV file - one book per row.
Each book is defined by about 30 values, such as: title, author, categories, keywords, prices in different markets, etc. There are also a bunch of status values returned by KDP, e.g. pubStatus, pubDate, ISBN, ASIN, etc.
In my case, many values were the same, for example author, illustrator, book category. To avoid such repetition values are read from the CSV file and defaults from a key-val config file
My books.csv
:
title = Who will ${babyName} be?
authorFirstName = Ela
authorLastName = Krepska
isGirl = $vareq ${babyGender} == she
category1 = $varif ${isGirl} ?? JUV006000 :: JUV005000
..
My books.conf
:
action , babyName, babyGender, babyHairColor, pubStatus, ..
updateMetadata:publish, Sophia , she , dark , LIVE, ..
updateMetadata:publish, Sophia , she , fair , LIVE, ..
updateMetadata:publish, Liam , he , dark , LIVE, ..
You can see that some values use ${key}
- I implemented a resolution engine
so you can define custom keys like babyName
, babyGender
and
reuse them in other values title = Who will ${babyName} be?
I also added a very hacky conditional support to my resolution engine: $varif
and $vareq
.
That way the majority of values are actually stored in the defaults config file, and the CSV file just stores the basic values and KDP status values.
Actions
Each row in the CSV file
has a special action
column which tells the script what to do, e.g. update metadata, or
click publish, etc. There is a special action produceManuscript
which invokes
the defined command to generate the manuscript in a child process.
Bringing it all together
auto-kdp is a command-line that:
- Reads books from a CSV file and a defaults file, as well manuscripts and covers from the content directory.
- Talks to Amazon KDP: for each books performs the requested “actions” on it
- Outputs new CSV file with the new state of the world
Summary
Over time I uploaded over 3000 books. You can browse them on HappySophieBooks.com or directly on Amazon.
Did I make lots of money? Not at all, but days are early :)
Still here? Thank you for reading! If you want to support me, please consider getting the “Who will < Baby > be?” book for a child that you care about :) Search here and it will take you to the right Amazon product. If the name is not covered, send me a message to happysophiebooks@gmail.com and I’ll produce it in a few days.