ios, web, thoughts & notes (rss)

dreamscape by isak solheim

Generating YouTube Titles Using Machine Learning

Smoothbrain YouTube channel

Last year I created a YouTube channel called Smoothbrain. I wanted a place where I could share some skateboarding clips with my friends, and I ended up posting many videos - that would sometimes go trending on YouTube Shorts. Since I created the channel, I have been consistently shitposting around one to three videos every day, which has resulted in over five-hundred video uploads at the point of writing this post.

Because I have been posting so consistently, uploading the videos has become somewhat of a routine for me. The hardest part of creating a video is, of couse, coming up with a title. The titles I end up with are often complete trash, which made me wonder if I could write some code that could generate titles of an equal, or even better, standard.

Getting Training Data

Having already posted five-hundred videos, the way I chose to get training data was to use all of my existing YouTube titles. Manually writing down five-hundred video titles would be very tedious work, which is why I wrote a Go program that would do it for me.

func getData(dataPointer *Data) {
	URL := getUrl(dataPointer.NextPageToken)

	resp, err := http.Get(URL)

	if err != nil {
		log.Fatalln(err)
	}

	defer resp.Body.Close()

	setData(resp, dataPointer)
}

Using the YouTube API, I can send a request asking for the YouTube videos from the Smoothbrain channel. The API responds with a maximum of 50 videos, so I had to implement pagination in the requests.

for {
	for _, video := range data.Items {
		formattedString := strings.ReplaceAll(video.Snippet.Title, " #shorts", "")
		_, _ = datawriter.WriteString(formattedString + "\n")
	}

	getData(data)

	if (data.NextPageToken == "") {
		break
	}
}

Running our Go-program gives us the file data.txt, which contains all of our titles.

This guy is a poser ๐Ÿ˜Ž
This guy is a chad ๐Ÿฅฐ
Why does the filmer keep grunting
This filmer needs to chill! ๐Ÿ˜Ž
This is the fastest popshuvit I have ever seen ๐Ÿ˜Ž
This fakie ollie goes hard ๐Ÿ’ฅ
He almost had this trick ๐Ÿ”ฅ
...

Training Our Model

I used a Python library called textgenrnn to easily train my own text-generating RNN on top of a pre-trained model.

from textgenrnn import textgenrnn
textgenrnn().train_from_file('../data/data.txt', num_epochs=10).generate()

Generating Titles

Having trained our model, we can generate titles using the same library.

from textgenrnn import textgenrnn
textgenrnn('textgenrnn_weights.hdf5').generate(30, temperature=1.0)

Here are some examples of the results:

Yo watch switch hes hard better to chad
Sigma skating after beat straight
Vitkig Kyle mid after this straight...
Chad does switch flip this good?
Yo which one manny or behind heelflip out boardida track
Is Virgin Kyle dominantsey and this halfcab heel hard tailslide?
Skate does fakie heelflip
Dumb looking in the filmer part
"Yo why do you counted this smood thich it"

Uploading Some Videos

Video uploads

As we can see, the videos ended up getting some views. Not as much as I usually get, but not too bad either.

Analytics graph

Am I going to keep using the generated titles? Probably not. But it was fun checking out.

Thanks for reading about this project! โ›„๏ธ