Using AI and CGI in Music

5 min readDec 29, 2021

“When music technology takes the place of musicianship, it’s time to pull the plug.”

- Robert Brown

AI is revolutionizing the way we view music (Source: Pexels)

Introduction

Music and artificial intelligence are terms that are scarcely associated with each other. Many people are under the impression that using AI in music refers to robots producing beats or composing music. This branch of people included me, until very recent times. I was going through a few articles on Codecademy when I stumbled across a blog titled “Using Machine Learning to Analyze Taylor Swift’s Lyrics”. A Kaggle dataset of her song lyrics was used to compare song lyrics over time, with the implementation of Natural Language Processing (NLP), an AI tool. I found the blog extremely informative but wasn’t particularly impressed. I’ve used NLP before (for sentiment analysis) and figured that the same logic was being used. However, instead of ordinary words, song lyrics were being used. A question remained unanswered, though. Can AI understand the tone, the beats, and the emotion behind a song? The following blog is a summary of my research on this topic.

Using AI to Produce Music

There are 2 common ways in which AI composes music:

1. Using Deep Learning (DL): Deep learning can be used with music to get a desirable output (good quality music). The process is rather complicated. In DL, artificial neural networks are created with the intent of mocking the neurons in our brains. These artificial neural nets try to understand why we perceive something as good or not, based on patterns formed by analyzing raw data. This type of learning can be implemented by the computer for music, enabling it to grasp a deep understanding of the human music taste. It uses this knowledge to create its beats. Magenta, an open-source Python library, powered by TensorFlow, uses DL to produce content, based on old data.

2. Using Reinforcement Learning (RL): Reinforcement learning can be used to personalize the created music according to your taste. In RL, the computer learns from its own experiences using feedback that the user provides. It aims to thrive with the feedback (using trial and error) until a near-perfect level is achieved. For example, the user can be asked to judge a few beats on a scale of 1 to 100. The computer will then use this rating to come up with beats similar to the ones that were rated close to 100. This process goes on until a narrow niche has been discovered by the model (which is essentially your music taste). However, RL can often be very time-consuming or tedious.

Applications of Technology in Music

While researching, I came across many interesting applications of tech in the music industry. Here are a few examples:

1. Magenta: As discussed above, the Magenta Python library amazed me due to its uniqueness and easy-to-use nature. It has many exciting features and can be integrated into a website with its JavaScript API. Magenta Studios is a plugin that allows you to apply Magenta Models to your music.

2. Analysis of Taylor Swift’s Lyrics: A Codecademy developer used a Kaggle dataset of the lyrics of each of Swift’s albums to train an NLP Model. After sufficient training, it was able to plot the song topics (such as Love, Remorse, Contemplative) across her various albums, over time. The model was quite accurate. Her album, Fearless, was considered to be the closest related to the topic of ‘growing up’, which can be alluded to her moving out of her family house in 2010 when the album was released.

3. AIVA Technologies: AIVA is the first virtual music composer to be recognized by a music society. It is an artificial intelligence model capable of generating emotional soundtracks. The aim of the team at AIVA is “to personalize music and augment human creativity with Artificial Intelligence”.

4. You Broke Me First Immersive CGI Music Video: Recently, Tate McRae, a popular Canadian singer teamed up with Sony to create an immersive CGI (Computer Generated Imagery) video for one of her songs. It was recorded in an empty room with no props or sets, other than Tate, who was dancing. In my opinion, the finished product is a very well-made music video. However, it doesn’t end there. Sony’s ‘Spatial Reality Display’ lets viewers experience “You Broke Me First” in a previously unimaginable immersive way, bringing the whole journey to life right in front of their eyes. If used, the entire video can be viewed in 3D, which is unique to the music industry.

Pitfall to avoid: Replacing Musicianship with AI

Will AI-generated music make musicians obsolete? Can AI develop music by itself, in the future? Will AI ever be able to come up with lyrics that are emotionally meaningful to humans?

These are some questions whose answers will only be determined by time. However, a few things are certain. When a singer sings in a certain tone, the listener can often feel the joy or pain in their voice. When a guitarist plays the guitar with a certain intensity, the listener can determine the type of music produced (whether it is a melodic jingle or a loud dance track).

AI will never be able to replicate human emotions or human communications, howsoever near it may reach. Replacing musicianship with AI is a key blunder that the human race should not commit. Of course, AI-generated music does have its variety and unique ambiance, but replacing musicianship entirely will only have negative consequences.

My Advice

According to me, AI-generated music is here to stay. We may prefer our human vocals today, but with the rapid advancement of AI, an artificial pitch might get created, that no human can match. I don’t feel that this is a threat, specifically while talking about the music industry.

Many people have different answers when asked about their favorite singer. While some may like the voice of Taylor Swift, several people could also dislike it. This variety that humans possess may be tough to replicate for AI, resulting in monotonous AI vocals. To conclude, I would recommend the usage of AI for lyric analysis or music video production more than its usage for generating music. This is simply because the latter threatens musicianship while the former does not (assuming it will shortly). Nevertheless, this opinion of mine is biased and is subject to personal interpretation.

Helpful Links:

My Instagram

https://www.instagram.com/dylan_coding

Magenta Project

https://opensource.google/projects/magenta

https://magenta.tensorflow.org/

Sony Spatial Reality Display

https://www.sony.net/Products/Developer-Spatial-Reality-display/en/develop/AboutSRDisplay.html

Codecademy — Using Machine Learning to Analyze Taylor Swift’s Lyrics

https://www.codecademy.com/resources/blog/taylor-swift-lyrics-machine-learning/

AIVA — Music composed by AI

https://aiva.ai/