Matthew Moocarme
  • About
  • Resume
  • Portfolio
  • Blog

Deep Learning Metallica with Recurrent Neural Networks



Goal

I will use recurrent neural networks to generate original guitar music in the style of Metallica, from guitar pro tabs scraped from the internet.

METALLICA

Method

I decided to use guitar pro files because they are quite popular, so there will be a lot of accessible data, and they are reasonably accurate, containing a lot of fine data which can be missed in conventional guitar tablature, which an example is given below.

Guitar tablature Example

See here ▼

Below is an example of typical guitar tablature from ultimate-guitar.com

Band- Metallica
Song- TuesdAYS gONE

  A5                   E5                   F#5                   D
      x 0 2 2 x x          0 2 2 x x x          2 4 4 x x x          x x 0 2 3 2

         Dsus4                Dsus2                  G                   F#m
      x x 0 2 3 3          x x 0 2 3 0          3 2 0 0 3 3          2 4 4 2 2 2

         GIII
      3 5 5 4 3 3

Gtr I (Eb Ab Db Gb Bb Eb) - 'acoustic'
Gtr II (Eb Ab Db Gb Bb Eb) - 'dobro'
Gtr III (Eb Ab Db Gb Bb Eb) - 'acoustic'
Gtr IV (Eb Ab Db Gb Bb Eb) - 'acoustic'

 Intro
  Slowly H.=50
                   A5                                 E5
 3/4
  Gtr I
                    |           |                      |    ||
                    /           /                      /    //

  Gtr II
                                                       ~~
|-------|-------||-----------|----------------------|------------|
|-------|-------||-14--------|-14------14b16r=(14)--|-12---------|
|-------|-------||-----------|----------------------|------------|
|-------|-------||-----------|----------------------|------------|
|-------|-------||-----------|----------------------|------------|
|-------|-------||-----------|----------------------|------------|
|
| Gtr III
|                                                      ~~
|-------|-------||-----------|----------------------|------------|
|-------|-------||-----------|----------------------|------------|
|-------|-------||-11--------|-11b13r=(11)----------|--9---------|
|-------|-------||-----------|----------------------|------------|
|-------|-------||-----------|----------------------|------------|
|-------|-------||-----------|----------------------|------------|
|
| Gtr IV
|-------|-------||-----------|----------------------|------------|
|-------|-------||-----------|--2-------------------|------------|
|-------|-------||--------2--|----------2-----------|------------|
|-------|-------||-----2-----|------------------2---|--------2---|
|-------|-------||--0--------|----------------------|-----2------|
|-------|-------||-----------|----------------------|--0---------|


                    F#5                                   D
  |        |  |      |         ||    |        |     |     |  | || |
  /        /  /      /         //    /        /     /     /  / // /

                        ~~~~~~~                           ~
|-----------------|----------------|--------------------|----------|
|----12b13r=(12)--|-12-10-(10)-----|----10b12r====(10)--|-7--------|
|-----------------|----------------|--------------------|----------|
|-----------------|----------------|--------------------|----------|
|-----------------|----------------|--------------------|----------|
|-----------------|----------------|--------------------|----------|
|
|                    ~~                                   ~
|-----------------|----------------|--------------------|----------|
|-----------------|----------------|--------------------|----------|
|-----------------|----------------|--------------------|----------|
|-----9b10r==(9)--|--7-------------|----------7b9r=(7)--|-4--------|
|-----------------|----------------|--------------------|----------|
|-----------------|----------------|--------------------|----------|
|
|                    PM---------------------------|
|-----------------|----------------|--------------------|----------|
|-----------------|----------------|--------------------|-------3--|
|-1---------------|----------------|-2------------------|----2-----|
|-----2-----------|-------------4--|-----4--------------|-0--------|
|-------------2---|---------4------|----------------4---|----------|
|-----------------|--2-------------|--------------------|----------|

        

This is an example of the kind of tabs I scraped in previous work on country music, that can be found here. While the guitar tab may seem quite orderly, because the tabs are user-submitted, they can differ tab to tab. Also, additional data, such as whether the guitar is palm-muted, may often be ommited.

Guitar tabs can be easily converted to sheet music since every fret on each string corresponds to a note. Different frets on diffrent strings can have the same note, so it is a many-to-one mapping, but if we restrict ourselves to the first 4 frets it is a one-to-one mapping. The image below may be helpful for understanding, above shows a guitar tablature, and below the corresponding sheet musical.


fretboard


Guitar pro files are created using specialized software (Guitar Pro), which has a built-in MIDI-editor, so much of the fine data is both available and standardized, and, in general, more accurately resembles the song it is transcripting.

To obtain the guitar pro files I scraped the website ultimate-guitar.com, as it has the most comprehensive repository of guitar tabs on the internet. I scrape the the website for only 5-star rated tabs, and filter for guitar pro files, and choose remove duplicate songs (for example mutliple versions) by choosing the tabs with the most ratings.

Scraper to get all guitar pro files

See here ▼

Below is an python script of grabbing guitar pro files from ultimate-guitar.com.

# -*- coding: utf-8 -*-
"""
Created on July 30th
@author: matt
"""
# = Import packages

import requests
import re
from bs4 import BeautifulSoup
import pandas as pd
import cgi
import shutil
import os

# = Helper Functions =========================================================

def removeTags(string):
    '''
    Function to remove html tags
    '''
    return re.sub('<[^<]+?>', '', string)


def getBandTree(band, page):
    '''
    Function to get xml tree given the band name
    '''
    if type(page) == int:
        page = str(page)
    theURL = 'https://www.ultimate-guitar.com/search.php?band_name=' + band + \
        '&type%5B4%5D=500&rating%5B4%5D=5&approved%5B1%5D=1&page=' + page + \
        '&view_state=advanced&tab_type_group=text&app_name=ugt&order=myweight'

    pageBand = requests.get(theURL)
    return BeautifulSoup(pageBand.content)


def download_url(url, directory):
    """Download file from url to directory

    URL is expected to have a Content-Disposition header telling us what
    filename to use.

    Returns filename of downloaded file.

    """
    response = requests.get(url, stream=True)
    if response.status_code != 200:
        raise ValueError('Failed to download')

    params = cgi.parse_header(
        response.headers.get('Content-Disposition', ''))[-1]
    if 'filename' not in params:
        raise ValueError('Could not find a filename')

    filename = re.sub('([\(\[]).*?([\)\]])', '', os.path.basename(params['filename']))
    filename = re.sub(' ','', filename)
    abs_path = os.path.join(directory, filename)
    with open(abs_path, 'wb') as target:
        response.raw.decode_content = True
        shutil.copyfileobj(response.raw, target)

    return filename

# ============================================================================

def main():
    band = 'metallica'
    page = 1
    bandURL = getBandTree(band, page)
    #Get max pages
    pages = bandURL.find_all('div', { "class" : "paging" })
    maxPages = str(pages).count('')
    dfs = []
    for page in range(1, maxPages + 1):
        bandURL = getBandTree(band, page)
        mybs = bandURL.find_all('b', { "class" : "ratdig" })
        mybs2 = [removeTags(str(rating)) for rating in mybs]
        songname = bandURL.find_all('a', { "class" : "song result-link" })
        songname2 = [re.sub('([\(\[]).*?([\)\]])', '', removeTags(str(song)).strip()).strip() for song in songname]
        tabType = bandURL.find_all('strong')
        tabType2 = [removeTags(str(tab)) for tab in tabType]
        df1 = pd.DataFrame({'Rating': mybs2, 'Type': tabType2})
        df2 = df1[df1.Type == 'guitar pro']
        df2.loc[:,'Song_Name'] = songname2
        links = []
        for a in songname:
            links.append(a['href'])
        df2.loc[:,'Song_Links'] = links
        df3 = df2.loc[df2.groupby(['Song_Name'], sort=False)['Rating'].idxmax()]
        dfs.append(df3)
    tot_df = pd.concat(dfs)
#    print(tot_df)

    song_links_list = list(tot_df.Song_Links)
    for i in range(len(tot_df)):
        webPage = requests.get(song_links_list[i])
        soup = BeautifulSoup(webPage.content)
        tab_id = soup.find_all("input", {"type" : "hidden", "name" : "id", "id" : "tab_id"})
        the_id = tab_id[0].get('value')
        mydir = os.path.dirname('out/' +band + '/')
        if not os.path.exists(mydir):
            os.makedirs(mydir)
        download_url('https://tabs.ultimate-guitar.com/tabs/download?id='+str(the_id), mydir)


if __name__=='__main__':
    main() 

The scraper grabs 123 songs, which seems pretty good, since there are a total of 151 songs in Metallica's catalog.

I can use the same recurrent neural network to generate country music lyrics that seems to work pretty well. To use this model the guitar pro files need to be in text file format. I do this by extracting all the pertinent information from the guitar pro files and write them to a txt file. This method has to be reversible since the neural network will output text in the same format.

Python script to convert guitar pro files to txt file

See here ▼

Below is an python script of how guitar pro files are converted to a txt file for the neural network to read.

# -*- coding: utf-8 -*-
"""
Created on Sat Jul 30 14:26:22 2016

@author: matt-666
"""
# = Import packages

import guitarpro
from os import listdir

# = Helper functions =========================================================

def unfold_tracknumber(tracknumber, tracks):
    """Substitute '*' with all track numbers except for percussion tracks."""
    if tracknumber == '*':
        for number, track in enumerate(tracks, start=1):
            if not track.isPercussionTrack:
                yield number
    else:
        yield tracknumber


def transpose(myfile, track):
    '''
    Get pertinent information from guitar pro files and write to text file
    '''
    myfile.write("%s" % 'strgs: ' +' '.join([str(string) for string in track.strings]) + ' ')
    myfile.write("%s" % 'fc: ' +str(track.fretCount) + ' ')
    measure1 = track.measures[0]
    myfile.write("%s" % str(measure1.keySignature) + ' ')
    myfile.write("%s" % 'len:' + str(measure1.length) + ' ')
    myfile.write("%s" % 'tmpo:' + str(measure1.tempo) + '\n')
    i = 1
    for measure in track.measures:
        myfile.write("%s" % 'Num:' + str(i) + ' ')
        myfile.write("%s" % 'mst:' + str(measure.start) + '\n')
        for voice in measure.voices:
            for beat in voice.beats:
                myfile.write("%s" % 'vst:' + str(beat.start) + '\n')
                for note in beat.notes:
                    myfile.write("%s" % 'S:' + str(note.string) + ' ')
                    myfile.write("%s" % 'V:' + str(note.value) + '\n')
        i += 1

# =================================================================

def main():

    band = 'metallica'
    mydir = 'out/' + band
    files = listdir(mydir)
    myfile = open('allTabs.txt', 'w')
    for gpfile in files:
        curl = guitarpro.parse(mydir + '/'+ gpfile)
        transpose(myfile, curl.tracks[0])
        myfile.write('\n\r\n\r')
    myfile.close()

if __name__== '__main__':
    main()

Now that all the guitar pro files are converted to one large text file it can be fed into the recurrent neural network. The model comes from Andrej Karpathy’s great char-rnn library for Lua/Torch. Recurrent neural networks can use the output of the current node as the input for the next node.

The model takes a few days to run on my poor laptop with 400 nodes and 3 layers in the network, which corresponds to about 5 million parameters in the network.

The output of the model is in the same text format of the input of the model. I take a guitar pro file and modify the measures, notes, and metadata. Creating a guitar pro file from scrtach in python is quite difficult due to the number of settings that have to be included in order to save the file properly, so this method was a compromise.

Python script to convert txt to GP5 file

See here ▼

Below is an python script of how txt is converted back to guitar pro files.

# -*- coding: utf-8 -*-
"""
Created on Sat Aug  6 15:53:06 2016

@author: matt-666
"""
# = Import packages

import guitarpro
import time

# = Helper Functions ====================================================

def transpose2GP5file(track, totaldict):
    '''
    Function to take an empty gp5 song and fill with song information
    from dictionary generated from txt2songDict function
    '''
    meas = 1
    breakMeas = Inf
    for measure in track.measures:
        #measure.keySignature = totaldict['measure_'+str(meas)]['key']
        #measure.length = totaldict['measure_'+str(meas)]['mlen']
        #measure.start = totaldict['measure_'+str(meas)]['mstart']
        for voice in measure.voices:
            try:
                beats_list = totaldict['measure_'+str(meas)]['beats']
                thebeat = 0
            except KeyError:
                if meas < breakMeas:
                    breakMeas = meas
                break
            for beat in voice.beats:
                beat.start = beats_list[thebeat]
                try:
                    strings = totaldict['measure_'+str(meas)]['strings'+str(beats_list[thebeat])]
                    realValues = totaldict['measure_'+str(meas)]['Notes'+str(beats_list[thebeat])]
                except KeyError:
                    break
                notes = []
                for string, realValue in zip(strings, realValues):
                    #print(string)
                    note = guitarpro.base.Note()
                    note.string = string
                    note.value = realValue
                    print(meas, string, note.value)
                    notes.append(note)
                beat.notes = notes
            thebeat += 1
        meas += 1
    track.measures = track.measures[:breakMeas-1]
    return(track)

# ================================================================================

def main():
    def txt2songDict(track):
        '''
        Read txt file and go through line by line
        convert all information into dictionary of measures
        '''
        #    myfile =open('xyz.txt', 'r')
        with open(track) as f:
            total_dict = {}
            measures_list = []
            measNum = 0
            measStart = 0
            key = 'CMajor'
            song_len = 3840
            for line in f:
                if line[:5] == 'strgs':

                    fc1 = line.find('fc')
                    try:
                        strings = line[6:fc1].split()
                    except ValueError:
                        break
                    key1 = line.find('Key')
                    fc = line[fc1+4:key1]
                    len1 = line.find('len')
                    key = line[key1+13:len1-1]
                    tempo1 = line.find('tmpo')
                    song_len = line[len1+4:tempo1]
                    try:
                        tempo = line[tempo1+5:tempo1+7]
                    except ValueError:
                        tempo = line[tempo1+5:tempo1+6]
                    init_dict = {}
                    init_dict['tempo'] = tempo
                    gpStrings = [guitarpro.base.GuitarString(string) for string in strings]
                    init_dict['string'] = gpStrings
                    init_dict['fretCount'] = fc
                    total_dict['init_dict'] = init_dict
                elif line[:3] == 'Num':
                    measStart1 = line.find('mst')
                    if measNum > 0:
                        total_dict['measure_'+str(measNum)]['beats'] = beats
                    measNum += 1
                    measStart += 3840
                    meas_dict = {}
                    meas_dict['key'] = key
                    meas_dict['mlen'] = song_len
                    meas_dict['mstart'] = measStart
                    measures_list.append(meas_dict)
                    total_dict['measure_'+str(measNum)] = meas_dict
                    beats = []

                elif line[:3] == 'vst':
                    beatStart1 = measStart + 221820-int(line[4:11])
                    beats.append(beatStart1)
                    strings = []
                    notes = []
                elif line[:1] == 'S':
                    valStart = line.find('V')
                    string = int(line[2])
                    realVal = int(line[valStart+2:valStart+5])
                    strings.append(string)
                    notes.append(realVal)
                    total_dict['measure_'+str(measNum)]['strings' + str(beatStart1)] = strings
                    total_dict['measure_'+str(measNum)]['Notes' + str(beatStart1)]= notes
        return(total_dict)

    curl = guitarpro.parse('Serenade.gp5')
    track = curl.tracks[0]
    for measure in track.measures:
        for voice in measure.voices:
            for beat in voice.beats:
                beat.notes = []
    curl.tracks[0] = track

    songDict = txt2songDict('genMetallica2.txt')

    track = curl.tracks[0]
    track = transpose2GP5file(track, songDict)
    curl.tracks[0] = track
    curl.artist = 'R. N. Net'
    curl.album = time.strftime("%d %B %Y")
    curl.title = 'Metallica Style Song'
    guitarpro.write(curl, 'genMetallica.gp5')

if __name__ == '__main__':
    main() 

Results

The model works well, with the training error decreasing as the model runs, and the validation error being slightly higher than the trtaining error, the sign of good model parameters. An example output of the model is shown below and can be played by hitting the play button, the tab is played using AlphaTab, that is able to play guitar pro files online. This example is an output consisting of 7000 characters and can be modified accordingly, ot seems that a whole song may consist of 15-20,000 characters in the text format the model requires, or about 70 musical measures. Works best on google chrome!

The model is able to pick up on the song structure, and even power chords, which are often played in Metallica's music. Though the song is a little out-of-order we can see that it starts with a solo or interlude, and continues to a chorus or verse.

This model could be used to generate other original music in the style of other artists dependent on the input files, for example, pearl jam or nirvana songs could be used or even a combination of both to get songs in the style of 90's alternative. To take this even further guitar pro file exist for bass guitar, drum, and piano music, as well as lyrics, so it is conceivable that whole songs could be generated using this model. It would take a very long time though, scaling linearly with the size of the imput file. I would recommend having at least 75 songs to accurately train the model for the various characteristic features of the songs, especially if multiple artists are used, where common features may be more subtle.

For the meantime I'll be Master of puppets I’m pulling your strings. Twisting your mind and smashing your dreams.


Related Projects


Deep Learning Country Lyrics

What Makes a Good Playlist Title


alphaTab
  • SoundFont

    0%