Preamble Link to heading

This post details my experience importing my "Extended Streaming History" from Spotify into my ListenBrainz account using Elbisaur.

I did this with a listening export from 2022 so my experience was slightly different, any changes I needed to make for this older dataset is noted as such. One major difference worth noting is that my files are endsong_*.json and the newer Spotify dumps are Streaming_History_Audio_*.json.

Importing Link to heading

  1. Request download of Spotify listening history
    1. This will take a few days to process then will email you and let you download a zip file
  2. Install deno (a runtime that can be used to execute javascript software, this is required to run Elbisaur)
    curl -fsSL https://deno.land/install.sh | sh
    
  3. Install elbisaur
    deno install --global --allow-env=LB_USER,LB_TOKEN,ELBISAUR_LISTEN_TEMPLATE --allow-net=jsr.io,api.listenbrainz.org,musicbrainz.org --allow-read --allow-write jsr:@kellnerd/elbisaur
    
  4. Copy ListenBrainz user token
  5. Create .env file in the working directory that it's gonna be executed from with following:
    LB_TOKEN = "YOUR_TOKEN_HERE"
    LB_USER = "YOUR_USERNAME_HERE"
    
  6. Check the CLI works by running elbisaur history (this should show 25 recent listens for your user)
  7. Identify timestamp of first imported listen to listenbrainz (go to user feed and click the button that takes to oldest)
    1. Convert that to format YYYY-MM-DDTHH:MM:SS
  8. 2022 Data Format Note
    1. I had issues when trying to parse as some of my spotify json files had entries that had no artist, track, album or uri
      1. Filtered these out before using elbisaur by parsing the files with JQ to remove any items where there isn't a spotify_uri for a track (usually indicative of empty artist/track/album), an example of doing that to a file is below:
        cat endsong_0.json |jq  '[.[] | select(.spotify_track_uri!=null)]' >> parsed_endsong_0.json
        
  9. Elbisaur filtering for tracks that have not been skipped and have been played for at least 30 seconds
    1. The documented way (using ms_played) is incorrect and should use duration_ms - "skipped!=1&&duration_ms>=30000"**
    2. Check output seems sensible (e.g. no songs that you know are longer than duration shown) with elbisaur parse -d -p parsed_endsong_0.json --filter "skipped!=1&&duration_ms>=30000"
    3. Some songs have incorrect listen dates, coming out with the epoch (01-01-1970T01:00:01)
      1. Setting offline_timestamp to 0 (in parsed_endsongX.json) for any tracks that have it set to just 1 will prevent that being used to calculate the "listened at" date (bc 1 would set to 1 second past epoch)
        sed -i '' 's/"offline_timestamp": 1,/"offline_timestamp": 0,/g' parsed_endsong_0.json
        
  10. Create filtered json files SET -b TO TIMESTAMP IDENTIFIED EARLIER (Note that this is being done to the parsed_*.json files created earlier)
    for i in endsong* ; do elbisaur parse parsed_$i --filter "skipped!=1&&duration_ms>=30000" -b YYYY-MM-DDTHH:MM:SS filtered_$i.jsonl ; done
    
  11. Import to listenbrainz
    1. Dry run it and sort by date it thinks of listens, check top of this output to see if any epoch borking
      elbisaur import --preview filtered_endsong_0.json.jsonl |sort -k 1.7,1.10 -k 1.4,1.5 -k 1.1,1.2
      
    2. Run the import
      for file in filtered_endsong_*.jsonl ; do echo "$file" ; elbisaur import $file > import_of_$file.log ; done
      

Voila, all my listening history is now in ListenBrainz! (Why did I pivot to using an iPod Touch and iTunes for 2 years?!)

A screenshot of my ListenBrainz 'All Time' statistics

A screenshot of my ListenBrainz top artists of all time

Troubleshooting Link to heading

I hit the issue with offline_timestamp after trying an import so to tidy up all imports before my first legit imported listen discovered earlier (following docs):

  1. Identify listens before the first legit one
    1. Note: Count may need to be set much higher depending on how many were imported before failure! Highest that count can be is 523 from my testing
    2. Repeat this process if more than 500 but changing --before to whatever the last listen to from previous run was
    3. Note: If count is higher than the number of listens before this date then will hit errors so slowly increase count to get to the right number. Should be able to compare this list to the "oldest" listen in the listenbrainz profile view in browser
    4. Note: If there's a significant time jump (e.g. a couple of years of no listens) this seems to confuse things so go up until that date
      elbisaur history --user stuts --before 2018-11-11T14:20:00 --count 1000 --output remove_these.jsonl
      
  2. Delete the ones we've found
    1. Note: Use --preview to check it removes what you expect
      elbisaur delete remove_these.jsonl