Preamble Link to heading
This post details my experience importing my "Extended Streaming History" from Spotify into my ListenBrainz account using Elbisaur.
I did this with a listening export from 2022 so my experience was slightly different, any changes I needed to make for this older dataset is noted as such. One major difference worth noting is that my files are endsong_*.json
and the newer Spotify dumps are Streaming_History_Audio_*.json
.
Importing Link to heading
- Request download of Spotify listening history
- This will take a few days to process then will email you and let you download a zip file
- Install
deno
(a runtime that can be used to execute javascript software, this is required to run Elbisaur)curl -fsSL https://deno.land/install.sh | sh
- Install
elbisaur
deno install --global --allow-env=LB_USER,LB_TOKEN,ELBISAUR_LISTEN_TEMPLATE --allow-net=jsr.io,api.listenbrainz.org,musicbrainz.org --allow-read --allow-write jsr:@kellnerd/elbisaur
- Copy ListenBrainz user token
- Create
.env
file in the working directory that it's gonna be executed from with following:LB_TOKEN = "YOUR_TOKEN_HERE" LB_USER = "YOUR_USERNAME_HERE"
- Check the CLI works by running
elbisaur history
(this should show 25 recent listens for your user) - Identify timestamp of first imported listen to listenbrainz (go to user feed and click the button that takes to oldest)
- Convert that to format YYYY-MM-DDTHH:MM:SS
- 2022 Data Format Note
- I had issues when trying to parse as some of my spotify json files had entries that had no artist, track, album or uri
- Filtered these out before using
elbisaur
by parsing the files with JQ to remove any items where there isn't aspotify_uri
for a track (usually indicative of empty artist/track/album), an example of doing that to a file is below:cat endsong_0.json |jq '[.[] | select(.spotify_track_uri!=null)]' >> parsed_endsong_0.json
- Filtered these out before using
- I had issues when trying to parse as some of my spotify json files had entries that had no artist, track, album or uri
- Elbisaur filtering for tracks that have not been skipped and have been played for at least 30 seconds
- The documented way (using
ms_played
) is incorrect and should useduration_ms
-"skipped!=1&&duration_ms>=30000"
** - Check output seems sensible (e.g. no songs that you know are longer than duration shown) with
elbisaur parse -d -p parsed_endsong_0.json --filter "skipped!=1&&duration_ms>=30000"
- Some songs have incorrect listen dates, coming out with the epoch (01-01-1970T01:00:01)
- Setting
offline_timestamp
to 0 (inparsed_endsongX.json
) for any tracks that have it set to just1
will prevent that being used to calculate the "listened at" date (bc 1 would set to 1 second past epoch)sed -i '' 's/"offline_timestamp": 1,/"offline_timestamp": 0,/g' parsed_endsong_0.json
- Setting
- The documented way (using
- Create filtered json files SET
-b
TO TIMESTAMP IDENTIFIED EARLIER (Note that this is being done to theparsed_*.json
files created earlier)for i in endsong* ; do elbisaur parse parsed_$i --filter "skipped!=1&&duration_ms>=30000" -b YYYY-MM-DDTHH:MM:SS filtered_$i.jsonl ; done
- Import to listenbrainz
- Dry run it and sort by date it thinks of listens, check top of this output to see if any epoch borking
elbisaur import --preview filtered_endsong_0.json.jsonl |sort -k 1.7,1.10 -k 1.4,1.5 -k 1.1,1.2
- Run the import
for file in filtered_endsong_*.jsonl ; do echo "$file" ; elbisaur import $file > import_of_$file.log ; done
- Dry run it and sort by date it thinks of listens, check top of this output to see if any epoch borking
Voila, all my listening history is now in ListenBrainz! (Why did I pivot to using an iPod Touch and iTunes for 2 years?!)
A screenshot of my ListenBrainz 'All Time' statistics
A screenshot of my ListenBrainz top artists of all time
Troubleshooting Link to heading
I hit the issue with offline_timestamp
after trying an import so to tidy up all imports before my first legit imported listen discovered earlier (following docs):
- Identify listens before the first legit one
- Note: Count may need to be set much higher depending on how many were imported before failure! Highest that
count
can be is 523 from my testing - Repeat this process if more than 500 but changing
--before
to whatever the last listen to from previous run was - Note: If count is higher than the number of listens before this date then will hit errors so slowly increase count to get to the right number. Should be able to compare this list to the "oldest" listen in the listenbrainz profile view in browser
- Note: If there's a significant time jump (e.g. a couple of years of no listens) this seems to confuse things so go up until that date
elbisaur history --user stuts --before 2018-11-11T14:20:00 --count 1000 --output remove_these.jsonl
- Note: Count may need to be set much higher depending on how many were imported before failure! Highest that
- Delete the ones we've found
- Note: Use
--preview
to check it removes what you expectelbisaur delete remove_these.jsonl
- Note: Use