9 minute read

This Project uses an agent to create a new playlist for you. It looks at your listening history and finds songs that match your request – choosing a mixture of songs you know and new ones. The UI is all done through Streamlit, with package management via uv and and logging done with Loguru.

This project took me about 2 days thanks to Claude. The most difficult part was debugging why creating a playlist was working, but adding tracks to that playlist was broken. It turned out to be because the backend of the Spotipy package is not up to date with Spotify’s WebAPI, so it was submitting a /post call that was getting rejected to no fault of my own. The workaround was to just directly submit the correct /post call instead of using the python package. Another problem is Gemini’s insistance on hallucinating track id’s. The prompt explicitely tells Gemini not to do this, but a second layer of validation was still necessary.

Below is an explanation of how it works, generated by Claude. The interesting part is the agentic loop. This is where things could be tweaked if the results around to your liking.

How It Works — Codebase Walkthrough

Where it starts

The entry point is app.py, executed by Streamlit when you run the app. Streamlit re-executes the entire file top-to-bottom on every user interaction. The very last line is:

main()

Everything flows from there.


1. Boot sequence (every page load)

uv run streamlit run app.py
        │
        ▼
  setup_logging()          ← logging_setup.py: configures Loguru sinks before
        │                    anything else can log
        ▼
  st.set_page_config()     ← must be the first Streamlit call
        │
        ▼
  st.markdown(font CSS)    ← injects SF Pro system font stack
        │
        ▼
  main()

2. OAuth state machine (main())

Every page load runs through three possible states:

main()
  │
  ├─► _handle_oauth_callback()
  │       checks st.query_params for ?code=
  │       if found → exchanges code for token via SpotifyOAuth
  │                → stores in st.session_state["token_info"]
  │                → clears ?code= from URL (prevents double-processing)
  │
  ├─► _try_get_cached_token()
  │       checks st.session_state["token_info"] first
  │       if expired → refreshes via auth_manager.refresh_access_token()
  │       else checks disk cache (.spotify_cache file)
  │       returns token dict or None
  │
  ├─ token_info is None?
  │       └─► _render_auth_page()   ← STATE 1: show login page, st.stop()
  │
  └─ token_info exists?
          ├─► _initialize_spotify()  ← wraps token in spotipy.Spotify()
          │                            cached in session_state["sp"]
          ├─► _load_user_data()      ← STATE 3: fetch profile + history
          └─► _render_main()         ← draw the app

State 1 — no token. Shows the login page with the Spotify connect button. State 2?code= is in the URL. Spotify just redirected back. The callback handler fires, exchanges the code, then _try_get_cached_token() immediately finds the token and falls through to State 3. State 3 — valid token. The full app renders.


3. Session data loading (_load_user_data)

Runs once per browser session (guarded by session_state checks):

_load_user_data(sp)
  │
  ├─► PlaylistPlanner(sp).get_user_profile()
  │       └─► SpotifyClient.get_current_user()
  │               └─► spotipy: GET /v1/me
  │               └─► returns UserProfile (id, name, image, product, followers)
  │
  └─► PlaylistPlanner(sp).get_listening_context()
          └─► SpotifyClient.build_listening_context()
                  ├─► get_top_tracks(short_term, 20)   → GET /v1/me/top/tracks
                  ├─► get_top_tracks(long_term, 20)    → GET /v1/me/top/tracks
                  ├─► get_top_artists(short_term, 20)  → GET /v1/me/top/artists
                  ├─► get_top_artists(long_term, 20)   → GET /v1/me/top/artists
                  ├─► get_recently_played(20)           → GET /v1/me/player/recently-played
                  └─► _infer_favorite_genres()
                          counts genres across all top artists
                          returns top 10 by frequency

All of this is stored in st.session_state so it only happens once — not on every Streamlit rerun.


4. Playlist creation flow

When the user clicks “Create Playlist”:

_render_main()
  │
  └─► PlaylistPlanner(sp).create_playlist(request, user_profile, listening_context)
          │
          ├─► PlaylistAgent.run()        ← THE AGENT LOOP (see §5)
          │       returns AgentResult
          │           .track_ids        list of Spotify IDs
          │           .playlist_name
          │           .playlist_description
          │           .reasoning_summary
          │           .tool_calls       full log of every tool call
          │           .iterations_used
          │
          ├─► deduplication
          │       dict.fromkeys(agent_result.track_ids)  ← preserves order
          │
          └─► SpotifyClient.create_playlist()
                  ├─► POST /v1/me/playlists           ← creates empty playlist
                  ├─► validate IDs (regex: 22 base62 chars)
                  │       drops hallucinated IDs with a warning log
                  ├─► POST /v1/playlists/{id}/items   ← adds tracks (100 at a time)
                  └─► GET  /v1/playlists/{id}         ← fetches full playlist data
                          returns Playlist model

5. The agent loop (PlaylistAgent.run)

This is the core of the app. It implements a tool-use agentic loop — Gemini decides what to do, executes tools, sees results, decides again, until it calls finalize_playlist.

PlaylistAgent.run()
  │
  ├─► build_system_prompt()    ← injects user's listening history (see §6)
  ├─► build_user_message()     ← wraps the user's text input
  │
  │   contents = [user message]     ← growing conversation history
  │
  └─► LOOP (up to max_iterations=10):
          │
          ├─ iteration >= max_iterations-1 AND have tracks?
          │       inject "you MUST call finalize_playlist now" user message
          │       set tool_config = ANY, only finalize_playlist allowed
          │
          ├─► _generate(contents, config)
          │       calls Gemini API (gemini-2.5-pro)
          │       with exponential backoff on 503
          │       returns candidate with 0..N function_call parts
          │
          ├─► append candidate.content to contents
          │
          ├─ no function_call parts?
          │       if have tracks → inject "you stopped, call finalize now"
          │       force one more _generate() with finalize_playlist only
          │       return AgentResult   ── EXIT
          │       else break loop
          │
          ├─ any part is finalize_playlist?
          │       extract args (track_ids, name, description, reasoning)
          │       return AgentResult   ── EXIT
          │
          ├─► dispatch all tool calls IN PARALLEL (ThreadPoolExecutor)
          │       each call → _dispatch_tool(name, inputs)
          │
          ├─► collect results, build FunctionResponse parts
          ├─► append tool results to contents
          └─► progress_callback(_summarize_iteration())

The conversation history (contents) grows each iteration:

Iteration 1:
  contents = [
    user: "Please create 20 tracks. Request: lo-fi for studying"
  ]
  → Gemini responds with function_calls
  contents = [
    user:  "Please create 20 tracks..."
    model: [function_call: search_tracks("lo-fi hip hop study")]
           [function_call: search_tracks("chill beats instrumental")]
    user:  [function_response: search_tracks → [{id, name, artists...}, ...]]
           [function_response: search_tracks → [{id, name, artists...}, ...]]
  ]

Iteration 2:
  → Gemini sees the search results and calls more tools
  contents = [... + model response + tool results ...]

...until finalize_playlist is called

Gemini sees the entire conversation history on every call — it knows what it has already searched and what tracks it has in hand.


6. What prompts does Gemini receive?

System prompt (set once, sent with every API call):

You are a music curator AI helping {display_name} build a Spotify playlist.

## User's Listening Profile
- Recent favorites (last ~4 weeks): {top 8 short-term artists}
- All-time favorites: {top 8 long-term artists}
- Recently played tracks: {8 recent tracks}
- Top tracks (recent): {8 short-term tracks with artists}
- Top tracks (all-time): {8 long-term tracks with artists}
- Favorite genres: {top 10 genres, inferred from artist tags}

## Your Approach
1. Analyze the request for mood, genre, activity, artist preferences
2. Use search_tracks as primary tool — 3-5 varied searches
   - Good patterns: "genre:k-pop upbeat", "artist:IU", "chill lo-fi 2024"
3. Use get_user_top_items for personalization, then search those artists
4. Never fabricate track IDs
5. Order tracks thoughtfully (energy arc, genre flow, tempo)
6. Call finalize_playlist as soon as you have enough tracks

## Constraints
- Respect explicit content preferences
- Aim for variety — avoid repeating artists

First user message:

Please create a playlist with {target_length} tracks [no explicit content].

Request: {user's raw text input}

Injected user messages (added mid-loop when needed):

# When approaching iteration limit:
"You must now call finalize_playlist with the tracks you have gathered.
 No more search or recommendation calls are allowed."

# When model stops without finalizing:
"You stopped without finalizing.
 Call finalize_playlist now with the tracks you have gathered."

7. Tool dispatch (_dispatch_tool)

Each tool call is a thin bridge between Gemini’s JSON arguments and SpotifyClient:

_dispatch_tool(name, inputs)
  │
  ├─ "search_tracks"
  │       SpotifyClient.search_tracks(query, limit=10)
  │           → spotipy: GET /v1/search?q=...&type=track&market=from_token
  │           → returns list[Track] → serialized to compact JSON
  │               [{"id", "name", "artists", "album", "popularity", "explicit", "duration"}, ...]
  │
  ├─ "get_user_top_items" (tracks)
  │       SpotifyClient.get_top_tracks(time_range, limit)
  │           → spotipy: GET /v1/me/top/tracks
  │           → [{"id", "name", "artists", "popularity"}, ...]
  │
  ├─ "get_user_top_items" (artists)
  │       SpotifyClient.get_top_artists(time_range, limit)
  │           → spotipy: GET /v1/me/top/artists
  │           → [{"id", "name", "genres", "popularity"}, ...]
  │
  └─ "finalize_playlist"    ← never reaches _dispatch_tool
                               handled before the dispatch block

finalize_playlist is intercepted before the parallel dispatch loop. It’s a signal to the loop, not an actual API call.


8. Full end-to-end diagram

Browser
  │  GET http://localhost:8501
  ▼
Streamlit (app.py)
  │
  ├── [first visit] ──────────────────────────────────────────────────┐
  │   _render_auth_page()                                             │
  │   → "Connect with Spotify" link_button                           │
  │              │                                                    │
  │              ▼                                                    │
  │         Spotify OAuth                                             │
  │         /authorize → user approves → redirect to                 │
  │         http://localhost:8501?code=AUTH_CODE                     │
  │                                          │                        │
  │                                          ▼                        │
  │                              _handle_oauth_callback()            │
  │                              → exchange code for token           │
  │                              → store in session_state            │
  └────────────────────────────────────────────────────────────────┘
  │
  ├── [logged in, first load] ─────────────────────────────────────┐
  │   _load_user_data()                                            │
  │   5× Spotify API calls → UserProfile + UserListeningContext    │
  │   cached in session_state                                      │
  └────────────────────────────────────────────────────────────────┘
  │
  ├── [user submits form] ─────────────────────────────────────────┐
  │   st.status() context opens ("Building...")                    │
  │                                                                │
  │   PlaylistPlanner.create_playlist()                            │
  │     │                                                          │
  │     └─► PlaylistAgent.run()   ← AGENTIC LOOP                  │
  │             │                                                  │
  │             │  System prompt (listening history baked in)      │
  │             │  User message ("20 tracks, lo-fi for studying")  │
  │             │                                                  │
  │             │  ┌─────────────────────────────────┐            │
  │             │  │  GEMINI API CALL                │            │
  │             │  │  → returns function_calls       │            │
  │             │  └──────────┬──────────────────────┘            │
  │             │             │                                    │
  │             │    ┌────────┴──────────────────────┐            │
  │             │    │  PARALLEL TOOL DISPATCH        │            │
  │             │    │  ThreadPoolExecutor            │            │
  │             │    │                                │            │
  │             │    │  search_tracks("lo-fi") ──┐    │            │
  │             │    │  search_tracks("chill") ──┤    │            │
  │             │    │  get_user_top_items() ────┘    │            │
  │             │    │    ↓ all hit Spotify API       │            │
  │             │    │    ↓ simultaneously            │            │
  │             │    └────────┬──────────────────────┘            │
  │             │             │                                    │
  │             │    results appended to contents                  │
  │             │    progress_callback("Iteration 1: Searched...") │
  │             │             │                                    │
  │             │    [repeat until finalize_playlist called]       │
  │             │             │                                    │
  │             │    finalize_playlist({track_ids, name, ...})     │
  │             │    → return AgentResult                          │
  │             │                                                  │
  │     └─► SpotifyClient.create_playlist()                        │
  │             ├─ POST /v1/me/playlists                           │
  │             ├─ validate IDs (drop hallucinated ones)           │
  │             ├─ POST /v1/playlists/{id}/items                   │
  │             └─ GET  /v1/playlists/{id} → Playlist model        │
  │                                                                │
  │   st.status() → "Playlist ready" ✓                            │
  │   st.rerun() → renders _render_playlist()                      │
  └────────────────────────────────────────────────────────────────┘

9. Data model relationships

UserListeningContext
  ├── top_tracks_short: list[Track]
  ├── top_tracks_long:  list[Track]
  ├── top_artists_short: list[Artist]
  ├── top_artists_long:  list[Artist]
  ├── recently_played:  list[Track]
  └── favorite_genres:  list[str]      ← derived from artist.genres

Track
  ├── id, name, album_name
  ├── artists: list[Artist]
  ├── duration_ms → .duration_str ("3:42")
  ├── popularity, explicit
  └── album_image_url, spotify_url, preview_url

AgentResult                            ← output of the agent loop
  ├── track_ids: list[str]             ← what gets sent to Spotify
  ├── playlist_name, description
  ├── reasoning_summary
  ├── iterations_used
  └── tool_calls: list[ToolCall]       ← full audit log
          ├── tool_name
          ├── tool_input (dict)
          ├── tool_output (JSON string)
          └── iteration

Key design points

Conversation history grows in one direction. contents is a flat list that both sides append to. Gemini sees everything — past searches, past results, what tracks it already has. This is why it can avoid repetition and build on earlier searches.

Tool calls are parallel, API calls are sequential. Within one iteration, if Gemini requests 4 searches at once, all 4 hit the Spotify API simultaneously via ThreadPoolExecutor. But the Gemini API itself is called sequentially — one call per iteration.

finalize_playlist is a loop-exit signal, not a real API call. It never reaches _dispatch_tool. The agent loop scans for it before the dispatch block and returns immediately when found. The actual playlist creation happens in PlaylistPlanner after the agent returns.

The system prompt is the personalization layer. Gemini doesn’t call any user-data APIs itself — all listening history is pre-fetched, formatted into the system prompt, and baked in before the loop starts. Gemini uses it to pick relevant queries and seed artists.

Two safety nets for loop termination. (1) forcing_finalize injects a mandatory instruction at iteration max-1, and (2) if Gemini returns no tool calls at all, it gets one forced finalize attempt before the loop breaks and raises.