Collected US 2024 tech job postings from Indeed and embedded them with Open AI text embedding large. Reduced dimensionality and clustered via UMAP and HDBSCAN. Topic modeled with Open AI chat API. Visualized with DataMapPlot. Github pages https://hazondata.github.io/ has full interactive map. I also have real-time insights into tech job postings on my site hazon.fyi

https://old.reddit.com/r/dataisbeautiful/comments/1fakvwv/oc_clustering_250k_tech_job_postings_in_2024/

You are viewing a single thread.
View all comments
5 points

no wonder it was taking long to load; it’s a 58MB HTML file.

really cool stuff though - I’d love to see more information of what’s on the screen:

  • Number of postings (updated when filtered using the search);
  • Some way to visualize posts in the intersection of these clusters e.g. Software Dev with Education; AI and DevOps.
  • Word cloud of most common terms in the posting selection;
  • Ways to export the filtered data.
permalink
report
reply

Data is Beautiful

!dataisbeautiful@mander.xyz

Create post

Be respectful

Community stats

  • 1.1K

    Monthly active users

  • 124

    Posts

  • 2K

    Comments