Clustering similar words to plot on world map in all languages using: Python, Scikit-learn and Google Cloud Translation Service

InterSoftwareBot
2 min readMar 4, 2021

Generate etymology cluster on world map for any word in any language. Needs google cloud translation account — free £300 on first calls

To start using Cloud Translation, you need a project that has the Cloud Translation API enabled and credentials to make authenticated calls. The following sections detail how to get set up before you make your first call to the Cloud Translation API.

Download Wordmapix repo from here:

https://github.com/haker88/wordmapix

Read through Google Cloud setup guide: https://cloud.google.com/translate/docs/setup

In short:

Setup your new app here — https://console.developers.google.com/apis/dashboard

Setup API for google cloud translation: https://console.cloud.google.com/flows/enableapi?apiid=translate.googleapis.com

Generate json key for your app:

https://console.developers.google.com/apis/credentials

Download your json key to main app folder and rename: ‘key.json’ You are ready to go ! In jupyter file execute all cells and feel free to edit list of phrases or load from csv etc.

Go to cloud console to monitor calls and setup threshold limits — you should be ok to run 500k calls which will be enough for decent number of maps.

#scikit #sklearn #datascience #geodata #wordmapix

Some countries have very uniform name across the whole world
Word Python clustered on worldmap
Every map gives you opportunity to learn some interesting insights about word etymologies

--

--