I have a programming question around the project architecture for geolocated data on a map.My goal is to have a map that displays realtime data (timeframe: days/weeks) as markers on a map.As example lets use traffic reports (incidents, jams, blockages, …) around the driver of a vehicle.The map has no bounds, so all datapoints can be located all around the world.
The user has the option to pan and zoom the map and see a selection of markers most relevant to the current bounds.But now I’m wondering how to efficiently load this data initially and update on panning/zooming.
Therefore, my question is:Are there architectures/approaches already documented (papers, packages, documentation) on how to tackle this problem? My main point of concern is the server side of things.
Context
There are too many datapoints to load all markers at once and do the calculations on the client. Clustering or prioritisation would therefore have to be made dynamically on the server side.
Users are able to like/dislike datapoints which should change their visibility.
Current approach
Client
At first I initialise the map and center it on the user.Based on the current zoom level I split the map in different tiles and calculate which of them are currently in the viewport. Then I make a call to the api to fetch all data relevant to these tiles.On panning I calculate which tiles are added to the viewport and selectively load more data on demand.
For zooming I'm not yet sure.When zooming out, I can probably keep the existing markers and hide those which are not relevant at the new zoom level.For zooming in, I'll probably have to make a complete refetch of data for all tiles, to load data for more detailed zoom levels.
Server
For the server side, I'm honestly a bit unsure at this point in time. I have ideas, but feel like I should not reinvent the wheel.
Priority at zoom levels
My main problem is the calculation of the priority at different zoom levels.Because users can vote on each datapoint, the marker visibilities are pretty dynamic and keep changing every few minutes/hours. Voting effects the markers own visibility as well as its priority relative to others. Therefore the visibility of markers would either need to be calculated every time a tile is requested or an index that stores the visibilities would constantly be updated.
Querying of tiles
I currently calculate different sized tiles for different zoom levels. In order to be able to move data between different sized tiles, the tile sizes are always doubled and halved.The next higher level therefore always combines 2x2 smaller tiles and a smaller level splits the current tile in 4 parts (2x2).
But I'm not yet sure, how to connect lat/long positions to all the different sizes of tiles in the db/queries. Theoretically I could only store the smallest tile for each datapoint, because larger tiles can be calculated based on it. But this would lead to large WHERE tileId IN (2x10, 3x10, 4x10, 5x10, ...)
queries. Alternatively I could have one individual column for each tile size and query the current column for the current tile size, but this would increase the size of each row in database.
Environment
- API: PHP, Laravel, (maybe EleasticSearch)
- Client: Angular, Google / Apple Maps / Open Streetmap
Conclusion
So far.. this is my status quo. Currently I feel like I could start and build a system that would work, but reach its limitations quickly.
And I feel like this is a relatively common problem and I don't want to reinvent the wheel. I was wondering if there are any infos how the big guys solved this, like:
- Google Maps
- Apple Maps
- Open Streetmap
- ...They have integrated location markers which are rendered based on zoom levels.Most of their places are less dynamic, but at some point in time their visibility needs to be calculated. And data needs to be queried constantly.
Therefore, I would really appreciate any inputs that could help me expand my concept and build a reasonable architecture. Preferably one that is not too complex to initially set up, but can be extended / scaled on demand, in case more and datapoints/-types are added.
Everything helps:
- Keywords to search the web with - I already did a lot of researching, but feel like there are resources which I was so far not able to find
- Concepts to tackle this problem
- Packages or services that provide help (preferably open source and free to use)