Skip to content

Advanced Coloring #1

Open
3 of 4 tasks
rjm11010 opened this issue Jan 11, 2017 · 8 comments
Open
3 of 4 tasks

Advanced Coloring #1

rjm11010 opened this issue Jan 11, 2017 · 8 comments

Comments

@rjm11010
Copy link
Collaborator

Summary

We want to make a more general application of color to the pins.
Right now you're able to apply a color to the Provider column, which has classes Network and GPS.

This is a good start, but in of it self isn't good enough for a more thorough analysis of GPS data.
We would like to applying a color to any column!

But lets not get too crazy. We'll start out with a column called class. You will only need to worry about this column when apply colors.

If you want a model example, I would checkout BatchGeo with their Group By / Thematic Value option.
screenshot from 2017-01-11 15-48-16

How do I do that?

Part of the fun is not knowing, and then figuring it out (LOL).
But, I think I should give you at least a heads up on stuff that may help.

Steps

1: Find Possible Classes in the class Column

For this you need to essentially find all distinct values that are contained in a column.
There's a lot of ways of doing this, but one really easy way is using an Abstract Data Structure you learned about last semester.

2: Generate Random Colors for each Class

This one maybe a little tricky. As you may know, the Google API has finite set of pre-defined colors to use (I think like 6 of them).

The problem is, what happens when there are more classes than their pre-defined colors? .... Mayhem

3: Create a Legend

This is a UI element that shows what colors correspond to what class.
You can do this anyway you think best. You can put in the control Panel, or elsewhere, as long as it's constantly visible

4: Plot the Points with the colors

How to apply these random colors using the Google API (IDK, but maybe this may help)

Checklist

  • Find Possible Classes in a Column
  • Generate Random Colors for each Class
  • Create a Legend
  • Plot the Points with the colors
@joh13010
Copy link
Owner

So as of right now, strictly based off of the month.csv file, it looks like the only column that wouldn't create an unreasonable amount of classes (more then 15 or so) is the configAccuracy column. So I will surely add that to the UI drop down.

But are there other columns that may be slightly more diverse on other common files (for example, is it common to see a "deviceid" that changes throughout the file, or are they always constant throughout the entirety of the file?) or are there additional columns that are included in other files from the database that aren't included in month.csv that would be more of interest to classify?

Otherwise, I could set ranges for some of the columns that could be made into classes such as for the accuracy column. I could do something like 0-500 is one class, then 501-1000 is another, 1001-1500 another, and so on.

@rjm11010
Copy link
Collaborator Author

You're right configAccuracy would be the only one. The other ones like deviceid, userid, and sensorType will always be the same, so their not so interesting. It's possible that after some data processing we can add additional columns, which may have more meaningful classes within them. So this generalization will apply to any new columns that can be possibly added.

I like your idea about making ranges (in fact I was going to suggest it, your good!).
You'll probably get stuck making the function to make an evenly spaced array of integers for that though.
I couldn’t find a Javascript library that already has that implemented. In MATLAB it's called linspace. You can make your own. If you need help, I have version I made in python as a quick test (this could be like a small interview question. Got to get ready for those too!).

Note: In the dataset I gave you, may see that deviceid changes, but that's because of a bug we had; which we fixed. Don't make changes in your code to accommodate this, just treat it like any other column.

@joh13010
Copy link
Owner

Okay so each time a file is uploaded we want to evaluate all columns and check if it is "classifiable", even if they definitely wont be (like the actual GPS coordinates)? Or should I only worry about the ones that matter (like configAccuracy and the ranges of some others) and add more classification code when new columns are added to the data sets in the future? I feel like it might be tough to make a general function that will apply to any given column if were working with ranges.

And okay so deviceid will normally not change throughout a single file correct?

@rjm11010
Copy link
Collaborator Author

Okay, I see what you mean.
Lets make this simpler. I'll update the requirements for this after this comment.

Update

You can expect a certain column in the file called class. From there you will extract the classes needed to color. You will only need to color the points based on the different classes in this column.

So this eliminates the need to dynamically find the classes need for any column.

@joh13010
Copy link
Owner

Okay so I think I might be a little stuck with the initial step of classifying the different distinct values. To keep it general, I would imagine that I should be using some sort of hash table or hash map that takes a key/value pair. I would assume that the key would be the individual value in the class column and the value would just be the marker, but I am having trouble implementing such a data structure. Would it be best to try to somehow keep an array or linked list at each of the key locations that would store all of the different markers in the case of key collisions, or is there some better way of doing it that I'm not quite seeing? I could probably hardcode a function in that just makes a new array every time it reads a value that has not yet been read, but I feel like this isn't really the best way to do it and I'd rather just get it done right.

@rjm11010
Copy link
Collaborator Author

The abstract data structure you'll need for this is a Set. Javascript has it's own implementation of this data structure (Set).

You're correct to think using a hash table is an answer to this problem, because the best way (i.e. most efficient way) of creating a set is using a hash table!

@joh13010
Copy link
Owner

So I completed the general classifying feature. As of right now, I only included 5 markers colors but I can add essentially as many as I want since the colors are decided based on Hex code in the source URL.

However, I have been having a problem displaying the colors properly in the legend. Right now, it shows a text alternative to the marker image and not the image itself. Perhaps there is something wrong with my current formatting but I was unable to pinpoint the issue. Perhaps you might be able to check it over and see a problem with how the legend markers are displayed.

@rjm11010
Copy link
Collaborator Author

Nice, looking good!

Got it just pushed it (0fc8467bc0). The right function you want use is: appendChild.

Note: For some reason I always have to minus an extra 1 on the for loops that go through the data file to get the code to run on my laptop. When I look at those particular for loops in MapScript.js, and the max time in the Slider.js I do an extra minus 1 of the max or end. I think it's something on my end, but I'm not sure what, because the loops are setup correctly (I think).

Sign in to join this conversation on GitHub.
Projects
None yet
Development

No branches or pull requests

2 participants