Wednesday, March 21, 2012

Trade Area Analysis - Thiessen Polygons

Authors: Safwan Rahman, Chris Fealy


Introduction


A trade area analysis is an analysis performed to understand the trade zones of a particular business. There are a number of methods in performing a trade area analysis, while this project focused on a Thiessen Polygons approach.

Essentially, a trade are will show some set of information within a particular boundary, designed and manipulated by the author. Information presented in these zones could be ones such as household income levels, education levels of the households within the zone, etc.

With the Thiessen polygons approach, we took each retail store and created a larger boarder around it, signifying its "trade zone". Thiessen polygons are defined by the midway point between one retail store to another. To understand further about Thiessen polygons, click here.


Study Area


The study area focused on this project is the Greater Vancouver Area, BC in Canada.


Methodology
  1. The first step in conducting a trade area analysis is identifying a retail store. I will leave it unnamed in this post however and refer it now on as "retail store". All the retail stores were first identified in the Greater Vancouver Area. This can be done in several methods. We were given the points, however, a simple way to tackle this is by conducting your own research. Go to the retail store's website and look up all the retail stores in your study area. You can then digitize this in your own means. I recommend Google Maps and then convert to KML (tutorial coming soon). 
  2. For the next step, we found boundary information (census tracts) through the Statistics Canada geo-database. Each census tract already had the demographic information needed. In this project we were concerned with 
    1. Average Household Income
    2. Population Age 15-19
    3. Population Age 20-24
    4. University degrees attained
  3. Now we take each retail store and apply Thiessen polygons around then. This was done using the Thiessen Polygons tool in ESRI ArcMap. Each thiessen polygon now overlaps census tract boundaries. Therefore, the census tracts were divided within each Thiessen polygon. Some census tracts were present in two different Thiessen polygons, thus, we took the percentage of its presence in one. The demographic information in the census tracts were now totaled into each Thiessen polygon, except for Average Household Income, which was averaged in each polygon. 
  4. Using our cartographic skills, we created a color coded map to display all the demographic information. 
Results & Analysis

Below are the maps created for each of the variables. 





From all the preceding maps, the most marketable zones are circled in yellow. These zones show optimal locations to focus on for any kind of marketing purpose. For example, the degrees attained map shown above highlights the S. Kensington area. If your target customers are ones that should have high levels of education, this zone circled above works well as it is in a central location, close to high traffic retail centers (Tier 1 and 2 malls, meaning over 100 shops), and are on major roadways. 

To conclude, this trade area analysis is just an example of how businesses should leverage GIS technology in finding optimal marketing locations. There are many  other methods in conducting a trade area analyses which I shall post in the near future. 


Mapping and Assessing Fire Vulnerability - Alberta & British Columbia: national parks region

Authors: Safwan Rahman; Thomson Lau; Brent Manza; Nicholas Marcheggiani


Mapping and Assessing Fire Vulnerability
Alberta & British Columbia: national parks region


Abstract


Vulnerability to fire in Banff National Forest
Assessing forest fire vulnerability would be ideal for fire hazard specialists so that they can focus on areas that are of highest importance, such that mitigation efforts may be thoroughly and efficiently applied. A well developed regression methodology showed that land cover was the significant factor that influences a regions vulnerability to fires. Moreover, temperature, slope and elevation accounted for about 30% of the significance in influencing fire risk. The aforementioned four variables were used as parameters in creating vulnerability maps in the Alberta, British Columbia national parks region. Finally a secondary study area located at Wabakimi Provincial Park in Northern Ontario, Canada was employed to evaluate our methodology’s credibility. The results of this showed consistency in our method.

Introduction



Forest fires are a growing concern around the world, which has negative human and ecological implications. Canada receives on average, 9100 fires each year; it is interesting to note that only 2% of fires account for 98% of the total burned area. While the majority of fires are small in magnitude, it currently costs Canada 300 million dollars each year to deal with the damages (Wagner, N.D).

Current research in forest fire vulnerability is subjective in determining a quality ranking system for factors affecting the risk of fire. This research project aims at identifying a quantitative approach in assessing fire vulnerability. Banff National Park in Alberta, Canada is used for this research.

The research approach leverages a regression analysis and Saaty's analytical hierarchy process to identify and justify the relation between physical factors in a forest and its vulnerability to fire.

Study Area


The study area chosen to conduct research on is Banff National Forest in Alberta, Canada. This study area was chosen due to the ease of available resources. Once the research was completed, the model created (described under Methodology section), was applied to Wabakimi National Forest in Ontario, Canada for further research and application.
Satellite image of Wabakimi Forest


Map of Banff showing historical occurrences of forest
fires by area of coverage (in hectares).
 
Data Sources

It is important to keep track of data sources for intellectual property and other organizational purposes. These data sources are provided to you so you may find the data as you desire for your own projects. Note that some sources are open source (free) and some are paid.


Variables

The following are the list of variables used to determine fire risk. The following factors (variables) carry some   weighted value in causing vulnerability to fire. 

  1. Temperature
  2. Slope
  3. Elevation
  4. Land cover type
    1. Coniferous
    2. Deciduous
    3. Mixed wood
    4. Shrubs
Methodology

There were three distinguishable sub-processes identified; the crux of the first was a regression analysis, the second involved a complete work-through of an analytical hierarchy process, and the third involved careful manipulation of a set of rasters and summing them up. The regression analysis served its purpose in indicating the level of importance of each variable contributing to fire risk. Saaty’s matrix played the role in the analytical hierarchy process, which ultimately formulated the final weights of each variable.

Finally, reclassifying and rescaling all input variable rasters were carried out using ESRI’s ArcGIS software so that they all were in a common factor to be summed up together to produce the final fire vulnerability map.


Regression Analysis


The regression analysis took each variable and plotted it against the size of the forest fire. When plotted against the size, we were able to determine how much a certain variable had an effect on the size of the fire. The graph below shows an example of the plotting.

OLS regression is used to specifically find parameters a and b, which is the regressed line’s intercept and slope, respectively. The line can take any number of input factors and plot against its x-axis, which in our case would be the fire determining variables (Statistics.com. 2010). Our study involved seven b parameters for the seven variables defined. Once data was collected from each variable, they were all plotted against the fire size and an OLS regression estimator was produced. The equation below shows the OLS regression equation.


The variables listed above include both topographic and land cover variables. The variable names are clearly listed and defined in the table below.




The following are the results from performing the regression analysis on all the variables. The table below the equation shows the coefficients while the equation below shows our OLS regression equation. 



From the table and equation, we can conclude that the land cover types play significantly higher weights towards the vulnerability of fire. Now since we wanted to use these coefficients as only indicators to test the weight of each variable, it was necessary to conduct a regression on only the temperature, slope, and elevation. 

The following were the results:

Analytical Hierarchy Process - Saaty's Matrix

An analytical hierarchy process (AHP) is a system for solving complex decision-making situations. In the case of our study, it will be used to decide the significance of each of the seven variables by applying certain weights to them. Each variable will be given a weight adding up to 100%. Thomas L. Saaty developed a method for applying the AHP and coming up with appropriate weights. His method involved setting up a matrix, which matched each variable with each other. Each variable was compared with each other and their level of importance was identified between them (Dyer, 1990, p. 249).

To perform Saaty’s AHP, we used an automatic matrix calculation generator to overcome time and difficulty constraints. The Canadian Conservation Institute provided us a free-to-use Saaty’s Matrix generator (Canadian Conservation Institute, 2005). We used the beta coefficients attained from the regression analysis to decide the level of importance of each variable to one another. 

The matrix decision making was formed in two separate steps. The first step evaluated the topographic data and the land cover as a single variable on its own. Therefore, the first matrix included an evaluation of only four variables. The goal from doing this was to figure out how accountable land cover is compared with the other three variables on a scale of 100%. The second step required us to evaluate only the four land cover variables against one another. 

Sample of AHP:


The results shows the final weights of each variable as determined by the automatic AHP calculator. Land cover accounts for about 70% of forest fire risk while temperature holds second highest, although significantly less. 


The following show the results for performing each land cover type separately. 



Raster Creation

The final process in our methodology required us to reclassify all our input rasters so they would be put into a common scale. All the rasters needed to be in a common scale in order to be summed together. If the rasters did not have the same scale, then all the values would ultimately be meaningless. For example, if we left the DEM showing elevation out of scale from slope, where slope has values ranging from 0-30 and DEM has 100 to 3500; the DEM value would be far overstated.

 According to our weights, elevation received the smallest weight; therefore the overstatement would show a significant inaccuracy in our final map product.

The input rasters for the topographic variables were reclassified to a scale of 0 to 10. We chose a scaling of 0 to 10 because a small range scale would make it easier for fire hazard specialists to whom our product is targeted to, to assess areas that need the most focus on. Regions with a value closer to 0 represent areas of low fire vulnerability and values closer to 10 show higher vulnerability. Table 8 shows how we reclassified slope, elevation and temperature in a 0 to 10 scale.


Reclassified raster values
The first part of the process defined land cover to have a weight of 70.01%, according to table 6. This means that all the values in table 7 had to be rescaled to be 70.01% of the four variable weighting scheme. Therefore, the 50% coniferous value had to be rescaled to 70%, since it was the land cover variable with the largest weight. This is done using a scaling factor derived by dividing . The common scaling factor is 1.357. Table 9 below shows the new rescaled values for land cover type.

Equation for final vulnerability map
Results

The figure to the right shows the final vulnerability raster map of Banff National Forest. Note that the highest vulnerability value is 9.5 and the lowest is 0.35. As indicated in figure n.n the most vulnerable regions are in predominant in the northern region. Most vulnerable regions are situated on coniferous land. This is in accordance with our beta values and our weighting scheme.

Below you can see two maps. One shows fire sites, and the other shows fire sites along with camp sites. You can interpret from the image below how close camp sites are indeed to fire presence locations. There is no definite correlation, but one can make assumptions (under their own risk).






Now with our newly created model, we apply it to Wabakimi National Forest.


It would however be tedious to continuously recreate all the aforementioned steps in finding the vulnerability of forest fires elsewhere. Hence we designed our own model shown below using ESRI ArcMap's model builder. In this model, as long as the desired rasters are entered into input sections, the result should give you a map, such as the one shown above. 


References

Akpinar, E., Usul, N. GIS in Forest Fires. ESRI. Retrieved 23 May 2010 <http://proceedings.esri.com/library/userconf/proc05/papers/pap1052.pdf>.

Analytical Hierarchy Process (AHP) Program. Canadian Conservation Institute. 13 May 2005. Retrieved on June 29
Barbosa, M.R., Seoane, J.C.S., Buratto, M.G., Dias, L.S.O, & Raival, J.P.C., Martins, F.L. (2009). Forest fire alert system: a geoweb gis prioritization model considering land susceptibility and hotspots-a case study in the carajas nation forest, Brazilian Amazon.. International journal of geographic information science, 24(6), 873-901.

Chuvieco, E, & Congalton, R.G. (1989). Application of remote sending and geographic information systems to forest fire hazard mapping. 29, 147-159.

Dyer, James S. Remarks on the Analytical Hierarchy Process. Management Science. Vol 36, No 3. EbscoHost. March 1990. <http://web.ebscohost.com/ehost/pdfviewer/pdfviewer?vid=2&hid=9&sid=4e27ed13-140d-4763-a8fe-bc2edeac60d9%40sessionmgr13>.

Flannifan, M.D., Stocks, B.J., & Wotton, B.M. (2000). Climate change and forest fires. The science of the total environment, 262, 221-229.

Jaiswal, R.K., Mukherjee, S, Raju, K.D., & Saxena, R. (2002). Forest fire risk zone mapping from satellite imagery and gis. International journal of applied earth observation and geoinformation, 4, 1-10.

Madry, Scott; Cole, Mathew L.; Siebel, Scott. Archaeological Predictive Modeling: Method and Theory. Retrieved 25 June 2010 < http://www.informatics.org/Poster_SEAC-05.pdf>. Ordinary Least Squares Regression. Statistics.com. 2010. Retrieved on 20 July 2010 <http://www.statistics.com/resources/glossary/o/olsregr.php>.

Wagner, C.V. (n.d.). The Canadian encyclopaedia: forest fire. Retrieved from http://www.thecanadianencyclopedia.com/index.cfm?PgNm=TCE&Params=a1ARTA000290



Saturday, March 17, 2012

Easy Method to Finding Coordinates

Here is a quick and easy way to find geographical coordinates when needed. Sometimes you may be given a task to find the location of specific coordinates (ie. -79.387014, 43.642768), or determine the specific coordinates to a certain location (ie. coordinates of Mt Everest).

Here are a few easy steps using my favorite, Google Maps (because it is free!) to find the coordinates (in decimal degrees) of a specific location.

1. Find a location you are interested in using Google Maps. I have chosen the Empire State Building.


2. Right click on your desired location and click "What's here?"


3. Check the Google search bar and your coordinates will be displayed. NOTE: The coordinates are displayed in opposite order. 40.748403, -73.985549 should be read -73.985549, 40.748403. This is read as -73.985549 west, 40.748403 north. Obviously if the latter is negative, then it is south, and the former is east when positive. It is read as longitude, latitude, but Google maps read as opposite.



Now if you would like to find where specific coordinates are, just place the coordinates in the search bar shown above. Remember, place North/South coordinate first, and then East/West.

Saturday, March 10, 2012

Calculating Distances Topographically in 5 Easy Steps

Have you ever wanted to calculated a distance birds eye view? That would mean finding the distance from one point to another in a straight line.

Here is a simple way to calculate distances using one of my favorites, Google Maps. I can use this tool for various objectives.

For example, last month I for some reason really wanted to know the distance it would take a boat to travel from one island to the next. Or swim of course. Here's what I did.

Here are 6 easy steps:

1. Sign on to Google via your personal Google account and then go to Google Maps. To satiate my curiosity, I decided to find out the approximate distance between the western most point in Alaska to Russia's Eastern most point. I wonder if you can swim the distance. 


Note: The feature I am about to show will not work if you do not have a Google/Gmail account. 





  As you can see from above, I have a Google Maps satellite view of Alaska and Russia.

2. On the pane to the left, you should see an option where it says "My Places". You will only see this if you are signed into a Google account. Click on "My Places"


3. Once you have done this, you will see a "Create Map" option in a lightly red colored box. Click this. It will give you an option to create a title and description for your map; you may skip this. What this essentially does is Google maps allows you to perform your own "light" GIS functions on Google maps. Such functions can be digitizing, drawing lines, creating points, etc...


4. Now, on the top left corner of the map, click on the "line" feature. This is the third option, next to the "point" feature which appears as a blue balloon. The point feature is between the line feature and the "pan" tool, which is the icon of a hand.

Ensure the "line" feature is actually on l"draw a line" feature.




5. Now that you are on the "draw a line" feature, you can calculate the distance between your two points of interest. Click on an original point of interest. The blue line will now begin. You may now drag this line to the next point of interest. In my case, I dragged it from the western tip of Alaska to the "Little Diomede Island". You will now notice the distance is automatically displayed as you drag the line. This is how you can find the distance quickly and efficiently between two points.




To my discovery, I see that Alaska and Russia are extremely close together, only about 86 kilometers apart. Hmm, I wonder how border security is? My my, the tension.


It is important to note some limitations. One large one to factor in is that you should not use this to calculate the distance between very far points of interests. This should be an obvious one as the Earth is round. Google Maps uses a projection system that would not account for this (UTM). No projection would actually account for this as there is always some sort of distortion in cartographic presentations of our Earth.

Well I hope this was an interesting free tool to learn. Good luck.

Friday, March 2, 2012

What is the difference between a vector and raster?

Another one of the most common terms you hear in the GIS field are the words "Vector" and Raster" images.

What is the difference? I'll try to keep at as concise as possible. 

Raster:
A raster is a set of pixels (dots) arranged in a grid, where each pixel contains a unique value. The unique value can contain any kind of information. To understand it easier, think of each pixel containing a value which pertains to a color. So therefore, each pixel will be a color. Now think of these pixels arranged in an organized grid, where each pixel contains a color, when you look at the grid from a distance, you will see a collection of colors. 
Below is a grid, with each pixel containing no color, or blue. You see the arrangement makes up an image of a fish. 
 
Now look at the image from a distance, it looks like a fish. Bring your face closer to the screen. If you make the image larger, it is going look a little distorted. This is one of the disadvantages of raster images. Because everything is set in pixels, if you are to blow up the image from a small paper to a large, it will look awful. 

Vector:
A vector are mathematical sets of information arranged to make up the image intended. To make it simpler, let's say you have an equation that makes up a line. Imagine you manipulate each equation to make sets of lines curve the way you want. These lines collected makes an image. 

Since the line is made of a mathematical equation, if you wanted to increase your image size by 10 times, so from a 8x11 paper to 60x42 inch, the image will look exactly the same. This is because the mathematical equation redraws itself on the paper, no matter the size. The proportion is still the same.