Fra mn/ifi/camsem
Revisjon per 30. jun. 2010 kl. 14:38 av Larsereb@uio.no (diskusjon | bidrag)

(diff) ← Eldre revisjon | Nåværende revisjon (diff) | Nyere revisjon → (diff)
Hopp til: navigasjon, søk

Tutorial 3: Getting a nice view on gender

We feel Anzo on the Web lack some properties to make the best graphs for statistics. For instance, let’s say I want a stacked graph over sick-leave, where men are in one color and women another? I mean something like this (figure taken from www.anychart.com):

Tutorial 3 bilde 1.jpg

With the method we have been making our ontologies until now, we have not been able to show graphs like these over women and mens sick leave. We have figured out a workaround for this problem, this tutorial shows the step-by-step process in how to make such nice columns like those above.

Step 1: Manipulating the Excel spreadsheet
Step one in this process is to render the Excel spreadsheet so we can upload the data in a meaningful way. Instead of letting gender be one of the values in the column, we make a new row for each gender (women, men, both).

This time we are only using data from 2008, in the region of Nord Norge (Northern Norway).

Old way:

Tutorial 3 bilde 2.jpg
New way:

Tutorial 3 bilde 3.jpg

As you can see, we get half the number of columns, but two more rows in this manner. The reason for doing it like this will become apparent shortly.

Step 2: Updating the sick leave ontology
Now, we must adjust the ontology to reflect the new layout of the spreadsheet. A simple task, we just use the one from Tutorial 2 and add two properties to the class named Sick Leave: “Men on sick leave” and “Women on sick leave”. The range on them both should be Double. You could also delete the gender property since we won’t use that any more.

Tutorial 3 bilde 4.jpg
Step 3: Upload your data to the Anzo server
Upload the data on Kindergarten coverage and sick leave. This task should be familiar by now, so we won’t go into details.

Step 4: Creating the view in Anzo on the web

  • Log in to Anzo on the web.
  • Create a new view.
  • Add the correct data set.
  • Add the data types “Kindergarten coverage” and “Sick leave”.

  • Create a new lense
  1. Make a lense from the chart template
  2. Give it an appropriate name
  3. Add three series:
    i. One about Kindergarten coverage
      a.Use Kindergarten Coverage -> Municipality, County, Name as label
      b. Use Kindergarten Coverage -> Use Kindergarten Coverage as value
      c. Use the same as step 1 as Group by
      d. Select Avg instead of Value in the drop down box
      e. Select Line as type in the Plot-tab
    ii. One about women sick leave
      a. Use Sick leave -> Municipality, County, Name as label
      b. Use Sick leave -> Women sick leave as value
      c. Use the same as step 1 as Group by
      d. Select Avg instead of Value in the drop down box
      e. Select Bar as type in the Plot-tab
    iii. One about men sick leave
      1. Basically the same as women

  • Now, we want to stack the bars.
  1. In the Properties-dialog select Axes
  2. In the Y Axis, select Stacked as scale-mode
  3. Press Save

Following all these steps should give you this graph:

Tutorial 3 bilde 8.JPG
Quite similar as that we found from www.anychart.com, as you may see.

Next big problem!!

Now we have a nice view of the correlation between sick leave and kindergarten coverage. For now, it seems like the correlation is rather non-existent, but we have of course too little data to draw any conclusions. (These tutorials are never meant to be exhausting enough to say anything meaningful, just showing the steps to how to build nice graphs which could say something meaningful).

But what if we just want to take a look at one quarter at time? And also, a quarter just for one year! Can we do this?

  • Lets add a filter, The title could be "Quarter". The property should be Sick leave -> Year and quarter -> Quarter.
  • Select a quarter from the new filter left hand. For instance quarter number 3. The graph will now look like this:

Tutorial 3 bilde 9.JPG

As you can see, the data for Kindergarten coverage is missing! The filter removes everything which does not have anything to do with the Municipality-Quarter.We have a solution for this one as well, stay tuned for Tutorial 4. 

We do not have a solution for this! The problem we solved, was the problem of filtering counties, while showing municipalities. This would also lead to a similar problem, that we have solved in tutorial 4.

Other problems

Tutorial 3 bilde 10.JPG

The stacking of values in Anzo on the Web is based in the type of plot. That is why we can stack bars and showing a nice line above. But when we stack some bars and want to show two lines, the lines get stacked as well! (See first thumbnail.)

We can avoid this problem for two values, for instance by using a marker for one series of data, and a line for another. (See second thumbnail.) 
Tutorial 3 bilde 11.JPG

We can not however solve this in the long run, adding more and more series of data. But perhaps it isn't that appealing have so much information in one graph either way? :-)

Concluding remarks
With a clever use of Men and women sick leave as properties, we can easily use these values to view stacked charts, and compare men and women in a neat graphical environment. By engineering the ontologies in Anzo carefully, you should be able to draw a lot of interesting graphs.

Appendix pictures:

All bars, no stacking:

Tutorial 3 bilde 6.JPG

Trying to stack when Kindergarten coverage also is bars:

Tutorial 3 bilde 7.JPG