DEVELOPING DATA WAREHOUSE AND BUSINESS INTELLIGENCE – AMELIA TEKEI

Project Description

The goal of  my project was to develop a data warehouse of my own and use dummy datasets of my choosing to demonstrate how such data could be managed and utilised using the following Microsoft Services:

  • SQL Server Integration Services (SSIS)
  • SQL Server’s Analysis Services (SSAS)
  • SQL Server Reporting Services (SSRS)

Further analysis of the dataset was also shown using PowerBi, where I was able to create an interactive visualisation report of my chosen data’s.

Using Microsoft Services and PowerBi allowed me to explore many ways in which large scale databases can be transformed and simplified for the sake of making the data appear more organised and easily readable to any individual analysing the data records.

 

Creating Data Warehouse

For my data warehouse I chose to use Microsoft SQL and created a database of my own under my university username. To populate my database I searched for datasets from kaggle.com where I decided to choose a theme in which I would enjoy working with – Mario Kart. I then used the following Microsoft Services to develop a data integration, analysis and reporting solution to support Data Warehousing.

 

SSIS

The SSIS has features which allowed me to showcase a range of data migration tasks I was able to perform using the data from my datasets. The following SSIS methods are what I performed to create a project supporting Data Warehousing solution.

Enhanced Entity Relational Diagram (EERD)

To provide structure to my relational database and create relationships between my data tables I created an EERD. The diagram also helped me identify patterns within my datasets which I used for Data Analytic and Data Visualisation.

Extracting data from CSV File to Data Warehouse

The dataset downloaded from kaggle came in the form of CSV files which required me to use SSIS to import the data from CSV to my data warehouse. Steps I took to import the data can be seen through this link.

Once the data warehouse was populated with my dataset I then transformed the data using the SSIS methods stated:

  • Derived Column

    Say for example the speed data for one of the Mario character was changed, this would also have an effect on the class type the character is placed under. To allow the user to compare the past and current change to the Mario character speed and type data, the following method was used. A timestamp was an additional column added to allow the user to view the date and time of when the data was changed if needed.

  • Data Conversion and Error Handling

    Sometimes data types may not match and therefore need to be converted. This was the case for my dataset. Some of the numerical data’s had “varchar” as its data type which created issues for myself when it came to using calculation methods further in my project. However there are times when data conversions can create errors which can be easily overseen. The following method allowed me to carry out data conversions and view the data’s that were able to successfully convert and those that were not able to.  In my example, I tried to convert all my data’s in my tires table to one-byte unsigned integer (convert the numerical data to 0 or greater) however there were errors which showed in my errorlog CSV file.

 

SSAS

Further transformations were carried out on the data in my data warehouse using SSAS. The service features allowed me to create a tabular Model Project that would provide high performance analysis of my data warehouse. I was able to perform and demonstrate ways of creating:

  • Calculated Columns – Created a new column within the Character Data table called “Speed Class” and added a calculation method that if a Mario characters speed was less then 3 then they are considered as slow whereas those greater than 4 are considered fast/faster.
  • Measures – Finding out the maximum/minimum/average speed value using the data from the character table.
  • Perspective -As the name implies this method is what allows me to create various perspectives that will enable me to show certain information depending on the user role. For example information that a player will be able to view will be different to the information that is seen by a developer of the gaming system.

I then deployed the SSAS tabular model database on SQL as you can see by the image.

 

SSRS

This service feature allowed me to develop a report project for my data warehouse. SSRS offers various features e.g. the ability to showcase the data analysis through the use of charts and tables. 

I performed the following skills within my data table:

  • Sorting – Alphabetically order the character table
  • Drilling – The table has a toggling feature that will allow the user to either see more/less information about a certain mario characters information.
  • KPI – if the speed values of each mario character exceeds 2.25 then they will either be amber/green in colour. However, if the value is lower then 2.25 then the indicator will be red in colour.

 

PowerBi

Further visuals of my datasets were shown using PowerBi where the software also allows the user to interact with the given reports. An example I created the following report to allow the viewer to select a character, view their acceleration, speed and weight data and use those information to find a kart that will help balance out the characters weaknesses (or vice versa). For example say the player is given a character with a slow speed then combining the kart with the highest speed and acceleration may increase the chances of winning.

Skip to content