Unit 33 Data Analysis and Design Sample Assignment

Unit 33 Data Analysis and Design Sample Assignment

Unit 33 Data Analysis and Design Sample Assignment

Task 1:

1.1-Critically compare different data models and schemas.

Data Models tells us how to create logical design of database. There are three major types of data models:

  • Relational Model –In this model data is organized in form of tables or relations which are related to each other.
    Consider an example, where there is a company with various departments. Each department has certain manager and employees. Every entity is related to each other.
    Relational Model
  • Hierarchical Model - In this model, each entity has one parent or root but may have one or several children or child entities.
    Hierarchical Model
  • Network Model – In this model, all entities are organized in form of graph. Some entities can be accessed from more than one path.
    Network Model

1.2- Compare database systems with file based systems and also discuss the benefits and limitations of different database technologies with reference to above scenario

 File systems and database management systems (DBMS) are two ways of managing data. DBMS is a computer based system to record persistent data.

  • In database management system, data is stored in large databases, while in file system, data is stored in form of files.
  • In file system, tasks like storage, retrieval, and search are done manually while in DBMS such tasks are automated, usually done by help of some tool.
  • Manual process of executing tasks invites problems like data integrity, data inconsistency and data security while such problems can be avoided in DBMS.
  • In DBMS certain control mechanisms are used and reading line by line is not required, whereas this is not the case with file systems.
  • We can avoid unauthorized access to database in DBMS whereas in file systems this is not possible.
  • With DBMS we can make use of backup and recovery, whereas in file systems, data once lost cannot be recovered back.
  • In DBMS, multiple users can access the data at one time, whereas with file systems this is not possible. File systems are basically designed for single user to access it at a time.
  • DBMS avoids data duplication. If a set of data is required by multiple web application, it is made available, whereas in file system this is not possible, one program is not readable by another program.
  • DBMS coordinates both logical and physical access to data, while file system only provides physical access to data.

1.3- Analyze different approaches to database design.

Organizations these day’s follow four major techniques of designing databases. These are:

  • Top-Down Approach – This starts with a general idea of what is required for the system and moves towards more specific details. This process involves identifying various entities and their attributes.
    Top-Down Approach
  • Bottom-Up Approach–This approach is just reverse of top down approach. It starts from more specific details and moves down to general details. In this approach, attributes are identified prior to entities.
    Bottom-Up Approach
  • Centralized Approach–This approach is useful in small organizations, where there are less number of entities and interactions between entities and attributes is easy. This approach is even considered in some large organizations which follow a relatively simple database structure. Normally it has a single database administrator or a small website design team that designs the entire database for the organization. Problem or specification is defined by this person or design team, a conceptual design is created, this conceptual design is verified by user views and system processes and data constraints are defined, to ensure that developed design complies with organization’s requirements.
    Centralized Approach
  • Decentralized Approach–This approach is useful when the system has large number of entities and complex relationships exist among them. A team of carefully selected designers is employed for large or complex projects, to accomplish the job.
    Decentralized Approach

Task 2:

2.1 - Discuss the principles of normalization and steps followed to achieve normal forms

Database normalization is a technique to arrange, manage or organize the data (fields and tables) in a relational database. This process is followed to minimize redundancy and to ensure that data is logically stored.
If our database is not normalized, it becomes difficult to maintain and update database without data loss. We might face three kinds of anomalies if database is not normalized. These are:
Insertion Anomaly:
Consider table Manager:

Manager_id

Manager_name

Manager_address

Department

01

Neil

Gurgaon

Marketing

02

Edwin

Mumbai

Production

03

Stuart

Delhi

Production

04

Neil

Gurgaon

Sales

Suppose for table Manager, for a new entry we have manager_id, manager_name, manager_address, but the newly recruited person has not been assigned any department yet, still then we have to insert NULL there, leading to insertion anomaly.
Deletion Anomaly:
In above table, if for ‘01’ manager_id there is only one department assigned, but if he temporarily drops it, then we might have to delete entire record corresponding to it, rather than only one column.
Updation Anomaly:
To update the address of the manager, who occurs more than once in a table, we need to update address in all entries, else data will become inconsistent. E.g.: In above table, Manager_name = Neil occurs more than once, so manager_address entries needs to be changed two times in the table.
Therefore, to remove such anomalies, we need to normalize our database tables.
There are 4 major types of Normal Forms:
First Normal Form (1NF):According to this rule, no two groups of data or records in table must contain same information. E.g.:
Manager Table: (Before normalization)

Manager_name

Manager_age

Manager_department

Neil

45

Marketing, Sales

Edwin

42

Production

Stuart

49

Textile

Manager Table: (After Normalization)

Manager_name

Manager_age

Manager_department

Neil

45

Marketing

Neil

45

Sales

Edwin

42

Production

Stuart

49

Textile

After normalizing manager table, each row has unique data, but data redundancy increases.
Second Normal Form (2NF): There must not be any partial dependencies, according to this rule. The concatenation must depend on primary key.
Manager table after 2NF:

Manager name

Manager age

Neil

45

Edwin

42

Stuart

49

 

Manager_name

Manager_department

Neil

Marketing

Neil

Sales

Edwin

Production

Stuart

Textile

Third Normal Form (3NF): According to this rule, all the transitive dependencies must be removed from the table and every non-prime attribute must be dependent on primary key.

Manager_id

Manager_name

Manager_age

Manager_street

Manger_city

Manager_zip

 

 

 

 

 

 

New tables after 3NF:

Manager_id

Manager_name

Manager_age

Manager_zip

 

 

 

 

Manager_street

Manger_city

Manager_zip

 

 

 

 

 

       

Boycc Codd Normal Form (BCNF):A table is said to be in BCNF, if it is in 3NF and if it does not have multiple overlapping candidate keys.

2.2 – ER Diagram for English Premier League with all entity sets, attributes and relationships and cardinality constraints

ER Diagram for English Premier League

2.3 –Relational Data Model

Table

Attributes

Data Type

Constraints

Manager

Manager_id

int

Primary Key

 

Manager_fname

Varchar(50)

 

 

Manager_lname

Varchar(50)

 

 

Manager_dob

datetime

 

 

Manager_pob

Varchar(50)

 

 

 

 

 

Players

Player_id

int

Primary key

 

Player_fname

Varchar(50)

 

 

Player_lname

Varchar(50)

 

 

Player_dob

datetime

 

 

Player_pob

Varchar(50)

 

 

Game_id

int

Foreign Key

 

score

int

 

 

 

 

 

Coach

Coach_id

int

Primary key

 

Coach_fname

Varchar(50)

 

 

Coach_lname

Varchar(50)

 

 

 

 

 

Refree

Refree_id

int

Primary Key

 

Refree_fname

Varchar(50)

 

 

Refree_lname

Varchar(50)

 

 

 

 

 

Team

Team_id

int

Primary Key

 

Team_name

Varchar(50)

 

 

Score

int

 

 

city

Varchar(50)

 

 

Manager_id

int

Foreign key

 

Coach_id

int

Foreign Key

 

Player_id

int

Foreign Key

 

Game_id

int

Foreign Key

 

 

 

 

2.4 – Create tables using SQL (DDL) commands

Table: Manager

createtable manager
CODING A

CODING A1

2.5 – Screenshots from SQL Server Management Studio (IDE)

Creation of Table -Manager:
Creation of Table -Manager
Creation of Table -Players:
Creation of Table -Players
Creation of table – Coach:
Creation of table – Coach
Creation of Table – refree:
Creation of Table – refree
Creation of Table – Team:
Creation of Table – Team
Creation of Table – Game:
Creation of Table – Game
Update table with Primary Key Constraint:
Update table with Primary Key Constraint
Insertion of Data into Table:
Insertion of Data into Table
Updation of Data into table:
Updation of Data into table

Need help?

Get Complete Solution From Best Locus Assignment Experts.

Place an order

Task 3:

3.1 - Benefits of using manipulation and query tools in a relational database system

SQL DDL commands – DDL stands for data definition language. These commands help us to define the database structure.
Create Table statement, Drop table statement and Alter table statement are three major commands which help us to define structure of database and schema.
Create Table statement: This statement helps us to create table in database. E.g.:CODING A2

Alter table statement: This statement helps us change a value in a database table. E.g.:
altertable manager1
addprimarykey(manager_id);
Drop table statement: This statement helps us to delete a database table along with its schema. E.g.:
droptable manager1;
SQL DML commands – DML stands for data manipulation language. These help us to manage data with schema objects.
Insert statement, update statement and delete statement are three major commands which help us to manipulate with the date in the database tables.
Insert Statement: It helps us to insert data or records in database table.
E.g.–To insert a record of data into table manager we write following query in SQL Server Management Studio IDE:
Insertinto manager
(manager_id,manager_fname,manager_lname,manager_dob,manager_pob)
values(1,'Neil','Armstrong','february 27,1984','london')
Delete Statement: It helps us to delete data from database table.
E.g. – To delete a record of data from table manager, we write following query in SQL Server Management Studio IDE:
deletefrom manager
where manager_id = 2;
Update Statement : This statement helps us to modify data from database table.
Eg – To update a record of data from table manager, we have to write following query in SQL Server Management Studio IDE:
update manager
set manager_fname ='Edwin'where manager_id ='1';
By above query, the record with manager_id = 1 will update, manager_fname to Edwin.

3.2 – Populate tables with some data and answer queries –

  1. List the managers of each team.
  2. Manager_fname, Manager_lname from manager
  3. team
    On manager.manager_id = team.manager_id;
  4. Output the full name of the top scorer in the league with the number of goals scored.
    Selecttop 1 players. player_fname, players.player_lname, players.score from players
                Innerjoin team
                On players.game_id = team.game_id
                Orderby team.score desc
  5. Output the average number of goals per game of all games played.
  6. AVG(score)as'Average Score'from game
  7. List all of the games that were played in a particular ‘city’.
    Select game_id from Game where city = ‘Manchester’;
  8. List all players who have played a game that was refereed by a particular referee (choose any referee).
    Select players.player_fname, players.player_lname from players, game, team, refree
    Where game.refree_id = team.refree_id and
    players.player_id = team.player_id and
    team.refree_id  = refree.refree_id and
    refree_fname ='smith';
  9. Create a database viewfor your system to produce a list of fixtures yet to be played along with the date of each game.
    Createview fixture as
    Select game_id,date
                From game
                Where(date >='')

3.3 - Critically evaluate the validity of the data extracted using the above queries and comment on the design process followed to ensure that meaningful data is extracted through the use of query tools.

All the queries mentioned above have been executed using SQL Server Management Studio IDE, and are thus successful in retrieving relevant information. By making use of SQL DDL (Data definition language) and DML (Data Manipulation Language), meaningful data can be fetch in IDE.

Task 4:

4.1 - Critically review and test the relational database system designed for the given scenario and thus provide a documentation supporting implementation and testing of the relational database system developed.

There are six entities in this given scenario based upon which I have designed the relational database for the system. Manager, Players, Coaches, Refree, Team, Game. I have developed the relational database for this system by defining tables with primary key and foreign key constraints. Each entity and relationship has been converted into table, and attributes into corresponding fields.
The database thus developed has important data which must be tested from time to time for any discrepancy. In day to day basis, databases are accessed by thousands of users in an organization. These databases even have important functionality like stored procedures, stored functions, triggers, views, instances, queries etc. There must be a complete regression test suite, to be run over the database periodically. There are number of reasons why we must test our database on regular basis:

  • Database is an important asset, by which hundreds or thousands of users are associated. If there is any loss of data, people associated with it will be affected and they may have to incur huge loss.
  • Databases of large organizations incorporate mission critical data in them, which contains their business functionality, so it is essential to secure it from unidentified access and any unforeseen disasters.
  • Testing tells us if there are any defects in the system, and whether we must take any steps to remove those errors.
  • If we make changes or modifications in our database, then sometimes it may result in some errors like overlapping, redundancy or missing data. Regression testing helps us detect those errors.

4.2 - Create brief user documentation for the relational database system developed.

According to given scenario, English premier league has managers, players, and coaches, refree. Teams participate in English premier league and play games. Team can be either home team or away team. The players who play the game can be goalkeeper, midfielder, defender, and striker.
The entire database developed for this scenario is normalized up to 3nf (3rd normal form) and thus there is no transitive dependency or redundancy.
In this project, I have created 6 database tables: Manager, Players, Coach, Refree, Team, and Game.
Description of tables is as follows:

  1. Manager : Manager_id (pk), manager_fname, manager_lname, manager_dob, manager_pob
  2. Players : Player_id(pk), player_fname, player_lname, player_dob, player_pob, score, game_id(fk)
  3. Coach : Coach_id(pk), coach_fname, coach_lname
  4. Refree : Refree_id(pk), refree_fname, refree_lname
  5. Team : Team_id(pk), team_name, score, city, manager_id(fk), coach_id(fk), player_id(fk), score, refree_id(fk)
  6. Game: Game_id(pk), team_id(fk), refree_id(fk), date, score

Where pk represents primary key and fk represents foreign key.

4.3 - By giving a tabular V&V document, explain how verification and validation has been addressed.

We have V-Model in software development, popularly known as verification and validation model. It is executed in sequential manner and each phase must be completed before the next phase begins. These tasks are performed to identify how consumers or users perceives the final product or software. Is the final product satisfy the quality standards and is it good to be used.

Verification

Validation

 

 

It is the process in software development life cycle to check if we are in the right track of developing the product or software.

It is the process in software development life cycle to check whether the final product meets business needs.

 

 

This process involves checking the intermediary documents like ER Diagrams, database design, website design, test cases, test plan, design documents, traceability matrix.

This process involves checking that all the design documents and requirement specification reflect the requirement and are fit to be used. When placed in intended environment, it is to be checked that product fulfills its intended use.

 

 

Verification is done before software execution

Validation involves software execution

 

 

This process involves all techniques of static testing.

This process involves all techniques of dynamic testing.

 

 

Examples of verification are : Reviews, inspection, walkthrough

Examples of validation are : All types of testing like Regression testing, System testing, User Acceptance testing, smoke testing etc.

 

 

In this project, all design documents like ER Diagrams and requirement specifications have been verified and all system requirements have been addressed therein.

After system completion, it will undergo various types of testing to check if it is fit to be used.

 

 

4.4 - Explain control mechanisms and show how these techniques have been used in developing your system.

Control mechanisms are methods or process to define and manage variables in a desirable way. For example a test manager at deployment site might install a variety of control mechanisms to help them monitor various testing activities in software testing life cycle.
We need to apply control mechanisms over our system to ensure that a high standard of quality is met. To keep the entire process on track, it is important to have quality control. It helps us to figure out the problem, fix it and also helps us in judging the effectiveness of implemented solution.
The entire process of quality control must go on smoothly. A control feature must be devised, to manage major portion of work in standardized manner.
If in the entire process of controlling the system work flow, we are not able to achieve any task by straight method, we need to devise some alternative control mechanisms, which comply with the standardized methods.
In the process of controlling, one must also know how to react in case of defects. The areas of system where defects are most likely to occur must be monitored carefully, so that the defects can be caught and fixed there and then, rather than continuing the work flow further. Defects must be caught and treated in their nascent stage or must be reduced to near zero, so that we can say that six sigma is attained.

References:

  1. ER Diagram in DBMS [Online]. [Accessed on 17 November 2014]. Available on world wide web: <http://www.studytonight.com/dbms/er-diagram.php>
  2. Structured Query Language/Data Manipulation Language [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://en.wikibooks.org/wiki/Structured_Query_Language/Data_Manipulation_Language>
  3. Database Testing [Online].  [Accessed on 18 November 2014]. Available on world wide web <http://www.agiledata.org/essays/databaseTesting.html>
  4. Why to test a database [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://www.softwaretestinggenius.com/why-should-we-test-a-relational-database>
  5. What is verification and validation [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://www.softwaretestinghelp.com/what-is-verification-and-validation/>
  6. Six Sigma control phase [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://www.managementstudyguide.com/six-sigma-control-phase.htm>
  7. Normalization of Database [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://www.studytonight.com/dbms/database-normalization.php>
  8. Database design strategies [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://databasemanagement.wikia.com/wiki/Database_Design_Strategies>
  9. Difference between DBMS and File System [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://www.differencebetween.com/difference-between-dbms-and-vs-file-system/>
  10. Difference between file processing system and database management system [Online]. [Accessed on 18 November 2014]. Available on world wide web <http://studychacha.com/discuss/135910-difference-between-file-processing-system-database-management-system.html>