COIT20257
Assessment item 3
Student Name:
Student ID:
2.1 Distributed Databases and Distributed technologies 4
a) ACID (Atomicity, Consistency, Isolation and Durability) 4
b) Distributed transactions and distributed databases 4
2.2 Ubiquitous systems and social impact 6
ii. IOT (internet of things) 8
i. Use of Cyber physical system 9
ii. Use of internet of things 9
2.3 Cloud Computing fault tolerance challenges 9
a) Cloud computing and architectural models 9
c) Fault tolerance techniques 12
d) Fault tolerance challenges 12
e) Techniques used to meet challenges 13
List of figures
Figure 1: Cyber physical system 7
Figure 2: IoT 8
Figure 3: software as a service 10
Figure 4: Infrastructure as a service 10
Figure 5: platform as a service 11
Figure 6: Fault tolerance manager 13
This report is going to discuss the distributed database and distributed transaction. It will also have the definition of the propertied which are included in ACID. It will also include the different type of partitions done on databases. It also includes the discussion of ubiquitous systems and social impact which includes the cyber system and internet of things architectures and utilization of these systems. It also includes the discussion on the topic cloud computing fault tolerance challenges.
Atomicity means that the database transaction is going to be take place with full commitment or their will be no transaction at all. It means if the transaction is been started and the connection get disrupted in between the transaction then the database will nullify the progress and will get to the state that it was in before the transaction gets started.
Consistency is the property that ensures the maintenance of the integrity of the data. Consistent data transaction will transfer the data with the integrity of the data which are constrains specified by the database. Consistency ensures that if the data integrity is been compromised then the database will nullify the changes and will get to the state that it was in before the transaction get started (Watts 2020).
Isolation is the property of the database transactions which are traceable or serializable. It means that every data transaction in the database can be identified individually and any transaction will not interfere with other transaction on the same database. It means every transaction take place one by one and no transaction is been in tandem. Multiple transactions can be carried on the database unless one is affecting other.
Durability is the property of the database that helps in making the transactions which had occurred with full commitment permanent. It helps in making sure the changes which are been made with the commitment are permanent and record the changes and make sure the database stays with the changes made in the case of system failure or connection disruption.
Distributed database is as the name suggest database which is been distributed among different data files stored in different location on same or completely different network or server. In this database the data is been stored in different location physically and the computation of the data is been performed by the distributed machines located in different locations. Distributed transaction is the data flow from the database that is been carried through two or more network or server hosts. Hosts are responsible for the providing the resources for the request and the transaction is been handle by the transaction manager that is responsible for the operation to be performed on the resource (Bharati and Attar 2018).
Distributed database is been managed by a centralized distributed data base management system that help in connect the data stored in the storage which is been distributed among different location and machines. It helps in centralization the data and utilizes it as one combine. It also updates the data after every transaction to reflect the updates to all the data stored in different locations. Biggest advantage of distributed database is that if one of the component breakdowns it will not affect the working of the database (Muzammal et al. 2019). Distributed transaction is the data transaction that includes two or more database and is been carried with the help of two or more network or server. It can have two possible results to a distributed transaction one can be confirm transaction with the changes to the database required or requested and other can be no changes due to failure in the transaction (Wei et al. 2018).
Horizontal partition is the partition of the data stored in the database horizontally to store the data in different table to reduce the traffic and divide the traffic into several parts. It also helps in storing more data and helps in access the data easy by dividing the traffic on the bases of the criteria used to divide the data into parts.
i. Deadlock is the problem occurs in distributed database. As a process can request resources from the database in any order that is not specified or pre-defined. A process can also hold or reserve the resources and request of other resources. If the resources allocation to a process is not defined or put in sequence or controlled it can lead to deadlocks in the resources request as two processes can request for the resources which are been reserved by each other (Bharti and Attar 2018).
ii. Range partition is the type of partition of data stored in the database on the bases of a range which is been predefined on a particular data column that can be dates, currency, unique key, or any other data stored in the data base.
Schema level partitioning is the partition of database which is been done on the schema level. Partition of the database done during the building of schema or partition done in the schema of the database is known as schema level partition.
Graph level partition is the partition of the graph. Graph can be divided by bisecting a graph or dividing it into n number of sets. It can be done by dividing the graph by vertices or edges.
iii. Partitioning algorithm and consensus algorithm: There are four type of partitioning algorithm K-mean, K-mediod also known as PAM (Portioning around Medoids), CLARA, and CLARANS. K-mean algorithm divides the data set by taking value for k and divide the dataset of N attributes into K data set. Consensus algorithm has various types which includes Proof of work, practical byzantine fault tolerance, proof of stake, proof of burn and many more. Proof of work is a type of consensus algorithm that helps in choosing the miner for the consecutive generation of block.
k-medoids algorithm is carried in steps as mentioned below:
K objects are been chosen randomly from the D dataset.
Assign other objects to the nearest k object.
Now for every set of k object (k) and non-representative object (O) calculate the cost S
Now if the value of S is negative the replace the O with K
Repeat until no set has the negative cost.
Proof of work is been carried in following steps:
Identify the problem
Miners then will try to solve the problem
When the miner will found the solution the solution will be broadcast to other miner
Then other miners will verify the solution
Ubiquitous system are the system used to do ubiquitous computation which is a computer science engineering concept in which the computation can be done regardless of time and place of the computation. Unlike the desktop computation it can be done using any device ate any time and any palace to compute the data (Villela et al. 2019).
Cyber physical system is the system that combines the computation and the physical world. It is generally used for the real time systems and embedded system. In this type of system, the real world data is been noted by using the sensor and a central unit is used to compute the real world data to perform various operations on the data to analyse or manipulate the data to get the desired results out of it.
Figure 1: Cyber physical system
IOT is a type of ubiquitous system which has various interrelated system or computation devices which include digital, mechanical machines, animals, and objects that are able to transfer the data over servers to manipulate the data and give the required output without any efforts from the humans. Internet of things can help the humans and can compute the data regardless of time or place to process the data collected from the real world without any human to input the data using some kind of sensors and process the data to give out the results as per required which is been pre specified.
Cyber physical system can be used in various fields and industries like autonomous automobile system, industrial monitoring and control system, robotics, auto pilot and medical conditioning monitoring and many more. It can be used in autonomous automobile to make the automobile determine the path and self-drive to the destination without any interference of humans.
Internet of things can be used to for various purposes like smart home, wearable, connection between cars, internet used by the industries, smart cities; it can be used in agriculture and many more. It can be used to build smart home which can identify and response to the human presence and gestures like it can be used for automatic lights, sound systems, voice-controlled air conditioner and other home appliances (Yordanova et al. 2018).
Social impact of smart home includes some merits and demerits. It helps the interior designer to design the home with minimal efforts of humans to interact or work to perform various activities like switching on or off home appliances. It also reduces the human efforts in the activities which only need human monitoring to complete the activity like heating water. It also helps in reducing the effort in the activities like making morning coffee which can be done automatically using IOT to set the timer according to the human movement. It also has some demerits like it is making humans lazy and non-punctual.
To put it simply, cloud-based services are provided over internet ("the cloud") to provide quicker growth, more accessible infrastructure and economies of magnitude, including computers, data centres, networks, apps, research and information. You normally only pay for cloud services that you need, thereby reducing your operating expenses, running your networks more effectively and can as the company's needs shift. Cloud computing is available in three various models, each fulfilling a single set of market criteria (Cheraghlou et al. 2016).
Saas – SaaS or Software as a service seems to be a model that offers easy access to online applications built on a cloud. The provider manages the whole computer stack that a web browser requires you to use.
Figure 3: software as a service
Iaas – IaaS or Service Infrastructure seems to be simply a virtual supply of cloud storage services. The whole scope of computing infrastructures including servers, storage, networking devices and servicing and service can be supplied by an IaaS cloud service for you.
Figure 4: Infrastructure as a service
Paas - The Platform as a service or PaaS seems to be mainly a cloud platform for designing, testing and organizing various enterprise applications. The introduction of PaaS promotes the production of company apps. The virtual PaaS runtime offers desirable room for applications to be built and tested.
Figure 5: platform as a service
Distributed computing seems to be an area in which distributed networks are researched in computer science. There are many individual computers in a distributed system connecting over a data network. The machines cooperate to accomplish a shared purpose. The use of distributed systems to address numerical problems also applies. A challenge is split into several activities of distributed computing, all of which are resolved through one or even more computers. Cloud computing is also viewed as only a modern application for existing distributed networks. Cloud computing seems to be a computing model for linking an extensive pool of applications to publicly or privately networks to offer a scalable server, data as well as file storage platform. The introduction of this technology greatly reduces the cost of measurement, enterprise applications, information storage and distribution (Dhingra and Gupta 2017).
Cloud computing seems to be a rational model to the real cost gain that can turn a data centre into a variable pricing model through capital-intensive systems. Virtualisation, Distribution as well as dynamically extensible cloud infrastructure are the core features. Distribution relates to the general node used by the machine. Dynamic expansion includes the flexible growth of virtualization stage, then manages, extends, migrates, backup the hypothetical framework, all kinds of activities is done through virtualisation.
The tolerance to faults throughout distributed networks are outlined below -
Replication based fault tolerance technique - Among the most common approaches seems to be replication-based fault tolerance. In reality, this methodology replicates the data on certain programs. A request could be transferred to one replica device throughout the centre of other replica system during duplication techniques.
Process Level Redundancy Technique - This fault tolerance strategy is also used for defects which are not used to repair, which are called transient defects. Temporary errors occur because all of the system elements have a partial fault or often because of external interference. Temporary flaws have the issue of being difficult to treat and detect but are of a less serious nature.
Fusion based technique - Replication seems to be the most commonly used fault tolerance process or procedure. The biggest drawback is the many copies it takes. Owing to the rise in backups as faults rise as well as management costs are very costly; this fusion-based approach fixes the problem. The tech relies on fusion is an option since it needs less backup machines than the strategy based on reproduction (AbdElfattah et al. 2017).
Fault tolerance of a system seems to be an option that enables a device to resume operations even though a system fails. Instead of totally collapsing, the machine will resume activities at a reduced pace. There was a mistake. It helps to isolate faults through the identification of faults. Fault tolerance demands due to its sophistication, interdependence as well as the following factors a careful study and consideration. The challenges of fault tolerance faced by the cloud computing are outlined below -
Autonomous fault tolerance technology is needed for multiple device cases on different virtual machines.
In order to create a stable framework, various solutions from rival cloud service providers must be combined.
It is important to build the latest methodology that combines these defect tolerance strategies with existing workflow programming algorithms.
In the cloud setting the outputs of the defect tolerance portion can be measured by a benchmark-based methodology relative to related ones.
Several service services with separate computing stacks can be utilized to maintain optimal stability and availability.
Autonomous fault tolerance should respond to different cloud configuration
Figure 6: Fault tolerance manager
Could computing has a fault tolerance manger which manages the fault and help in resolving the faults and also helps in keep the machine running and working with resolving the fault simultaneously.
The techniques being used solve cloud computing problems/challenges are listed below.
Google file system – The private distributed file system Google File System (GFS), which was created by Google Inc for its own purposes. It is planned to provide powerful and efficient data access across large heterogeneous computing clusters. The Google File System (GFS) seems to be a Google Inc. portable distributed file system (DFS) built to satisfy the increasing Google data processing needs. GFS delivers tolerance, trust, scalability, accessibility and efficiency to major networks as well as linked nodes for defects.
Big table – A Bigtable seems to be a fragmented, dispersed, multidimensional map. The map will be indexed by a row key, column key and time signature; each map value will be a continuously varied byte sequence. Cloud Bigtable seems to be a compact densely dense table which is capable of storing terabytes or petabytes of data throughout thousands of rows as well as thousands of columns. Each row has one single value indexed; the row key is called that value.
Map reduce programming model - Map-Reduce seems to be a proprietary Google software application for the deployment on computer clusters of large amounts of data. This mapping paradigm reduces the features typically found in programming languages, but its aim is not same as its initial forms throughout the Map-Reduce paradigm (Kumari and Kaur 2018).
This report helps in understanding the distributed database and distributed transactions and it also helps in understanding the properties which includes atomicity, consistency, isolation and durability. It also includes the concept of partitioning techniques and also includes the ubiquitous system and social impact of the ubiquitous system which are of two types’ cyber physical system and internet of things. It also has the cloud computing faults tolerance challenges which includes they could compute and its architectural model and challenges in cloud computing.
AbdElfattah, E, Elkawkagy, M and El-Sisi, A 2017, December, ‘A reactive fault tolerance approach for cloud computing’, In 2017 13th International Computer Engineering Conference (ICENCO) (pp. 190-194). IEEE.
Bharati, R D and Attar, V Z 2018, August, ‘A comprehensive survey on distributed transactions-based data partitioning’, In 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA) (pp. 1-5). IEEE.
Cheraghlou, M.N., Khadem-Zadeh, A. and Haghparast, M 2016, ‘A survey of fault tolerance architecture in cloud computing’, Journal of Network and Computer Applications, 61, pp.81-92.
Dhingra, M and Gupta, N 2017, ‘Comparative analysis of fault tolerance models and their challenges in cloud computing’, International Journal of Engineering & Technology, vol.6.no.2, pp.36-40.
Kumari, P and Kaur, P 2018, ‘A survey of fault tolerance in cloud computing’, Journal of King Saud University-Computer and Information Sciences.
Muzammal, M, Qu, Q and Nasrulin, B 2019, ‘Renovating blockchain with distributed databases: An open source system’, Future generation computer systems, 90, pp.105-117.
Villela, K, Groen, E C and Doerr, J 2019, ‘Ubiquitous requirements engineering: a paradigm shift that affects everyone’, IEEE Software, 36(2), pp.8-12.
Watts, S 2020, ACID: Atomic, Consistent, Isolated, & Durable, viewed 8 February 2021, https://www.bmc.com/blogs/acid-atomic-consistent-isolated-durable/
Wei, X, Dong, Z, Chen, R and Chen, H 2018, ‘Deconstructing RDMA-enabled distributed transactions: Hybrid is better!’, In 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18) (pp. 233-251).
Yordanova, K, Paiement, A, Schröder, M, Tonkin, E, Woznowski, P, Olsson, C M, Rafferty, J and Sztyler, T 2018, ‘Challenges in annotation of user data for ubiquitous systems: Results from the 1st arduous workshop’, arXiv preprint arXiv:1803.05843.