Monday, June 3, 2019
Review Paper on Fault Tolerance in Cloud Computing
Review Paper on find blemish Tolerance in Cloud reckoningA REVIEW PAPER ON FAULT TOLERANCE IN CLOUD COMPUTINGDeepali MittalMs. Neha AgarwalAbstract Cloud computing request is increasing collectable to which it is important to provide correct services in the presence of sacks also. The Resources in obnubilate computing dope be dynamically lepidote that too in a cost effective manner. switch Tolerance is the process of finding faults and afflictions in a system of rules. If a fault occurs or on that point is a hardw be failure or software failure consequently also the system should work aright. Failures should be managed in a effective modal value for reliable Cloud Computing. It will also ensure availaibility and robustness .This paper aims to provide a better understanding of fault tolerance proficiencys which are apply for managing faults in mist over. It also deals with some compriseing shimmy tolerance model.Index Terms Cloud Computing, Fault Tolerance, relia bility.I. IntroductionCloud computing is new-sprung(prenominal) method acting which can be used for representing computing model where IT services are delivered via internet technologies . These have attracted millions of users. Cloud storage not only provide us the massive computing infrastructure yet also the economics of scale. Such a trend, requires assurance of the quality of data storage services which involves two concerns from some(prenominal) blur users and asperse service providers data integrity and storage efficiency.It is much more simple than internet. It is a service that allows user to access applications that actually exist at location other than users own computer or other devices on network. There are m each benefits of this technology. For example any other company hosts user application.Cloud computing is nothing new as it uses approaches, concepts, and techniques that have already been developed. But on the other side everything is new as mottle computing changes how we invent, develop, deploy, scale, update, maintain, and pay for applications and the infrastructure on which they run. Cloud Computing is an efficient way of computing as it centralizes the storage, memory and processing.Fault tolerance has the property to assess the capability of the system to react graceciouslly to a hardware and software failure which is not expected. In assortment to attain robustness or raptness in cloud computing, failure should be determined and handled carefully.This paper will give basic knowledge ab step forward Fault tolerance greetes.The Methods used for fault management in cloud We also study some existing fault management models which tolerates fault in cloud environment. Then figure come forth the best model of fault tolerance.Fault tolerance deals with all different approaches that provides robustness ,availaibility and dependableness .The major use of enforcing fault tolerance in cloud computing include re think upy from different h ardware and software failures, reduced cost and also improves performance . Robustness is the property of providing of with an faithful service in an unwanted situation that can arise because of an unexpected system state. Dependability is something that need to be achieved.It is one of the very important aspects for cloud provider.It includes dependability as well as availability.It is related to some of the Quality of service issues delivered by the system.Fault tolerance intent to accomplish robustness and dependability in the cloud environment.Fault tolerance techniques can be classified into types depending on the policies of fault tolerance viz,Proactive Fault Tolerance Proactive fault tolerance simple core early prediction of the problem before it actually arises.Reactive fault toleranceThis policy handles the failure. The effect of failure is reduced when the failure actually occurs. This could be further divided into two sub-procedures 1. Error Processing2. Flaw Treatmen tThe first process eliminates fallacy from the system. Fault treatment tries to prevent faults from getting reactivated .Fault tolerance is accomplished by error processing. Error Processing has two main phases. The first phase is effective error processing which path delivery the effective error back to a latent state and if possible it is claim before occurrence of a failure.The Second Phase is latent error processing which aims to ensure that the error is not reactivated.II. Existing Fault Tolerance Approaches In CloudThe different techniques used for fault tolerance in cloud are Check pointing It is a good fault tolerance approach .It is used for applications which have a long running time. In check pointing technique , check pointing is make after each change in system state. It is useful when a task is not able to complete. It fails in the middle due to some error. Then that task is made to begin from the most recent check pointed state instead of restarting it from the beginning.Task Migration There may be a case when a task in not able to complete on the assigned specific virtual machine . When this type of task failure occurs then that task could be moved other machine. This can be performed by utilise HA-Proxy.Replication Replication simply means copying. The replica of tasks is punish on distinct resources if the original instance af task fails.It is do to get the actual required result. Replication can be implemented by using various tools. Some of the tools are Hadoop , HA Proxy or Amazon EC2.Self- Healing A big task can divided into parts .This surgical incision is done for better performance. It results in creation of variant application instance.The instances run on distinct virtual machines.In this way automated failure management is done for instances.Safety bag checks This strategy is quite simple. It blocks the command which does not met the requirements for safe execution or proper working of machine.S-Guard It is a stream Pr ocessing techniques.It makes open more resources. It use the mechanism of Rollback recovery. Check Pointing is done Asynchronously. It is used for distributed environment. S-Guard is performed using Hadoop or Amazon EC2.Retry A task is made to reach repeatedly .This approach try to re execute the failed job on same machine .Task Resubmission A task failure can make the complete job also fail. So when a failed task is identified ,it should be submitted to same or either distinct resource for reexecution. epoch mark Time checker is a supervised technique. A watch dog is used. It consider Critical time function.Rescue workflow This strategy is used for Fault tolerance in workflow execution.Reconfiguration The configuration of the system is changed in this technique.The faulty component is removed.Resource Co-allocation It increases the availability of resources. It takes care of multiple resources. Resource allocation is done to complete the execution of task.III. Fault Toleranc e ModelsVarious Fault Tolerance Models are designed using these techniques. These techniques are combined with one another and then applied or simply used individually. Some of Existent fault tolerance models are AFTRC A Fault Tolerance Model for Real Time Cloud Computing is designed by keeping the fact in mind that real time systems have good computation. These systems are also scalable and make use of virtualization techniques which helps in excuting real time applications more effectively.This model is designed by considering the dependability issue. The model make use of proactive fault strategy and predicts the faulty pommels.LLFT Low Latency Fault Tolerance act as a middleware for tolerating faults. It is useful for distributed application which are running in cloud. In this model fault tolerance is provided like a service by cloud providers. Applications are replicated by middleware. In this way replication helps in handling of faults for different applications.FTWS Fau lt giving WorkFlow Scheduling is a model based on replication approach. It also makes use of resubmission technique. A metric is hold for checking the priority of tasks and they are submitted accordingly. The principle of workflow is used in this model. Workflow means a series of task executed orderly. Data dependency decides the order. Fault management is done while the workflow is scheduled.FTM is one of the most flexible model. It delivers fault tolerance as on demand service. The user has a emolument that without having known the working of model ,they can specify the required fault tolerance. It is mainly designed for dependability issues. It consists of various components. Each component has its own functionality. candy is component base availability modeling frame work. It is mainly designed for availaibility issues. System modelling language is used to construct a model from specifications. This is done semi automatically.Vega-warden is a uniform user management system. I t creates global work space for variant applications and distinct infrastructure.This model is constructed for virtual cluster base cloud computing environment to overcome the 2 problems usability and security which arise from sharing of infrastructure.FT-Cloud has a mechanism of automatic detection of faults.It makes use of frequency for finding out the component.Magi-Cube is a kind of architecture for computing in cloud environment.It is designed for dependability,expenditure and performance issues.All three issues are related to storage.This architecture provides highly reliable and slight redundant storage. This storage system is done for metadata handling.It also handles file read and write.IV. Fault Tolerant Model for Dependable Cloud ComputingFault Tolerant Model for dependable cloud computing is a model designed for dealing with failures in cloud . As we all know Cloud Computing Environment is made up of virtual machines or you can say clients. The applications run on thes e nodes. Using this model faulty nodes are detected and replaced by correctly performing nodes. This is done for real applications. Now on what criteria the model can decide a node to be faulty ? There can be various parameters for detecting faulty node but this model makes use of dependability or dependability measurement. The criteria could be changed according to users requirement.A. Working of ModelThe model is designed for X virtual machines. X distinct algorithms run on the X nodes. stimulation buffer feeds the data to nodes. The foreplay data is then moved onwards to all the nodes simultaneously. When the node gets the input it starts its operation. It performs some functions as designed or stated by the algorithm . In other words , the algorithm runs on nodes and gives a result .The Funtioning of every module is different.Accepter ModuleThis module tests the nodes for correct result. It verifies the result of algorithms. If the result is faultless or as required then the re sult is forwarded further for evaluation of dependability.The appropriate result is sent to timer module. The inappropriate result is not forwarded instead note is sent.Timer ModuleThis module has a timer set for every node .It checks the time of result.If the result is generated before the time set or within that assigned time the only it forwards the result.Dependability AssessorThis module is responsible for checking of dependability of nodes. At the starting of system the dependability for each node is set to it maximum that is cent percent. When computations are performed the dependability of nodes dynamically changes.The dependability is decided on the basis of time and correctness of result. Dependability increases if the result is accurate and on time. The highest and lowest limit of dependability is set in the beginning. The node with dependability value less than the lowest dependability is replaced. It also sends a message to resource manager. The result of dependabilit y assesers forwards the results to descision maker module.Decision MakerIt gets the result from dependability assessors. A selection of node is done from all perfective tense nodes. The node which has the maximum dependability is selected. It makes the comparison between the dependability level of nodes and system dependability. System dependability is important to be attained by a node. In case all the node fails to achieve the system dependability then a failure notification is issued. A failure notification means that all the nodes have failed for this computation roll. Now backward recovery is done using check points .Decision maker also asks the resource manager to replace the node with lowest dependability with the new one.Check PointingCheck Pointing saves the state of system. It is done at regular small intervals. It is helpful in a scenario when a system fails completely. The strategy helps in automatic recovery form the check pointer state. This automatic recovery is don e only when all the nodes fails. The system continues to work properly with rest of the nodes.Fig .1.Fault Tolerant Model For Dependable Cloud ComputingB. Mechanism Of the ModelDependability Assessment AlgorithmBeginInitially dependability=1, n =1Input from configuration RF, maxDependability, minDependabilityInput nodestatusif node shape =Pass thendependability = dependability + (dependability * RF)if n 1 then = n-1elseif processing node Status = Fail then dependability = dependability (dependability * RF * n) n = n+1if dependability = max Dependability then Dependability = max Dependabilityif dependability Call Add new node ( )EndDecision Mechanism AlgorithmBeginInitially dependability=1, n =1Input from RA nodeDependability, numCandNodesInput from configuration SRLbestDependability = find_dependability of node with highest dependabilityif bestDependability = SRL status = successelse perform_backward_recoverycall_proc remove_node_minDependabilitycall_proc add_new_nodeEndC. ResultI n the first cycle, both VirtualMacine-1 and VirtualMachine-3 have the same dependability, but the result of VM-1 has been selected as it has a lower IP address. VM-3 output was selected by DM from cycle 2 to 4, as it has the highest dependability among competing virtual machines. In cycle 5 VirtualMachine-3 still has the highest dependability, but it is not selected. Because its result was not passed by AT and TC, so consequently, it was not among competing virtual machines.TABLE I Resultv. decision and future workTolerance of faults makes an important problem in the scope of environments of cloud computing. Fault tolerance method activates when a fault enters the boundaries i.e theoretically these strategies are implemented for detecting the failures and make an appropriate action before failures are about to occur.I have looked after the need of fault tolerance with its various methods for implementing fault tolerance. Various called models for fault tolerance are discussed .In the present scene, there are number of models which provide different mechanisms to improve the system. But still there are number of problems which requires some concern for every frame work. There are some drawbacks non of them can full fill the all expected aspects of faults. So might be there is a possibility to carried over the drawbacks of all previous models and try to make a appropriate model which can cover maximum fault tolerance aspect.ReferencesAnjuBala, InderveerChana, Fault Tolerance- Challenges, Techniques and Implementation in Cloud Computing IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 1, No 1, January 2012 ISSN (Online) 1694-0814 www.IJCSI.orgSheheryar MalikandFabriceHuet adaptive Fault Tolerance in Real Time Cloud Computing 2011 IEEE World Congress on ServiceRavi Jhawar, Vincenzo Piuri, Marco Santambrogio, A Comprehensive Conceptual System-Level Approach to Fault Tolerance in Cloud Computing, 2012 IEEE, DOI 10.1109/SysCon.2012.6189503P. Me ll, T. Grance. The NISTdefinition of cloud computing. Technical report, National Institute of Standards and Technology, 2009.Wenbing Zhao, Melliar-Smith, and P. M. Moser, Fault tolerance middleware for cloud computing, in tertiary International Conference on Cloud Computing (CLOUD 2010). Miami, FL, USA, 2010.R. Jhawar, V. Piuri, and M. D. Santambrogio, A comprehensive conceptual system level approach to fault tolerance in cloud computing, in Proc. IEEE Int. Syst. Conf., Mar. 2012, pp. 15.M. Castro and B. Liskov, Practical Byzantine fault tolerance, in Proc.3rd Symp. Operating Syst. Design Implementation, 1999, pp. 173186.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.