|
OR/MS Today - October 2001 Planning Problems Disruption Management Case Studies from airline, shipbuilding industries show how OR can get interrupted operations back on track fast. By Jens Clausen, Jesper Hansen, Jesper Larsen and Allan Larsen A passenger aboard a Boeing 747 from New York to London suddenly looses consciousness. Fearing the passenger may be having a heart attack, the captain decides to divert to Gandor to get immediate help. A delay of the planned arrival at Heathrow is unavoidable, but the airline's Operations Control Center (OCC) takes no action because heavy air traffic over London is delaying flights anyway. While performing the necessary checks before take off from Gandor, the captain discovers that one of the checks fails. Normally, this would not pose a severe problem, but the required technical expertise is not present at Gandor, and now the situation turns into a serious delay. The disruption from the planned schedule will affect passengers as well as the next planned activity of the aircraft and the crew. The disruption can be solved in various ways. One solution: fly in the necessary personnel to Gandor to cope with the check of the aircraft. However, this gives rise to an overnight stop, and the passengers need accommodations. Unfortunately, there are a number of first-class passengers aboard the plane who are granted 5-star accommodations in such a situation, and such accommodations are not available in Gandor. Thus, the solution is not feasible. The airline opts to hire a Boeing 747 from another airline, fly it to Gandor to pick up the passengers and continue to Heathrow. This constitutes a very expensive solution. In addition, the airline is left with the problem of getting crew and aircraft back to their planned activities as quickly as possible not an easy task. Is there a better solution to the problem? Operations research methods have a proven track record of delivering high-quality solutions to a vast range of planning problems, most notably in the airline industry, production management and logistics. For more than a year, researchers at the Department of Informatics and Mathematical Modelling at the Technical University of Denmark have been working on applying OR in a new, exciting field: disruption management. In this context, a disruption is defined as a situation during the operation's execution in which the deviation from plan is sufficiently large that the plan has to be changed substantially. The plan produced by OR-based decision support can be applied on the day of the disruption, it can be adjusted to take last-minute changes into account, or it can produce alternative plans well ahead of potential problems. The disruption described above was serious enough for those passengers involved, but it was not a major disruption. Major disruptions are closure of airports or airspace due to snow storms, strikes or as was the case with the recent terrorists attacks in the United States events that are beyond comprehension. Of course, costly disruptions are not limited to the airline industry. In the shipbuilding industry, for example, the just-in-time approach to production gives rise to an increased demand for robustness in plans and calls for enhanced tools to handle disruption situations. Odense Steel Shipyard a major shipyard in Denmark assembles ships in a large dock utilizing a gigantic portal crane as the prime tool. During December 1999, Denmark was hit by the worst hurricane ever recorded. The hurricane blew the OSS portal crane into the dock where a ship was under construction. The disaster immediately closed down production in the dock and disrupted the shipyard's activities for several months. The Process Cycle of an Operation The airline and the shipbuilder are involved in different activities, but in order to carry out their daily operations they both produce a plan. As the date of the particular operation approaches, the plan is adjusted to take into account changing circumstances. This is typically called the tracking process. On the day of operation, the plan is implemented, and the operation is monitored during execution. What happens when the observed situation deviates from the planned situation? If the deviation is marginal, no immediate action may be required in order to continue the operation. If the impact of the deviation on the operation is substantial either because the current plan becomes infeasible or because the cost or benefits of running the operation according to the current plan changes a disruption has occurred. In order to continue operations, intervention is necessary to resolve the infeasibilities resulting from the disruption or to decrease costs or increase revenues. The monitoring and re-planning process is referred to as the control process. As opposed to the tracking phase, the time for re-planning in the control phase is so limited that the methods used for generating the original plan cannot be used. In Figure 1, the three processes are shown in the context of the daily operation of an airline company. ![]() Figure 1: The timeline for the daily operation of an airline. A disruption is not necessarily the result of one particular event. For efficient disruption management, the status of the entire system forming the basis for operation is monitored. The process cycle of an operation consists of three elements:
Disruption Management in Action Case 1: Managing steel plates. The CIAMM project is a collaboration between DTU, the University of Aalborg and a number of industrial companies, including the Odense Steel Shipyard. The shipyard builds the largest container ships in the world. Ships are built in an assembly line fashion (i.e., several ships are under construction at the same time in different workshops at the shipyard). Hence, it is critical that delays are minimized in each workshop since a delay in one workshop influences the whole production. Each workshop maintains its own planning unit, while an overall planning unit is responsible for coordinating the flow between workshops. The first station in the production of a ship is the steel plate storage where the raw material for the ship is delivered. The steel plates arrive by ship in large bulks, each bulk containing plates to be used for different components and at different times. The plates are stored at an outdoor field with an 8-by-32 grid of stacks until they are requested by the cutting workshop. Each stack contains 20 plates on average. The plates within each stack may vary in size. The plates are stored and retrieved by two portal cranes running on the same pair of tracks. The cranes cannot pass each other. The plates are delivered in one end of the storage and are handed over to the cutting process at the other end. The organization is illustrated in Figure 2. ![]() Figure 2: A 4-by-8 steal plate storage area with two cranes. At present, the storage is managed using a so-called block-oriented approach in which steel plates to be used in the same section of a ship are stored together. However, there are not enough stacks so that each section can have its own. In addition, the plates often arrive weeks or even months prior to the planned use date. The topmost plates in a stack are often not the first to be used, and hence have to be moved in order to get to the relevant plates. The goal of the project is to investigate alternative approaches to storage organization in order to minimize dig-up moves, taking into account that the planned sequence of plates to be delivered from storage often changes due to urgent deliveries. The project team is investigating two possible organizations: the time-slot organization and the self-adjusting organization. In time-slot organized storage, plates are arranged according to their planned use date. The self-adjusting organization determines the location of each new plate and the location of plates moved in dig-up moves based on the current status of the storage and the knowledge of future demands. In both cases, the quality of the solution as well as the sequencing of the cranes in order to avoid collisions are determined by simulating the activities of the storage for the rest of the day. There are at least two approaches to disruption management in the daily operation: the control approach and the re-planning approach. In the control approach, the storage and the cranes are continuously monitored, and the next activity of each crane is decided based on the current status without regard to upcoming activities. Clearly, no efforts are wasted on planning for situations that do not occur. On the other hand, due to the limited time horizon, suboptimal decisions are bound to occur. The alternative strategy is a re-planning approach. Here, a detailed plan based of the expected production of the coming day(s) is constructed prior to the day of operation, and the operation is run according to this plan. In a deterministic world, an optimal operation results. If disruptions occur, however, some mechanism is needed to take care of recovery. Recovery is possible either by online re-planning or by building buffers into the original plan. Building buffers is the current practice of OSS. However, this leads to costly inefficiency. Re-planning without buffers, on the other hand, is dangerous since delays in one workshop will immediately affect the flow through the complete system. The CIAMM project partners have developed a planning tool for the time-slot organization and the self-adjusting organization. The running time of this tool is sufficiently short that it may also be used in a disrupted situation to re-plan activities. The tool is based on the heuristic method simulated annealing, in which each suggested new plan is evaluated through simulation. Simulation plays a crucial role in the project since it also provides the interface to the end-users. This approach has been necessary for two reasons: the constraints of the problem are difficult to handle in classical mathematical models (e.g. the cranes cannot pass each other), and the evaluation of a re-plan when disruptions are taken into account is by no means obvious. Results so far indicate the time-slot organization and the self-adjusting organization of the steel plate storage seem to be superior to the block-oriented organization. In a number of generated scenarios without disruptions, the number of dig-up moves was reduced by 60 percent compared to the block-oriented approach. For scenarios with disruption, similar experiments indicate that the self-adjusting organization is more robust to disruption than the time-slot oriented approach, and that the savings in terms of dig-up moves are comparable to those for the non-disrupted situation. The operation time is reduced 40 percent compared to the block-oriented approach. Case 2: Holistic approach to airline delays. With more than 22,000 commercial flights each day in the European airspace and the control spread over more than a dozen national air controls, there are plenty of reasons for disruptions in air traffic. In addition, airlines regularly face restrictive weather conditions, maintenance problems and staff shortages. As a result, one out of four European flights was delayed by more than 15 minutes in the first quarter of 2000. The DESCARTES project, financed partly by the European Union, includes partners DTU, British Airways and Carmen Systems. The project aims at developing decision-support tools for airline disruptions on the day of operation. Currently, plans are made for aircraft, flight crews and cabin crews based on an airline's schedule which is determined at least six months prior to the actual operation date. Making such a plan is complicated for several reasons: aircraft maintenance rules have to be taken into account, the right capacity must be at the right place at the right time, and the characteristics of each airport have to be respected. Crew scheduling has to consider international and national rules regulating flying time, as well as individual airline agreements with unions. The plans for crew assignments, aircraft assignments and maintenance are handed over from the planning department to the OCC a few days ahead of the operation date. Deadlines differ for different resources. For example, the plans for short-haul aircraft are handed over one day before operation, while long-haul plans are handed over five days in advance. As the plan is handed over, it becomes the responsibility of the OCC to maintain the resources so that the flight plan is feasible even if crewmembers get sick or flights arrive late. The OCC concerns itself with not only the immediate situation, but also the knock-on effects on other parts of the schedule since flight crews, cabin crews and aircraft are not planned as a unit. Producing recovery plans is a complex task, as many resources (crew, aircraft, passengers, slots, catering, cargo, etc.) have to be re-planned. When disruption occurs on the day of operation, large airlines usually react by solving the problem in a sequential fashion: aircraft, crew, ground operations and passengers. Sometimes, the process is iterated with all stakeholders until a feasible plan for recovery is found. Like many airlines, controllers at British Airways performing the recovery have little computerized decision support to help construct high-quality recovery options. Since it is time consuming, complex work to build a recovery plan, the controllers are often content with producing only one viable plan. Furthermore, the controllers have little help in estimating the quality of the recovery action they are about to implement. One recovery option that is almost always available is cancellation of flights or round trips. From a resourcing perspective, cancellation is ideal; it requires no extra resources and may even result in new, free resources, and little re-planning is required. However, from the passenger side, it is the worst option, since they don't get where they want to go. Determining the quality of a recovery option is (as was the case for the steel plate storage) difficult. The objective function is composed of several conflicting and non-quantified goals. The project aims at developing better support for airline operations problems. There are already systems on the market that in a disruptive situation can help airline controllers resolve disruptions. However, to the best of our knowledge, these systems only consider one resource at a time (e.g. cabin crew). With DESCARTES, we aim to develop an integrated approach that can deliver decision support for several resource areas that takes the highly complex interaction between the areas into account. At present the focus is on four resources: aircraft, flight crew, cabin crew and passengers. DTU and Carmen Systems developed new optimization methods for this highly time-constrained problem. The disruption management system is built around an infrastructure, "the Umbrella," which facilitates message-passing between the different stakeholders of the process (the managers of flight and cabin crew, aircraft and passengers) and underlying systems performing the actual computations leading to recovery options for the current situation. The team has developed systems for crew recovery and aircraft recovery, and now we're working on systems integrating the recovery of different resources. In parallel, the team has developed two simulators: a consequence analyzer that walks through the rest of the day given a suggested option for a disruption and its knock-on effects, and a stochastic simulator that allows strategic analysis of different overall strategies in disruption handling. Alerting mechanisms will be included in the final system because a disruption is not necessarily the effect of one particular event; it may be the result of a series of smaller events each of which by itself is not serious. For example, when a single crewmember calls in sick it is not serious, when many crewmembers call in sick on the same day it can result in a major shortage of staff. The project, currently in its second year, has produced prototypes that are being tested on real-data in a closed environment. Later this year, the systems will be tested with respect to speed and option quality in a simulated online environment, again with real data. With this project, it's crucial to have tools that allow the staff in the production environment to view and investigate the suggested solution options. The consequence analyzer is valuable as a stand-alone tool, since it allows the decision-makers to simulate the effect of potential decisions and to develop a better understanding of the effect of different types of strategies (avoid cancellations by all means, return to plan as fast as possible, never leave any problems to the next day, etc.). Conclusion Disruption management is an application area for OR that has huge potential and which offers substantial gains in efficiency for the users involved. Applications range from industrial companies to the public sector (see box). Solution methods must be able to produce good and structurally different solutions fast due to the online flavor of the problems. Thus, the technical challenge is to develop methods that produce robust and near-optimal solutions fast for real-life problems. Even with the tremendous development in the field of heuristics, this is by no means a trivial task.
Jens Clausen (jc@imm.dtu.dk), Jesper Hansen, Jesper Larsen and Allan Larsen are researchers in the Department of Informatics and Mathematical Modelling at The Technical University of Denmark. OR/MS Today copyright © 2001 by the Institute for Operations Research and the Management Sciences. All rights reserved. Lionheart Publishing, Inc. 506 Roswell Street, Suite 220, Marietta, GA 30060, USA Phone: 770-431-0867 | Fax: 770-432-6969 E-mail: lpi@lionhrtpub.com URL: http://www.lionhrtpub.com Web Site © Copyright 2001 by Lionheart Publishing, Inc. All rights reserved. |