Doing homework for my university studies, I stumbled upon the problem of having a somewhat smart logic for hotel bookings. The assignment was to build a model-2 jsp/servlet based website for bookings of a generic hotel chain; the required booking facilities were simple but I wanted to do something smarter, and solve the problem in a satisfactory way.
Let’s say that you are eager to go to Sidney between the 29th Jan and 5th of Feb.. because – well, any reason you want to be there in that period. What bothers me is that most booking facilities just check for a whole availability – a single room available for the entire period. Now, if I really want to go to Sidney, and I’m already willing to take 12+ hours of plane to go there, renew the passport, fight spiders, etc., probably having to change the room at the mid of my stay there is the least of the problems – specially if the alternative is not going to Sidney at all.
Now, apart from the fact that this is a stupid example because I would never want to go to Sidney because there are some of the worst (1) spiders (2) of the world(3) there, the problem was interesting to solve and the solution happened to me while walking towards home, listening to Alesha Dixon – this to justify the non-optimality of the algorithm. (The rest of the project was built over Miles Davis prestige recordings I received as a present for Christmas… much better food for the brain :))
Step 0 – the data structure
As always, the first step of a good algorithm is a good data structure. Since this is a simple algorithm, the data structure is very simple as well.. the data structure is an integer matrix of rooms rows by days columns. This already suggests that the algorithm complexity will be O(rooms*days), and it is, unless the data fetch from database is more complex for some reason. This step is already O(rooms*days) in space, there is no need to zero-set the matrix, every cell will be overwritten in step 1.
Step 1 – fill data structure with room occupation data
The step 1, of O(rooms*days) time, is simple : store a 1 in each cell where the corresponding room is available in the corresponding day, a 0 if the room is already booked.
Step 2 – does the magic
The step 1, of O(rooms*days) time, is also simple and does the magic that allows the final step to run: parse each row from right to left. If you are in zero cell, leave it as it is. If it’s a 1 cell, sum to it value to its immediate right. The final result is that each cell contains a number which says for how many following days that room is free.
Step 3 – gets the result
Step 3 – O(rooms*days) worst case, but O(rooms) on average – leads us to the final result. Let’s parse each column from left to right. For each column, get the maximum number of that column, choose that room for the period until it reaches a 0 and advance of that many columns. Repeat until time ends, or until a column of only zeros is found – which means no booking is possible even with changes.
The algorithm is a greedy algorithm of O(rooms*days) space and time complexity, optimal in the minimization of room changes but sub-optimal in fragmentation. With this I mean that the solutions it finds may lead to small fragments of rooms for future bookings. For sure a human can do better, for example:
does not leave a 3 days hole in room3, leaves room1 free for 4 days, sacrificing one day in room2 which would be hard to book anyway (unless it is already free in the weekend!).
On a side note, for storage a number of solutions are available. However you have to consider that you need the booking to be a transaction and you want the transaction to fail if one or more days/rooms are not available anymore; basically the two ways I found are using constraints and from/to columns in the table, or simply having a separate entry for each day with (room,day) as the table key : this way if the two bookings conflict they will result in a duplicate key violation and one, or both, transactions will fail. Constraints and from/to periods work too, with the pro of having less rows in the database and reducing the algorithm execution time, and the cons of requiring a bit more database-fu. Of course other solutions might be available, but I’m definetely not a database expert (actually, I more of a database rookie :)).